2012-07-24 71 views
1

我已经达到了我对这个问题的有限知识的结束。目前,我正在解析差异结果。以下是我试图操作的结果的一个示例:Python:根据子列表对列表中的列表中的元素进行切片,取决于子列表

[ 
[[0, 0, '\xe2\x80\x9cWe are returning again statement. He depicted the attacks as part of a battle launched by Sunnis against the country\xe2\x80\x99s Shia leaders.\r\n\r\nThe first attack came about 5 a.m. on Monday when gunmen stormed onto an Iraqi '], 
[-1, 1, 'military base near the town of Duluiyah in S'], 
[0, 2, 'alahuddin Province and killed 15 Iraqi soldiers, according to security officials. Four soldiers, including a high-ranking was taken prisoner by the insurgents, who escaped with him.\r\n\r\nThe insurgents also attacked the home of a police official in Balad, seriously wounding ']], 

[[0, 4, 'eckpoint near Baquba, killing one policeman. In all, attacks were reported in at least five provinces.\r\n\r\nEight attacks were launched in Kirkuk Province, mostly targeting police patrols, with five people killed and 42 wounded.\r\n\r\nThe offensive started on the third day of the Islamic holy month of Ramadan, and '], 
[-1, 5, 'apparently took advantage of the wi'], 
[1, 6, 'll and the other.']] 
] 

我正在构建一个diff摘要。以下是它如何分解:

该列表是差异结果的列表(在上例中为两个)。

的子列表有三个要素:

  • 改变之前的文本,将构成变化
  • 文本;和
  • 更改后的文本。

子子列表中有三个元件太:

  • 一个数字来表示,如果是部分缺失,添加或不受影响(-1,0,1分别);
  • a position number(sequential);和
  • 字符串本身。

我需要做的是片中在子子列表中的字符串,但要看是什么子列表他们就能加入。

  • 对于子列表元素1,我需要切掉除最后4个字符以外的所有字符串。
  • 对于子列表中的元素2,我需要那里没有切片。
  • 对于子列表中的元素3,我需要删除除前4个字符以外的所有字符串。

下面是为什么我需要这样切片的一个例子。该解决方案之前简化tText:该解决方案后

[[[...]], [[this is a],[sentence],[to demonstrate.]], [[...]]] 

文字:

[[[...]], [[is a],[sentence],[to d]], [[...]]] 

而另一个困难是,我想保持列表的结构。

这是一个艰难的一天 - 我的心灵弯曲这一问题的性质道歉,但是这就是溢出是...

的思考?

回答

2

你可以用一个大的拆包作业做到这一点:

[[[b_n, b_p, b_s[-4:]], change, [a_n, a_p, a_s[:4]]] 
for (b_n, b_p, b_s), change, (a_n, a_p, a_s) in results] 

另一种方法是,以zip并运用slice对象:

[[[num, position, text[op]] 
    for (num, position, text), op in zip(chunk, [slice(-4, None), slice(None), slice(4)])] 
for chunk in results] 
+0

无法获得这些片段,以不通过语法工作错误。有关实施的任何提示? – Pat 2012-07-24 12:02:38

+0

@Pat错字,修正。 – ecatmur 2012-07-24 12:10:26

+0

钉了它。得到它的工作,并尝试各种边缘案例。精美的工作。谢谢。 – Pat 2012-07-24 12:22:38

相关问题