使用正则表达式

我试图提取第一款提取第一段。但我发现了任何运气。谁能帮我？这里是文字。 http://dpaste.com/638776/。我的文字是动态的。感谢使用正则表达式

更新：我在读使用eTree模块XML文件。在XML中有标签叫做<text></text>。 <text></text>is here之间的数据。我只想从text tags打印以下数据。可能吗？感谢

'''Zamindar''' ({{te|జమీందార్}}) is a 1965 [[Telugu language|Telugu]] "Thriller" film 
    directed by [[V. Madhusudhan Rao]] and produced by [[Tammareddy Krishna Murthy]] 
    of Ravindra Art Pictures.This is variety role for [[Akkineni Nageswara Rao]] 
    who is more popular with soft Romantic roles.He plays the role of a tough CID Officer  very well.The Movie has some Good songs.This movie has a considerable resemblance with the 1963 [[Cary Grant]] English Movie ''[[Charade (1963 film)|Charade]]''.

来源

2011-10-22 no_freedom

你是什么意思了款？从{{'到'}}'的所有东西？它似乎是一个维基百科模板，所以如果你使用pywikipedia，可能有更好的方法。 –

@wiso它是维基百科模板。感谢您的建议。 –

非常不清楚...... – heltonbiker

修订基于新的信息...

如果你能生产标签之间的文本，你只需要找到第一款的模式，将适合所有的情况下，因此基于在这个例子中：

#data - stuff between text tags 
firstparagraph = re.search("}}(.*?)\r*\n\r*\n",data,re.DOTALL) 
print firstparagraph.group(1)

来源

2011-10-22 13:48:20

感谢您的回复。但它不工作。 –

如果你喜欢发布一些细节...我还不确定你是否试图解析pastebin或只是文本？ –

它的工作很棒。但最后我也得到了警告信息。 '打印firstparagraph.group（1） AttributeError的： 'NoneType' 对象没有属性 '组' 。我只想要第一段，所以不需要'{{Infobox电影 | name = Bheemli Kabadi Jattu |图片= |字幕= |导演= [[Tatineni Satya]] |制片= NV普拉萨德，第耆那教 }}'谢谢 –

如果你建立在点换行符相匹配的正则表达式，你（在红宝石测试，但我猜想，这将在Python上班的）。这是完全一样的尼尔·伯恩回答：

}}\n(.*?)\n\n

请参阅在rubular效果。

来源

2011-10-22 15:50:58 lkuty

使用正则表达式

回答

相关问题