对应于PHP的preg_match的Python

我打算将我的一个刮板移动到Python。我很喜欢在PHP中使用preg_match和preg_match_all。我在Python中找不到类似于preg_match的合适函数。任何人都可以帮我这么做吗？对应于PHP的preg_match的Python

例如，如果我想获得<a class="title"和</a>之间的内容，我用下面的函数在PHP中：

preg_match_all('/a class="title"(.*?)<\/a>/si',$input,$output);

而在Python我无法找出一个类似的功能。

来源

2012-01-30 funnyguy

这里是pyt hon正则表达式文档：http://docs.python.org/howto/regex.html – 2012-01-30 09:39:24

在Python中，我们不使用正则表达式来解析HTML，我们使用[BeautifulSoup]（http://www.crummy.com/software/BeautifulSoup /）。见http://stackoverflow.com/a/1732454/78845 – Johnsyweb 2012-01-30 09:44:12

您正在寻找python的re module。

看看re.findall和re.search。

正如你所提到的，你正试图解析HTML使用html parsers。 python中有几个选项可用，如lxml或BeautifulSoup。

看看这个Why you should not parse html with regex

来源

2012-01-30 09:39:32 RanRag

谢谢先生们的回复。我已经开始使用Beatifulsoup，并且在使用它时遇到了一些问题。我已经通过HTML数据Beatifulsopu和我面临这个错误。汤= BeautifulSoup（data）print soup.prettify（）line 52，in soup = BeautifulSoup（data）文件“/home/infoken-user/Desktop/lin/BeautifulSoup.py”，第1519行，在__init__中 BeautifulStoneSoup .__ init __（self，* args，** kwargs）文件“/home/infoken-user/Desktop/lin/BeautifulSoup.py”，第1144行， .. '^ <\？。* encoding = [\ “]（。*？）[\'”]。* \？>'）。match（xml_data） TypeError：期望的字符串或缓冲区 – funnyguy 2012-01-30 12:54:10

你可能有兴趣阅读Python Regular Expression Operations

来源

2012-01-30 09:40:28

我想你需要财产以后这样的：在

output = re.search('a class="title"(.*?)<\/a>', input, flags=re.IGNORECASE) 
    if output is not None: 
     output = output.group(0) 
     print(output)

您可以添加（S？）启用正则表达式以启用多线模式：

output = re.search('(?s)a class="title"(.*?)<\/a>', input, flags=re.IGNORECASE) 
    if output is not None: 
     output = output.group(0) 
     print(output)

来源

2016-07-22 07:07:53

对应于PHP的preg_match的Python

回答

相关问题