使用python搜索字符串

我想知道如何使用python搜索特定的字符串。其实我打开它含有片状下方的降价文件：使用python搜索字符串

| --------- | -------- | --------- | 
|**propped**| - | -a flashlight in one hand and a large leather-bound book (A History of Magic by Bathilda Bagshot) propped open against the pillow. | 
|**Pointless**| - | -“Witch Burning in the Fourteenth Century Was Completely Pointless — discuss.”| 
|**unscrewed**| - | -Slowly and very carefully he unscrewed the ink bottle, dipped his quill into it, and began to write,| 
|**downtrodden**| - | -For years, Aunt Petunia and Uncle Vernon had hoped that if they kept Harry as downtrodden as possible, they would be able to squash the magic out of him.| 
|**sheets,**| - | -As long as he didn’t leave spots of ink on the sheets, the Dursleys need never know that he was studying magic by night.| 
|**flinch**| - | -But he hoped she’d be back soon — she was the only living creature in this house who didn’t flinch at the sight of him.|

我必须从与装饰各行得到的字符串| ** |，如：

撑起
无意义
拧开
受压迫
片
flinch

我试图使用正则表达式但未能提取它。

来源

2017-02-26 Marco Mei

原始markdown文件内容如下所示： --------- | -------- | --------- | | **支持** | - | - 一只手电筒和一个大皮革书（Bathilda Bagshot的魔术史）在枕头上张开。 | | **毫无意义** | - | - “十四世纪的女巫燃烧毫无意义 - 讨论。” –

有[在线正则表达式测试器]（https://regex101.com/）使用Python风格的正则表达式 - 它们对于微调模式非常有用。 – wwii

您正在搜索的文本中是否有'''**'''字符？ – wwii

尝试使用下面的正则表达式：

(?<=\|)(?!\s).*?(?!\s)(?=\|)

看demo/explanation

来源

2017-02-26 16:54:36 m87

非常感谢。并且您分享的网站非常有用。 –

对不起，我在这里新... –

我以为我可以接受很多次...对不起 –

如果星号是您正在搜索的文本，你不想sheets后面的逗号。该模式将是管道后跟两个星号，然后是任何不是星号或逗号。

\|\*{2}([^*,]+)

如果你可以用逗号居住或是否有可能是用逗号你想赶上

\|\*{2}([^*]+)

使用带有re.findall或re.finditer要么模式捕捉你想要的文字。

如果使用第二种模式，则需要遍历组并去除不需要的逗号。

来源

2017-02-26 17:42:32 wwii

是的，当然，很高兴做到这一点，但我不知道如何接受它，因为这是我第一次在这里发布问题，你介意告诉我该怎么做吗？ –

谢谢你wwii。 –

import re 

y = '(?<=\|\*{2}).+?(?=,{0,1}\*{2}\|)' 
reg = re.compile(y) 
a = '| --------- | -------- | --------- | |**propped**| - | -a flashlight in one hand and a large leather-bound book (A History of Magic by Bathilda Bagshot) propped open against the pillow. | |**Pointless**| - | -“Witch Burning in the Fourteenth Century Was Completely Pointless — discuss.”|' 
reg.findall(a)

正则表达式（Y）上面解释：

(?<=\|\*{2}) - 匹配，如果字符串中的当前位置由匹配前面\|\*{2}即|**

.+? - 将尝试找到任何东西（除换新线）重复一次或多次。限定符之后添加?使其以非贪婪或最小方式执行匹配;尽可能少的字符将被匹配。

(?=,{0,1}\*{2}\|) - ?=匹配前面提到的正则表达式之前的任何字符串。在这种情况下，我提到了,{0,1}\*{2}\|，这意味着零或一个,和2 *和结尾|。

来源

2017-02-26 17:52:20

感谢Dhruv Baveja。 –

@MarcoMei如果解决方案适用于您，您可以请它upvote它。谢谢 –

嗨，Dhruv Baveja。感谢您的好意和有用的帮助，它的运作非常好，当然我希望得到它的支持，但是当我试图这样做时，它会提示“感谢您的反馈，记录了名声低于15的人的投票，但不要更改公开显示的帖子分数。“我能做的其他事情吗？ –

我已经写了下面的程序来实现所需的输出。我创建了一个文件string_test，其中我复制了所有原始字符串：

a=re.compile("^\|\*\*([^*,]+)") 
with open("string_test","r") as file1: 
for i in file1.readlines(): 
    match=a.search(i) 
    if match: 
     print match.group(1)

来源

2017-02-27 09:10:13

使用python搜索字符串

回答

相关问题