如何使用正则表达式找到文件中含有两个以上元音的单词的出现？

我无法弄清楚如何找到所有含有2个或更多元音的单词。到目前为止，这是我的，但是当我运行它时，它不会给我任何输出。我很感激帮助。如何使用正则表达式找到文件中含有两个以上元音的单词的出现？

import re 

def main(): 

in_f = open("jobs-061505.txt", "r") 
read = in_f.read() 
in_f.close() 
for word in read: 
    re.findall(r"\b[aAeEiIoOuU]*", read) 
    in_f = open("twoVoweledWordList.txt", "w") 
    in_f.write(word) 
    in_f.close() 

print (word) 
main()

我很抱歉，如果这不是正确的格式。

来源

2014-11-01 GNExia

考虑在同一个单词中使用'\ w'和多个元音。 – 2014-11-01 00:21:45

for word in read: <--- iterating over chars in "read"! 
    re.findall(r"\b[aAeEiIoOuU]*", read) <-- using read again, discarding result

您的迭代和模式使用不对齐。另外，你不使用结果。

考虑处理由行的文件行等

twovowels=re.compile(r".*[aeiou].*[aeiou].*", re.I) 
nonword=re.compile(r"\W+", re.U) 
file = open("filename") 
for line in file: 
    for word in nonword.split(line): 
     if twovowels.match(word): print word 
file.close()

来源

2014-11-01 00:23:53

实际上'read'是一个字符串，文件的全部内容;而且Python是强类型的，所以你的评论在这里不重要。 – 2014-11-01 00:45:04

你是对的，他之前啜泣了整个文件，然后遍历字符现在...所以输入问题在for循环中 - 这是'字符串循环中的字符'，而不是'in line in file '循环，但由于缺少源代码中的类型信息，运行时无法检测到这个错误。 – 2014-11-01 11:42:06

什么信息不足？与C/C++等语言相反，您始终可以在Python中检测类型，并且不能更改对象的类型。在我看来，这是非常强大的类型系统......现在你试图从Python中没有'char'类型的事实中推导出一些非意义，因为单个字符只是len 1的字符串。请停止。你在做什么可以混淆新手，给他们错误的印象是什么强/弱和动态/静态类型的系统。 – 2014-11-01 19:31:25

使用re.findall功能，找到所有包含ATLEAST两个元音的话，

>>> s = """foo bar hghghg ljklj jfjgf o jgjh aei 
bar oum""" 
>>> re.findall(r'\S*?[aAeEiIoOuU]\S*?[aAeEiIoOuU]\S*', s) 
['foo', 'aei', 'oum'] 
>>> re.findall(r'\w*?[aAeEiIoOuU]\w*?[aAeEiIoOuU]\w*', s) 
['foo', 'aei', 'oum']

来源

2014-11-01 00:23:56

-1

我建议使用下面的命令：

re.findall('\S*[aAeEiIoOuUyY]\S*[aAeEiIoOuUyY]\S*', str)

其中str是你在寻找带有两个或更多元音字的字符串。

REGEX解释：

\ S - 这意味着 '每个非白色字符'

[aAeEiIoOuUyY] - 它代表每一字符的括号中（这样'a'或'A'或'e'等）

a * - 它表示在那里可以是字符前述*（ 'a' 的这种情况下）的出现任意数量

实施例：

字符串：

str = "aaa bbb abb koo llk tr"

Python代码：

import re 
re.findall('\S*[aAeEiIoOuUyY]\S*[aAeEiIoOuUyY]\S*', str)

输出：

['aaa', 'koo']

来源

2014-11-01 00:24:58 Eenoku

a='hello how are you" 
[ x for x in a.split(' ') if len(re.findall('[aeiouAEIOU]',x))>=2 ]

修改代码中的

import re 

def main(): 

in_f = open("jobs-061505.txt", "r") 
read = in_f.read() 
words = [ x for x in re.findall('\w+',read) if len(re.finall('[aeiouAEIOU]',x))>=2 ] 
print words

诠释他上面的代码'read()将读取整个文件作为字符串。 re.findall（'\ w +'，read）会给你单词列表。如果列表的长度大于或等于2。它将被存储为列表。现在你可以对输出做任何事情。

来源

2014-11-01 00:32:15 Hackaholic

如何使用正则表达式找到文件中含有两个以上元音的单词的出现？

回答

相关问题