使用正则表达式从python中的字符串中提取多个值

我有多个字符串，看起来像这样 product: green apples price: 2.0 country: france company: somecompany。一些字符串可能有更少的字段。例如有些人缺少公司名称或国家等。我试图只提取价值，并跳过产品，价格，国家，公司。我试图创建多个正则表达式，它从每个字符串的左侧开始。使用正则表达式从python中的字符串中提取多个值

blah="product: green apples price: 2.0 country: france company: somecompany" 

product_reg = re.compile(r'.*?\bproduct\b:(.*).*') 
product_reg_strip = re.compile(r'(.*?)\s[a-z]:?') 

product_full=re.findall(product_reg, blah) 
prod=re.find(product_reg_strip, str(product_full)) 
print prod 

price_reg = re.compile(r'.*?\bprice\b:(.*).*') 
price_reg_strip = re.compile(r'(.*?)\s[a-z]:?') 

price_full=re.findall(price_reg, blah) 
price=re.find(price_reg_strip, str(price_full)) 
print price

但这不起作用。我该怎么做才能使这个正则表达式更健全？

来源

2017-04-20 pauts

价格是每个字符串中唯一的数值吗？ –

你想要输出什么？在你的例子中，它是“绿色苹果2.0法国somecompany”吗？ – tdelaney

你可能分裂这样的字符串：

str = "product: green apples price: 2.0 country: france company: somecompany" 
p = re.compile(r'(\w+:)') 
res = p.split(str) 
print res 
for i in range(len(res)): 
    if (i%2): 
     print res[i],' ==> ',res[i+1]

输出：

['', 'product:', ' green apples ', 'price:', ' 2.0 ', 'country:', ' france ', 'company:', ' somecompany'] 

product: ==> green apples 
price: ==> 2.0 
country: ==> france 
company: ==> somecompany

来源

2017-04-20 16:10:01 Toto

我不能完全确定你所追求的，但如果你想删除的东西一个单词后跟一个冒号，正则表达式非常简单。这里有几个样本。

>>> import re 
>>> blah="product: green apples price: 2.0 country: france company: somecompany" 
>>> re.sub(r'\w+: ?', '', blah) 
'green apples 2.0 france somecompany' 
>>> re.split(r'\w+: ?', blah)[1:] 
['green apples ', '2.0 ', 'france ', 'somecompany']

来源

2017-04-20 16:16:41 tdelaney

你可以简单地使用正规并获得命名的分组结果。您也可以拥有或不是所有的值，正则表达式在所有情况下都能正常工作。尝试使用上regex101.com https://regex101.com/r/iccVUv/1/这一全球多的正则表达式：

^(?:product:(?P<product>.*?))(?:price:(?P<price>.*?))?(?:country:(?P<country>.*?))?(?:company:(?P<company>.*))?$

在蟒蛇就可以了，比如做：

pattern = '^(?:product:(?P<product>.*?))(?:price:(?P<price>.*?))?(?:country:(?P<country>.*?))?(?:company:(?P<company>.*))?$' 
matches = re.search(pattern, 'product: green apples price: 2.0 country: italy company: italian company')

简单地使用现在可以得到的数据：

product = matches.group('product')

您终于只需检查匹配是否满足并修整空格：

if matches1.group('product') is not None: 
    product = matches.group('product').strip()

来源

2017-04-21 09:55:06

使用正则表达式从python中的字符串中提取多个值

回答

相关问题