从字符串中拆分字母的字母

我正在处理这样的字符串："125A12C15" 我需要在字母和数字之间的边界处拆分它们，例如，这个应该变成["125","A","12","C","15"]。从字符串中拆分字母的字母

有没有比通过位置检查位置并检查它是字母还是数字，然后进行相应连接的更优雅的方式在Python中执行此操作？例如。这种东西的内置函数或模块？

感谢您的指点！ Lastalda

2013-03-22 Lastalda

以下（SO）文章完全回答你的问题;）http://stackoverflow.com/questions/3340081/product-code-looks-like-abcd2343-what-to-split-by-letters-and -numbers gr，M. – Michael 2013-03-22 14:13:01

使用itertools.groupby与str.isalpha方法一起：

文档字符串：

GROUPBY（可迭代[，keyfunc]） - >创建返回由分组的迭代器（键，子迭代）每个键值（值）。

文档字符串：

S.isalpha（） - > BOOL

返回真，如果在S中的所有字符是字母和有S中的至少一个字符，假除此以外。

In [1]: from itertools import groupby 

In [2]: s = "125A12C15" 

In [3]: [''.join(g) for _, g in groupby(s, str.isalpha)] 
Out[3]: ['125', 'A', '12', 'C', '15']

或可能re.findall或re.split从regular expressions module：

In [4]: import re 

In [5]: re.findall('\d+|\D+', s) 
Out[5]: ['125', 'A', '12', 'C', '15'] 

In [6]: re.split('(\d+)', s) # note that you may have to filter out the empty 
           # strings at the start/end if using re.split 
Out[6]: ['', '125', 'A', '12', 'C', '15', ''] 

In [7]: re.split('(\D+)', s) 
Out[7]: ['125', 'A', '12', 'C', '15']

至于性能，似乎使用正则表达式是可能更快：

In [8]: %timeit re.findall('\d+|\D+', s*1000) 
100 loops, best of 3: 2.15 ms per loop 

In [9]: %timeit [''.join(g) for _, g in groupby(s*1000, str.isalpha)] 
100 loops, best of 3: 8.5 ms per loop 

In [10]: %timeit re.split('(\d+)', s*1000) 
1000 loops, best of 3: 1.43 ms per loop

来源

2013-03-22 14:12:22 root

're.findall'很好地工作，谢谢！ – Lastalda 2013-03-25 14:19:48

从字符串中拆分字母的字母

回答

相关问题