2016-12-15 336 views
0

我正在研究一个简单的Web Scrape,DataFrame项目。我有一个简单的8x1 DataFrame,我试图将它分成8x2的DataFrame。到目前为止,这是我的数据框的样子:将Pandas DataFrame列拆分为两列

dframe = DataFrame(data, columns=['Active NPGL Teams'], index=[1, 2, 3, 4, 5, 6, 7, 8]) 
Active NPGL Teams 
1 Baltimore Anthem (2015–present) 
2 Boston Iron (2014–present) 
3 DC Brawlers (2014–present) 
4 Los Angeles Reign (2014–present) 
5 Miami Surge (2014–present) 
6 New York Rhinos (2014–present) 
7 Phoenix Rise (2014–present) 
8 San Francisco Fire (2014–present) 

我想增加一列,“年活动”和分裂“(2014年本)”,“(2015年至今)”进入“活跃年份”栏。我如何分割我的数据?

回答

2

您可以使用

dframe['Active NPGL Teams'].str.split(r' (?=\()', expand=True) 
    0    1 
1 Baltimore Anthem (2015–present) 
2   Boston Iron (2014–present) 
3   DC Brawlers (2014–present) 
4 Los Angeles Reign (2014–present) 
5   Miami Surge (2014–present) 
6  New York Rhinos (2014–present) 
7  Phoenix Rise (2014–present) 
8 San Francisco Fire (2014–present) 

的关键是正则表达式r' (?=\()'相匹配,只有当它后面跟着一个开括号(前向断言)的空间。


另一种方法(大约5%,但更灵活)是用户Series.str.extract

dframe['Active NPGL Teams'].str.extract(r'^(?P<Team>.+) (?P<YearsActive>\(.+\))$', 
             expand=True) 
    Team  YearsActive 
1 Baltimore Anthem (2015–present) 
2   Boston Iron (2014–present) 
3   DC Brawlers (2014–present) 
4 Los Angeles Reign (2014–present) 
5   Miami Surge (2014–present) 
6  New York Rhinos (2014–present) 
7  Phoenix Rise (2014–present) 
8 San Francisco Fire (2014–present) 
+0

爱这个!奇妙! :)不知道这个'熊猫'功能呢。 – quapka

+0

谢谢,这工作!我从来没有想到这一点。 –