使用list comprehension
与groupby
:
from itertools import groupby
df = pd.DataFrame({'a':[['Andhra Pradesh-133', 'Meetai-1358', 'Meetai-2146', 'Meetai-2277'],
['Andhra Pradesh-20', 'Rajasthan-60', 'Rajasthan-70']]})
data = []
for x in df['a']:
b = [a.split('-') for a in x]
L = [t for k, g in groupby(b, key=lambda x: x[0])
for t in [k + '-' + str(sum((int(j) for i, j in g)))]]
data.append(L)
print (data)
[['Andhra Pradesh-133', 'Meetai-5781'], ['Andhra Pradesh-20', 'Rajasthan-130']]
df['b'] = data
print (df)
a \
0 [Andhra Pradesh-133, Meetai-1358, Meetai-2146,...
1 [Andhra Pradesh-20, Rajasthan-60, Rajasthan-70]
b
0 [Andhra Pradesh-133, Meetai-5781]
1 [Andhra Pradesh-20, Rajasthan-130]
编辑:
data = []
for line in open('file.csv'):
#strip new-line characters, split by [ and get second list
items = line.strip('\r\n" ]').split('[')[1]
#split lines, remove whitespace
items = [item.strip("' ") for item in items.split(',')]
#split to sublist
items = [a.split('-') for a in items]
#sum splitted sublists
items = [t for k, g in groupby(items, key=lambda x: x[0])
for t in [k + '-' + str(sum((int(j) for i, j in g)))]]
data.append(items)
print (data)
[['Andhra Pradesh-133', 'Meetai-5781'], ['Andhra Pradesh-20', 'Rajasthan-130']]
编辑:如果输入文件
解决方案:
你需要通过[
首次出现分裂,然后剥离[]
太:
data = []
for line in open('file.csv'):
#strip new-line characters, split by [ and get second list
items = line.strip('\r\n" ]').split('[', 1)[1]
#split lines, remove whitespace
items = [item.strip("'[] ") for item in items.split(',')]
#split to sublist
items = [a.split('-') for a in items]
print (items)
#sum splitted sublists
items = [t for k, g in groupby(items, key=lambda x: x[0])
for t in [k + '-' + str(sum((int(j) for i, j in g)))]]
data.append(items)
有一个小疑问在这里,如果我考虑的是X = [ '潘吉姆-20', '北方邦-23185',“ Gujurat-1013','Uttar Pradesh-51']声明函数组似乎不起作用。 b = [a.split(' - ')for a x] for k,g in groupby(b,key = lambda x:x [0]):不会被'uttar Pradesh'分组也不是'uttar Pradesh'一样。你能帮助我们了解什么是错过的? –
我觉得有问题double'[['。我编辑答案。 – jezrael
对于我正在尝试处理的名单中的错字x = ['panjim-20','Uttar Pradesh-23185','Gujurat-1013','Uttar Pradesh-51']表示歉意。 ? –