2015-08-09 65 views
0

我有下面的代码给我几乎正是我想要的:输出不必要的阵列,熊猫

def stateCountAsList(filepath,state): 

    import pandas as pd 
    pd.set_option('display.width',200) 

    import numpy as np 

    dataFrame = pd.read_csv(filepath,header=0,sep='\t') 
    df = dataFrame.iloc[0:638,:] 

    dfState = df[df['State']== state] 
    yearList = range(1999,2012) 
    countsList =[] 

    for year in yearList: #for every year in the range 
     if year in dfState['Year'].tolist(): #if the year is in the list of years for the selected state 
      value = dfState[(dfState.Year == year)] 
      countsList.append(value.Count.values) 
     else: 
      countsList.append(np.nan.values) 
    print countsList 
    return countsList 

stateCountAsList('United States Cancer Statistics, 1999-2011 Incidencet.txt' ,'California') 

的问题是,我出来就把应该是一个清单,但我得到了这个词到处数组:

[array([ 5561.]), array([ 5588.]), array([ 6059.]), array([ 6043.]), array([ 5958.]), array([ 6566.]), array([ 7160.]), array([ 6780.]), array([ 7327.]), array([ 7585.]), array([ 7483.]), array([ 7635.]), array([ 7735.])] 

如何删除数组中我输出?

回答

1

熊猫的Dataframe将其数据存储在numpy数组中。这就是为什么你在输出中看到字数组的原因。如果你想将它转换为普通的Python列表而不是numpy数组,你可以调用tolist()

# untested 
for year in yearList: #for every year in the range 
    if year in dfState['Year'].tolist(): #if the year is in the list of years for the selected state 
     value = dfState[(dfState.Year == year)] 
     countsList.append(value.Count.values.tolist()) 
    else: 
     countsList.append(np.nan.values.tolist()) 
+0

冷静,得到了它。谢谢! – madman

0

array是由NumPy的库,它是一个用于Python的科学库中创建的数据结构。人们可以用类似的方式从数组和列表中检索项目。

由于value.Count.valuesnp.nan.values一个项目回报阵列,可以代替项追加到countsList直接:

countsList.append(value.Count.values[0]) 
... 
countsList.append(np.nan.values[0]) 

来源:http://docs.scipy.org/doc/numpy/reference/arrays.html