2015-07-11 101 views
1
import numpy as np 
from pandas import Series, DataFrame 
import pandas as pd 
import matplotlib.pyplot as plt 

iris_df = DataFrame() 

iris_data_path = 'Z:\WORK\Programming\Python\irisdata.csv' 

iris_df = pd.read_csv(iris_data_path,index_col=False,header=None,encoding='utf-8') 

iris_df.columns = ['sepal length','sepal width','petal length','petal width','class'] 

print iris_df.columns.values 
print iris_df.head() 
print iris_df.tail() 
irisX = irisdata[['sepal length','sepal width','petal length','petal width']] 
print irisX.tail() 
irisy = irisdata['class'] 
print irisy.head() 
print irisy.tail() 

colors = ['red','green','blue'] 
markers = ['o','>','x'] 

irisyn = np.where(irisy=='Iris-setosa',0,np.where(irisy=='Iris-virginica',2,1)) 

Col0 = irisdata['sepal length'] 
Col1 = irisdata['sepal width'] 
Col2 = irisdata['petal length'] 
Col3 = irisdata['petal width'] 

plt.figure(num=1,figsize=(16,10)) 
plt.subplot(2,3.1) 
for i in range(len(colors)): 
    xs = Col0[irisyn==i] 
    xy = Col1[irisyn==i] 
    plt.scatter(xs,xy,color=colors[i],marker=markers[i]) 
plt.legend(('Iris-setosa', 'Iris-versicolor', 'Iris-virginica')) 
plt.xlabel(irisdata.columns[0]) 
plt.ylabel(irisdata.columns[1]) 

plt.subplot(2,3,2) 
for i in range(len(colors)): 
    xs = Col0[irisyn==i] 
    xy = Col2[irisyn==i] 
    plt.scatter(xs,xy,color=colors[i],marker=markers[i]) 
plt.xlabel(irisdata.columns[0]) 
plt.ylabel(irisdata.columns[2]) 

plt.subplot(2,3,3) 
for i in range(len(colors)): 
    xs = Col0[irisyn==i] 
    xy = Col3[irisyn==i] 
    plt.scatter(xs,xy,color=colors[i],marker=markers[i]) 
plt.xlabel(irisdata.columns[0]) 
plt.ylabel(irisdata.columns[3]) 

plt.subplot(2,3,4) 
for i in range(len(colors)): 
    xs = Col1[irisyn==i] 
    xy = Col2[irisyn==i] 
    plt.scatter(xs,xy,color=colors[i],marker=markers[i]) 
plt.xlabel(irisdata.columns[1]) 
plt.ylabel(irisdata.columns[2]) 

plt.subplot(2,3,5) 
for i in range(len(colors)): 
    xs = Col1[irisyn==i] 
    xy = Col3[irisyn==i] 
    plt.scatter(xs,xy,color=colors[i],marker=markers[i]) 
plt.xlabel(irisdata.columns[1]) 
plt.ylabel(irisdata.columns[3]) 

plt.subplot(2,3,6) 
for i in range(len(colors)): 
    xs = Col2[irisyn==i] 
    xy = Col3[irisyn==i] 
    plt.scatter(xs,xy,color=colors[i],marker=markers[i]) 
plt.xlabel(irisdata.columns[2]) 
plt.ylabel(irisdata.columns[3]) 
plt.show() 

这是Howard Bandy的书Quantitative Technical Analysis中的代码。问题是,即使我完全按照书中的内容输入它,它仍然给我错误。系列导入但未使用的错误Python

我仍然得到导入的系列,但未使用和未定义的名称irisdata错误/警告。

这是在控制台:

代码:

runfile('Z:/WORK/Programming/Python/Scripts/irisplotpairsdata2.py', wdir='//AMN/annex/WORK/Programming/Python/Scripts') 
['sepal length' 'sepal width' 'petal length' 'petal width' 'class'] 
    sepal length sepal width petal length petal width  class 
0   5.1   3.5   1.4   0.2 Iris-setosa 
1   4.9   3.0   1.4   0.2 Iris-setosa 
2   4.7   3.2   1.3   0.2 Iris-setosa 
3   4.6   3.1   1.5   0.2 Iris-setosa 
4   5.0   3.6   1.4   0.2 Iris-setosa 
    sepal length sepal width petal length petal width   class 
145   6.7   3.0   5.2   2.3 Iris-virginica 
146   6.3   2.5   5.0   1.9 Iris-virginica 
147   6.5   3.0   5.2   2.0 Iris-virginica 
148   6.2   3.4   5.4   2.3 Iris-virginica 
149   5.9   3.0   5.1   1.8 Iris-virginica 
Traceback (most recent call last): 

    File "<ipython-input-100-f0b2002668bd>", line 1, in <module> 
    runfile('Z:/WORK/Programming/Python/Scripts/irisplotpairsdata2.py', wdir='//AMN/annex/WORK/Programming/Python/Scripts') 

    File "C:\MyPrograms\Spyder(Python)\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 685, in runfile 
    execfile(filename, namespace) 

    File "C:\MyPrograms\Spyder(Python)\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 71, in execfile 
    exec(compile(scripttext, filename, 'exec'), glob, loc) 

    File "Z:/WORK/Programming/Python/Scripts/irisplotpairsdata2.py", line 24, in <module> 
    irisX = irisdata[['sepal length','sepal width','petal length','petal width']] 

TypeError: list indices must be integers, not list 

显然,程序不运行。

我使用spyder与python 2.7。这是他在书中使用的平台。

感谢您的任何见解。

回答

1

那么Python没有错。您导入了系列但从未使用,这是不会导致崩溃的警告。发生崩溃是因为您正在取消引用以前从未定义的变量irisdata。 (在你的代码中按Ctrl + f irisdata并看一看。)通过你的代码判断,irisdata可能需要包含解析的数据Z:\WORK\Programming\Python\irisdata.csv不是吗?所以你需要解析并分配给irisdata。见this post

例如。

import csv 
... 
irisdata = list(csv.reader(open(iris_data_path, 'rb'))) 
+0

嗯......我在想它应该使用Series来解析csv文件。如果这是有道理的。比如,pd.read_csv应该使用Series。我是一个全新的Python新手,所以我不知道这是否合理。 – antiseptic

+0

也许你应该使用这个。 http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.from_csv.html#pandas.Series.from_csv eg。 “irisdata = Series.from_csv(iris_data_path)' – initialxy

+0

其实经过进一步检查,我怀疑这行应该是'irisX = iris_df [['sepal length','sepal width','petal length','petal width' ,'class']]',其中'iris_df'已经在'iris_df = pd.read_csv(...)分配了解析数据'我确信这个脚本不会在这里结束,因为最后一行似乎准备绘制一张将要使用“系列”的图表。所有这些只是我的猜测工作。你真的应该从你的书中获得所有缺少的信息。如果你确信你没有犯错,那么你应该对这本书的出版商大肆吹嘘。 – initialxy