我对数据挖掘,我们从kaggle给CSV数据的一所学校项目的工作(这是怎样的数据看起来(2线出6970)):转换CSV到ARFF
4,1970,Female,150,DomesticPartnersKids,Bachelor's Degree,Democrat,,Yes,No,No,No,Yes,Public,No,Yes,No,Yes,No,No,Yes,Science,Study first,Yes,Yes,No,No,Receiving,No,No,Pragmatist,No,No,Cool headed,Standard hours,No,Happy,Yes,Yes,Yes,No,A.M.,No,End,Yes,No,Me,Yes,Yes,No,Yes,No,Mysterious,No,No,,,,,,,,,,Mac,Yes,Cautious,No,Umm...,No,Space,Yes,In-person,No,Yes,Yes,No,Yay people!,Yes,Yes,Yes,Yes,Yes,No,Yes,,,,,,,,,,,,,,,,,No,No,No,Only-child,Yes,No,No
5,1997,Male,75,Single,High School Diploma,Republican,,Yes,Yes,No,,Yes,Private,No,No,No,Yes,No,No,Yes,Science,Study first,,Yes,No,Yes,Receiving,No,Yes,Pragmatist,No,Yes,Cool headed,Odd hours,No,Right,Yes,No,No,Yes,A.M.,Yes,Start,Yes,Yes,Circumstances,No,Yes,No,Yes,Yes,Mysterious,No,No,Tunes,Technology,Yes,Yes,Yes,Yes,No,Supportive,No,PC,No,Cautious,No,Umm...,No,Space,No,In-person,No,No,Yes,Yes,Grrr people,Yes,No,No,No,No,No,No,Yes,No,No,Yes,No,Own,Pessimist,Mom,No,No,No,No,Nope,Yes,No,No,No,Yes,No,Yes,No,Yes,No
和我们必须得到.arff格式才能在weka中使用。我manualy输入的报头(107个属性)
@ATTRIBUTE user_id NUMERIC
@ATTRIBUTE yob NUMERIC
@ATTRIBUTE gender {Male,Female}
@ATTRIBUTE income {150,100,75,50,25,10}
@ATTRIBUTE householdstatus {MarriedKids,Married,DomesticPartnersKids,DomesticPartners,Single,SingleKids}
@ATTRIBUTE educationlevel {Bachelor's Degree,High School Diploma,Current K-12,Current Undergraduate,Master's Degree,Associate's Degree,Doctoral Degree}
@ATTRIBUTE party {Democrat,Republican}
@ATTRIBUTE Q124742 {Yes,No}
@ATTRIBUTE Q124122 {Yes,No}
,我得到这个错误:
}预计在统计结束阅读令牌EOL
然后我试图使用WEKA转换器,但它给我一个错误
values.Read 2数目错误,预期1,读令牌[EOL],第4行问题在线遇到:3
什么Kaggle项目?如果我能得到数据文件,我会试试看。 – zbicyclist
[链接](https://inclass.kaggle.com/c/can-we-predict-voting-outcomes)你的回应 – candy