2013-04-28 194 views
0

Python中的arff库中的dump命令使用户能够根据给定的输入创建arff文件,例如,命令:由python中的arff库创建的arff文件中的名义属性

arff.dump("outputDir", data, relation="relation1", 
      names=['age, fatRatio, hairColor']) 

产生以下ARFF:

@relation relation1 
@attribute age real 
@attribute hairColor string 
@data 
10,0.2,black 
22,10,yellow 
30,2,black 

的数据给出:

data = [[10,0.2,'black'],[22,10,'yellow'],[30,2,'black']] 

我的问题是:如何通知我想hairColor到相关机制是一个名义上的属性,即我希望我的阿尔菲标题如下:

@relation relation1 
@attribute age real 
@attribute hairColor **nominal** 
@data 
... 

回答

0

有几种不同的方式在这里做这样概括:

https://code.google.com/p/arff/wiki/Documentation

我觉得对我来说更好的方法是第二个这建议是:

arff_writer = arff.Writer(fname, relation='diabetics_data', names) 
arff_writer.pytypes[arff.nominal] = '{not_parasite,parasite}' 
arff_writer.write([arff.nominal('parasite')]) 

如果你看看在代码为arff.nominal,它的定义是这样的:

class Nominal(str): 
    """Use this class to wrap strings which are intended to be nominals 
    and shouldn't have enclosing quote signs.""" 
    def __repr__(self): 
     return self 

所以我有大功告成创建每个标称不同的“包装”标称类属性我这样的:

class ZipCode(str): 
    """Use this class to wrap strings which are intended to be nominals 
    and shouldn't have enclosing quote signs.""" 
    def __repr__(self): 
     return self 

,然后按照上面的代码,你可以做这样的事情:

arff_writer = arff.Writer(fname, relation='neighborhood_data', names) 
arff_writer.pytypes[type(myZipCodeObject)] = '{85104,84095}' 
# then write out the rest of your attributes... 

arff_writer.write([arff.nominal('parasite')])