2014-12-02 142 views
0

我试图将KDD-CUP-99数据集(在这里找到:http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html)导入到MongoDB中。我使用下面的命令来完成此一台机器上:MongoDB在一台机器上导入成功,在另一台机器上失败

mongoimport --db dbName --collection colName --type csv --file kddcup.data.corrected --fieldFile kddcup99header 

当我使用findOne()来看看结果,一切看起来很好;输出如下:

> db.colName.findOne() 
{ 
    "_id" : ObjectId("547c33e376945996ed878f81"), 
    "duration" : 0, 
    "protocol_type" : "tcp", 
    "service" : "http", 
    "flag" : "SF", 
    "src_bytes" : 215, 
    "dst_bytes" : 45076, 
    "land" : 0, 
    "wrong_fragment" : 0, 
    "urgent" : 0, 
    "hot" : 0, 
    "num_failed_logins" : 0, 
    "logged_in" : 1, 
    "num_compromised" : 0, 
    "root_shell" : 0, 
    "su_attempted" : 0, 
    "num_root" : 0, 
    "num_file_creations" : 0, 
    "num_shells" : 0, 
    "num_access_files" : 0, 
    "num_outbound_cmds" : 0, 
    "is_host_login" : 0, 
    "is_guest_login" : 0, 
    "count" : 1, 
    "srv_count" : 1, 
    "serror_rate" : 0, 
    "srv_serror_rate" : 0, 
    "rerror_rate" : 0, 
    "srv_rerror_rate" : 0, 
    "same_srv_rate" : 1, 
    "diff_srv_rate" : 0, 
    "srv_diff_host_rate" : 0, 
    "dst_host_count" : 0, 
    "dst_host_srv_count" : 0, 
    "dst_host_same_srv_rate" : 0, 
    "dst_host_diff_srv_rate" : 0, 
    "dst_host_same_src_port_rate" : 0, 
    "dst_host_srv_diff_host_rate" : 0, 
    "dst_host_serror_rate" : 0, 
    "dst_host_srv_serror_rate" : 0, 
    "dst_host_rerror_rate" : 0, 
    "dst_host_srv_rerror_rate" : 0, 
    "unknown" : "normal." 
} 

现在我正在另一台机器上相同的导入操作,使用相同的文件和命令的事,但不能正常工作。是进口的结果如下:

> db.colName.findOne() 
{ 
    "_id" : ObjectId("547d8f94facff0761ae10688"), 
" : 0, "duration 
" : "tcp",rotocol_type 
" : "http",rvice 
" : "SF",flag 
" : 215,"src_bytes 
" : 45076,st_bytes 
" : 0, "land 
" : 0, "wrong_fragment 
" : 0, "urgent 
" : 0, "hot 
" : 0, "num_failed_logins 
" : 1, "logged_in 
" : 0, "num_compromised 
" : 0, "root_shell 
" : 0, "su_attempted 
" : 0, "num_root 
" : 0, "num_file_creations 
" : 0, "num_shells 
" : 0, "num_access_files 
" : 0, "num_outbound_cmds 
" : 0, "is_host_login 
" : 0, "is_guest_login 
" : 1, "count 
" : 1, "srv_count 
" : 0, "serror_rate 
" : 0, "srv_serror_rate 
" : 0, "rerror_rate 
" : 0, "srv_rerror_rate 
" : 1, "same_srv_rate 
" : 0, "diff_srv_rate 
" : 0, "srv_diff_host_rate 
" : 0, "dst_host_count 
" : 0, "dst_host_srv_count 
" : 0, "dst_host_same_srv_rate 
" : 0, "dst_host_diff_srv_rate 
" : 0, "dst_host_same_src_port_rate 
" : 0, "dst_host_srv_diff_host_rate 
" : 0, "dst_host_serror_rate 
" : 0, "dst_host_srv_serror_rate 
" : 0, "dst_host_rerror_rate 
" : 0, "dst_host_srv_rerror_rate 
    "unknown" : "normal." 
} 

看到,因为我使用的是相同的数据文件和命令,我想它一定有什么东西在环境中。系统区域设置相同,但导入仍无法正常工作。有没有人见过这样的行为?

编辑我要补充一点,两台机器正在运行的MongoDB的相同版本:2.6.5

回答

0

最后我去了很长的路轮,基于@ helmy的回答。我从正在运行的Mongo实例中导出,并将其导入到非工作的实例中。

1

我建议你验证这些文件是真正的同在两台机器上:

md5sum kddcup.data.corrected kddcup99header 

而且也验证了mongoimport工具的版本:

mongoimport --version 
相关问题