2016-08-05 64 views
-2

我在下面附加了我的数据集。我尝试使用freadr中读取它。但是列类都是character,即使我已经指定了它们。fread()将所有列类指定为字符,甚至指定了

class <- c("character", "character", "numeric", "numeric","numeric", 
"character", "numeric", "numeric", rep("chracter", 5), "numeric",          
"chracter", "character", "factor", "character",  
"character", "character", "character", "character", "factor",           
"numeric", "numeric", "character",rep("numeric", 6), "character",          
"numeric", "factor", rep("numeric", 9) , "character", "numeric",          
"character", "character", "numeric", "numeric", "numeric", "factor",         
"factor", "numeric", "numeric", "factor", rep("numeric", 55)) 

data_q1 <- fread("LoanStats_2016Q1.csv", header = TRUE, skip = 1, nrows = 133887, colClasses = class, fill = TRUE) 

str(data_q1) 

Classes ‘data.table’ and 'data.frame': 133887 obs. of 111 variables: 
$ id       : chr "75577129" "75669195" "75769072" "75991583" ... 
$ member_id      : chr "81011841" "81136933" "81236807" "81482303" ... 
$ loan_amnt      : chr "25000" "4000" "3600" "8000" ... 
$ funded_amnt     : chr "25000" "4000" "3600" "8000" ... 
$ funded_amnt_inv    : chr "25000" "4000" "3600" "8000" ... 

我检查这个answer,我想这:

any(is.na(data_q1[, loan_amnt])) 
[1] FALSE 

loan_amnt列不包含NA值。现在我不知道问题是什么。

data

+1

我想你是从这个doc遇到这个问题的:'fread只会促进一列如果colClasses请求更高类型。由于会产生NAs,因此它不会将列降级到较低的类型。如果您真的需要数据丢失,则必须事后强制处理这些列。' – Vedda

+0

但问题是loan_amnt列没有NA值 – zhichi

+1

可能没有NA值,但数据集包含其他类型的值对于'NA'没有数据,从这一端很难知道。我会直观地看看所有的数据,看看缺失值是否被识别为别的东西。 – Vedda

回答

1

您在类矢量拼写错误。 rep(“chracter”,5)应该是rep(“character”,5)并且这个错误在同一个向量中再次完成

相关问题