2017-06-16 88 views
0

我有一个.txt数据集,其中所述第一12行文本接着用2个空行,然后将数据fread为什么不接受skip命令?

DATE   HEIGHT INPUT  OUTPUT TESTMEASURE 
01/01/1933 NO RECORD NO RECORD MISSING  MISSING 
01/02/1933 NO RECORD NO RECORD MISSING  MISSING 

但是,当我做

dat <- fread('data.txt'), 

它将跳过15行,和用途第一条数据行作为导入数据集的列名称。它忽略标题行。

01/01/1933 NO RECORD NO RECORD MISSING  MISSING 

跳跃参数没有影响我在导入所有。我如何提到需要用作列名的行号?或者我可以重命名列名,但不应该忽略第一行数据。

诊断

Input contains no \n. Taking this to be a filename to open 
File opened, filesize is 0.001319 GB. 
Memory mapping ... ok 
Detected eol as \r\n (CRLF) in that order, the Windows standard. 
Positioned on line 1 after skip or autostart 
This line is the autostart and not blank so searching up for the last non-blank ... line 1 
Detecting sep ... '\t' 
Detected 5 columns. Longest stretch was from line 15 to line 30 
Starting data input on line 15 (either column names or first row of data). First 10 characters: 01/01/1933 
The line before starting line 15 is non-empty and will be ignored (it has too few or too many items to be column names or data): DATE   HEIGHT INPUT OUTPUT TESTMEASURE the fields on line 15 are character fields. Treating as the column names. 
+0

它应该是'dat < - fread('data.txt',skip = 15)'? – CPak

+0

@ChiPak我需要跳过12 + 2 = 14行。但是,下面的任何内容,15,和数据集不受影响。 – maximusdooku

+0

无论我跳过什么,导入的第一行都是01/02/1933。 – maximusdooku

回答

2

你有12行文字,2号线的空间,然后你的数据。但是我注意到DATEHEIGHT之间有额外的空白。所以作出这样,你的数据是制表符分隔的文本文件,并添加2标签DATEHEIGHT代替fread(data)之间1片

garbage 
garbage 
garbage 
garbage 
garbage 
garbage 
garbage 
garbage 
garbage 
garbage 
garbage 
garbage 


DATE  HEIGHT INPUT OUTPUT TESTMEASURE 
01/01/1933 NO RECORD NO RECORD MISSING MISSING 
01/02/1933 NO RECORD NO RECORD MISSING MISSING 

做给我:

fread(data) 
    01/01/1933 NO RECORD NO RECORD MISSING MISSING 
1: 01/02/1933 NO RECORD NO RECORD MISSING MISSING 

删除DATEHEIGHT之间的额外选项卡给我:

  DATE HEIGHT  INPUT OUTPUT TESTMEASURE 
1: 01/01/1933 NO RECORD NO RECORD MISSING  MISSING 
2: 01/02/1933 NO RECORD NO RECORD MISSING  MISSING