我想读取存储在Linux机器上一个文件夹中的大量.csv文件(几个千兆字节)的第一个和最后一个记录。假设他们被称为have1.csv, have2.csv, ...
等。SAS - 读取多个csv文件的第一个和最后一个观察结果
所以我试了下面的代码,它只给了我第一行。但不是最后一行。
%let datapath = ~/somefolder/;
data want;
length finame $300.;
/*Reference all CSV files in input data folder*/
infile "&datapath.have*.csv" delimiter=","
MISSOVER DSD lrecl=32767 firstobs=2
eov=eov eof=eof filename=finame end=done;
/*Define input format of variables*/
informat Var1 COMMA. Var2 COMMA. Var3 COMMA.;
/*Loop over files*/
do while(not done);
/*Set trailing @ to hold the input open for the next input statement
this is because we have several files */
input @;
/*If first line in file is encountered eov is set to 1,
however, we have firstobs=2, hence all lines would be skipped.
So we need to reset EOV to 0.*/
if eov then
do;
/*Additional empty input statement
handles missing value at first loop*/
input;
eov = 2;
end;
/*First observation*/
if eov=2 then do;
input Var1--Var3;
fname=finame;
output;
eov = 0;
end;
/*Last observation*/
if 0 then do;
eof: input Var1--Var3;
fname=finame;
output;
end;
input;
end;
stop;
run;
我非常感谢您的帮助!如果我误解了infile,end,eov,eof和input @的概念或相互作用,请告诉我!我不知道我的错误是...
您是否还试图跳过标题行?那是关于FIRSTOBS = option的评论? – Tom
是的,很抱歉没有提前回复。 –