或者,如果你想有一个长篇大论繁琐的方法...
# if you just want the data and not the header information
x<-read.table("1429271308686.txt",comment.char=">")
# in case all else fails, my somewhat cumbersome solution...
x<-scan("1429271308686.txt",what="raw")
# extract the lengths, ind1 has all the lengths
ind1<-x=="="
ind1<-c(ind1[length(ind1)],ind1[-length(ind1)]) # take the value that comes after "="
cumsum(ind1)
lengths<-as.numeric(x[ind1])[c(TRUE,FALSE)] # only want one of the lengths
# remove the unwanted characters
ind2<-x==">"
ind2<-c(ind2[length(ind2)],ind2[-length(ind2)]) # take the value that comes after ">"
ind3<-x==">"|x=="Len"|x=="="|x=="Reverse"
dat<-as.numeric(x[!(ind1|ind2|ind3)]) # remove the unwanted
# arrange as matrix
mat<-matrix(dat,length(dat)/4,4,byrow=T)
# the number of rows for each block
block<-(c(1:length(x))[duplicated(cumsum(!ind2))][c(FALSE,TRUE)]-c(1:length(x))[duplicated(cumsum(!ind2))][c(TRUE,FALSE)]-5)/4
# the number for each block
id<-as.numeric(x[ind2])[c(TRUE,FALSE)]
# new vector
mat<-cbind(rep(id,block),mat) # note, this assumes that the last line is again "> Reverse"
通常的做法是使用'readLines'加载文件,然后根据需要将每行转换为字符或数字。搜索一下,你会发现几个类似于你的问题。 –
您可以上传/链接到实际数据文件(.txt,.dat ...)的缩短版本,以便我们可以走了吗? – steinbock
基本上你必须编写你自己的解析器,如果没有人为这个文件格式做过。 – Roland