我有.txt格式的非常大的数据文件(通常是30Gb到60Gb)。我想找到一种方法来自动抽取文件而不先将它们导入到内存中。 我的.txt文件由两列数据,这是一个示例文件: https://www.dropbox.com/s/87s7qug8aaipj31/RTL5_57.txt在MATLAB中从磁盘抽取大数据文件?
我迄今所做的是将数据导入到变量“C”再往下采样数据。用这种方法的问题是变量“C”往往充满MATLAB的内存容量的方案已更改为抽取前:
function [] = textscan_EPS(N,D,fileEPS)
%fileEPS: .txt address
%N: number of lines to read
%D: Decimation factor
fid = fopen(fileEPS);
format = '%f\t%f';
C = textscan(fid, format, N, 'CollectOutput', true);% this variable exceeds memory capacity
d = downsample(C{1},D);
plot(d);
fclose(fid);
end
我怎么能修改此行:
C = textscan(fid, format, N, 'CollectOutput', true);
使它通过从磁盘将每一行或每三行的.txt文件从磁盘导入到内存中的变量“C”来有效地减少数据。
任何帮助将不胜感激。
干杯, 吉姆
PS 的另一种方法,我已经与用途“的fread”打转转,但它encouters同样的问题:
function [d] = fread_EPS(N,D,fileEPS)
%N: number of lines to read
%D: decimation factor
%fileEPS: location of .txt fiel
%read in the data as characters
fid = fopen(fileEPS);
c = fread(fid,N*19,'*char');% EWach line of .txt has 19 characters
%Parse and read the data into floading point numbers
f=sscanf(c,'%f');
%Reshape the data into a two column format
format long
d=decimate((flipud(rot90(reshape(f,2,[])))),D); %reshape for 2 colum format, rotate 90, flip veritically,decimation factor