2012-03-28 1328 views
1

我有一个简单的矩阵,在某些列中有重复值。我需要按名称和星期分组数据,并总结每周给定的价格。这里是例子:在Matlab中分组和汇总

name day week price 
John 12 12 200 
John 14 12 70 
John 25 13 150 
John 1 14 10 
Ann 13 12 100 
Ann 15 12 100 
Ann 20 13 50 

所需的输出将是:

name week sum 
    John 12 270 
    John 13 150 
    John 14 10 
    Ann 12 200 
    Ann 13 50 

有没有一个很好的办法做到这一点?我用的循环,但不知道它是做的最好的方式:

names= unique(data(:,1)); % getting unique names from data 
n=size(names, 1);   % number of unique names 
m=size(data(:,1),1);  % number of total rows 
sum=[];     % empty matrix for writing the results 
for i = 1:n    
     temp=[];   % creating temporar matrix 
     k=1; 
    for j=1:m 
     if name(i)==data(j,1)  % going through all the rows and getting the rows of 
      temp(k,:)=data(j,:); % the same name and putting in temporar matrix 
      k=k+1; 
     end 
    end 
    count=0; 
    s=1; 
    for l = 1:size(temp,1)-1  % going through temporar matrix of one name(e.g.John) 
     if temp(l,3)==temp(l+1,3) % checking if the day of current row is equal to the 
     count=count+temp(l,4); % date of the next row (the data is sorted by name 
     else      % and date) and then summing the prices 4th column 
      sum(s, 1:3)=[names(i) temp(l,3) count]; 
      count=0;    % if the days are not equal, then writing the answer 
      s=s+1;    % to the output matrix sum 
     end   
    end 
end 
+0

单字母变量名和缺乏的意见相结合,使你的代码非常难走。你能扩展变量名称并注释代码的意图吗? – 2012-03-28 18:27:39

回答

3

使用accumarray。它会分组和汇总这样的值。您可以使用第三otuput参数从unique(data(:,1))得到的数字指标传递给的accumarraysubs说法。详情请参阅doc accumarray

1

也许最简单的方法是使用GRPSTATS功能从统计工具箱。你必须在nameweek以产生第一组结合:

[name_week priceSum] = grpstats(price, strcat(name(:), '@', week(:)), {'gname','sum'});