阵列散列

在Perl中，我有散列像阵列散列

0 HASH(0x98335e0) 
    'title' => 1177 
    'author' => 'ABC' 
    'quantity' => '-100' 


1 HASH(0x832a9f0) 
    'title' => 1177 
    'author' => 'ABC' 
    'quantity' => '100' 

2 HASH(0x98335e0) 
    'title' => 1127 
    'author' => 'DEF' 
    'quantity' => '5100' 


3 HASH(0x832a9f0) 
    'title' => 1277 
    'author' => 'XYZ' 
    'quantity' => '1030'

数组现在我需要积累，其中标题和作者是相同的数量。在与标题= 1177和作者=哈希上述结构“ABC”的数量可以累积成一个，并应在整个结构看起来像下面

0 HASH(0x98335e0) 
    'title' => 1177 
    'author' => 'ABC' 
    'quantity' => 0 

1 HASH(0x98335e0) 
    'title' => 1127 
    'author' => 'DEF' 
    'quantity' => '5100' 

2 HASH(0x832a9f0) 
    'title' => 1277 
    'author' => 'XYZ' 
    'quantity' => '1030'

什么是我能做到这一点的积累，这样的最佳方式它被优化？数组元素的数量可能非常大。我不介意添加一个额外的密钥来帮助相同的哈希，但我不想n查找。请告知

来源

2010-07-08 Gopalakrishnan SA

你说“我不想n查找”，但是没有访问数组的每个成员都无法在整个数组中累积。 – 2010-07-08 15:48:27

请将[perldoc perldsc]（http://perldoc.perl.org/perldsc.html）和[perldoc perlreftut]（http://perldoc.perl.org/perlreftut.html）添加到您的阅读列表中。 – Ether 2010-07-08 16:12:51

my %sum; 
for (@a) { 
    $sum{ $_->{author} }{ $_->{title} } += $_->{quantity}; 
} 

my @accumulated; 
foreach my $author (keys %sum) { 
    foreach my $title (keys %{ $sum{$author} }) { 
    push @accumulated => { title => $title, 
          author => $author, 
          quantity => $sum{$author}{$title}, 
         }; 
    } 
}

不知道是否map使它看起来更好：

my @accumulated = 
    map { 
    my $author = $_; 
    map { author => $author, 
      title => $_, 
      quantity => $sum{$author}{$_}, 
     }, 
     keys %{ $sum{$author} }; 
    } 
    keys %sum;

来源

2010-07-08 15:44:30

这个例子只是痒了一些地图/ grep的爱 – Daenyth 2010-07-08 16:36:03

@Daenyth通常是的，但在这种情况下看起来不太好。 – 2010-07-08 17:38:44

如果你不想ñ查找，那么你需要一个哈希函数 - 但是你需要店他们与该散列函数。当你将它们放入列表（或数组）中时，就太迟了。你要么走运，一直在，否则你将有N个查找。

或者插入他们进入散列~~上述下方~~。混合解决方案是将定位器作为项目0存储在列表/数组中。

my $lot = get_lot_from_whatever(); 
my $tot = $list[0]{ $lot->{author} }{ $lot->{title} }; 
if ($tot) { 
    $tot->{quantity} += $lot->{quantity}; 
} 
else { 
    push @list, $list[0]{ $lot->{author} }{ $lot->{title} } = $lot; 
}

以前的所有，我们将重新格式化的

首先，使其可读。

[ { title => 1177, author => 'ABC', quantity => '-100' } 
, { title => 1177, author => 'ABC', quantity => '100' } 
, { title => 1127, author => 'DEF', quantity => '5100' } 
, { title => 1277, author => 'XYZ', quantity => '1030' } 
]

接下来，你需要打破这个问题。你想按作者和标题分组数量为。所以你需要这些东西唯一识别这些地段。要重复说明，您需要名称的组合以识别实体。因此，你需要一个散列来标识按名称排列的东西。

既然我们有两件事情，双散列是一个很好的方法来做到这一点。

my %hash; 
foreach my $lot (@list) { 
    $hash{ $lot->{author} }{ $lot->{title} } += $lot->{quantity}; 
} 
# consolidated by hash

要将其变回列表中，我们需要对这些级别进行分解。

my @consol 
    = sort { $a->{author} cmp $b->{author} || $a->{title} cmp $b->{title} } 
     map { 
      my ($a, $titles) = @$_; # $_ is [ $a, {...} ] 
      map { +{ title => $_, author => $a, quantity => $titles->{$_} } 
      keys %$titles; 
     } 
     map { [ $_ => $hash{$_} ] } # group and freeze a pair 
     keys %hash 
    ; 

# consolidated in a list.

然后你回来了，我甚至为你排序。当然，你也可以通过来排序 - 发布者就是这样 - 递减的数量。

sort { $b->{quantity} <=> $a->{quantity} 
    || $a->{author} cmp $b->{author} 
    || $a->{title} cmp $b->{title} 
    }

来源

2010-07-08 16:15:33 Axeman

我认为重要的是退一步考虑数据的来源。如果数据来自数据库，那么您应该编写SQL查询，以便为每个作者/标题组合使用数量字段中总数量的一行。如果您正在读取文件中的数据，那么您应该直接将其读取到散列中，或者如果订单很重要，则使用Tie::IxHash。

一旦你有像你这样的hashrefs数组中的数据，你将不得不创建一个辅助数据结构，并做一大堆查找，其成本可能会主宰你的程序的运行时间（不如果它每天运行15分钟，那么这很重要），并且可能会遇到内存问题。

来源

2010-07-08 17:15:07

回答

相关问题