2017-10-05 36 views
0

我想知道段落中有多少单词,然后找到每个单词出现次数。 我可以做到这一点,但是有没有其他方法可以只使用正则表达式?捕获一行中的所有单词并使用Perl正则表达式计算它们的出现

my $string = "John is a good boy. John goes to school with his brother Johnny. When John is hungry, he eats his tiffin."; 
my @list =(); 
while($string =~ /(\b\w+\b)/gi) 
{ 
     push(@list, $1); 
} 

my %counts; 
for (@list) { 
    $counts{$_}++; 
} 
print "$#list \n"; 
foreach my $keys (keys %counts) { 
    print "$keys = $counts{$keys}\n"; 
} 

输出应该是

20 
brother = 1 
a = 1 
goes = 1 
is = 2 
good = 1 
to = 1 
tiffin = 1 
When = 1 
boy = 1 
his = 2 
school = 1 
Johnny = 1 
he = 1 
eats = 1 
John = 3 
with = 1 
hungry = 1 
+0

你已经在使用正则表达式吗? –

+0

不,我的意思是使用正则表达式来计算出现次数。我正在使用list和hash。 –

+0

你想要一个甚至不使用散列的解决方案吗? –

回答

2

我看不到的方式来做到这一点纯粹用正则表达式,如果这种方式确实存在,这将是一个真正的过于复杂的正则表达式,这将是很难维护。但是,通过使用散列和丢失列表可以简化你所拥有的内容;

use strict; 
use warnings; 

my $string = "John is a good boy. John goes to school with his brother Johnny. When John is hungry, he eats his tiffin."; 
my %counts; 
my $word_count = 0; 
while($string =~ /\b(\w+)\b/g) 
    { 
    $counts{$1}++; 
    $word_count++; 
    } 

print "$word_count\n"; 
foreach my $keys (keys %counts) 
    { 
    print "$keys = $counts{$keys}\n"; 
    } 

注:我已经调整了正则表达式略有你并不需要的“\ b”的拍摄组内,使其不区分大小写如您不匹配特定字符串是不是必需的。并添加了“严格使用”;和“使用警告”;你应该总是在你的perl的顶部抛出任何问题。

相关问题