如何使用Perl从文件中提取缩写？

我需要从诸如ABS，TVS和PERL的文件中提取某些缩写。任何以大写字母表示的缩写。我最好喜欢用正则表达式来做这件事。任何帮助表示赞赏。如何使用Perl从文件中提取缩写？

2009-07-08 User1611

你打算如何确定一个词是否是一个缩写？必须有某种数据库，比如另一个包含所有缩写的文件，或者可以查询的数据库。 – ghostdog74 2009-07-08 08:15:19

以上实现可能会将任何大于2个字符的大写字符字符串视为缩写。 – 2009-07-08 08:25:56

我还会增加一个上限，因为如果它说的长度超过5或6个字符，那么我会怀疑它是一个缩写;） – fortran 2009-07-08 09:32:51

未经测试：


my %abbr; 
open (my $input, "<", "filename") 
    || die "open: $!"; 
for (< $input >) { 
    while (s/([A-Z][A-Z]+)//) { 
    $abbr{$1}++; 
    } 
}

修改它来寻找至少两个连续大写字母。

来源

2009-07-08 08:09:28

很高兴听到您特别遇到困难的部分。从标准输入检索

my %abbr; 
open my $inputfh, '<', 'filename' 
    or die "open error: $!\n"; 
while (my $line = readline($inputfh)) { 
    while ($line =~ /\b([A-Z]{2,})\b/g) { 
     $abbr{$1}++; 
    } 
} 

for my $abbr (sort keys %abbr) { 
    print "Found $abbr $abbr{$abbr} time(s)\n"; 
}

来源

2009-07-08 09:18:08 ysth

#!/usr/bin/perl 

use strict; 
use warnings; 

my %abbrs =(); 

while(<>){ 
    my @words = split ' ', $_; 

    foreach my $word(@words){ 
     $word =~ /([A-Z]{2,})/ && $abbrs{$1}++; 
    } 
} 

# %abbrs now contains all abreviations

来源

2009-07-08 09:25:35 dsm

阅读文本，写所有的缩写发现到标准输出，用空格分隔：

my $text; 
# Slurp all text 
{ local $/ = undef; $text = <>; } 
# Extract all sequences of 2 or more uppercase characters 
my @abbrevs = $text =~ /\b([[:upper:]]{2,})\b/g; 
# Output separated by spaces 
print join(" ", @abbrevs), "\n";

注意使用的POSIX字符类[：上：]，这将匹配全部大写字母，不只是英文字母（AZ）。

来源

2009-07-08 10:15:16

如何使用Perl从文件中提取缩写？

回答

相关问题