2013-02-27 47 views
1

我有以下的Perl脚本,即排序的单词列表,使用UTF-8编码:Perl的多语种排序:想空间排序上述信件

use HTML::Entities; 
use Unicode::Collate::Locale; 
use utf8; 

my @array = (
    "Another", 
    "An Other", 
    "Anóther", 
    "An Óther", 
    "Anòther", 
    "An Òther", 
    "Anôther", 
    "An Ôther", 
    "Anöther", 
    "An Öther", 
    "Anõther", 
    "An Õther" 
    ); 

my $lang = "da"; 

printf ("Lang code is: %s\n", $lang); 

my $coll = Unicode::Collate::Locale->new(locale => "$lang"); 

my @result = $coll->sort(@array); 


foreach my $item (@result){ 

print $item, "\n"; 
} 

下面就是它输出:

Lang code is: da 
An Other 
Another 
An Óther 
Anóther 
An Òther 
Anòther 
An Ôther 
Anôther 
An Õther 
Anõther 
An Öther 
Anöther 

不过,我会把它想输出:

An Other 
An Óther 
An Òther 
An Ôther 
An Õther 
An Öther 
Another 
Anóther 
Anòther 
Anôther 
Anõther 
Anöther 

的理由是,我想空格字符之前其他文件进行排序tters。有没有办法让我的Collat​​or对象帮助我做到这一点?

回答

3

尝试所述可变加权设为'non-ignorable'

my $coll = Unicode::Collate::Locale->new(
    locale => $lang, 
    variable => 'non-ignorable', 
); 

有关详细信息,请参见在Variable Weighting Unicode归类算法(UCA)的规范。

+0

超级回答,谢谢参考。 – egilchri 2013-02-28 12:33:32