如何在perl中使用数组匹配两个序列

当在两个数组中循环时，我对如何将指针移动通过一个循环但在另一个循环中保持常量感到困惑。因此，例如：如何在perl中使用数组匹配两个序列

阵列1：A T C G T C G A G C G
阵列2：A C G T C C T G T C G

所以A的第一阵列中的甲匹配所述第二阵列中的，所以我们移动到下一个元素。但由于T不会在第二索引的C匹配，我希望程序是t比较到下一个中的G阵列2，以此类推，直到找到匹配T.

my ($array1ref, $array2ref) = @_; 

my @array1 = @$array1ref; 
my @array2= @$array2ref; 
my $count = 0; 
foreach my $element (@array1) { 
foreach my $element2 (@array2) { 
if ($element eq $element2) { 
$count++; 
    }else { ??????????? 


}

来源

2013-05-02 user2344516

嵌套循环品牌没有意义。你不想多次循环。

您没有指定重新同步后想要发生的情况，因此您需要从以下开始，并根据需要进行调整。

my ($array1, $array2) = @_; 

my $idx1 = 0; 
my $idx2 = 0; 
while ($idx1 < @$array1 && $idx2 < @$array2) { 
    if ($array1->[$idx1] eq $array2->[$idx2]) { 
     ++$idx1; 
     ++$idx2; 
    } else { 
     ++$idx2; 
    } 
} 

...

由于是，上面的代码将在最后一个指标是不能（最终）重新同步在离开$idx1。相反，如果你想，只要你第一次重新同步，你想

my ($array1, $array2) = @_; 

my $idx1 = 0; 
my $idx2 = 0; 
my $mismatch = 0; 
while ($idx1 < @$array1 && $idx2 < @$array2) { 
    if ($array1->[$idx1] eq $array2->[$idx2]) { 
     last if $mismatched;   
     ++$idx1; 
     ++$idx2; 
    } else { 
     ++$mismatched; 
     ++$idx2; 
    } 
} 

...

来源

2013-05-02 19:50:36 ikegami

停止的foreach循环将不会削减它：我们会想在循环，同时还有两个阵列中可用的元素，或遍历所有指数，我们可以增加我们喜欢：

EL1: while (defined(my $el1 = shift @array1) and @array2) { 
    EL2: while(defined(my $el2 = shift @array2)) { 
    ++$count and next EL1 if $el1 eq $el2; # break out of inner loop 
    } 
}

或

my $j = 0; # index of @array2 
for (my $i = 0; $i <= $#array1; $i++) { 
    $j++ until $j > $#array or $array1[$i] eq $array2[$j]; 
    last if $j > $#array; 
    $count++; 
}

或任意组合。

来源

2013-05-02 19:51:08 amon

这是一个复杂的用于循环使用while循环，而不是

my ($array1ref, $array2ref) = @_; 

my @array1 = @$array1ref; 
my @array2= @$array2ref; 
my $count = 0; 
my ($index, $index2) = (0,0); 
#loop while indexs are in arrays 
while($index <= @#array1 && $index2 <= @#array2) { 
    if($array1[$index] eq $array2[$index2]) { 
     $index++; 
     $index2++; 
    } else { 
     #increment index until we find a match 
     $index2++ until $array1[$index] eq $array2[$index2]; 
    } 
}

来源

2013-05-02 19:51:36 user1937198

这是一种可能的条件。它将使用索引来通过这两个列表。

my @array1 = qw(A T C G T C G A G C G); 
my @array2 = qw(A C G T C C T G T C G); 

my $count = 0; 
my $idx1 = 0; 
my $idx2 = 0; 

while(($idx1 < scalar @array1) && ($idx2 < scalar @array2)) { 
    if($array1[$idx1] eq $array2[$idx2]) { 
     print "Match of $array1[$idx1] array1 \@ $idx1 and array2 \@ $idx2\n"; 
     $idx1++; 
     $idx2++; 
     $count++; 
    } else { 
     $idx2++; 
    } 
} 

print "Count = $count\n";

来源

2013-05-02 19:54:56

您可以使用while循环搜索匹配项。如果您找到匹配项，请在两个阵列中推进。如果你不这样做，则推进第二个数组。在结束时，你可以从第一阵列打印剩余的无与伦比的人物：

# [1, 2, 3] is a reference to an anonymous array (1, 2, 3) 
# qw(1, 2, 3) is shorthand quoted-word for ('1', '2', '3') 
my $arr1 = [qw(A T C G T C G A G C G)]; 
my $arr2 = [qw(A C G T C C T G T C G)]; 

my $idx1 = 0; 
my $idx2 = 0; 

# Find matched characters 
# @$arr_ref is the size of the array referenced by $arr_ref 
while ($idx1 < @$arr1 && $idx2 < @$arr2) { 
    my $char1 = $arr1->[$idx1]; 
    my $char2 = $arr2->[$idx2]; 
    if ($char1 eq $char2) { 
     # Matched character, advance arr1 and arr2 
     printf("%s %s -- arr1[%d] matches arr2[%d]\n", $char1, $char2, $idx1, $idx2); 
     ++$idx1; 
     ++$idx2; 
    } else { 
     # Unmatched character, advance arr2 
     printf(". %s -- skipping arr2[%d]\n", $char2, $idx2); 
     ++$idx2; 
    } 
} 

# Remaining unmatched characters 
while ($idx1 < @$arr1) { 
    my $char1 = $arr1->[$idx1]; 
    printf("%s . -- arr1[%d] is beyond the end of arr2\n", $char1, $idx1); 
    $idx1++; 
}

脚本打印：

A A -- arr1[0] matches arr2[0] 
. C -- skipping arr2[1] 
. G -- skipping arr2[2] 
T T -- arr1[1] matches arr2[3] 
C C -- arr1[2] matches arr2[4] 
. C -- skipping arr2[5] 
. T -- skipping arr2[6] 
G G -- arr1[3] matches arr2[7] 
T T -- arr1[4] matches arr2[8] 
C C -- arr1[5] matches arr2[9] 
G G -- arr1[6] matches arr2[10] 
A . -- arr1[7] is beyond the end of arr2 
G . -- arr1[8] is beyond the end of arr2 
C . -- arr1[9] is beyond the end of arr2 
G . -- arr1[10] is beyond the end of arr2

来源

2013-05-02 20:02:43 Andomar

好像你可以用“grep”可以做到这一点很容易，如果你'保证array2总是和array1一样长或更长。事情是这样的：

sub align 
{ 
    my ($array1, $array2) = @_; 
    my $index = 0; 

    return grep 
      { 
       $array1->[$index] eq $array2->[$_] ? ++$index : 0 
      } 0 .. scalar(@$array2) - 1; 
}

基本上，grep的是说“返回我越来越指数为数组2匹配在array1连续元素的列表。“

如果运行上述与这个测试代码，可以看到它返回预期的对准阵列：

my @array1 = qw(A T C G T C G A G C G); 
my @array2 = qw(A C G T C C T G T C G); 

say join ",", align \@array1, \@array2;

此输出预期映射： 0,3,4,7,8,9 ，10这份名单意味着@array1[0 .. 6]对应@array2[0,3,4,7,8,9,10]

。（注意：您需要use Modern::Perl或类似使用say）

现在，你还没有真正说你所需要的输出的操作。我假定你想要这个映射数组。如果您只需要计算在@array2中跳过的元素数，并将其与@array1对齐，则仍然可以使用上面的grep，而不是列表，最后只使用return scalar(@$array2) - $index。

来源

2013-05-06 09:45:39

如您所知，您的问题叫做Sequence Alignment。有很好的算法可以有效地做到这一点，在CPAN上有一个这样的模块Algorithm :: NeedlemanWunsch。以下是你如何将它应用于你的问题。

#!/usr/bin/perl 

use Algorithm::NeedlemanWunsch; 

my $arr1 = [qw(A T C G T C G A G C G)]; 
my $arr2 = [qw(A C G T C C T G T C G)]; 

my $matcher = Algorithm::NeedlemanWunsch->new(sub {@_==0 ? -1 : $_[0] eq $_[1] ? 1 : -2}); 

my (@align1, @align2); 
my $result = $matcher->align($arr1, $arr2, 
    { 
    align => sub {unshift @align1, $arr1->[shift]; unshift @align2, $arr2->[shift]}, 
    shift_a => sub {unshift @align1, $arr1->[shift]; unshift @align2,   '.'}, 
    shift_b => sub {unshift @align1,   '.'; unshift @align2, $arr1->[shift]}, 
    }); 

print join("", @align1), "\n"; 
print join("", @align2), "\n";

打印出最佳的解决方案中，我们在构造函数中指定的成本方面：从一个在你原来的问题

ATCGT.C.GAGCG 
A.CGTTCGG.TCG

一个非常不同的方法，但我认为这是值得了解。

来源

2013-05-09 22:26:17

如何在perl中使用数组匹配两个序列

回答

相关问题