有没有这样的命令来合并shell中的多个文件？

例如，是5号=> [1,2,3,4,5]和3组有没有这样的命令来合并shell中的多个文件？

File1中（组1）：

1 
3 
5

文件2（第2组）：

3 
4

文件3（组3）：

1 
5

输出（列1：是否在组1，列2：无论是在第2组，栏3：WH在第3组醚[NA表示不..]）：

1 NA 1 
3 3 NA 
NA 4 NA 
5 NA 5

或者像这样（+意味着， - 表示不）：

1 + - + 
3 + + - 
4 - + - 
5 + - +

我试图join和merge，但看起来既像他们的不适合多井工作文件。（例如，8个文件）

来源

2013-02-26 Hanfei Sun

我可能是失明的，但我没有看到你之后的合并逻辑。你能否详细说明你的合并结果是基于什么？ – favoretti 2013-02-26 16:12:59

什么是NA？ oO – 2013-02-26 16:14:49

@favoretti NA在columnN表示没有这样的项目用于文件N – 2013-02-26 16:15:10

你说有数字1-5，但这是，据我可以看到，与你想要的输出无关。您只能使用输出中的文件中找到的数字。此代码会做你想要什么：

use strict; 
use warnings; 
use feature 'say'; 

my @hashes; 
my %seen; 
local $/; # read entire file at once 
while (<>) { 
    my @nums = split;       # split file into elements 
    $seen{$_}++ for @nums;      # dedupe elements 
    push @hashes, { map { $_ => $_ } @nums }; # map into hash 
} 

my @all = sort { $a <=> $b } keys %seen;  # sort deduped elements 
# my @all = 1 .. 5;       # OR: provide hard-coded list 

for my $num (@all) {       # for all unique numbers 
    my @fields; 
    for my $href (@hashes) {     # check each hash 
     push @fields, $href->{$num} // "NA"; # enter "NA" if not found 
    } 
    say join "\t", @fields;     # print the fields 
}

你可能会只是my @all = 1 .. 5或任何其他有效的替换列表中@all排序的重复数据删除列表。然后它会为这些数字添加行，并为缺失的值打印出额外的“NA”字段。

你也应该知道，这取决于你的文件内容是数字，但只有在排序@all数组时，所以如果你用自己的列表或自己的列表替换它，排序例程，可以使用任何值。

此脚本将采取任意数量的文件并处理它们。例如：

$ perl script.pl f1.txt f2.txt f3.txt 
1  NA  1 
3  3  NA 
NA  4  NA 
5  NA  5

为了弄清楚OP的意思，请归功于Brent Stewart。

来源

2013-02-26 17:32:05 TLP

对于两个文件，你可以方便地使用join如下图所示（假设file1和file2已经排序）：

$ join -e NA -o 1.1,2.1 -a 1 -a 2 file1 file2 
1 NA 
3 3 
NA 4 
5 NA

如果你有两个以上的文件，它会变得更加复杂。

这里是蛮力grep解决方案：

#!/bin/bash 
files=(file1 file2 file3) 
sort -nu "${files[@]}" | while read line; do 
    for f in "${files[@]}"; do 
     if grep -qFx "$line" "$f"; then 
      printf "${line}\t" 
     else 
      printf "NA\t" 
     fi 
    done 
    printf "\n" 
done

输出：

1  NA  1 
3  3  NA 
NA  4  NA 
5  NA  5

来源

2013-02-26 16:43:03 dogbane

#!/usr/bin/env perl 
use strict; 
use warnings; 
use autodie; 

my @lines; 
my $filecount = 0; 

# parse 
for my $filename (@ARGV){ 
    open my $fh, '<', $filename; 
    while(my $line = <$fh>){ 
    chomp($line); 
    next unless length $line; 
    $lines[$line][$filecount]++; 
    } 
    close $fh; 
}continue{ 
    $filecount++; 
} 

# print 
for my $linenum (1..$#lines){ 
    my $line = $lines[$linenum]; 
    next unless $line; 

    print ' ' x (5-length $linenum), $linenum, ' '; 

    for my $elem(@$line){ 
    print $elem ? 'X' : ' ' 
    } 
    print "\n"; 
}

来源

2013-02-26 17:11:21

如果你的输入文件单调递增的，只是由一个整数的每一行作为输入样本建议，你可以简单地预处理输入文件，并用浆糊：

for i in file{1,2,3}; do # List input files 
    awk '{ a += 1; while($1 > a) { print "NA"; a += 1 }} 1' $i > $i.out 
done 
paste file{1,2,3}.out

这使某些列中的条目尾部为空。解决这个问题留给读者一个练习。

来源

2013-02-26 18:11:08

有没有这样的命令来合并shell中的多个文件？

回答

相关问题