比较awk中的连续行和多个列以及随机选择其中一个重复行

我读过这个问题：Compare consecutive rows in awk/(or python) and random select one of duplicate lines。现在我有一些额外的问题：我应该如何更改代码，如果我不想对x值进行比较，还需要对y值或更多的列进行比较？也许像比较awk中的连续行和多个列以及随机选择其中一个重复行

if ($1 != prev) && ($2 != prev) ???

换句话说：我想比较，如果x值和当前行的y值是相同的x值和下一个连续的y值线。

数据：

输出应看起来像：

或（由于随机选择）

从上述链路的代码，这并对于x值的东西，但不是在AND条件下的y值：

$ cat tst.awk 
function prtBuf(  idx) { 
    if (cnt > 0) { 
     idx = int((rand() * cnt) + 1) 
     print buf[idx] 
    } 
    cnt = 0 
} 

BEGIN { srand() } 
$1 != prev { prtBuf() } 
{ buf[++cnt]=$0; prev=$1 } 
END { prtBuf() }

来源

2016-07-22 Jojo

这应做到：

function prtBuf(idx) { 
    if (cnt > 0) { 
     idx = int((rand() * cnt) + 1) 
     print buf[idx] 
    } 
    cnt = 0 
} 

BEGIN { srand() } 
$1 != prev1 || $2 != prev2 { prtBuf() } 
{ buf[++cnt]=$0; prev1=$1; prev2=$2 } 
END { prtBuf() }

来源

2016-07-23 17:42:10

是的，这样做吧！做得好！如果有人想为更多的colums做这个比较，它也很容易改变。 3列的示例：BEGIN {srand（）} $ 1！= prev1 || $ 2！= prev2 || $ 3！= prev3 {prtBuf（）} {buf [++ cnt] = $ 0; prev1 = $ 1; prev2 = $ 2; prev3 = $ 3} END {prtBuf（）} – Jojo

比较awk中的连续行和多个列以及随机选择其中一个重复行

回答

相关问题