一个相当普遍的方法,使用awk
:
awk 'FNR==NR { array[$1]++; next } { for (i=1; i<=NF; i++) if ($i in array) print $0 }' dict file
说明:
FNR==NR { } ## FNR is number of records relative to the current input file.
## NR is the total number of records.
## So this statement simply means `while we're reading the 1st file
## called dict; do ...`
array[$1]++; ## Add the first column ($1) to an array called `array`.
## I could use $0 (the whole line) here, but since you have said
## that there will only be one integer per line, I decided to use
## $1 (it strips leading and lagging whitespace; if any)
next ## process the next line in `dict`
for (i=1; i<=NF; i++) ## loop through each column in `file`
if ($i in array) ## if one of these columns can be found in the array
print $0 ## print the whole line out
要处理使用bash循环多个文件:
## This will process files; like file, file1, file2, file3 ...
## And create output files like, file.out, file1.out, file2.out, file3.out ...
for j in file*; do awk -v FILE=$j.out 'FNR==NR { array[$1]++; next } { for (i=1; i<=NF; i++) if ($i in array) print $0 > FILE }' dict $j; done
如果你有兴趣在多个文件中使用tee
,你可能想尝试这样的:
for j in file*; do awk -v FILE=$j.out 'FNR==NR { array[$1]++; next } { for (i=1; i<=NF; i++) if ($i in array) { print $0 > FILE; print FILENAME, $0 } }' dict $j; done 2>&1 | tee output
这将显示你的文件被进程的名称和匹配记录找到并写入一个'日志'到文件output
。
您必须使用'grep',还是打开其他解决方案? – 2012-08-16 22:16:29
不是。任何命令行工具都可以。 – qazwsx 2012-08-16 22:18:41
您的'grep'命令适用于我,而不会出现您指出的误报。 – 2012-08-17 00:48:29