我有一个程序(遗憾地改变这个不是一个选项),它输出的日志文件大于500k行。Shell:通过子串对字符串进行分组的脚本
我想组中的日志文件一起行(然后排序这些群体)的基础上的子带中的台词
比如我有类似下面几行:
SELECT something WHERE TIM BETWEEN '*' AND '*' AND something;
什么即时寻找到组上是TIM BETWEEN '*' AND '*'
其中*线之间相匹配,例如:
SELECT something WHERE TIM BETWEEN '2010-03-04' AND '2010-03-10' AND something;
SELECT something WHERE TIM BETWEEN '2011-01-28' AND '2011-02-05' AND something;
SELECT something WHERE TIM BETWEEN '2010-03-04' AND '2010-03-10' AND something;
SELECT something WHERE TIM BETWEEN '2011-01-28' AND '2011-02-05' AND something;
将在输出被分组为例如:
SELECT something WHERE TIM BETWEEN '2010-03-04' AND '2010-03-10' AND something;
SELECT something WHERE TIM BETWEEN '2010-03-04' AND '2010-03-10' AND something;
SELECT something WHERE TIM BETWEEN '2011-01-28' AND '2011-02-05' AND something;
SELECT something WHERE TIM BETWEEN '2011-01-28' AND '2011-02-05' AND something;
每个组也都已经根据整个字符串进行了排序,所以在“多少”类似的情况下,它们是否相邻?
我一直在试图把一个shell脚本放在一起输出我想从日志文件中读取的内容,但没有取得任何成功!
编辑:我还需要提及的是 '东西' 可以是多个字,例如:
SELECT blah1, blah2 or SELECT blah1, blah2, blah3
感谢Kristofer的答案,但我不能依靠列的数量和TIM BETWEEN'*'和'*'块的位置在行之间的相同位置,我编辑了原始问题以反映此 – Tristan 2011-05-24 09:26:44
您可以将“分隔符”设置为除空格以外的其他值,以定义列结束的内容。通过这样做,您可能可以执行多步排序,在其中更改每种排序之间的分隔符(如果可以使用单词作为分隔符)。 -t更改分隔符。 –
Kristofer
2011-05-24 10:39:34