2017-02-27 73 views
0

以JSON不同值的唯一组合我有一个JSON文件(input.json),它看起来是这样的:使用JQ

{"header1":"a","header2":1a, "header3":1a, "header4":"apple"}, 
{"header1":"b","header2":2a, "header3":2a, "header4":"orange"} 
{"header1":"c","header2":1a, "header3":2a, "header4":"banana"}, 
{"header1":"d","header2":2a, "header3":1a, "header4":"apple"}, 
{"header1":"a","header2":2a, "header3":1a, "header4":"banana"}, 
{"header1":"b","header2":1a, "header3":2a, "header4":"orange"}, 
{"header1":"b","header2":1a, "header3":1a, "header4":"orange"}, 
{"header1":"d","header2":1a, "header3":1a, "header4":"apple"}, 
{"header1":"a","header2":2a, "header3":1a, "header4":"banana"} (repeat of line 5) 

我想过滤出的每个值JQ的独特组合。 结果应该是这样的:

{"header1":"a","header2":1a, "header3":1a, "header4":"apple"}, 
{"header1":"b","header2":2a, "header3":2a, "header4":"orange"} 
{"header1":"c","header2":1a, "header3":2a, "header4":"banana"}, 
{"header1":"d","header2":2a, "header3":1a, "header4":"apple"}, 
{"header1":"a","header2":2a, "header3":1a, "header4":"banana"}, 
{"header1":"b","header2":1a, "header3":2a, "header4":"orange"}, 
{"header1":"b","header2":1a, "header3":1a, "header4":"orange"}, 
{"header1":"d","header2":1a, "header3":1a, "header4":"apple"} 

我试图通过与其他头头1的做组,但它并没有产生独特的效果。 我用unique但没有产生正确的结果。

我怎样才能得到这个? Im新的jq和没有找到很多教程。

感谢

回答

0
  1. 你给采样线是无效的JSON。由于您的序言将它们作为JSON引入,因此以下内容将假定您打算呈现JSON对象的数组。

  2. 的问题是,在许多方面还不清楚,但是从例子中,它看起来好像unique可能是你在找什么,所以考虑:

调用:JQ -c“独特的[ ]” input.json

输出:

{"header1":"a","header2":"1a","header3":"1a","header4":"apple"} 
{"header1":"a","header2":"2a","header3":"1a","header4":"banana"} 
{"header1":"b","header2":"1a","header3":"1a","header4":"orange"} 
{"header1":"b","header2":"1a","header3":"2a","header4":"orange"} 
{"header1":"b","header2":"2a","header3":"2a","header4":"orange"} 
{"header1":"c","header2":"1a","header3":"2a","header4":"banana"} 
{"header1":"d","header2":"1a","header3":"1a","header4":"apple"} 
{"header1":"d","header2":"2a","header3":"1a","header4":"apple"} 
  • 如果需要以一些其它形式的输出,可以d Ø在使用JQ为好,但要求不是那么清楚,让我们离开,作为一个练习:-)
  • +0

    更新了我的问题,请再次检查。我想通过仅选择这些特定的键来生成每个值的唯一组合 – user2340345

    0

    由于作为peak表明了自己的输入是不合法的JSON我已经采取了纠正的自由它并转换到个体对象的列表:

    {"header1":"a","header2":"1a", "header3":"1a", "header4":"apple"} 
    {"header1":"b","header2":"2a", "header3":"2a", "header4":"orange"} 
    {"header1":"c","header2":"1a", "header3":"2a", "header4":"banana"} 
    {"header1":"d","header2":"2a", "header3":"1a", "header4":"apple"} 
    {"header1":"a","header2":"2a", "header3":"1a", "header4":"banana"} 
    {"header1":"b","header2":"1a", "header3":"2a", "header4":"orange"} 
    {"header1":"b","header2":"1a", "header3":"1a", "header4":"orange"} 
    {"header1":"d","header2":"1a", "header3":"1a", "header4":"apple"} 
    {"header1":"a","header2":"2a", "header3":"1a", "header4":"banana"} 
    

    如果此数据是在data.json和运行

    jq -M -s -f filter.jq data.json 
    

    具有以下filter.jq

    foreach .[] as $r (
        {} 
    ; ($r | map(.)) as $p | if getpath($p) then empty else setpath($p;1) end 
    ; $r 
    ) 
    

    它将以原始顺序生成以下输出,但不会有重复项。

    {"header1":"a","header2":"1a","header3":"1a","header4":"apple"} 
    {"header1":"b","header2":"2a","header3":"2a","header4":"orange"} 
    {"header1":"c","header2":"1a","header3":"2a","header4":"banana"} 
    {"header1":"d","header2":"2a","header3":"1a","header4":"apple"} 
    {"header1":"a","header2":"2a","header3":"1a","header4":"banana"} 
    {"header1":"b","header2":"1a","header3":"2a","header4":"orange"} 
    {"header1":"b","header2":"1a","header3":"1a","header4":"orange"} 
    {"header1":"d","header2":"1a","header3":"1a","header4":"apple"} 
    

    注意,

    ($r | map(.)) 
    

    用于产生从被假设为总是产生一个唯一的密钥的路径的每一行 仅包含值的数组。对于样本数据 是这样,但对于更复杂的值可能不是这样。

    较慢但是更强大的filter.jq

    foreach .[] as $r (
        {} 
    ; [$r | tojson] as $p | if getpath($p) then empty else setpath($p;1) end 
    ; $r 
    ) 
    

    其使用整行的JSON表示作为一个独特的密钥,以确定是否一个行先前已经看到。