2017-04-09 185 views
1

我对MongoDB相当陌生,目前我正面临一种情况。下面是从整个数据库2条样本记录,我有:MongoDB汇总结果到嵌套数组

{ 
    "_id": 1, 
    "Record": 1, 
    "Link": [ "https://wikileaks.org/plusd/cables/1979PANAMA06344_e.html" ], 
    "Location": [ "USA", "PAN", "USA", "USA", "PAN" ], 
    "Organization": [ "GN", "SOUTHCOM", "UCMJ", "PRC" ], 
    "Date": [ "2016" ], 
    "People": [ "P.Walter" ] 
} 
{ 
    "_id": 2, 
    "Record": 2, 
    "Link": [ "https://wikileaks.org/gifiles/docs/11/111533_-latam-centam-brief-110822-.html" ], 
    "Location": [ "NIC", "GTM", "JAM", "GTM", "PAN" ], 
    "Organization": [ "CENTAM", "Calibre Mining Corporation", "STRATFOR", "Alder Resources" ], 
    "Date": [ "2013" ], 
    "People": [ "Daniel Ortega", "Hugo Chavez", "Paulo Gregoire" ] 
} 

基本上,我试图让一个像这样的输出:

{ 
    "Country": "US", 
    "Years": [ 
     { 
      "Year": "2016", 
      "Links": [ "https://wikileaks.org/gifiles/docs/11/111533_-latam-centam-brief-110822-.html", 
      "https://wikileaks.org/plusd/cables/1979PANAMA06344_e.html", 
      "https://wikileaks.org/gifiles/docs/90/9058_wax-12312008-csv-.html" ] 
     }, 
     { 
      "Year": "2013", 
      "Links": [ ""https://wikileaks.org/gifiles/docs/11/111533_-latam-centam-brief-110822-.html", 
      "https://wikileaks.org/plusd/cables/1979PANAMA06344_e.html", 
      "https://wikileaks.org/gifiles/docs/90/9058_wax-12312008-csv-.html" ] 
     } 
    ] 
"Link_Count": 6 
} 
    { 
    "Country": "UK", 
    "Years": [ 
     { 
      "Year": "2009", 
      "Links": [ "https://wikileaks.org/gifiles/docs/11/111533_-latam-centam-brief-110822-.html", 
      "https://wikileaks.org/plusd/cables/1979PANAMA06344_e.html", 
      "https://wikileaks.org/gifiles/docs/90/9058_wax-12312008-csv-.html" ] 
     }, 
     { 
      "Year": "2011", 
      "Links": [ ""https://wikileaks.org/gifiles/docs/11/111533_-latam-centam-brief-110822-.html", 
      "https://wikileaks.org/plusd/cables/1979PANAMA06344_e.html"] 
     } 
    ] 
"Link_Count": 5 
} 

我试着聚集,但我无法达到我想要的效果,就像我在输出中给出的那样。这是我的查询:

db.test.aggregate([ 
{ 
"$unwind": "$Location" 
}, 
{ 
    "$group" : { 
     "_id": { 
      "Country": "$Location", 
      "Year": "$Date", 
      "Links": "$Link" 
     }, 
     Loc: { 
      $addToSet: "$Location" 
     } 
    } 
}, 
{ 
    "$unwind": "$Loc" 
}, 
{ 
    "$group": { 
     "_id": "$Loc", 
     "Years": { "$push": { 
      "Year": "$_id.Year", 
      "Links": "$_id.Links" 
      } 
     } 
    } 
} 
]).toArray() 

我在$ Location中使用了$ unwind和$ addToSet,因为$ Location中有重复项。我接受任何建议或解决方案,请告诉我!提前致谢!

回答

0

您可以使用:

db.test.aggregate([{ 
    "$unwind": "$Location" 
}, { 
    "$unwind": "$Date" 
}, { 
    "$unwind": "$Link" 
}, { 
    "$group": { 
     "_id": { 
      "Country": "$Location", 
      "Year": "$Date" 
     }, 
     Links: { 
      $addToSet: "$Link" 
     } 
    } 
}, { 
    "$group": { 
     "_id": "$_id.Country", 
     Years: { 
      $push: { 
       "Year": "$_id.Year", 
       "Links": "$Links" 
      } 
     }, 
     Link_Count: { $sum: { $size: "$Links" } } 
    } 
}]) 

的想法是$unwind所有阵列能够$push链接到一个新的数组,并计算与$size分组记录最后$group阶段。

+0

嗨,谢谢你的主意!我设法使用上面的查询来获得我想要的输出。但是,我想进一步询问我是否可以在Links上使用$ addToSet,因为我只想得到唯一的链接,上面的查询没有给出,因为它给了我重复。我真的很感激你的帮助,并且很抱歉给你带来麻烦! – Levi0000

+0

查看我的更新答案,以获得'Years.Links'数组中唯一的值,并让'Link_Count'为该国唯一的'Links'值的计数 –