2017-05-24 34 views
2

初学者在这里,很可能是试图做不可能的事。 我有以下的结构,我想在Elasticsearch存储:ElasticSearch为了在嵌套场比赛的数量

{ 
    "id" : 1, 
    "code" : "03f3301c-4089-11e7-a919-92ebcb67fe33", 
    "countries" : [ 
     { 
      "id" : 1, 
      "name" : "Netherlands" 
     }, 
     { 
      "id" : 2, 
      "name" : "United Kingdom" 
     } 
    ], 
    "tags" : [ 
     { 
      "id" : 1, 
      "name" : "Scanned" 
     }, 
     { 
      "id" : 2, 
      "name" : "Secured" 
     }, 
     { 
      "id" : 3, 
      "name" : "Cleared" 
     } 
    ] 
} 

我有过如何将其存储的完全控制,这样的结构可以改变,但它应该包含某种形式的所有这些领域。 我希望能够在所有这些具有至少一个匹配的物品退还,通过匹配次数进行排序这样的方式在countriestags查询数据。如果可能,我宁愿不做全文搜索。

例如:

id, code, country ids, tag ids 
1, ..., [1, 2, 3], [1] 
2, ..., [1],   [1, 2, 3] 

对于这样的问题:"which of these was in country 1 or has tag 1 or has tag 2",应该返回:

2, ..., [1], [1, 2, 3] 
1, ..., [1, 2, 3], [1] 

按照此顺序,因为第二行中的上述析取匹配更多的子查询。

从本质上说,我想复制这个SQL查询:

SELECT p.id, p.code, COUNT(p.id) FROM packages p 
LEFT JOIN tags t ON t.package_id = p.id 
LEFT JOIN countries c ON c.package_id = p.id 
WHERE t.id IN (1, 2, 3) OR c.id IN (1, 2, 3) 
GROUP BY p.id 
ORDER BY COUNT(p.id); 

我使用ElasticSearch 2.4.5如果该事项。

希望我已经够清楚了。感谢您的帮助!

回答

0

您需要countriestagsnested类型。此外,还需要采取打分的控制与function_score1为function_score内部查询的weight并与boost_modescore_mode玩。最后,你可以使用这个查询:

GET /nested/test/_search 
{ 
    "query": { 
    "function_score": { 
     "query": { 
     "match_all": {} 
     }, 
     "functions": [ 
     { 
      "filter": { 
      "nested": { 
       "path": "tags", 
       "query": { 
       "term": { 
        "tags.id": 1 
       } 
       } 
      } 
      }, 
      "weight": 1 
     }, 
     { 
      "filter": { 
      "nested": { 
       "path": "tags", 
       "query": { 
       "term": { 
        "tags.id": 2 
       } 
       } 
      } 
      }, 
      "weight": 1 
     }, 
     { 
      "filter": { 
      "nested": { 
       "path": "countries", 
       "query": { 
       "term": { 
        "countries.id": 1 
       } 
       } 
      } 
      }, 
      "weight": 1 
     } 
     ], 
     "boost_mode": "replace", 
     "score_mode": "sum" 
    } 
    } 
} 

对于一个更完整的测试情况下,我还提供了测绘和试验数据:

PUT nested 
{ 
    "mappings": { 
    "test": { 
     "properties": { 
     "tags": { 
      "type": "nested", 
      "properties": { 
      "name": { 
       "type": "string", 
       "index": "not_analyzed" 
      } 
      } 
     }, 
     "countries": { 
      "type": "nested", 
      "properties": { 
      "name": { 
       "type": "string", 
       "index": "not_analyzed" 
      } 
      } 
     } 
     } 
    } 
    } 
} 

POST nested/test/_bulk 
{"index":{"_id":1}} 
{"name":"Foo Bar","tags":[{"id":2,"name":"My Tag 5"},{"id":3,"name":"My Tag 7"}],"countries":[{"id":1,"name":"USA"}]} 
{"index":{"_id":2}} 
{"name":"Foo Bar","tags":[{"id":3,"name":"My Tag 6"}],"countries":[{"id":1,"name":"USA"},{"id":2,"name":"UK"},{"id":3,"name":"UAE"}]} 
{"index":{"_id":3}} 
{"name":"Foo Bar","tags":[{"id":1,"name":"My Tag 4"},{"id":3,"name":"My Tag 1"}],"countries":[{"id":3,"name":"UAE"}]} 
{"index":{"_id":4}} 
{"name":"Foo Bar","tags":[{"id":1,"name":"My Tag 1"},{"id":2,"name":"My Tag 4"},{"id":3,"name":"My Tag 2"}],"countries":[{"id":2,"name":"UK"},{"id":3,"name":"UAE"}]} 
+0

谢谢你,这完美地工作!一个小小的改变我所要做的就是设置的所有功能,“重量”为2,加2“min_score”,因为如果文件不符合任何过滤器的它仍然会得到的1“分数” 。 – Robert