2012-04-04 63 views
2

由于某种原因,以下查询允许重复的名称。这是为什么?sql查询未正确分组

SELECT id, name_without_variants, SUM(relevance) as total_relevance FROM (
    SELECT 
     card_definitions.id, 
      card_definitions.name_without_variants, 
     (MATCH(card_definitions.name_without_variants) AGAINST ('lost soul site discard')) * 0.40 AS relevance 
     FROM card_definitions 
     GROUP BY name_without_variants, id 
    UNION 
    SELECT 
     card_definitions.id, 
      card_definitions.name_without_variants, 
     (MATCH(card_def_identities.special_ability_text) AGAINST ('lost soul site discard')) * 0.05 AS relevance 
     FROM card_def_identities 
     INNER JOIN card_definitions ON card_def_identities.card_def_sid = card_definitions.id 
     GROUP BY name_without_variants, id 
    UNION 
    SELECT 
     card_definitions.id, 
      card_definitions.name_without_variants, 
     (MATCH(brigades.brigade_color) AGAINST ('lost soul site discard')) * 0.30 AS relevance 
     FROM brigades 
     INNER JOIN card_def_brigades ON brigades.id = card_def_brigades.brigade_sid 
     INNER JOIN card_definitions ON card_def_brigades.card_def_sid = card_definitions.id 
     GROUP BY name_without_variants, id 
    UNION 
    SELECT 
     card_definitions.id, 
      card_definitions.name_without_variants, 
     (MATCH(identifiers.identifier) AGAINST ('lost soul site discard')) * 0.20 AS relevance 
     FROM identifiers 
     INNER JOIN card_def_identifiers ON identifiers.id = card_def_identifiers.identifier_sid 
     INNER JOIN card_definitions on card_def_identifiers.card_def_sid = card_definitions.id 
     GROUP BY name_without_variants, id 
    UNION 
    SELECT 
     card_definitions.id, 
      card_definitions.name_without_variants, 
     (MATCH(card_effects.effect) AGAINST ('lost soul site discard')) * 0.05 AS relevance 
     FROM card_effects 
     INNER JOIN card_def_effects ON card_effects.id = card_def_effects.effect_sid 
     INNER JOIN card_definitions on card_def_effects.card_def_sid = card_definitions.id 
     GROUP BY name_without_variants, id 
    ) AS combined_search 
GROUP BY name_without_variants, id 
HAVING total_relevance > 0 
ORDER BY total_relevance DESC 
LIMIT 10; 

这是我得到的结果。注意两个Lost Soul [Site Doubler]

2623 Lost Soul [Deck Discard] 6.35151714086533 
1410 Lost Soul [Hand Discard] 6.29273346662521 
1495 Lost Soul [Discard Card] 5.93360201716423 
1442 Lost Soul [Demon Discard] 5.91308708190918 
1497 Lost Soul [Site Doubler] 5.05888686180115 
1498 Lost Soul [Site Doubler] 5.05888686180115 
2572 Lost Soul [Site Guard] 4.82421946525574 
2774 Lost Soul [Far Country] 3.39325473308563 
2891 Fortify Site [RoA2] 2.77084048986435 
1418 Lost Soul [Hopper] 2.63041100502014 

回答

2

因为ID是不同的,你是ID分组,你多行的每个,这就是GROUP BY一样。如果您改变了顶层SELECT

SELECT name_without_variants, SUM(relevance) as total_relevance 

和外GROUP BY到:

GROUP BY name_without_variants 

你应该看到不同的名称,但将不再有ID。

+0

如果您将其从group by语句中删除,则还必须向id列添加聚合函数。你将需要最小/最大或类似的东西。 – 2012-04-04 17:24:41

0
GROUP BY name_without_variants, id 

您正在按name_without_variants,id进行分组。该id在两条记录上有所不同:

1497 Lost Soul [Site Doubler] 5.05888686180115 
1498 Lost Soul [Site Doubler] 5.05888686180115 

您需要决定如何管理id。

从组中删除id,然后将聚合函数添加到select中的id列。或者只是一起删除列。

下面是一个简化为单个查询的示例。请理解我对您的架构或数据没有全面的看法,也没有经过测试。我也在这里做一些假设。但是,如果架构是关系型的,这应该带回您正在寻找的内容:

SELECT cd.id, cd.name_without_variants, (((MATCH(cd.name_without_variants) AGAINST ('lost soul site discard')) * 0.40)+ 
            ((MATCH(cdi.special_ability_text) AGAINST ('lost soul site discard')) * 0.05)+ 
            ((MATCH(b.brigade_color) AGAINST ('lost soul site discard')) * 0.30)+ 
            ((MATCH(i.identifier) AGAINST ('lost soul site discard')) * 0.20)+ 
            ((MATCH(ce.effect) AGAINST ('lost soul site discard')) * 0.05) 
           ) as total_relevance 
FROM card_definitions cd 
LEFT OUTER JOIN card_def_identities cdi ON cd.id=cdi.card_def_sid 
LEFT OUTER JOIN brigades b ON cd.id=b.card_def_sid 
LEFT OUTER JOIN identifiers i ON i.id=cdi.identifier_sid 
LEFT OUTER JOIN card_def_effects cde ON cde.card_def_sid=cd.id 
LEFT OUTER JOIN card_effects ce ON ce.id=cde.effect_sid 
GROUP BY cd.id, cd.name_without_variants 
HAVING total_relevance > 0 
ORDER BY total_relevance DESC 
LIMIT 10; 
+0

哦,我认为小组会按名称分组,然后如果还有任何重复项,请按id将它们分组。尽管我仍然需要身份证。我怎样才能得到1的ID,并确保没有重复的名称? – LordZardeck 2012-04-04 17:33:28

+0

您必须在ID列上使用最小/最大值。然而,那么你会从联合选择语句中得到一个任意的id。从顶级选择中删除ID可能是明智的,因为它没有多大意义。否则,你将不得不在你的union select语句中添加一个静态列,以便你知道id来自哪个语句。合理?如果需要,我可以编辑显示的答案。 – 2012-04-04 17:38:52

+0

我需要该ID。这是我实际使用的唯一的东西。如果你不介意,我能看到一个例子吗?我不知道任何关于最小/最大 – LordZardeck 2012-04-04 17:45:25