2015-11-20 65 views
0

多个表的计分算法我有3个表:创建跨越使用MySQL

(1)薄膜

id title 
1 AAA 
2 BBB 
3 CCC 
4 DDD 
5 EEE 

(2)的类型

id film_id genre 
1 1  Action 
2 1  Comedy 
3 1  Horror 
4 2  Action 
5 2  Comedy 
6 3  Action 
7 3  Drama 
8 4  Sci-Fi 
9 4  Drama 
10 4  Western 
10 5  Romance 
10 5  Musical 
10 5  Avant-Garde 

(3 )导演

id film_id director 
1 1  John Smith 
2 2  John Smith 
3 2  Ann Coates 
4 3  Tom Jones 
5 4  Ann Coates 
6 5  John Smith 

我正在写一个算法,根据最接近匹配的电影#1 - 任何匹配类型得分5分,任何匹配的导演得分100分给我一个分数。

当我比较只是两个表 - 电影和流派 - 使用此查询结果预期:

SELECT f1.id as original_film_id, f2.id as matching_film_id, SUM(if(g1.genre = g2.genre,5,0)) as score 
FROM films f1 
JOIN films f2 
LEFT JOIN genres g1 ON f1.id = g1.film_id 
LEFT JOIN genres g2 ON f2.id = g2.film_id 
WHERE f1.id = 1 
GROUP BY f2.id 
HAVING score > 0 
ORDER BY score DESC; 

结果:

original_film+id matching_film_id score 
1     1     15 
1     2     10 
1     3     5 

也就是说,在电影3个类型#1与电影#1(显然)中的3种类型相匹配,#2与电影#2中的2种类型以及电影#3中的1种类型匹配。

不过,我不明白的结果,当我使用此查询的董事表中添加:

SELECT f1.id as original_film_id, f2.id as matching_film_id, 
SUM(if(g1.genre = g2.genre,5,0)) 
+ SUM(IF(d1.director = d2.director,100,0)) as score 
FROM films f1 
JOIN films f2 
LEFT JOIN genres g1 ON f1.id = g1.film_id 
LEFT JOIN genres g2 ON f2.id = g2.film_id 
LEFT JOIN directors d1 ON f1.id = d1.film_id 
LEFT JOIN directors d2 ON f2.id = d2.film_id 
WHERE f1.id = 1 
GROUP BY f2.id 
HAVING score > 0 
ORDER BY score DESC; 

我期待看到这些结果:

original_film_id matching_film_id score 
1     1     115 
1     2     110 
1     5     100 
1     3     5 

...因为电影#1具有相同的流派和导演,电影#2有2个相同的流派和相同的导演,电影#5具有相同的导演但是没有匹配的流派等等。

但是相反,我看到了这些结果:

original_film_id matching_film_id score 
1     1     915 
1     2     620 
1     5     300 
1     3     5 

而我根本无法弄清楚为什么!感谢所有的帮助。

回答

0

由于您匹配了很多行(流派行与导演行),因此您已经在计算分数。你将能够看到这一点,如果你删除了组和总和,所有的总和输入将被枚举。

您可以独立计算流派,导演的分数,然后将它们组合。

select id, film_id, sum(s) as score from (
    select  a.id, c.film_id, sum(5) s 
    from  films a 
    left join genres b on(a.id = b.film_id) 
    left join genres c on(b.genre = c.genre) 
    where a.id = 1 
    group by a.id, c.film_id 
    union all 
    select  a.id, c.film_id, sum(100) s 
    from  films a 
    left join directors b on(a.id = b.film_id) 
    left join directors c on(b.director = c.director) 
    where a.id = 1 
    group by a.id, c.film_id 
) q 
group by id, film_id 
order by id, score desc 
; 
+0

这是非凡的,非常感谢你。 – huey