全球跳出率
组由GUID和主机名的数据(两次,用后者过滤,得到只反弹),然后外部联接在一起:)
SELECT count(bounces.guid) `bounces`,
count(uniqueUsers.guid) `total unique users`,
count(bounces.guid)/count(uniqueUsers.guid) * 100 `global bounce rate`
FROM (
SELECT guid, hostname
FROM PageViews
GROUP BY guid, hostname
) uniqueUsers
LEFT JOIN (
SELECT guid, hostname
FROM PageViews
GROUP BY guid, hostname
HAVING COUNT(1) = 1
) bounces
ON uniqueUsers.guid = bounces.guid
AND uniqueUsers.hostname = bounces.hostname
实施例的结果:
bounces unique users global bounce rate
------- ------------ ------------------
3 6 50.0000
请注意,所有与'host1'对应的4'guid 3'命中仅计为1个唯一用户,但'gui d 1'同时击中了host1和host2,因此它统计了2个唯一用户(我认为这是所需的逻辑)。
每个主机跳出率
相同的,但与一组通过在外部查询:)
SELECT uniqueUsers.hostname,
count(bounces.guid) bounces,
count(uniqueUsers.guid) `unique users`,
count(bounces.guid)/count(uniqueUsers.guid) * 100 `global bounce rate`
FROM (
SELECT guid, hostname
FROM PageViews
GROUP BY guid, hostname
) uniqueUsers
LEFT JOIN (
SELECT guid, hostname
FROM PageViews
GROUP BY guid, hostname
HAVING COUNT(1) = 1
) bounces
ON uniqueUsers.guid = bounces.guid
AND uniqueUsers.hostname = bounces.hostname
GROUP BY uniqueUsers.hostname;
实施例的结果:
hostname bounces unique users bounce rate
-------- ------- ------------ -----------
host1 2 4 50.0000
host2 0 1 0.0000
host3 1 1 100.0000
示例数据
guid hostname path
---- -------- ----
1 host1 irrelevant => bounce 1
2 host1 irrelevant => bounce 2
3 host1 irrelevant => non-bounce 1 (visit 1/4)
3 host1 irrelevant
3 host1 irrelevant
3 host1 irrelevant
4 host1 irrelevant => non-bounce 2 (visit 1/2)
4 host1 irrelevant
1 host2 irrelevant => non-bounce 3 (visit 1/2)
1 host2 irrelevant
2 host3 irrelevant => bounce 3
我认为您的查询不能正确识别反弹。你没有对主机名进行任何操作......同一个GUID不能击中多个主机吗? – AjahnCharles
同样使用count(\ *)会重新计算每个路径上的同一用户一次,如果我访问了50个页面,我仍然是1个非弹跳用户,而不是50个。当然,您希望计数(反弹的用户)并计数(唯一用户)计算每个网站,*然后*总结所有网站并计算您的费率。 – AjahnCharles