2014-11-03 60 views
0

我有一个电子邮件地址,用户名和域名的表(T1)的数量:发送到一个地址的邮件是否被打开SQL case语句基于事件

 email    user   domain 
[email protected]  joe123  domain.com 
[email protected]   sue234  email.net 
     ...     ...   ... 

而另一个表(T2) :

Opened    Email 
    0   [email protected] 
    1   [email protected] 
    0   [email protected] 
    ...    ... 

我想加入t1.domain到t2,但只有发生超过100倍的域名。

我可以创建发生的表计数

SELECT domain, count(domain) cntDomain 
from table1 
group by domain 

有这样的结果:

domain   cntDomain 
domain.com  5000 
email.net  4300 
mybarber.com  67 

结果表是这样的:

Opened    Email     domain 
    0   [email protected]   domain.com 
    1   [email protected]   email.net 
    0   [email protected]  other 
    ...    ... 

但可以” t找出连接(我认为这将是一个左连接,为不经常出现的值创建“其他”值)如果它发生超过100倍,并且如果不是“其他”的值,则需要加入case case语句。

+0

你需要一个'有COUNT(*)> 100' – paqogomez 2014-11-03 21:39:03

回答

0
select * 
from table2 t2 
inner join 
(
    SELECT domain, count(1) cntDomain 
    from table1 
    group by domain 
    having count(1) > 100 
) t1 on t2.email = t1.email 
0

目前还不清楚第一个表中的所有电子邮件是否在第二个表中。如果是这样,你可以这样做:

select t1.*, t2.domain 
from (select t2.*, count(*) over (partition by domain) as cnt 
     from table2 t2 
    ) t2 join 
    table1 t1 
    on t1.email = t2.email 
where cnt > 100; 

如果没有,我们可以为您在邮件地址中的域:

select t2.*, t1.domain 
from table2 t2 left join 
    (select t1.domain, count(*) as cnt 
     from table1 t1 
     group by t1.domain 
    ) t1 
    on t2.email like '%@' + t1.domain and 
     cnt > 100; 

期待这个版本的性能是非常非常糟糕。

+0

你可能想使第二查询的第加盟条件't2.email LIKE“%@ '+ t1.domain'保持子域分离。 – Allan 2014-11-04 00:13:52

+0

@Allan。 。 。这很有道理。谢谢。它也影响诸如'gmail.com'和'mail.com'之类的东西。 – 2014-11-04 03:12:58

0

此方法使用内部查询获取计数,然后使用case语句将计数解释为域或字符串'Other'(视情况而定)。对一些游戏数据进行测试以确保其有效,但我对其性能没有任何意见。

感觉有点尴尬,因为t1会被查询两次;一次获得域名,并再次获得计数。无论如何,它可以完成工作。

如果特定阈值发生变化,您可以将数字100替换为另一个数字(或变量)。

select 
    t2.Opened 
, t2.Email 
, case when t3.cntDomain > 100 then t3.domain else 'Other' end as domain 
from t2 
left outer join t1 on t2.Email = t1.email 
left outer join (
    select t1.domain, count(1) cntDomain 
    from t1 
    left outer join t2 on t1.email = t2.email 
    group by t1.domain 
) as t3 on t1.domain = t3.domain 

编辑

如果你不喜欢case语句,这种方法可能会闻到更优雅。内部查询使用having语句进行修改。现在,由于左连接,在计数小于阈值的情况下,t3.domain将为空。为null合并的select语句添加一点ISNULL,并且您是金钱。

select 
    t2.Opened 
, t2.Email 
, ISNULL(t3.domain, 'Other') 
from t2 
left outer join t1 on t2.Email = t1.email 
left outer join (
    select t1.domain, count(1) cntDomain 
    from t1 
    left outer join t2 on t1.email = t2.email 
    group by t1.domain 
    having count(1) > 100 
) as t3 on t1.domain = t3.domain 

干杯!

0

我认为以下查询应该解决您的问题

 SELECT t2.opened, 
     t2.Email, 
     CASE WHEN tempt1.email is NULL THEN 'Other' ELSE tempt1.domain END as domain 
     FROM t2 LEFT JOIN (SELECT email,domain 
     FROM t1 
     group by domain HAVING count(domain)>100) tempt1 on t2.Email=tempt1.email