2017-09-22 41 views
0

我有2个表。我想要列表中的分类网址[Activite_Site]我尝试了下面的查询,但它不起作用...任何人都有想法。 预先感谢您Datalake解析连接

Table [Categorie] 
URL       CAT 
http//www.site.com/business B2B 
http//www.site.com/office B2B 
http//www.site.com/job  B2B 
http//www.site.com/home  B2C 

Table [Actvite_Site] 
URL 
http//www.site.com/business/page2/test.html 
http//www.site.com/business/page3/pagetest/tot.html 
http//www.site.com/office/all/tot.html 
http//www.site.com/home/holiday/paris.html 
http//www.site.com/home/private/moncompte.html 

I would like OUTPUT : 

URL_SITE           CATEGORIE 
http//www.site.com/business/page2/test.html   B2B 
http//www.site.com/business/page3/pagetest/tot.html B2B 
http//www.site.com/office/all/tot.html    B2B 
http//www.site.com/home/holiday/paris.html   B2C 
http//www.site.com/home/private/moncompte.html  B2C 
http//www.site.com/test/pte.html     Null 

My query : 

    SELECT A.URL AS URL_SITE 
      C.CAT AS CATEGORIE 
    FROM Actvite_Site as A 
     LEFT Categorie as C ON C.URL==A.URL.PadLeft(C.URL.Lenght) 
+0

不工作怎么样?错误?意外的结果? – user5226582

+0

即时看到错字错误..或者是这种情况? –

+0

为了简化,我纠正我的查询是这样的: SELECT A.URL AS URL_SITE C.CAT AS CATEGORIE FROM Actvite_Site为A LEFT Categorie如C ON C.URL == A.URL.PadLeft(10) Erreur \t \t E_CSC_USER_JOINCOLUMNSEXPECTEDONEACHSIDEOFCONDITION:比较每一侧的表达式A.URL.PadLeft(10)和C.url必须都是列 – FranckSR

回答

1

RE错误E_CSC_USER_JOINCOLUMNSEXPECTEDONEACHSIDEOFCONDITION,U-SQL目前不支持派生列的连接条件。

实现此目的的一种方法可能是找到匹配的URL,然后将其与UNION组合在一起。

@category = SELECT * 
    FROM (
     VALUES 
      ("http//www.site.com/business", "B2B"), 
      ("http//www.site.com/office", "B2B"), 
      ("http//www.site.com/job", "B2B"), 
      ("http//www.site.com/home", "B2C") 
     ) AS x(url, cat); 


@siteActivity = SELECT * 
    FROM (
     VALUES 
      ("http//www.site.com/business/page2/test.html"), 
      ("http//www.site.com/business/page3/pagetest/tot.html"), 
      ("http//www.site.com/office/all/tot.html"), 
      ("http//www.site.com/home/holiday/paris.html"), 
      ("http//www.site.com/home/private/moncompte.html"), 
      ("http//www.site.com/test/pte.html") 
     ) AS x(url); 


// Find matched URLs 
@working = 
    SELECT sa.url, 
      c.cat 
    FROM @siteActivity AS sa 
     CROSS JOIN 
      @category AS c 
     WHERE sa.url.Substring(0, c.url.Length) == c.url; 


// Combine the matched and unmatched URLs 
@output = 
    SELECT url, 
      cat 
    FROM @working 

    UNION ALL 

    SELECT url, 
      (string) null AS cat 
    FROM @siteActivity AS sa 
     ANTISEMIJOIN 
      @working AS w 
     ON sa.url == w.url; 



OUTPUT @output TO "/output/output.csv" 
USING Outputters.Csv(quoting:false); 

我想知道是否有更有效的方法。