2014-10-31 74 views
5

我有一个SQLCLR标量函数,它会将我需要的XmlReader蒸发到内联结果集中。这些XML对象是按需生成的,所以我不能使用XML索引。在结果数据集中有超过100列是很常见的。考虑下面的示例代码:XQuery计划复杂性

CREATE XML SCHEMA COLLECTION RAB AS ' 
<xsd:schema xmlns:schema="urn:schemas-microsoft-com:sql:SqlRowSet1" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sqltypes="http://schemas.microsoft.com/sqlserver/2004/sqltypes" elementFormDefault="qualified"> 
<xsd:import namespace="http://schemas.microsoft.com/sqlserver/2004/sqltypes" schemaLocation="http://schemas.microsoft.com/sqlserver/2004/sqltypes/sqltypes.xsd" /> 

    <xsd:element name="r" type="r"/> 

    <xsd:complexType name="r"> 
    <xsd:attribute name="a" type="sqltypes:int" use="required"/> 
    <xsd:attribute name="b" type="sqltypes:int" use="required"/> 
    <xsd:attribute name="c" type="sqltypes:int" use="required"/> 
    </xsd:complexType> 
</xsd:schema>'; 
GO 

DECLARE @D TABLE(x XML(DOCUMENT RAB) NOT NULL); 

INSERT INTO @D 
VALUES 
('<r a="3" b="4" c="34"/>'), 
('<r a="5" b="6" c="56"/>'), 
('<r a="7" b="8" c="78"/>') 

SELECT x.value('/r/@a', 'int') a, x.value('/r/@b', 'int') b, x.value('/r/@c', 'int') c 
FROM @d a 

这与一些XML值的表变量填充类型化XML列,打破了属性为单独列。执行计划似乎过于凌乱:

|--Compute Scalar(DEFINE:([Expr1009]=[Expr1008], [Expr1016]=[Expr1015], [Expr1023]=[Expr1022])) 
    |--Nested Loops(Inner Join, OUTER REFERENCES:([a].[x])) 
     |--Nested Loops(Inner Join, OUTER REFERENCES:([a].[x])) 
      | |--Nested Loops(Inner Join, OUTER REFERENCES:([a].[x])) 
      | | |--Table Scan(OBJECT:(@d AS [a])) 
      | | |--Stream Aggregate(DEFINE:([Expr1008]=MIN([Expr1024]))) 
      | |   |--Compute Scalar(DEFINE:([Expr1024]=CASE WHEN datalength(CONVERT_IMPLICIT(sql_variant,CONVERT_IMPLICIT(nvarchar(64),xsd_cast_to_maybe_large(XML Reader with XPath filter.[value],XML Reader with XPath filter.[lvalue],XML Reader wi 
      | |    |--Table-valued function 
      | |--Stream Aggregate(DEFINE:([Expr1015]=MIN([Expr1025]))) 
      |   |--Compute Scalar(DEFINE:([Expr1025]=CASE WHEN datalength(CONVERT_IMPLICIT(sql_variant,CONVERT_IMPLICIT(nvarchar(64),xsd_cast_to_maybe_large(XML Reader with XPath filter.[value],XML Reader with XPath filter.[lvalue],XML Reader with XP 
      |    |--Table-valued function 
      |--Stream Aggregate(DEFINE:([Expr1022]=MIN([Expr1026]))) 
       |--Compute Scalar(DEFINE:([Expr1026]=CASE WHEN datalength(CONVERT_IMPLICIT(sql_variant,CONVERT_IMPLICIT(nvarchar(64),xsd_cast_to_maybe_large(XML Reader with XPath filter.[value],XML Reader with XPath filter.[lvalue],XML Reader with XPath f 
        |--Table-valued function 

它的每列都有一个嵌套循环!如果我将这些表中的多个列连接在一起,那么查询计划将过于复杂。另外,我不明白这些运营商的目的。内容如下:

MIN(
    CASE WHEN @d.[x] as [a].[x] IS NULL 
    THEN NULL ELSE 
    CASE WHEN datalength(CONVERT_IMPLICIT(sql_variant, 
      CONVERT_IMPLICIT(nvarchar(64),xsd_cast_to_maybe_large(xrpf.[value],xrpf.[lvalue],xrpf.[lvaluebin],xrpf.[tid],(15),(7)) 
      ,0),0))>=(128) 
    THEN CONVERT_IMPLICIT(int, 
     CASE WHEN datalength(xsd_cast_to_maybe_large(xrpf.[value],xrpf.[lvalue],xrpf.[lvaluebin],xrpf.[tid],(15),(7)))<(128) 
     THEN NULL 
     ELSE xsd_cast_to_maybe_large(xrpf.[value],xrpf.[lvalue],xrpf.[lvaluebin],xrpf.[tid],(15),(7)) 
     END,0) 
     ELSE CONVERT_IMPLICIT(int, 
      CONVERT_IMPLICIT(sql_variant,CONVERT_IMPLICIT(nvarchar(64),xsd_cast_to_maybe_large(xrpf.[value],xrpf.[lvalue],xrpf.[lvaluebin],xrpf.[tid],(15),(7)),0),0),0) 
    END 
END) 

Yu!我认为使用类型为sqltype的XML组应该避免转换?

要么我高估了这将是多么有效,或者我做错了什么。我的问题是我该如何解决这个问题,因此我没有为每列添加额外的查询计划运算符,并且理想地避免了转换,还是应该放弃并找到非xpath方法来执行此操作?

参考文献:

的SQLType http://msdn.microsoft.com/en-us/library/ee320775%28v=sql.105%29.aspx

XML数据类型方法http://technet.microsoft.com/en-us/library/ms190798%28v=sql.105%29.aspx

enter image description here

+1

通常的方法我做的是'SELECT r.value( '@一', '廉政'),r.value('@ b','int'),r.value('@ c','int')FROM @D CROSS APPLY x.nodes('/ r')as ca(r)',但是这个计划看起来形状差不多(TVF节点给出了更高的估计子树成本) – 2014-11-01 12:06:13

回答

11

有在需要先整理了一下查询计划的一些奥秘。计算标量做什么以及为什么会有一个流聚合。

表值函数返回碎化XML的节点表,每个碎化行一行。当您使用键入的XML时,这些列是值,左值,左值和tid。这些列用于计算标量来计算实际值。在那里的代码看起来有点奇怪,我不能说我明白为什么它是这样的,但它的要点是函数xsd_cast_to_maybe_large返回值,并且有代码处理的情况下,当值等于和大于128个字节。

CASE WHEN datalength(
        CONVERT_IMPLICIT(sql_variant, 
         CONVERT_IMPLICIT(nvarchar(64), 
             xsd_cast_to_maybe_large(XML Reader with XPath filter.[value], 
                   XML Reader with XPath filter.[lvalue], 
                   XML Reader with XPath filter.[lvaluebin], 
                   XML Reader with XPath filter.[tid],(15),(5),(0)),0),0))>=(128) 
    THEN CONVERT_IMPLICIT(int,CASE WHEN datalength(xsd_cast_to_maybe_large(XML Reader with XPath filter.[value], 
                     XML Reader with XPath filter.[lvalue], 
                     XML Reader with XPath filter.[lvaluebin], 
                     XML Reader with XPath filter.[tid],(15),(5),(0)))<(128) 
           THEN NULL 
           ELSE xsd_cast_to_maybe_large(XML Reader with XPath filter.[value], 
                  XML Reader with XPath filter.[lvalue], 
                  XML Reader with XPath filter.[lvaluebin], 
                  XML Reader with XPath filter.[tid],(15),(5),(0)) 
          END,0) 
    ELSE CONVERT_IMPLICIT(int,CONVERT_IMPLICIT(sql_variant, 
               CONVERT_IMPLICIT(nvarchar(64), 
                   xsd_cast_to_maybe_large(XML Reader with XPath filter.[value], 
                         XML Reader with XPath filter.[lvalue], 
                         XML Reader with XPath filter.[lvaluebin], 
                         XML Reader with XPath filter.[tid],(15),(5),(0)),0),0),0) 
END 

用于非类型化XML的相同计算标量非常简单且实际上可以理解。

CASE WHEN datalength(XML Reader with XPath filter.[value])>=(128) 
    THEN CONVERT_IMPLICIT(int,XML Reader with XPath filter.[lvalue],0) 
    ELSE CONVERT_IMPLICIT(int,XML Reader with XPath filter.[value],0) 
END 

如果有value超过128个字节lvalue获取来自其他value取。在非类型化XML的情况下,返回的节点表仅输出列id,值和左值。

当您使用键入的XML时,节点值的存储将根据模式中指定的数据类型进行优化。看起来它可能会以节点表中的值,左值或左值结束,具体取决于它的值是什么类型,xsd_cast_to_maybe_large可以帮助解决问题。

流聚合对来自计算标量的返回值执行min()。我们知道,SQL Server(至少有时)确实知道,在value()函数中指定XPath时,将只有从表值函数返回的一行。解析器确保我们正确地构建XPath,但是当查询优化器查看估计的行时,它会看到200行。用于解析XML的表值函数的基本估计值为10000行,然后使用所使用的XPath进行一些调整。在这种情况下,它最终只有200行,只有一行。对我而言,纯粹的猜测是流集合在那里来处理这种差异。它永远不会聚合任何东西,只发送返回的那一行,但它确实会影响整个分支的基数估计,并确保优化器使用1行作为该分支的估计值。当优化器选择加入策略等时,这当然非常重要。

那么100个属性如何呢?是的,如果您使用值函数100次,将会有100个分支。但是这里有一些优化要做。我创建了一个测试装置,查看使用10行以上的100个属性,查询的形状和形式是最快的。

获胜者将使用无类型的XML,并且不要使用nodes()函数粉碎r

select X.value('(/r/@a1)[1]', 'int') as a1, 
     X.value('(/r/@a2)[1]', 'int') as a2, 
     X.value('(/r/@a3)[1]', 'int') as a3 
from @T 

还有避免了100个分支机构使用透视但根据您的实际查询的样子,可能是不可能的方式。来自数据透视表的数据类型必须相同。你当然可以将它们作为字符串提取出来,并转换为列表中适当的类型。它还要求您的表具有主键/唯一键。

select a1, a2, a3 
from (
    select T.ID, -- primary key of @T 
      A.X.value('local-name(.)', 'nvarchar(50)') as Name, 
      A.X.value('.', 'int') as Value 
    from @T as T 
     cross apply T.X.nodes('/r/@*') as A(X) 
    ) as T 
pivot(min(T.Value) for Name in (a1, a2, a3)) as P 

查询计划枢轴查询,10点100的属性:

enter image description here

下面是结果和测试装备予使用。我测试了100个属性和10行以及所有int属性。

结果:

Test            Duration (ms) 
-------------------------------------------------- ------------- 
untyped XML value('/r[1]/@a')      195  
untyped XML value('(/r/@a)[1]')      108 
untyped XML value('@a') cross apply nodes('/r')  131 
untyped XML value('@a') cross apply nodes('/r[1]') 127 
typed XML value('/r/@a')       185 
typed XML value('(/r/@a)[1]')      148 
typed XML value('@a') cross apply nodes('/r')  176 
untyped XML pivot         34 
typed XML pivot          52 

代码:

drop type dbo.TRABType 
drop type dbo.TType; 
drop xml schema collection dbo.RAB; 

go 

declare @NumAtt int = 100; 
declare @Attribs nvarchar(max); 

with xmlnamespaces('http://www.w3.org/2001/XMLSchema' as xsd) 
select @Attribs = (
select top(@NumAtt) 'a'+cast(row_number() over(order by 1/0) as varchar(11)) as '@name', 

        'sqltypes:int' as '@type', 
        'required' as '@use' 
from sys.columns 
for xml path('xsd:attribute') 
) 
--CREATE XML SCHEMA COLLECTION RAB AS 

declare @Schema nvarchar(max) = 
' 
<xsd:schema xmlns:schema="urn:schemas-microsoft-com:sql:SqlRowSet1" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sqltypes="http://schemas.microsoft.com/sqlserver/2004/sqltypes" elementFormDefault="qualified"> 
<xsd:import namespace="http://schemas.microsoft.com/sqlserver/2004/sqltypes" schemaLocation="http://schemas.microsoft.com/sqlserver/2004/sqltypes/sqltypes.xsd" /> 
    <xsd:element name="r" type="r"/> 
    <xsd:complexType name="r">[ATTRIBS]</xsd:complexType> 
</xsd:schema>'; 

set @Schema = replace(@Schema, '[ATTRIBS]', @Attribs) 

create xml schema collection RAB as @Schema 

go 

create type dbo.TType as table 
(
    ID int identity primary key, 
    X xml not null 
); 

go 

create type dbo.TRABType as table 
(
    ID int identity primary key, 
    X xml(document rab) not null 
); 


go 

declare @NumAtt int = 100; 
declare @NumRows int = 10; 

declare @X nvarchar(max); 
declare @C nvarchar(max); 
declare @M nvarchar(max); 

declare @S1 nvarchar(max); 
declare @S2 nvarchar(max); 
declare @S3 nvarchar(max); 
declare @S4 nvarchar(max); 
declare @S5 nvarchar(max); 
declare @S6 nvarchar(max); 
declare @S7 nvarchar(max); 
declare @S8 nvarchar(max); 
declare @S9 nvarchar(max); 

set @X = N'<r '+ 
    (
    select top(@NumAtt) 'a'+cast(row_number() over(order by 1/0) as varchar(11))+'="'+cast(row_number() over(order by 1/0) as varchar(11))+'" ' 
    from sys.columns 
    for xml path('') 
)+ 
'/>'; 

set @C = 
    stuff((
    select top(@NumAtt) ',a'+cast(row_number() over(order by 1/0) as varchar(11)) 
    from sys.columns 
    for xml path('') 
), 1, 1, '') 

set @M = 
    stuff((
    select top(@NumAtt) ',MAX(CASE WHEN name = ''a'+cast(row_number() over(order by 1/0) as varchar(11))+''' THEN val END)' 
    from sys.columns 
    for xml path('') 
), 1, 1, '') 


declare @T dbo.TType; 
insert into @T(X) 
select top(@NumRows) @X 
from sys.columns; 

declare @TRAB dbo.TRABType; 
insert into @TRAB(X) 
select top(@NumRows) @X 
from sys.columns; 


-- value('/r[1]/@a') 
set @S1 = N' 
select T.ID'+ 
(
select top(@NumAtt) ', T.X.value(''/r[1]/@a'+cast(row_number() over(order by 1/0) as varchar(11))+''', ''int'')' 
from sys.columns 
for xml path('') 
)+ 
' from @T as T 
option (maxdop 1)'; 

-- value('(/r/@a)[1]') 
set @S2 = N' 
select T.ID'+ 
(
select top(@NumAtt) ', T.X.value(''(/r/@a'+cast(row_number() over(order by 1/0) as varchar(11))+')[1]'', ''int'')' 
from sys.columns 
for xml path('') 
)+ 
' from @T as T 
option (maxdop 1)'; 

-- value('@a') cross apply nodes('/r') 
set @S3 = N' 
select T.ID'+ 
(
select top(@NumAtt) ', T2.X.value(''@a'+cast(row_number() over(order by 1/0) as varchar(11))+''', ''int'')' 
from sys.columns 
for xml path('') 
)+ 
' from @T as T 
    cross apply T.X.nodes(''/r'') as T2(X) 
option (maxdop 1)'; 


-- value('@a') cross apply nodes('/r[1]') 
set @S4 = N' 
select T.ID'+ 
(
select top(@NumAtt) ', T2.X.value(''@a'+cast(row_number() over(order by 1/0) as varchar(11))+''', ''int'')' 
from sys.columns 
for xml path('') 
)+ 
' from @T as T 
    cross apply T.X.nodes(''/r[1]'') as T2(X) 
option (maxdop 1)'; 

-- value('/r/@a') typed XML 
set @S5 = N' 
select T.ID'+ 
(
select top(@NumAtt) ', T.X.value(''/r/@a'+cast(row_number() over(order by 1/0) as varchar(11))+''', ''int'')' 
from sys.columns 
for xml path('') 
)+ 
' from @TRAB as T 
option (maxdop 1)'; 

-- value('(/r/@a)[1]') 
set @S6 = N' 
select T.ID'+ 
(
select top(@NumAtt) ', T.X.value(''(/r/@a'+cast(row_number() over(order by 1/0) as varchar(11))+')[1]'', ''int'')' 
from sys.columns 
for xml path('') 
)+ 
' from @TRAB as T 
option (maxdop 1)'; 

-- value('@a') cross apply nodes('/r') typed XML 
set @S7 = N' 
select T.ID'+ 
(
select top(@NumAtt) ', T2.X.value(''@a'+cast(row_number() over(order by 1/0) as varchar(11))+''', ''int'')' 
from sys.columns 
for xml path('') 
)+ 
' from @TRAB as T 
    cross apply T.X.nodes(''/r'') as T2(X) 
option (maxdop 1)'; 

-- pivot 
set @S8 = N' 
select ID, '[email protected]+' 
from (
    select T.ID, 
      A.X.value(''local-name(.)'', ''nvarchar(50)'') as Name, 
      A.X.value(''.'', ''int'') as Value 
    from @T as T 
     cross apply T.X.nodes(''/r/@*'') as A(X) 
    ) as T 
pivot(min(T.Value) for Name in ('[email protected]+')) as P 
option (maxdop 1)'; 

-- typed pivot 
set @S9 = N' 
select ID, '[email protected]+' 
from (
    select T.ID, 
      A.X.value(''local-name(.)'', ''nvarchar(50)'') as Name, 
      cast(cast(A.X.query(''string(.)'') as varchar(11)) as int) as Value 
    from @TRAB as T 
     cross apply T.X.nodes(''/r/@*'') as A(X) 
    ) as T 
pivot(min(T.Value) for Name in ('[email protected]+')) as P 
option (maxdop 1)'; 


exec sp_executesql @S1, N'@T dbo.TType readonly', @T; 
exec sp_executesql @S2, N'@T dbo.TType readonly', @T; 
exec sp_executesql @S3, N'@T dbo.TType readonly', @T; 
exec sp_executesql @S4, N'@T dbo.TType readonly', @T; 
exec sp_executesql @S5, N'@TRAB dbo.TRABType readonly', @TRAB; 
exec sp_executesql @S6, N'@TRAB dbo.TRABType readonly', @TRAB; 
exec sp_executesql @S7, N'@TRAB dbo.TRABType readonly', @TRAB; 
exec sp_executesql @S8, N'@T dbo.TType readonly', @T; 
exec sp_executesql @S9, N'@TRAB dbo.TRABType readonly', @TRAB; 
+2

+1详细和信息丰富的配置文件替代品!我已经考虑过PIVOT选项,可能直接在CLR中直接生成键/名称/值元组。我希望使用XML来提高效率。这是使这么好的答案! – 2014-11-03 14:30:41