2013-03-01 68 views
7

我正在使用SQL Server(2008/2012),我知道有很多搜索类似的答案,但我似乎无法找到适当的示例/指针为我的案件。使用SQL转置/扁平化XML结构到列

我有一个SQL Server表保存这些数据的XML列:

<Items> 
<Item> 
    <FormItem> 
    <Text>FirstName</Text> 
    <Value>My First Name</Value> 
    </FormItem> 
    <FormItem> 
    <Text>LastName</Text> 
    <Value>My Last Name</Value> 
    </FormItem> 
    <FormItem> 
    <Text>Age</Text> 
    <Value>39</Value> 
    </FormItem> 
</Item> 
<Item> 
    <FormItem> 
    <Text>FirstName</Text> 
    <Value>My First Name 2</Value> 
    </FormItem> 
    <FormItem> 
    <Text>LastName</Text> 
    <Value>My Last Name 2</Value> 
    </FormItem> 
    <FormItem> 
    <Text>Age</Text> 
    <Value>40</Value> 
    </FormItem> 
</Item> 
</Items> 

所以,即使的<FormItem>结构将是一样的,我可以有多个(最常见的不超过20 -30)套形式的项目..

我基本上是试图从SQL在下面的格式返回查询,即基于/的FormItem /文本动态列:

FirstName   LastName   Age ---> More columns as new `<FormItem>` are returned 
My First Name  My Last Name  39   Whatever value etc.. 
My First Name 2 My Last Name 2 40   

因此,在现阶段,我有以下几点:

select 
    Tab.Col.value('Text[1]','nvarchar(100)') as Question, 
    Tab.Col.value('Value[1]','nvarchar(100)') as Answer 
from 
    @Questions.nodes('/Items/Item/FormItem') Tab(Col) 

当然,这还没有调换我的XML行转换成列,显然是无论如何固定领域..我一直在尝试不同的“动态SQL”的方法SQL执行不同的选择(在我的情况下)<Text>节点,然后使用某种类型的Pivot?但我似乎无法找到魔术组合来返回我需要的结果作为每行的动态一组列(<Item>集合<Items>)。

我确定可以看到这么多非常相似的例子,但是这个解决方案再次避开了我!

任何帮助感激地收到!

回答

7

解析XML是相当昂贵,因此而不是解析一次,以建立一个动态查询,一旦让你可以创建一个临时表名 - 值列表中的数据,然后使用为源动力的一个动态枢轴查询。
dense_rank是否有创建ID来回转。
要在动态查询中构建列表,它使用for xml path('')技巧。

此解决方案要求您的表具有主键(ID)。如果你在变量中有XML,它可以被简化。

select dense_rank() over(order by ID, I.N) as ID, 
     F.N.value('(Text/text())[1]', 'varchar(max)') as Name, 
     F.N.value('(Value/text())[1]', 'varchar(max)') as Value 
into #T 
from YourTable as T 
    cross apply T.XMLCol.nodes('/Items/Item') as I(N) 
    cross apply I.N.nodes('FormItem') as F(N) 

declare @SQL nvarchar(max) 
declare @Col nvarchar(max) 

select @Col = 
    (
    select distinct ','+quotename(Name) 
    from #T 
    for xml path(''), type 
).value('substring(text()[1], 2)', 'nvarchar(max)') 

set @SQL = 'select '[email protected]+' 
      from #T 
      pivot (max(Value) for Name in ('[email protected]+')) as P' 

exec (@SQL) 

drop table #T 

SQL Fiddle

+0

谢谢!正如我所需要的那样工作。实际上,我在另一个级别上方显示了示例,但是这实际上是将组合在一起..但关键部分是将变量xml节点显示为具有重复行的列。非常感谢一个非常详细的例子! – 2013-03-04 09:20:33

2
select Tab.Col.value('(FormItem[Text = "FirstName"]/Value)[1]', 'varchar(32)') as FirstName, 
     Tab.Col.value('(FormItem[Text = "LastName"]/Value)[1]', 'varchar(32)') as LastName, 
     Tab.Col.value('(FormItem[Text = "Age"]/Value)[1]', 'int') as Age 
from @Questions.nodes('/Items/Item') Tab(Col) 
+1

感谢,很好的例子,实际上之前没有看过这种技术,但它并不像柱那样动态地显示字段 - 如果您事先知道列的数量,可能会好的,但上面的@Mikael示例做我所需要的 - 但是,谢谢你的回复,并确保再次非常简单的清洁示例。 – 2013-03-04 09:08:49

2

我想补充我的“自己的答案”真的只是为了完整性,从而可能帮助别人..但它是最肯定的基础上,从@Mikael有很大的帮助!以上所以再次,这只是为了完整性 - 所有的荣誉@Mikael。

基本上,我结束了以下过程。我需要选择一些数据/过滤器,并获取一些连接的数据,并允许对某些输入参数进行布尔过滤。然后放入下一节,通过交叉应用创建一个关系数据临时表和所需的xml节点。最后一步是那么转动的结果/动态地从所选择的XML节点创建列..

CREATE PROCEDURE [dbo].[usp_RPT_ExtractFlattenentries] 
    @CompanyID   int, 
    @MainSelector  nvarchar(50) = null, 
    @SecondarySelector  nvarchar(255) = null, 
    @DateFrom   datetime = '01-jan-2012', 
    @DateTo    datetime = '31-dec-2100', 
    @SysReference  nvarchar(20) = null 
AS 
BEGIN 
    SET NOCOUNT ON; 

    -- Create the table var to hold the XML form data from the entries 
    declare @FeedbackXml table (
     ID int identity primary key, 
     XMLCol xml, 
     CompanyName nvarchar(20), 
     SysReference nvarchar(20), 
     RecordDate datetime, 
     EntryName nvarchar(255), 
     MainSelector nvarchar(50) 
    ) 

    -- STEP 1: Get the raw submission data based on the params passed in 
    -- *Note: The double casting is necessary as the "form" field is nvarchar (not varchar) and we need xml in UTF-8 format 
    begin 
     insert into @FeedbackXml 
      (XMLCol, CompanyName, SysReference, RecordDate, EntryName, MainSelector) 
     select cast(cast(e.form as nvarchar(max)) as xml), c.name, e.SysReference, e.RecordDate, e.name, e.wizard 
     from 
      entries s 
     left join 
      companies o on e.companies = c.ID 
     where 
      (@CompanyID = -1 or @CompanyID = e.companies) 
     and 
      (@MainSelector is null or @MainSelector = e.wizard) 
     and 
      (@SecondarySelector is null or @SecondarySelector = e.name) 
     and 
      (@SysReference is null or @SysReference = e.SysReference) 
     and 
      (e.RecordDate >= @DateFrom and e.RecordDate <= @DateTo) 
    end 

    -- STEP 2: Flatten the required XML structure to provide a base for the pivot, and include other fields we wish to output 
    select dense_rank() over(order by ID) as ID, 
      T.RecordDate, T.CompanyName, T.SysReference, T.EntryName, T.MainSelector, 
      F.N.value('(FieldNameNode/text())[1]', 'nvarchar(max)') as FieldName, 
      F.N.value('(FieldNameValue/text())[1]', 'nvarchar(max)') as FieldValue 
    into #TempData 
    from @FeedbackXml as T 
     cross apply T.XMLCol.nodes('/root/companies/') as I(N) -- Xpath to the desired node start point 
     cross apply I.N.nodes('company') as F(N) -- The actual node collection that forms the "field name" and "field value" data 

    -- STEP 3: Pivot the #TempData table creating a dynamic column structure based on the selected XML nodes in step 2 
    declare @SQL nvarchar(max) 
    declare @Col nvarchar(max) 

    select @Col = 
     (
     select distinct ','+quotename(FieldName) 
     from #TempData 
     for xml path(''), type 
    ).value('substring(text()[1], 2)', 'nvarchar(max)') 

    set @SQL = 'select CompanyName, SysReference, EntryName, MainSelector, RecordDate, '[email protected]+' 
       from #TempData 
       pivot (max(FieldValue) for FieldName in ('[email protected]+')) as P' 

    exec (@SQL) 
    drop table #TempData 

END 

再次,真的只是添加了这个答案,从我的角度提供了一个完整的画面,并可以帮助别人。