2014-12-02 43 views
0

有谁知道如何将以下XML转换为R数据框?XML到数据框

<?xml version="1.0"?> 
    <soap:Envelope> 
     <soap:Body> 
     <getCampaignsResponse> 
       <getCampaignsResult> 
        <campaign> 
         <categoryBids> 
           <categoryBid> 
            <campaignCategoryUID>1234</campaignCategoryUID> 
            <campaignID>1211</campaignID> 
            <categoryID>1254</categoryID> 
            <selected>true</selected> 
            <bidInformation> 
             <biddingStrategy>Cpc</biddingStrategy> 
             <cpcBid> 
             <cpc>0.5</cpc> 
             </cpcBid> 
             <cpaBid xsi:nil="true"/> 
            </bidInformation> 
           </categoryBid> 
           <categoryBid> 
             <campaignCategoryUID>5487</campaignCategoryUID> 
             <campaignID>3244</campaignID> 
             <categoryID>1234</categoryID> 
             <selected>true</selected> 
             <bidInformation> 
             <biddingStrategy>Cpc</biddingStrategy> 
             <cpcBid> 
              <cpc>0.2</cpc> 
             </cpcBid> 
             <cpaBid xsi:nil="true"/> 
            </bidInformation> 
           </categoryBid> 
         </categoryBids> 
        </campaign> 
       </getCampaignsResult> 
      </getCampaignsResponse> 
     </soap:Body> 
    </soap:Envelope> 

类的XML对象是:

> str(data) 
Classes 'XMLInternalDocument', 'XMLAbstractDocument' <externalptr> 

数据框应该有以下栏目:
campaignCategoryUID
CAMPAIGNID
的categoryID
BIDDINGSTRATEGY
中共

随着xmlToDataFramexmlToList我无法取得有用的结果。任何帮助真的很感激!

回答

1

你必须用手工的东西,如xpathSApply提取节点,可能需要改变你解析响应的方式,因为它没有任何的名称空间定义:

library(XML) 

xml <- '<?xml version="1.0"?> 
    <soap:Envelope> 
     <soap:Body> 
     <getCampaignsResponse> 
       <getCampaignsResult> 
        <campaign> 
         <categoryBids> 
           <categoryBid> 
            <campaignCategoryUID>1234</campaignCategoryUID> 
            <campaignID>1211</campaignID> 
            <categoryID>1254</categoryID> 
            <selected>true</selected> 
            <bidInformation> 
             <biddingStrategy>Cpc</biddingStrategy> 
             <cpcBid> 
             <cpc>0.5</cpc> 
             </cpcBid> 
             <cpaBid xsi:nil="true"/> 
            </bidInformation> 
           </categoryBid> 
           <categoryBid> 
             <campaignCategoryUID>5487</campaignCategoryUID> 
             <campaignID>3244</campaignID> 
             <categoryID>1234</categoryID> 
             <selected>true</selected> 
             <bidInformation> 
             <biddingStrategy>Cpc</biddingStrategy> 
             <cpcBid> 
              <cpc>0.2</cpc> 
             </cpcBid> 
             <cpaBid xsi:nil="true"/> 
            </bidInformation> 
           </categoryBid> 
         </categoryBids> 
        </campaign> 
       </getCampaignsResult> 
      </getCampaignsResponse> 
     </soap:Body> 
    </soap:Envelope>' 

doc <- xmlRoot(xmlTreeParse(xml, useInternalNodes = TRUE)) 

data <- data.frame(campaignCategoryUID=xpathSApply(doc, "//campaignCategoryUID", xmlValue), 
        campaignID=xpathSApply(doc, "//campaignID", xmlValue), 
        categoryID=xpathSApply(doc, "//categoryID", xmlValue), 
        biddingStrategy=xpathSApply(doc, "//biddingStrategy", xmlValue), 
        cpc=xpathSApply(doc, "//cpc", xmlValue)) 

data 

## campaignCategoryUID campaignID categoryID biddingStrategy cpc 
## 1    1234  1211  1254    Cpc 0.5 
## 2    5487  3244  1234    Cpc 0.2 

您也可以提取功能:

nodes <- c("campaignCategoryUID", "campaignID", "categoryID", "biddingStrategy", "cpc") 
data <- rbind.data.frame(sapply(nodes, function(x) xpathSApply(doc, sprintf("//%s", x), xmlValue))) 

只要你并不需要处理边界情况(即所提供的所有提取均匀,不会有“错误”)。

+0

谢谢!你的解决方案对我来说很好! – jburkhardt 2014-12-03 10:10:37