2012-12-02 41 views
4

我正在使用Nokogiri来修改现有的XML,但我在选择某些节点时遇到了问题。Nokogiri XPath找不到某些节点

这里是XML的相关片段:

<ProductCatalog> 
    <ProductLineItem> 
    <updi:ProductIdentification> 
     <updi:ProductName>800-22283-03</updi:ProductName> 

我可以找到与较低的两个节点:

doc.xpath("//updi:ProductIdentification") => #<Nokogiri::XML... 
doc.xpath("//updi:ProductName") => #<Nokogiri::XML... 

但是如果我尽量选择上的一个节点:

doc.xpath("//ProductLineItem") => [] 

我找回一个空数组。这似乎与前缀有关。我可以找到任何有前缀的元素,但找不到没有前缀的元素。

更新:这里是(相当长)命名空间:

xsi:schemaLocation="urn:rosettanet:specification:interchange:ProductCatalogInformationDistribution:xsd:schema:01.00 ..\..\XML\Interchange\ProductCatalogInformationDistribution_01_00.xsd" 
xmlns:dplcs="urn:rosettanet:specification:domain:Design:ProductLifeCycleStatusCode:xsd:codelist:01.03" 
xmlns:rrt="urn:rosettanet:specification:domain:Shared:RateType:xsd:codelist:01.01" 
xmlns:dl="urn:rosettanet:specification:domain:Logistics:xsd:schema:02.15" 
xmlns:ictc="urn:rosettanet:specification:domain:Design:CatalogType:xsd:codelist:01.00" 
xmlns:updi="urn:rosettanet:specification:universal:ProductIdentification:xsd:schema:01.04" 
xmlns:dddt="urn:rosettanet:specification:domain:Design:DateType:xsd:codelist:01.00" 
xmlns:dsdc="urn:rosettanet:specification:domain:Logistics:ShipDateCode:xsd:codelist:01.03" 
xmlns:ucr="urn:rosettanet:specification:universal:Currency:xsd:codelist:01.02" 
xmlns:dpiac="urn:rosettanet:specification:domain:Logistics:PortIdentifierAuthorityCode:xsd:codelist:01.03" 
xmlns:rptc="urn:rosettanet:specification:domain:Shared:PricingTypeCode:xsd:codelist:01.03" 
xmlns:dit="urn:rosettanet:specification:domain:Procurement:InventoryType:xsd:codelist:01.03" 
xmlns:dtt="urn:rosettanet:specification:domain:Procurement:TransactionType:xsd:codelist:01.04" 
xmlns:upd="urn:rosettanet:specification:universal:PhysicalDimension:xsd:schema:01.05" 
xmlns:dcst="urn:rosettanet:specification:domain:Logistics:CustomsType:xsd:codelist:01.03" 
xmlns:dsd="urn:rosettanet:specification:domain:Logistics:ShippingDocument:xsd:codelist:01.02" 
xmlns:uci="urn:rosettanet:specification:universal:ContactInformation:xsd:schema:01.03" 
xmlns:dpcm="urn:rosettanet:specification:domain:Procurement:PurchaseMethod:xsd:codelist:01.03" 
xmlns:rpsc="urn:rosettanet:specification:domain:Shared:ProductStatusCode:xsd:codelist:01.01" 
xmlns:dgrc="urn:rosettanet:specification:domain:Marketing:GeographicRegionCode:xsd:codelist:01.02" 
xmlns:dtrt="urn:rosettanet:specification:domain:Logistics:TrackingReferenceType:xsd:codelist:01.06" 
xmlns:umtq="urn:rosettanet:specification:universal:MimeTypeQualifier:xsd:codelist:01.02" 
xmlns:dcrt="urn:rosettanet:specification:domain:Procurement:CustomerType:xsd:codelist:01.03" 
xmlns:dscd="urn:rosettanet:specification:domain:Logistics:ShipmentChangeDisposition:xsd:codelist:01.03" 
xmlns:uc="urn:rosettanet:specification:universal:Country:xsd:codelist:01.02" 
xmlns="urn:rosettanet:specification:interchange:ProductCatalogInformationDistribution:xsd:schema:01.00" 
xmlns:dpc="urn:rosettanet:specification:domain:Procurement:PaymentCondition:xsd:codelist:01.03" 
xmlns:rpmt="urn:rosettanet:specification:domain:Shared:PaymentType:xsd:codelist:01.01" 
xmlns:dft="urn:rosettanet:specification:domain:Procurement:FinanceTerms:xsd:codelist:01.03" 
xmlns:dtq="urn:rosettanet:specification:domain:Procurement:TotalQualifier:xsd:codelist:01.03" 
xmlns:ume="urn:rosettanet:specification:universal:MonetaryExpression:xsd:schema:01.04" 
xmlns:dcp="urn:rosettanet:specification:domain:Design:Compliant:xsd:codelist:01.02" 
xmlns:drsc="urn:rosettanet:specification:domain:Marketing:RegistrationStatusCode:xsd:codelist:01.03" 
xmlns:uat="urn:rosettanet:specification:universal:AbstractType:xsd:schema:01.02" 
xmlns:dp="urn:rosettanet:specification:domain:Procurement:xsd:schema:02.17" 
xmlns:rpm="urn:rosettanet:specification:domain:Shared:PaymentMethod:xsd:codelist:01.02" 
xmlns:dfrt="urn:rosettanet:specification:domain:Procurement:ForecastReferenceType:xsd:codelist:01.03" 
xmlns:dtec="urn:rosettanet:specification:domain:Procurement:TaxExemptionCode:xsd:codelist:01.03" 
xmlns:ulc="urn:rosettanet:specification:universal:Locations:xsd:schema:01.04" 
xmlns:dccc="urn:rosettanet:specification:domain:Procurement:CreditCardClassification:xsd:codelist:01.03" 
xmlns:drlc="urn:rosettanet:specification:domain:Logistics:ReturnLabelCode:xsd:codelist:01.03" 
xmlns:st="http://www.ascc.net/xml/schematron" 
xmlns:dnecc="urn:rosettanet:specification:domain:Logistics:NationalExportControlClassification:xsd:codelist:01.03" 
xmlns:rpktc="urn:rosettanet:specification:domain:Shared:PackageTypeCode:xsd:codelist:01.01" 
xmlns:uwt="urn:rosettanet:specification:universal:WeightType:xsd:codelist:01.01" 
xmlns:dfpt="urn:rosettanet:specification:domain:Logistics:FreightPaymentTerms:xsd:codelist:01.03" 
xmlns:dte="urn:rosettanet:specification:domain:Procurement:TransportEvent:xsd:codelist:01.03" 
xmlns:ul="urn:rosettanet:specification:universal:Language:xsd:codelist:01.02" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xmlns:dbpq="urn:rosettanet:specification:domain:Procurement:BookPriceQualifier:xsd:codelist:01.04" 
xmlns:drl="urn:rosettanet:specification:domain:Logistics:RouteLocation:xsd:codelist:01.03" 
xmlns:ssdh="urn:rosettanet:specification:system:StandardDocumentHeader:xsd:schema:01.16" 
xmlns:dmk="urn:rosettanet:specification:domain:Marketing:xsd:schema:02.12" 
xmlns:rmat="urn:rosettanet:specification:domain:Shared:MonetaryAmountType:xsd:codelist:01.01" 
xmlns:uuom="urn:rosettanet:specification:universal:UnitOfMeasure:xsd:codelist:01.03" 
xmlns:dfe="urn:rosettanet:specification:domain:Procurement:ForecastEvent:xsd:codelist:01.03" 
xmlns:dst="urn:rosettanet:specification:domain:Procurement:ShipmentTerms:xsd:codelist:01.03" 
xmlns:udt="urn:rosettanet:specification:universal:DataType:xsd:schema:01.04" 
xmlns:dacc="urn:rosettanet:specification:domain:Procurement:AccountClassification:xsd:codelist:01.03" 
xmlns:dptt="urn:rosettanet:specification:domain:Logistics:PortType:xsd:codelist:01.03" 
xmlns:sha="urn:rosettanet:specification:domain:Shared:xsd:schema:01.10" 
xmlns:dlv="urn:rosettanet:specification:domain:Design:Level:xsd:codelist:01.02" 
xmlns:rict="urn:rosettanet:specification:domain:Shared:InvoiceChargeType:xsd:codelist:01.02" 
xmlns:utt="urn:rosettanet:specification:universal:TaxType:xsd:codelist:01.02" 
xmlns:ddwsr="urn:rosettanet:specification:domain:Marketing:DesignWinStatusReason:xsd:codelist:01.03" 
xmlns:dsm="urn:rosettanet:specification:domain:Logistics:ShipmentMode:xsd:codelist:01.05" 
xmlns:udct="urn:rosettanet:specification:universal:DocumentType:xsd:codelist:01.09" 
xmlns:dac="urn:rosettanet:specification:domain:Design:ActionCode:xsd:codelist:01.03" 
xmlns:dpsr="urn:rosettanet:specification:domain:Procurement:ProductSubstitutionReason:xsd:codelist:01.03" 
xmlns:sft="urn:rosettanet:specification:system:TPIRFileType:xsd:codelist:01.01" 
xmlns:dltcc="urn:rosettanet:specification:domain:Procurement:LeadTimeClassificationCode:xsd:codelist:01.03" 
xmlns:ri="urn:rosettanet:specification:domain:Shared:Interval:xsd:codelist:01.01" 
xmlns:urss="urn:rosettanet:specification:system:xml:1.0" 
xmlns:dds="urn:rosettanet:specification:domain:Design:xsd:schema:02.15" 
xmlns:dslt="urn:rosettanet:specification:domain:Procurement:SaleType:xsd:codelist:01.04" 
xmlns:udc="urn:rosettanet:specification:universal:Document:xsd:schema:01.08" 
xmlns:dabcc="urn:rosettanet:specification:domain:Design:ABCCode:xsd:codelist:01.02" 
xmlns:dppt="urn:rosettanet:specification:domain:Procurement:ProductProcurementType:xsd:codelist:01.03" 
xmlns:rwtc="urn:rosettanet:specification:domain:Shared:WarrantyType:xsd:codelist:01.01" 
xmlns:dlit="urn:rosettanet:specification:domain:Logistics:InstructionType:xsd:codelist:01.00" 
xmlns:rfob="urn:rosettanet:specification:domain:Shared:FreeOnBoard:xsd:codelist:01.01" 
xmlns:upri="urn:rosettanet:specification:universal:ProcessRoleIdentifier:xsd:codelist:01.08" 
xmlns:ddrn="urn:rosettanet:specification:domain:Marketing:DesignRegistrationNotification:xsd:codelist:01.02" 
xmlns:dsh="urn:rosettanet:specification:domain:Procurement:SpecialHandling:xsd:codelist:01.04" 
xmlns:ud="urn:rosettanet:specification:universal:Dates:xsd:schema:01.03" 
xmlns:dpms="urn:rosettanet:specification:domain:Marketing:ProjectMarketSegment:xsd:codelist:01.02" 
xmlns:rssl="urn:rosettanet:specification:domain:Shared:ShippingServiceLevel:xsd:codelist:01.01" 
xmlns:dldr="urn:rosettanet:specification:domain:Logistics:LotDiscrepancyReason:xsd:codelist:01.03" 
xmlns:rat="urn:rosettanet:specification:domain:Shared:AmountType:xsd:codelist:01.02" 
xmlns:upi="urn:rosettanet:specification:universal:PartnerIdentification:xsd:schema:01.12" 
xmlns:ddp="urn:rosettanet:specification:domain:Marketing:Disposition:xsd:codelist:01.02" 
xmlns:dsfr="urn:rosettanet:specification:domain:Procurement:SpecialFulfillmentRequest:xsd:codelist:01.03" 
xmlns:ucs="urn:rosettanet:specification:universal:CountrySubdivision:xsd:codelist:01.02 
+0

什么是文档根目录下的命名空间? – Phrogz

+0

这是相当长的....我会将它添加到我的问题 – Pynner

+0

请参阅下面的答案;您的更新不是文档的命名空间。您正在寻找根元素上的'xmlns =“...”'属性。 – Phrogz

回答

8

最简单快速黑客的解决方案是完全从文档中删除命名空间:

require 'nokogiri' 
xml = Nokogiri.XML "<root xmlns='foo' xmlns:bar='whee'><a/><bar:b /></root>" 

p xml.xpath('//b').length  #=> 0 
p xml.xpath('//bar:b').length #=> 1 
p xml.xpath('//a').length  #=> 0 
xml.remove_namespaces! 
p xml.xpath('//a').length  #=> 1 
p xml.xpath('//b').length  #=> 1 

然而,上述不一个有效的解决方案,如果你需要保留名字空间(例如修改你的文档并保存它,或者你在各种名字空间中有相互冲突的元素或属性名称)。如果你不能核弹的命名空间,你可以创建一个前缀,并告诉引入nokogiri它所对应...

xml = Nokogiri.XML "<root xmlns='foo' xmlns:bar='whee'><a/><bar:b /></root>" 
p xml.xpath('//x:a','x'=>'foo').length #=> 1 

...其中字符串foo是URI为拥有元素的文档中的命名空间有一个默认的命名空间(通常在根目录下)和字符串x就是你想要的(不与已经在你的文档中声明的另一个命名空间相冲突)。或者更简单地说,你可以只使用xmlns作为默认命名空间的前缀:

p xml.xpath('//xmlns:a').length #=> 1 

另外,如果你需要离开的命名空间,并可以构建一个合理的CSS样式选择,以获得您所需要那么节点您可以使用css方法:

require 'nokogiri' 
xml = Nokogiri.XML "<root xmlns='foo' xmlns:bar='whee'> 
    <a/> 
    <bar:b /> 
    <c xmlns='jim'><d/></c> 
</root>" 

p xml.css('a').length, #=> 1 
    xml.css('b').length, #=> 0 
    xml.css('c').length, #=> 0 
    xml.css('d').length #=> 0 

如上图所示,注意,这仅适用于那些在相同的命名空间的根元素节点。

+0

我对命名空间的理解是非常简单的。我真的不明白为什么这个作品....但它确实!谢谢。 – Pynner

+0

@Pynner每个XML元素和属性都可以与一个名称空间相关联。命名空间只是一个用来唯一标识它的URI。但是,每次需要SVG圆元素时,编写''是不可能的,因此有两种更简单的方式来分配一个名称空间。 1)您可以通过您构成的简写标识符前缀来引用一个名称空间;例如' ... '在该命名空间中创建一个'jim'元素。 2)元素的默认命名空间('xmlns')被所有非前缀的后代继承。 – Phrogz

+0

感谢您的解释,同样也适用于其他人从事相同类型的问题。你可以使用'doc来引用默认的命名空间。xpath('// x:element','x'=> doc.namespaces ['xmlns'])' – Pynner