2017-03-02 65 views
0

我需要处理一个XML文件,该文件对其根元素具有名称空间声明并且包含+ 133K个子元素,其大小约为500MB;为了实现这一点,我使用WSO2 ESB 5和smooks mediator。使用WSO2 ESB拆分并路由带名称空间的大型XML并使用WSO2 ESB和smooks

基本上我正在寻找的是将输入文件拆分成具有预定义结构的小块,并将它们中的每一个发送到队列以备后续处理。

我第一次尝试先做XSLT转换删除输入文件的命名空间,但我得到了内存不足的错误是这样的:

TID: [-1234] [] [2017-03-02 03:04:43,900] ERROR {org.apache.axis2.transport.base.threads.NativeWorkerPool} - Uncaught exception {org.apache.axis2.transport.base.threads.NativeWorkerPool} 
java.lang.OutOfMemoryError: GC overhead limit exceeded 
    at org.apache.axiom.om.impl.llom.factory.OMLinkedListImplFactory.createOMText(OMLinkedListImplFactory.java:192) 
    at org.apache.axiom.om.impl.builder.StAXBuilder.createOMText(StAXBuilder.java:294) 
    at org.apache.axiom.om.impl.builder.StAXBuilder.createOMText(StAXBuilder.java:250) 
    at org.apache.axiom.om.impl.builder.StAXOMBuilder.next(StAXOMBuilder.java:252) 
    at org.apache.axiom.om.impl.llom.OMSerializableImpl.build(OMSerializableImpl.java:78) 
    at org.apache.axiom.om.impl.llom.OMElementImpl.build(OMElementImpl.java:722) 
    at org.apache.axiom.om.impl.llom.OMElementImpl.detach(OMElementImpl.java:700) 
    at org.apache.axiom.om.impl.llom.OMNodeImpl.setParent(OMNodeImpl.java:105) 
    at org.apache.axiom.om.impl.llom.OMNodeImpl.insertSiblingAfter(OMNodeImpl.java:203) 
    at org.apache.synapse.mediators.transform.XSLTMediator.performXSLT(XSLTMediator.java:366) 
    at org.apache.synapse.mediators.transform.XSLTMediator.mediate(XSLTMediator.java:202) 
    at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:97) 
    at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:59) 
    at org.apache.synapse.mediators.base.SequenceMediator.mediate(SequenceMediator.java:158) 
    at org.apache.synapse.core.axis2.ProxyServiceMessageReceiver.receive(ProxyServiceMessageReceiver.java:210) 
    at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:180) 
    at org.apache.axis2.transport.base.AbstractTransportListener.handleIncomingMessage(AbstractTransportListener.java:328) 
    at org.apache.synapse.transport.vfs.VFSTransportListener.processFile(VFSTransportListener.java:824) 
    at org.apache.synapse.transport.vfs.VFSTransportListener.scanFileOrDirectory(VFSTransportListener.java:472) 
    at org.apache.synapse.transport.vfs.VFSTransportListener.poll(VFSTransportListener.java:188) 
    at org.apache.synapse.transport.vfs.VFSTransportListener.poll(VFSTransportListener.java:134) 
    at org.apache.axis2.transport.base.AbstractPollingTransportListener$1$1.run(AbstractPollingTransportListener.java:67) 
    at org.apache.axis2.transport.base.threads.NativeWorkerPool$1.run(NativeWorkerPool.java:172) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
    at java.lang.Thread.run(Thread.java:745) 

我不明白为什么会这样,因为我的虚拟机配置基于以前的错误,我决定使用Smooks对实现一种流媒体解决方案与-Xms4096m -Xmx6144m

工作,那么我定义了一个VFS代理服务轮询文件夹,给文件smook调解员,但我不断收到似乎是相关的错误与输入文件的根元素上的名称空间定义相关,我提到了这一点,因为每当我编辑输入文件并摆脱名称空间定义时,我已经在WSO2 ESB上定义和部署了完美的作品。这里的关键是我收到来自后端黑盒系统的大文件,我应该处理命名空间的东西。

以下是我对我的ESB的定义:

代理服务

<?xml version="1.0" encoding="UTF-8"?> 
<proxy xmlns="http://ws.apache.org/ns/synapse" 
     name="Tryzens_ProductProxy" 
     startOnLoad="true" 
     statistics="disable" 
     trace="disable" 
     transports="vfs"> 
    <target> 
     <inSequence> 
     <log level="custom"> 
      <property name="Tryzens_ProductProxy__tracing" value="before smooks"/> 
     </log> 
     <property name="DISABLE_SMOOKS_RESULT_PAYLOAD" value="true"/> 
     <smooks config-key="ProductSplitJMS_Smook"> 
      <input type="xml"/> 
      <output type="xml"/> 
     </smooks> 
     <log level="custom"> 
      <property name="Tryzens_ProductProxy__tracing" value="after smooks"/> 
     </log> 
     </inSequence> 
    </target> 
    <parameter name="transport.vfs.Streaming">true</parameter> 
    <parameter name="transport.PollInterval">15</parameter> 
    <parameter name="transport.vfs.ActionAfterProcess">MOVE</parameter> 
    <parameter name="transport.vfs.FileURI">vfs:file:///home/jairof/wso2/00_test/working/tryzens/smook_product/</parameter> 
    <parameter name="transport.vfs.MoveAfterProcess">vfs:file:///home/jairof/wso2/00_test/working/tryzens/output/</parameter> 
    <parameter name="transport.vfs.MoveAfterFailure">vfs:file:///home/jairof/wso2/00_test/working/tryzens/fails/</parameter> 
    <parameter name="transport.vfs.FileNamePattern">.*.xml</parameter> 
    <parameter name="transport.vfs.ContentType">application/xml</parameter> 
    <parameter name="transport.vfs.ActionAfterFailure">MOVE</parameter> 
    <description/> 
</proxy> 

的Smooks配置

<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:ftl="http://www.milyn.org/xsd/smooks/freemarker-1.1.xsd" xmlns:xsl="http://www.milyn.org/xsd/smooks/xsl-1.1.xsd" xmlns:core="http://www.milyn.org/xsd/smooks/smooks-core-1.3.xsd" xmlns:jms="http://www.milyn.org/xsd/smooks/jms-routing-1.2.xsd"> 
     <params> 
     <param name="stream.filter.type">SAX</param> 
     <param name="default.serialization.on">false</param> 
     </params> 
     <resource-config selector="product"> 
     <resource>org.milyn.delivery.DomModelCreator</resource> 
     </resource-config> 
     <jms:router routeOnElement="product" beanId="productItem_xml" destination="dynamicQueues/TestFL"> 
     <jms:connection factory="QueueConnectionFactory"/> 
     <jms:jndi contextFactory="org.apache.activemq.jndi.ActiveMQInitialContextFactory" providerUrl="tcp://localhost:61616"/> 
     <jms:highWaterMark mark="-1"/> 
     </jms:router> 
     <ftl:freemarker applyOnElement="product"> 
     <ftl:template>/repository/resources/smooks/product.ftl</ftl:template> 
     <ftl:use> 
      <ftl:bindTo id="productItem_xml"/> 
     </ftl:use> 
     </ftl:freemarker> 
</smooks-resource-list> 

的Smooks模板

此模板仅用于测试目的,真正的一个对应的产品元素的完整结构,但要重现错误情况下,有足够的:

<#ftl ns_prefixes={"ns1": "http://www.demandware.com/xml/impex/catalog/2006-10-31"}> 
<product id='${.vars["product"]["@product-id"]}'> 
    <ean>${product.ean}</ean>   
</product> 

样品输入文件

请注意,实际文件有超过133K的产品,在这个示例中我剪掉了大部分文件,只剩下两个产品

<?xml version="1.0" encoding="UTF-8"?> 
<catalog xmlns="http://www.demandware.com/xml/impex/catalog/2006-10-31" catalog-id="tml-catalog-en"> 
    <header> 
     <image-settings> 
      <internal-location base-path="/images"/> 
      <view-types> 
       <view-type>original</view-type> 
       <view-type>portrait</view-type> 
       <view-type>badge_GBP</view-type> 
       <view-type>badge_EUR</view-type> 
       <view-type>badge_USD</view-type> 
       <view-type>badge_AUD</view-type> 
       <view-type>badge_CZH</view-type> 
       <view-type>ctlimage</view-type> 
       <view-type>badge_FRA</view-type> 
       <view-type>badge_GER</view-type> 
       <view-type>landscape</view-type> 
      </view-types> 
      <alt-pattern>${productname}, ${variationvalue}, ${viewtype}</alt-pattern> 
      <title-pattern>${productname}, ${variationvalue}</title-pattern> 
     </image-settings> 
    </header> 

    <category category-id="MensShoes"> 
     <display-name xml:lang="de-DE">Schuhe</display-name> 
     <display-name xml:lang="x-default">Shoes</display-name> 
     <display-name xml:lang="fr-FR">Chaussures</display-name> 
     <online-flag>true</online-flag> 
     <parent>MENSWEAR</parent> 
     <position>12.0</position> 
     <image>images/slot/landing/men_menlanding_H1_GBP.jpg</image> 
     <template/> 
     <page-attributes/> 
     <custom-attributes> 
      <custom-attribute attribute-id="categoryRecommendationsEnable">false</custom-attribute> 
      <custom-attribute attribute-id="enableCompare">false</custom-attribute> 
      <custom-attribute attribute-id="enableGridItemButtonStrip">false</custom-attribute> 
      <custom-attribute attribute-id="enableGridItemMobileButtonStrip">false</custom-attribute> 
      <custom-attribute attribute-id="enableUserJourney">false</custom-attribute> 
      <custom-attribute attribute-id="enableWishlist">false</custom-attribute> 
      <custom-attribute attribute-id="fitsme_enabled">false</custom-attribute> 
      <custom-attribute attribute-id="rrGenere">false</custom-attribute> 
      <custom-attribute attribute-id="rsCategoryEnabled">false</custom-attribute> 
      <custom-attribute attribute-id="shopAllButton">false</custom-attribute> 
      <custom-attribute attribute-id="showInMenu">true</custom-attribute> 
      <custom-attribute attribute-id="showInMobileMenu">false</custom-attribute> 
      <custom-attribute attribute-id="show_alternate_image_on_plp">false</custom-attribute> 
      <custom-attribute attribute-id="slotBannerImage">images/slot/landing/men_menlanding_H1_GBP.jpg</custom-attribute> 
     </custom-attributes> 
    </category> 

    <category category-id="P50 SUIT"> 
     <display-name xml:lang="de-DE">Hosen</display-name> 
     <display-name xml:lang="x-default">Trousers</display-name> 
     <display-name xml:lang="fr-FR">Pantalons</display-name> 
     <online-flag>true</online-flag> 
     <parent>WomensTailoring</parent> 
     <position>0.0</position> 
     <template/> 
     <page-attributes/> 
    </category> 

    <product product-id="0"> 
     <ean/> 
     <upc/> 
     <unit/> 
     <min-order-quantity>1</min-order-quantity> 
     <step-quantity>1</step-quantity> 
     <store-force-price-flag>false</store-force-price-flag> 
     <store-non-inventory-flag>false</store-non-inventory-flag> 
     <store-non-revenue-flag>false</store-non-revenue-flag> 
     <store-non-discountable-flag>false</store-non-discountable-flag> 
     <online-flag>false</online-flag> 
     <available-flag>true</available-flag> 
     <searchable-flag>true</searchable-flag> 
     <images> 
      <image-group view-type="badge_EUR"> 
       <image path="badge/blank.png"/> 
      </image-group> 
      <image-group view-type="badge_GBP"> 
       <image path="badge/blank.png"/> 
      </image-group> 
      <image-group view-type="badge_GER"> 
       <image path="badge/blank.png"/> 
      </image-group> 
      <image-group view-type="badge_USD"> 
       <image path="badge/blank.png"/> 
      </image-group> 
     </images> 
     <page-attributes/> 
     <pinterest-enabled-flag>false</pinterest-enabled-flag> 
     <facebook-enabled-flag>false</facebook-enabled-flag> 
     <store-attributes> 
      <force-price-flag>false</force-price-flag> 
      <non-inventory-flag>false</non-inventory-flag> 
      <non-revenue-flag>false</non-revenue-flag> 
      <non-discountable-flag>false</non-discountable-flag> 
     </store-attributes> 
    </product> 

    <product product-id="12024"> 
     <ean/> 
     <upc/> 
     <unit/> 
     <min-order-quantity>1</min-order-quantity> 
     <step-quantity>1</step-quantity> 
     <store-force-price-flag>false</store-force-price-flag> 
     <store-non-inventory-flag>false</store-non-inventory-flag> 
     <store-non-revenue-flag>false</store-non-revenue-flag> 
     <store-non-discountable-flag>false</store-non-discountable-flag> 
     <online-flag>false</online-flag> 
     <available-flag>true</available-flag> 
     <searchable-flag>true</searchable-flag> 
     <images> 
      <image-group view-type="original"> 
       <image path="original/12024_original_original.jpg"/> 
      </image-group> 
     </images> 
     <brand>J FRANCOMB</brand> 
     <page-attributes/> 
     <custom-attributes> 
      <custom-attribute attribute-id="allocGroup">X</custom-attribute> 
      <custom-attribute attribute-id="colour"> 
       <value>3PNK-PINK</value> 
      </custom-attribute> 
      <custom-attribute attribute-id="cuffType"> 
       <value>SINGLE CUFF</value> 
      </custom-attribute> 
      <custom-attribute attribute-id="enable_pdp_asset_footer_layout">false</custom-attribute> 
      <custom-attribute attribute-id="fabric"> 
       <value>LEWIN 100 PD</value> 
      </custom-attribute> 
      <custom-attribute attribute-id="fit">SEMI FIT</custom-attribute> 
      <custom-attribute attribute-id="gender"> 
       <value>M</value> 
      </custom-attribute> 
      <custom-attribute attribute-id="look">PTRN447</custom-attribute> 
      <custom-attribute attribute-id="pattern"> 
       <value>PATTERN</value> 
      </custom-attribute> 
      <custom-attribute attribute-id="productIDCIMS">12024</custom-attribute> 
      <custom-attribute attribute-id="retailTypeCIMS">M FORMAL</custom-attribute> 
      <custom-attribute attribute-id="seasonCIMS">307B</custom-attribute> 
      <custom-attribute attribute-id="styleName">MILSC PATTERN DOOM AND BLOOM</custom-attribute> 
      <custom-attribute attribute-id="styleNameCIMS">MILSC PATTERN DOOM AND BLOOM</custom-attribute> 
      <custom-attribute attribute-id="styleNumberCIMS">MS17</custom-attribute> 
      <custom-attribute attribute-id="typeDesc">MS SHIRTS</custom-attribute> 
      <custom-attribute attribute-id="weight">0.3</custom-attribute> 
     </custom-attributes> 
     <options> 
      <shared-option option-id="sleeveLengthAlteration"/> 
      <shared-option option-id="giftBox"/> 
     </options> 
     <variations> 
      <attributes> 
       <shared-variation-attribute attribute-id="collarSize" variation-attribute-id="collarSize"/> 
       <shared-variation-attribute attribute-id="sleeveLength" variation-attribute-id="sleeveLength"/> 
      </attributes> 
     </variations> 
     <classification-category>S17 MILAN</classification-category> 
     <pinterest-enabled-flag>false</pinterest-enabled-flag> 
     <facebook-enabled-flag>false</facebook-enabled-flag> 
     <store-attributes> 
      <force-price-flag>false</force-price-flag> 
      <non-inventory-flag>false</non-inventory-flag> 
      <non-revenue-flag>false</non-revenue-flag> 
      <non-discountable-flag>false</non-discountable-flag> 
     </store-attributes> 
    </product> 

    <category-assignment category-id="T43 HERITAGE" product-id="505158991125"> 
     <primary-flag>true</primary-flag> 
    </category-assignment> 
    <category-assignment category-id="U30 BOXERS" product-id="505158774834"/> 
    <recommendation source-id="58462" source-type="product" target-id="505158886294" type="4"/> 
</catalog> 

错误wso2carbon.log文件

TID: [-1234] [] [2017-03-02 12:15:27,793] INFO {org.apache.synapse.mediators.builtin.LogMediator} - Tryzens_ProductProxy__tracing = before smooks {org.apache.synapse.mediators.builtin.LogMediator} 
TID: [-1234] [] [2017-03-02 12:15:28,376] ERROR {freemarker.runtime} - {freemarker.runtime} 

Error on line 3, column 12 in repository/resources/smooks/product.ftl 
Expecting a string, date or number here, Expression product.ean is instead a freemarker.ext.dom.NodeListModel 
The problematic instruction: 
---------- 
==> ${product.ean} [on line 3, column 10 in repository/resources/smooks/product.ftl] 
---------- 

Java backtrace for programmers: 
---------- 
freemarker.core.NonStringException: Error on line 3, column 12 in repository/resources/smooks/product.ftl 
Expecting a string, date or number here, Expression product.ean is instead a freemarker.ext.dom.NodeListModel 
    at freemarker.core.Expression.getStringValue(Expression.java:126) 
    at freemarker.core.Expression.getStringValue(Expression.java:93) 
    at freemarker.core.DollarVariable.accept(DollarVariable.java:76) 
    at freemarker.core.Environment.visit(Environment.java:209) 
    at freemarker.core.MixedContent.accept(MixedContent.java:92) 
    at freemarker.core.Environment.visit(Environment.java:209) 
    at freemarker.core.Environment.process(Environment.java:189) 
    at freemarker.template.Template.process(Template.java:237) 
    at org.milyn.templating.freemarker.FreeMarkerTemplateProcessor.applyTemplate(FreeMarkerTemplateProcessor.java:358) 
    at org.milyn.templating.freemarker.FreeMarkerTemplateProcessor.applyTemplate(FreeMarkerTemplateProcessor.java:346) 
    at org.milyn.templating.freemarker.FreeMarkerTemplateProcessor.visitAfter(FreeMarkerTemplateProcessor.java:333) 
    at org.milyn.delivery.sax.SAXHandler.visitAfter(SAXHandler.java:389) 
    at org.milyn.delivery.sax.SAXHandler.endElement(SAXHandler.java:204) 
    at org.milyn.delivery.SmooksContentHandler.endElement(SmooksContentHandler.java:96) 
    at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source) 
    at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanEndElement(Unknown Source) 
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source) 
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) 
    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) 
    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) 
    at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) 
    at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) 
    at org.milyn.delivery.sax.SAXParser.parse(SAXParser.java:76) 
    at org.milyn.delivery.sax.SmooksSAXFilter.doFilter(SmooksSAXFilter.java:86) 
    at org.milyn.delivery.sax.SmooksSAXFilter.doFilter(SmooksSAXFilter.java:64) 
    at org.milyn.Smooks._filter(Smooks.java:526) 
    at org.milyn.Smooks.filterSource(Smooks.java:482) 
    at org.wso2.carbon.mediator.transform.SmooksMediator.mediate(SmooksMediator.java:146) 
    at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:97) 
    at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:59) 
    at org.apache.synapse.mediators.base.SequenceMediator.mediate(SequenceMediator.java:158) 
    at org.apache.synapse.core.axis2.ProxyServiceMessageReceiver.receive(ProxyServiceMessageReceiver.java:210) 
    at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:180) 
    at org.apache.axis2.transport.base.AbstractTransportListener.handleIncomingMessage(AbstractTransportListener.java:328) 
    at org.apache.synapse.transport.vfs.VFSTransportListener.processFile(VFSTransportListener.java:824) 
    at org.apache.synapse.transport.vfs.VFSTransportListener.scanFileOrDirectory(VFSTransportListener.java:472) 
    at org.apache.synapse.transport.vfs.VFSTransportListener.poll(VFSTransportListener.java:188) 
    at org.apache.synapse.transport.vfs.VFSTransportListener.poll(VFSTransportListener.java:134) 
    at org.apache.axis2.transport.base.AbstractPollingTransportListener$1$1.run(AbstractPollingTransportListener.java:67) 
    at org.apache.axis2.transport.base.threads.NativeWorkerPool$1.run(NativeWorkerPool.java:172) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
    at java.lang.Thread.run(Thread.java:745) 

请帮帮忙,我将不胜感激任何意见来解决这个问题 预先感谢

回答

0

在Smooks的模板(。FTL文件),如果你想使用类似${product.ean},你必须定义“产品”变量:

<#assign product = .vars["product"]> 

在你的XML输入文件,所有节点都属于同一个defaut命名空间"http://www.demandware.com/xml/impex/catalog/2006-10-31"

您可以定义FTL中的默认命名空间,带有保留的前缀“D”:<#ftl ns_prefixes={"D":"http://www.demandware.com/xml/impex/catalog/2006-10-31"}>

+0

嗨,让米歇尔,我真的很感谢你的帮助;我应用了你的建议,现在它的工作很完美,非常感谢。 –