2011-08-19 70 views
4

我使用的飞碟,它完美的作品,但现在我想添加书签做转换,从XHTML到PDF,并根据FS文件应该这样做:阅读XHTML和自定义标签导入DOM树

<bookmarks> 
    <bookmark name='1. Foo bar baz' href='#1'> 
     <bookmark name='1.1 Baz quux' href='#1.2'> 
     </bookmark> 
    </bookmark> 
    <bookmark name='2. Foo bar baz' href='#2'> 
     <bookmark name='2.1 Baz quux' href='#2.2'> 
     </bookmark> 
    </bookmark> 
</bookmarks> 

这应该被放到HEAD部分,我已经做到了,但的SAXParser不会读取该文件了,说:

line 11 column 14 - Error: <bookmarks> is not recognized! 
line 11 column 25 - Error: <bookmark> is not recognized! 

我有一个本地的实体解析器建立和甚至还添加了书签一个DTD,

<!--flying saucer bookmarks --> 
<!ELEMENT bookmarks (#PCDATA)> 
<!ATTLIST bookmarks %attrs;> 

<!ELEMENT bookmark (#PCDATA)> 
<!ATTLIST bookmark %attrs;> 

但它只是不会解析,我没有想法,请帮助。

编辑

我使用下面的代码来解析:

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); 
DocumentBuilder builder = dbf.newDocumentBuilder(); 
builder.setEntityResolver(new LocalEntityResolver()); 
document = builder.parse(is); 

编辑

这里是LocalEntityResolver:

class LocalEntityResolver implements EntityResolver { 

    private static final Logger LOG = ESAPI.getLogger(LocalEntityResolver.class); 
    private static final Map<String, String> DTDS; 
    static { 
     DTDS = new HashMap<String, String>(); 
     DTDS.put("-//W3C//DTD XHTML 1.0 Strict//EN", 
       "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"); 
     DTDS.put("-//W3C//DTD XHTML 1.0 Transitional//EN", 
       "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"); 
     DTDS.put("-//W3C//ENTITIES Latin 1 for XHTML//EN", 
       "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent"); 
     DTDS.put("-//W3C//ENTITIES Symbols for XHTML//EN", 
       "http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent"); 
     DTDS.put("-//W3C//ENTITIES Special for XHTML//EN", 
       "http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent"); 
    } 

    @Override 
    public InputSource resolveEntity(String publicId, String systemId) 
      throws SAXException, IOException { 
     InputSource input_source = null; 
     if (publicId != null && DTDS.containsKey(publicId)) { 
      LOG.debug(Logger.EVENT_SUCCESS, "Looking for local copy of [" + publicId + "]"); 

      final String dtd_system_id = DTDS.get(publicId); 
      final String file_name = dtd_system_id.substring(
        dtd_system_id.lastIndexOf('/') + 1, dtd_system_id.length()); 

      InputStream input_stream = FileUtil.readStreamFromClasspath(
        file_name, "my/class/path", 
        getClass().getClassLoader()); 
      if (input_stream != null) { 
       LOG.debug(Logger.EVENT_SUCCESS, "Found local file [" + file_name + "]!"); 
       input_source = new InputSource(input_stream); 
      } 
     } 

     return input_source; 
    } 
} 

我d礼仪建造者工厂实施是: com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl

+0

我认为你需要提供更多的细节。我怎么能或其他人重现这个问题? – mzjn

+0

基本上我想用某些未知元素将有效的XHTML解析为使用W3C过渡DTD的DOM树。如果你想重现任何有效的XHTML,添加书签html并尝试解析成dom树 – epoch

+0

什么是LocalEntityResolver?它从何而来?我无法在Xerces源代码中找到符合'{元素}'的消息不被识别! –

回答

0

呃,我终于找到了问题。对不起,让你们调试代码,问题是在我的代码中有一个调用JTidy.parse的DOM解析发生之前,这导致内容被解析为空,我甚至没有抓住,实际错误是,来自SAX的Premature End of file

感谢Matt Gibson,当我通过代码编译一个简短的输入文档时,我发现了这个错误。

我的代码现在包括一个检查,看看是否含量为空

/** 
* parses String content into a valid XML document. 
* @param content the content to be parsed. 
* @return the parsed document or <tt>null</tt> 
*/ 
private static Document parse(final String content) { 
    Document document = null; 
    try { 
     if (StringUtil.isNull(content)) { 
      throw new IllegalArgumentException("cannot parse null " 
        + "content into a DOM object!"); 
     } 

     InputStream is = new ByteArrayInputStream(content 
       .getBytes(CONTEXT.getEncoding())); 

     DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); 
     DocumentBuilder builder = dbf.newDocumentBuilder(); 
     builder.setEntityResolver(new LocalEntityResolver()); 
     document = builder.parse(is); 
    } catch (Exception ex) { 
     LOG.error(Logger.EVENT_FAILURE, "parsing failed " 
       + "for content[" + content + "]", ex); 
    } 

    return document; 
} 
+0

引用[SSCCE](http://sscce.org/);-)的另一个原因我确实试图重现您的问题,并且很难(FileUtil来自哪个库,例如..) – Wivani

+0

否后顾之忧。很高兴你找到了! –