使用Pattern.compile解析一条线

我想解析下面一行，在Java中的myline，它一直抛出空值。使用Pattern.compile解析一条线

这是我的尝试获得'000000010'。

myline = "<status> <id>000000010</id> <created_at>2012/03/11</created_at> <text>@joerogan Played as Joe Savage Rogan in Undisputed3 Career mode, won Pride GP, got UFC title shot against Shields, lost 3 times, and retired</text> <retweet_count>0</retweet_count> <user> <name>Siggi Eggertsson</name> <location>Berlin, Germany</location> <description></description> <url>http://www.siggieggertsson.com</url> </user></status>" 
p = Pattern.compile("(?i)<id.*?>(.+?)</id>", Pattern.DOTALL); 
m = regex.matcher(myline); 
id =m.group(1);

有什么建议吗？

来源

2012-03-23 user1289238

用正则表达式从XML文档中提取数据是一个坏主意。看看一个XML解析器。 – pimaster 2012-03-23 22:16:30

@ user1289238请您接受答案，谢谢。 – Adam 2012-12-23 17:46:25

您不应该首先使用正则表达式来解析XML。

但是，除此之外，你没有正确使用正则表达式。这是不够的，实例化一个matcher对象，你还需要告诉它做的事：

if (m.find()) 
{ 
    id = m.group(1); 
}

来源

2012-03-23 22:18:30

本网站可能为你提供的解析使用Java XML的一些信息 - http://www.java-samples.com/showtutorial.php?tutorialid=152

来源

2012-03-23 22:19:06 aretai

这个作品

String myline = "<status> <id>000000010</id> <created_at>2012/03/11</created_at> <text>@joerogan Played as Joe Savage Rogan in Undisputed3 Career mode, won Pride GP, got UFC title shot against Shields, lost 3 times, and retired</text> <retweet_count>0</retweet_count> <user> <name>Siggi Eggertsson</name> <location>Berlin, Germany</location> <description></description> <url>http://www.siggieggertsson.com</url> </user></status>"; 
Pattern p = Pattern.compile(".*<id>(.+)</id>.*"); 
Matcher m = p.matcher(myline); 
if (m.matches()) { 
    String id = m.group(1); 
    System.out.println(id); 
}

[编辑：]这也适用，并且它的更好：

String myline = "<status> <id>000000010</id> <created_at>2012/03/11</created_at> <text>@joerogan Played as Joe Savage Rogan in Undisputed3 Career mode, won Pride GP, got UFC title shot against Shields, lost 3 times, and retired</text> <retweet_count>0</retweet_count> <user> <name>Siggi Eggertsson</name> <location>Berlin, Germany</location> <description></description> <url>http://www.siggieggertsson.com</url> </user></status>"; 
Pattern p = Pattern.compile("<id>(.+)</id>"); 
Matcher m = p.matcher(myline); 
if (m.find()) { 
    String id = m.group(1); 
    System.out.println(id); 
}

来源

2012-03-23 22:26:08

如果字符串中存在多个''，如果''标签具有任何属性，或者标签的内容包含换行符，则两者都会失败。 – 2012-03-23 22:32:22

当然，我完全同意“你不应该使用正则表达式来解析XML”你的评论的一部分 – 2012-03-23 22:51:14

强烈建议使用XML解析器。有一种内置于Java的，这是针对您的问题的示例解决方案。为简单起见省略了异常处理程序

DocumentBuilderFactory factory = DocumentBuilderFactory 
     .newInstance(); 
DocumentBuilder builder = factory.newDocumentBuilder(); 
String input = "<status> <id>000000010</id> <created_at>2012/03/11</created_at> <text>@joerogan Played as Joe Savage Rogan in Undisputed3 Career mode, won Pride GP, got UFC title shot against Shields, lost 3 times, and retired</text> <retweet_count>0</retweet_count> <user> <name>Siggi Eggertsson</name> <location>Berlin, Germany</location> <description></description> <url>http://www.siggieggertsson.com</url> </user></status>"; 
Document document = builder.parse(new InputSource(new StringReader(
     input))); 
String value = document.getElementsByTagName("id").item(0) 
     .getTextContent(); 
System.out.println(value);

来源

2012-03-23 22:44:44 Adam

问题是我实际上没有处理XML文件，它是一个带有XML输入的文本文件。所以我不认为使用XML解析器会起作用吗？ – user1289238 2012-03-23 22:58:55

它呢，他只是向你展示了如何:) – 2012-03-23 23:07:05

谢谢！它就像一个魅力 – user1289238 2012-03-26 01:41:49

使用Pattern.compile解析一条线

回答

相关问题