我知道你问有关使用正则表达式,但jsoup使得这个如此简单,误差更容易:
import java.io.IOException;
import nu.xom.ParsingException;
import nu.xom.ValidityException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import org.xml.sax.SAXException;
public class HrefExtractor {
public static void main(final String[] args) throws SAXException, ValidityException, ParsingException, IOException {
final Document document = Jsoup.parse("<a href=\"target0.html\"><img align=\"center\" src=\"thumbnails/image001.jpg\" width=\"154\" height=\"99\">");
final Elements links = document.select("a[href]");
for (final Element element : links) {
System.out.println(element.attr("href"));
}
}
}
来源
2011-11-21 18:45:16
laz
我觉得对这个问题的答案是你在找什么: HTTP: //download.csdn.net/questions/1670593/java-i-have-a-big-string-of-html-and-need-to-extract-the-href-text – DiogoDoreto
@DiogoDoreto谢谢你的回复。你提到的问题的答案很好。 –
强制性的SO链接:http://stackoverflow.com/questions/1732348阅读最有价值的答案; ) – TacticalCoder