我想用htmlunit从网站上抓取数据。我将该地址作为表单的属性传递。即使我已导入.jar文件并正确设置了javadoc文件位置,但我仍然收到错误,它说“java.lang.NoClassDefFoundError:com/gargoylesoftware/htmlunit/WebClient”。我错过了什么吗?在servlet上运行htmlunit
package coreservlets;
import java.io.IOException;
import java.io.PrintWriter;
import javax.servlet.ServletException;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlDivision;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
@WebServlet("/WebScrape")
@SuppressWarnings("serial")
public class WebScrape extends HttpServlet {
public void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
PrintWriter out = response.getWriter();
// Create and initialize WebClient object
final WebClient webClient = new WebClient();
String Address = (String) request.getAttribute("address");
HtmlPage page = webClient.getPage(Address);
final HtmlDivision div = (HtmlDivision) page.getByXPath("//*[@id=\"LDPOffMarketPropertyInfo\"]//div//ul//li[4]//span[1]//text()");
out.println("<!DOCTYPE html>\n" +
"<html>\n" +
"<head>\n" +
"<meta name=" + "\"viewport\" " + "content=" + "\"initial-scale=1.0, user-scalable=no\" " + "/>\n" +
"<style type=" + "\"text/css\">\n" +
" html { height: 100% }\n" +
" body { height: 100%; margin: 0; padding: 0 }\n" +
" #default { height: 800px;\n"+
" width: 400px; }\n" +
" </style>\n" + div);
}
}
请解释“我导入的.jar文件”的含义。你把你的jar文件放在哪里? – 2012-07-30 21:20:41
你确定你有*所有必需的库吗?你如何指定类路径? – 2012-07-30 21:22:46
我使用了构建类的路径..并添加了外部的.jar文件(我把它放在项目文件夹中),但由于我希望路径是绝对路径,所以使用添加外部文件。我添加了从他们的网站下载的htmlunit .zip文件的全部内容。我还指定了javadoc位置。 – StackTraceYo 2012-07-30 21:30:23