1
我正在使用jsoup从网站上刮取数据。我想知道当我从哪里抓取数据的网站关闭时会抛出哪个异常。
是SocketException
还是NoHttpResponseException
或其他?
我看到NoHttpResponseException
在服务器收到请求但没有响应时抛出,这是正确的吗?网站停机时会抛出哪个异常?
我正在使用jsoup从网站上刮取数据。我想知道当我从哪里抓取数据的网站关闭时会抛出哪个异常。
是SocketException
还是NoHttpResponseException
或其他?
我看到NoHttpResponseException
在服务器收到请求但没有响应时抛出,这是正确的吗?网站停机时会抛出哪个异常?
我测试了我们自己的网站,我取下来的Tomcat我得到以下java.net.SocketTimeoutException
后:
java.net.SocketTimeoutException: connect timed out
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:668)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264)
at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1138)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1032)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:153)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:563)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:540)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:227)
at org.jsoup.helper.HttpConnection.get(HttpConnection.java:216)
at testing.Test.main(Test.java:19)
这是我使用的代码:
public static void main(String[] args) {
try {
Document document = Jsoup.connect("https://example/folder").validateTLSCertificates(false).timeout(1000).get();
System.out.println(document);
} catch (Exception e) {
e.printStackTrace();
}
}
NoHttpResponseException
似乎是一个Apache的HttpClient例外(org.apache.commons.httpclient.NoHttpResponseException
)。由于Jsoup没有apache依赖关系,因此SocketTimeoutException
可能是答案。
我认为它应该是'RequestTimeoutException',因为客户端无法在给定的超时时间内建立连接 –
将您的程序指向停靠网站并自行查看。这里有一个例子:http://cocacola.com:8989/ –