我注意到一个奇怪的现象,当使用apache httpclient库,我想知道它为什么会发生。我创建了一些示例代码来演示。 考虑下面的代码:Apache httpclient在加载之前返回页面?
//Example URL
String url = "http://rads.stackoverflow.com/amzn/click/05961580";
GetMethod get = new GetMethod(url);
HttpMethodRetryHandler httpHandler = new DefaultHttpMethodRetryHandler(1, false);
get.getParams().setParameter(HttpMethodParams.RETRY_HANDLER, httpHandler);
get.getParams().setCookiePolicy(CookiePolicy.IGNORE_COOKIES);
HttpConnectionManager connectionManager = new SimpleHttpConnectionManager();
HttpClient client = new HttpClient(connectionManager);
client.getParams().setParameter("http.useragent", FIREFOX);
String line;
StringBuilder stringBuilder = new StringBuilder();
String toStreamBody = null;
String toStringBody = null;
try {
int statusCode = client.executeMethod(get);
if(statusCode != HttpStatus.SC_OK){
System.err.println("Internet Status: " + HttpStatus.getStatusText(statusCode));
System.err.println("While getting page: " + url);
}
//toString
toStringBody = get.getResponseBodyAsString();
//toStream
InputStreamReader isr = new InputStreamReader(get.getResponseBodyAsStream())
BufferedReader rd = new BufferedReader(isr);
while ((line = rd.readLine()) != null) {
stringBuilder.append(line);
}
} catch (java.io.IOException ex) {
System.out.println("Failed to get page: " + url);
} finally {
get.releaseConnection();
}
toStreamBody = stringBuilder.toString();
此代码打印什么:
System.out.println(toStringBody); // ""
此代码打印网页:
System.out.println(toStreamBody); // "Whole Page"
但它变得更奇怪... 更换:
get.getResponseBodyAsString();
有了:
get.getResponseBodyAsString(150000);
现在我们得到的错误: 无法获取页面:http://www.amazon.com/gp/offer-listing/0596158068/ref=dp_olp_used?ie=UTF8
我无法找到除了亚马逊会复制这种行为另一个网站,但我认为还有其他的。
我知道,根据http://hc.apache.org/httpclient-3.x/performance.html
的文档不鼓励使用getResponseBodyAsString()
,它并不是说不会加载页面,只是说您可能会面临内存不足的异常。在加载之前getResponseBodyAsString()
是否可能返回页面?为什么这只发生在亚马逊?
哦,那不是链接。我会尽量改回它。 – Bob 2010-10-26 08:16:40
该网站是http://www.amazon.com/gp/offer-listing/0596158068/ref=dp_olp_used?ie=UTF8 – Bob 2010-10-26 08:16:56
好吧,出于某种原因,网站被网站更改。我无能为力。 – Bob 2010-10-26 08:27:22