Apache HttpClient 4.0。奇怪的行为

我为我的网络爬虫使用Apache HttpClient 4.0。我发现奇怪的行为是：我试图通过HTTP GET方法获取页面并获得有关404 HTTP错误的响应。但是，如果我尝试使用浏览器获取该页面，则会成功完成。Apache HttpClient 4.0。奇怪的行为

详细说明：1。我上传多形式的服务器是这样的：

HttpPost httpPost = new HttpPost("http://[host here]/in.php"); 

    MultipartEntity entity = new MultipartEntity(HttpMultipartMode.BROWSER_COMPATIBLE); 
    entity.addPart("method", new StringBody("post")); 
    entity.addPart("key", new StringBody("223fwe0923fjf23")); 
    FileBody fileBody = new FileBody(new File("photo.jpg"), "image/jpeg"); 
    entity.addPart("file", fileBody); 
    httpPost.setEntity(entity); 

    HttpResponse response = httpClient.execute(httpPost);  
    HttpEntity result = response.getEntity(); 

    String responseString = ""; 
    if (result != null) { 
     InputStream inputStream = result.getContent(); 

     byte[] buffer = new byte[1024]; 
     while(inputStream.read(buffer) > 0) 
      responseString += new String(buffer); 

     result.consumeContent(); 
    }

Uppload succefully结束。

我收到从Web服务器的一些结果：

HttpGet httpGet = new HttpGet("http://[host here]/res.php?key="+myKey+"&action=get&id="+id); 

    HttpResponse response = httpClient.execute(httpGet); 
    HttpEntity entity = response.getEntity();

我得到ClientProtocolException而执行方法运行。我正在用log4j调试这种情况。服务器回答“404未找到”。但我的浏览器加载我的网页没有问题。

任何人都可以帮助我吗？

谢谢。

来源

2009-11-04 Mikhail T

您是否检查过您的浏览器是否正在返回缓存页面？ – toolkit 2009-11-04 19:27:58

log4j告诉这个： DEBUG [org.apache.http.wire] >>“GET /res.php?key=sadf3f3f34f4f43f4f&action=get&id=89122037[0x0][0x0][0x0][0x0] .....如果是这样，我该如何消除它？ – 2009-11-04 19:28:11

你是否尝试从HTTP 1.1切换到1.0或其他方式？我想我隐约记得httpclient有一些问题与一些服务器（如何通信），导致服务器返回404. – 2009-11-04 19:41:13

我要注意的问题是不关心网络服务器。如果我不添加FileBody到多部分表单数据，异常不会发生，一切顺利，没有HTTP 404.

来源

2009-11-04 21:43:44

Apache HttpClient 4.0。奇怪的行为

回答

相关问题