2012-04-13 65 views
0

我正在使用Jsoup Java HTML分析器从特定URL获取图像。但一些图像抛出状态502错误代码,并没有保存到我的机器。以下是我使用的代码快照: -如何修复错误502状态

String url = "http://www.jabong.com"; 
String html = Jsoup.connect(url.toString()).get().html(); 
Document doc = Jsoup.parse(html, url); 
images = doc.select("img"); 

for (Element element : images) { 
     String imgSrc = element.attr("abs:src"); 
     log.info(imgSrc); 
     if (imgSrc != "") { 
      saveFromUrl(imgSrc, dirPath+"/" + nameCounter + ".jpg"); 
      try { 
       Thread.sleep(3000); 
      } catch (InterruptedException e) { 
       log.error("error in sleeping"); 
      } 
      nameCounter++; 
     } 
} 

而且saveFromURL功能如下: -

public static void saveFromUrl(String Url, String destinationFile) { 
    try { 
     URL url = new URL(Url); 
     InputStream is = url.openStream(); 
     OutputStream os = new FileOutputStream(destinationFile); 

     byte[] b = new byte[2048]; 
     int length; 

     while ((length = is.read(b)) != -1) { 
      os.write(b, 0, length); 
     } 

     is.close(); 
     os.close(); 
    } catch (IOException e) { 
     log.error("Error in saving file from url:" + Url); 
     //e.printStackTrace(); 
    } 
} 

我搜索有关状态码502互联网上,但它说的错误是由于不良的网关。我不明白这一点。其中一个可能的事情,我在想这个错误可能是因为我发送请求图像循环。可能是web服务器不能处理这么多的负载,因此拒绝对图像的请求时,以前的图像不发送。所以我试图把睡眠后,提取每个图像,但没有运气:( 一些建议请

回答

1

下面是对我的作品的完整的代码示例...

import java.io.FileOutputStream; 
import java.io.IOException; 
import java.io.InputStream; 
import java.net.Authenticator; 
import java.net.HttpURLConnection; 
import java.net.InetSocketAddress; 
import java.net.MalformedURLException; 
import java.net.Proxy; 
import java.net.SocketAddress; 
import java.net.URL; 

public class DownloadImage { 

    public static void main(String[] args) { 

     // URLs for Images we wish to download 
     String[] urls = { 
       "http://cdn.sstatic.net/stackoverflow/img/apple-touch-icon.png", 
       "http://www.google.co.uk/images/srpr/logo3w.png", 
       "http://i.microsoft.com/global/en-us/homepage/PublishingImages/sprites/microsoft_gray.png" 
       }; 

     for(int i = 0; i < urls.length; i++) { 
      downloadFromUrl(urls[i]); 
     } 

    } 

    /* 
    Extract the file name from the URL 
    */ 
    private static String getOutputFileName(URL url) { 

     String[] urlParts = url.getPath().split("/"); 

     return "c:/temp/" + urlParts[urlParts.length-1]; 
    } 

    /* 
    Assumes there is no Proxy server involved. 
    */ 
    private static void downloadFromUrl(String urlString) { 

     InputStream is = null; 
     FileOutputStream fos = null; 

     try { 
      URL url = new URL(urlString); 

      System.out.println("Reading..." + url); 

      HttpURLConnection conn = (HttpURLConnection)url.openConnection(proxy); 

      is = conn.getInputStream(); 

      String filename = getOutputFileName(url); 

      fos = new FileOutputStream(filename); 

      byte[] readData = new byte[1024]; 

      int i = is.read(readData); 

      while(i != -1) { 
       fos.write(readData, 0, i); 
       i = is.read(readData); 
      } 

      System.out.println("Created file: " + filename); 
     } 
     catch (MalformedURLException e) { 
      e.printStackTrace(); 
     } 
     catch (IOException e) { 
      e.printStackTrace(); 
     } 
     finally { 
      if(is != null) { 
       try { 
        is.close(); 
       } catch (IOException e) { 
        System.out.println("Big problems if InputStream cannot be closed"); 
       } 
      }   
      if(fos != null) { 
       try { 
        fos.close(); 
       } catch (IOException e) { 
        System.out.println("Big problems if FileOutputSream cannot be closed"); 
       } 
      } 
     } 

     System.out.println("Completed"); 
    } 
} 

你应该看到控制台上的以下输出中......

Reading...http://cdn.sstatic.net/stackoverflow/img/apple-touch-icon.png 
Created file: c:/temp/apple-touch-icon.png 
Completed 
Reading...http://www.google.co.uk/images/srpr/logo3w.png 
Created file: c:/temp/logo3w.png 
Completed 
Reading...http://i.microsoft.com/global/en-us/homepage/PublishingImages/sprites/microsoft_gray.png 
Created file: c:/temp/microsoft_gray.png 
Completed 

所以这是一个工作示例不使用代理服务器涉及服务器。

只有当你需要进行身份验证与代理服务器这里,你需要在此基础上Oracle technote

import java.net.Authenticator; 
import java.net.PasswordAuthentication; 

public class ProxyAuthenticator extends Authenticator { 

    private String userName, password; 

    public ProxyAuthenticator(String userName, String password) { 
     this.userName = userName; 
     this.password = password; 
    } 

    protected PasswordAuthentication getPasswordAuthentication() { 
     return new PasswordAuthentication(userName, password.toCharArray()); 
    } 
} 

,并使用这个新的类,你可以使用下面的代码来代替一个附加的类上面显示的openConnection()调用

... 
try { 
    URL url = new URL(urlString); 

    System.out.println("Reading..." + url); 

    Authenticator.setDefault(new ProxyAuthenticator("username", "password"); 

    SocketAddress addr = new InetSocketAddress("proxy.server.com", 80); 
    Proxy proxy = new Proxy(Proxy.Type.HTTP, addr); 

    HttpURLConnection conn = (HttpURLConnection)url.openConnection(proxy); 

    ... 
1

您的问题听起来像是HTTP通信问题,所以你可能最好用图书馆来处理事物的通信方面。看看Apache Commons HttpClient

关于你的代码示例的一些注意事项你还没有使用URLConnection对象所以目前还不清楚在Web /代理服务器和干净关闭资源方面会出现什么样的行为。所提及的HttpCommon库在这方面将有所帮助。

似乎还有一些例子可以用J2ME libararies来做你想做的事情。不是我个人用过的东西,但也可能会帮助你。

+0

非常感谢。作为新手不了解URLConnection对象的事情。我仍然不清楚你想通过干净地关闭资源来说什么。请解释一下 – sachinjain024 2012-04-13 17:06:06

+1

你还在收到502错误吗?我发布了另一个没有JSoup业务的代码示例供您试用。也许这将有助于确定问题所在。 – Brad 2012-04-18 10:32:08

+0

谢谢布拉德,感谢您解决问题。由于我遇到了这个问题,我使用jsoup将相对URL改为绝对URL,这确实解决了我的目的,并且非常感谢帮助我并给出了确切的解决方案.Cheers \ m / – sachinjain024 2012-04-19 08:28:32