2012-02-21 210 views
0

我必须在google中获取第一个搜索结果的html页面。
为此,我使用谷歌“我很幸运”,所以基本上加入& btnI来搜索查询网址。 因此,例如 - http://www.google.com/search?q=%22movie%22+site:amazon.com&btnI重定向到amazon.com上的电影相关页面在java中从重定向url中获取数据给出了403错误

让这就是我们的searchQuery;

searchQuery = "http://www.google.com/search?q=%22movie%22+site:amazon.com&btnI"; 
URL url = new URL(searchQuery); 
InputStream response = url.openStream(); 
BufferedReader reader = new BufferedReader(new InputStreamReader(response)); 
for (String line; (line = reader.readLine()) != null;) { 
    System.out.println(line); 
} 
reader.close(); 

我越来越
错误:服务器返回的HTTP响应代码:403网址:http://www.google.com/search?q=%22movie%22+site:amazon.com&btnI
需要一些帮助,同时如果有更好的方法..do让我知道!

回答

1

尝试使用HttpURLConnection

然后#setFollowRedirects(true)并为Firefox或IE等设置用户代理。

像这样:

URLConnection connection = new URL(searchQuery).openConnection(); 
connection.setFollowRedirects(true); 
connection.setRequestProperty("User-Agent", 
     "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0.2) Gecko/20100101 Firefox/10.0.2"); 
connection.connect(); 
InputStream response = connection.getInputStream(); 
... 
+0

任何想法,为什么我的代码失败了呢? – r15habh 2012-02-21 10:31:14