2011-04-18 68 views
2

我遇到一个多线程的问题与HttpClient的,我有以下情形:多线程问题与HttpClient的

线程A会发出URL http://blap.com?param=2

线程B会发出URL http://blap.com?param=3

这约98%的时间工作,但偶尔线程A将收到线程B的url的数据,反之亦然。

现在每个线程都创建它自己的HttpClient实例,所以我在理论上认为我不需要使用MultiThreadedHttpConnectionManager。

我描述的行为看起来似乎是合理的,它会通过使用MultiThreadedHttpConnectionManager修复吗?

我使用java 1.6和apache http客户端组件4.0.3。

更新: 这里是有问题的功能。

public void get_url(String strDataSet) throws SQLException, MalformedURLException, IOException 
{ 

     String query; 



     query = "select * from jobs where data_set='" + strDataSet + "'"; 

     ResultSet rs2 = dbf.db_run_query(query); 
     rs2.next(); 


     HttpClient httpclient = new DefaultHttpClient(); 
     HttpResponse response; 



      String strURL; 
      strURL = rs2.getString("url_static"); 

      if (rs2.getString("url_dynamic")!=null && !rs2.getString("url_dynamic").isEmpty()) 
       strURL = strURL.replace("${dynamic}", rs2.getString("url_dynamic")); 

      UtilityFunctions.stdoutwriter.writeln("Retrieving URL: " + strURL,Logs.STATUS2,"DG25"); 

      if (!strURL.contains(":")) 
       UtilityFunctions.stdoutwriter.writeln("WARNING: url is not preceeded with a protocol" + strURL,Logs.STATUS1,"DG25.5"); 

      //HttpGet chokes on the^character 

      strURL = strURL.replace("^","%5E"); 


      HttpGet httpget = new HttpGet(strURL); 


      /* 
      * The following line fixes an issue where a non-fatal error is displayed about an invalid cookie data format. 
      * It turns out that some sites generate a warning with this code, and others without it. 
      * I'm going to kludge this for now until I get more data on which urls throw the 
      * warning and which don't. 
      * 
      * warning with code: www.exchange-rates.org 
      */ 


       if (!(strCurDataSet.contains("xrateorg") || strCurDataSet.contains("google") || strCurDataSet.contains("mwatch"))) 
       { 
        httpget.getParams().setParameter("http.protocol.cookie-datepatterns", 
          Arrays.asList("EEE, dd MMM-yyyy-HH:mm:ss z", "EEE, dd MMM yyyy HH:mm:ss z")); 
       } 







      response = httpclient.execute(httpget); 




     HttpEntity entity = response.getEntity(); 

      BufferedReader in = new BufferedReader(
        new InputStreamReader(
        entity.getContent())); 



     int nTmp;   

     returned_content=""; 




     while ((nTmp = in.read()) != -1) 
     returned_content = returned_content + (char)nTmp; 


     in.close(); 

     httpclient.getConnectionManager().shutdown(); 

     UtilityFunctions.stdoutwriter.writeln("Done reading url contents",Logs.STATUS2,"DG26"); 



} 

更新: 我将问题范围缩小到行:

response = httpclient.execute(httpget); 

如果我把周围的线螺纹锁,问题就走了。事情是,这是最耗时的部分,我不希望只有一个线程能够一次检索http数据。

+0

这种方法看起来好像它一次处理大量的非连接事件:从数据库中获取URL,验证该URL,从相应的HTTP连接中读取数据。您是否考虑将其重构为几个类,以便于维护和单元测试?顺便说一句,如果'rs.next()'返回'false'呢?在当前的代码竞争条件可能在任何地方,即使在数据库级别。 – 2011-04-18 20:09:48

回答

0

您的代码不是线程安全的。要解决您的直接问题,您需要将HttpClient声明为ThreadLocal,但还有很多需要解决的问题。

+0

我没有看到使HttpClient threadlocal的好处。 Threadlocal在线程中使其成为全局的,但这是使用该变量的唯一方法。还有什么不是线程安全的任何细节也将不胜感激。 – opike 2011-04-18 21:03:11

0

你需要在每个线程创建一个新的HttpContext,并把它传递给HttpClient.execute:

HttpContext localContext = new BasicHttpClient(); 
response = httpclient.execute(httpget, localContext); 

请参见本文档的底部(从HttpClient的4):

http://hc.apache.org/httpcomponents-client-ga/tutorial/html/statemgmt.html

还有一个线程安全的HttpContext实现(SyncBasicHttpContext),但我不确定在这种情况下是否需要它。