2010-04-22 82 views
10

这里是我当前的代码:如何使用cURL与PHP同时打开多个URL?

$SQL = mysql_query("SELECT url FROM urls") or die(mysql_error()); //Query the urls table 
while($resultSet = mysql_fetch_array($SQL)){ //Put all the urls into one variable 

       // Now for some cURL to run it. 
      $ch = curl_init($resultSet['url']); //load the urls 
      curl_setopt($ch, CURLOPT_TIMEOUT, 2); //No need to wait for it to load. Execute it and go. 
      curl_exec($ch); //Execute 
      curl_close($ch); //Close it off 
     } //While loop 

我是比较新的卷曲。相对较新,我的意思是这是我第一次使用cURL。目前它加载一个两秒,然后加载下一个2秒,然后下一个。不过,我想让它在同一时间加载它们。我相信它是可能的,我只是不确定如何。如果有人能指引我正确的方向,我会很感激。

+0

您是否需要对卷曲加载的结果执行任何操作? – 2010-04-22 16:37:59

+0

没有。 – Rob 2010-04-22 16:39:06

回答

8

您以相同的方式设置每个cURL句柄,然后将它们添加到curl_multi_句柄。要查看的功能是curl_multi_*函数documented here。不过,根据我的经验,尝试一次加载过多网址时存在问题(尽管目前我无法在其上找到我的笔记),所以上次我使用curl_mutli_时,我将其设置为分批每次5个网址。

编辑:这是我一直在使用curl_multi_代码的简化版本:

编辑:略改写和大量的补充意见,希望这将有助于。

// -- create all the individual cURL handles and set their options 
$curl_handles = array(); 
foreach ($urls as $url) { 
    $curl_handles[$url] = curl_init(); 
    curl_setopt($curl_handles[$url], CURLOPT_URL, $url); 
    // set other curl options here 
} 

// -- start going through the cURL handles and running them 
$curl_multi_handle = curl_multi_init(); 

$i = 0; // count where we are in the list so we can break up the runs into smaller blocks 
$block = array(); // to accumulate the curl_handles for each group we'll run simultaneously 

foreach ($curl_handles as $a_curl_handle) { 
    $i++; // increment the position-counter 

    // add the handle to the curl_multi_handle and to our tracking "block" 
    curl_multi_add_handle($curl_multi_handle, $a_curl_handle); 
    $block[] = $a_curl_handle; 

    // -- check to see if we've got a "full block" to run or if we're at the end of out list of handles 
    if (($i % BLOCK_SIZE == 0) or ($i == count($curl_handles))) { 
     // -- run the block 

     $running = NULL; 
     do { 
      // track the previous loop's number of handles still running so we can tell if it changes 
      $running_before = $running; 

      // run the block or check on the running block and get the number of sites still running in $running 
      curl_multi_exec($curl_multi_handle, $running); 

      // if the number of sites still running changed, print out a message with the number of sites that are still running. 
      if ($running != $running_before) { 
       echo("Waiting for $running sites to finish...\n"); 
      } 
     } while ($running > 0); 

     // -- once the number still running is 0, curl_multi_ is done, so check the results 
     foreach ($block as $handle) { 
      // HTTP response code 
      $code = curl_getinfo($handle, CURLINFO_HTTP_CODE); 

      // cURL error number 
      $curl_errno = curl_errno($handle); 

      // cURL error message 
      $curl_error = curl_error($handle); 

      // output if there was an error 
      if ($curl_error) { 
       echo(" *** cURL error: ($curl_errno) $curl_error\n"); 
      } 

      // remove the (used) handle from the curl_multi_handle 
      curl_multi_remove_handle($curl_multi_handle, $handle); 
     } 

     // reset the block to empty, since we've run its curl_handles 
     $block = array(); 
    } 
} 

// close the curl_multi_handle once we're done 
curl_multi_close($curl_multi_handle); 

既然你什么都不需要从后面的网址,你也许并不需要很多东西的存在,但我这是怎么分块请求进入BLOCK_SIZE块,等待每块在继续之前运行,并从cURL中捕获错误。

+0

好吧,我要做的就是加载每个网址(以及它将加载的网址是空白页,访问网址只启动一个脚本并使其运行预设时间),而不是保存或输出任何数据。你认为这会造成这种情况下的任何问题吗? – Rob 2010-04-22 16:45:17

+0

我的猜测是,在这种情况下它不会出现问题,但我不确定 - 如果尝试一次加载所有这些文件时无法运行或发生错误,则可以将计数器在你的'while'循环中,并且每当循环内部有'counter%batch_size == 0'时,运行批处理并清除它。 – Isaac 2010-04-22 18:11:12

+0

哇。讨厌用这个打扰你,但是你能否在该代码中评论一些东西,以便我可以看到所有事情都做了什么? – Rob 2010-04-22 18:39:56