2011-02-07 69 views

回答

7

当冰壶一个URL,你只收到的是那个URL,这很可能只是一个HTML文档。
卷曲不自动下载在HTML文档中提及的所有200张图像,因为卷曲不关心HTML。恰恰相反;如果你想想要下载所有200张图片,你必须手动解析HTML,并对每张图片进行更多的cURL请求。从命令行

实施例:

$ curl -i www.w3.org 
HTTP/1.1 200 OK 
Date: Mon, 07 Feb 2011 02:46:36 GMT 
Server: Apache/2 
Content-Location: Home.html 
Vary: negotiate,accept,Accept-Encoding 
TCN: choice 
Last-Modified: Tue, 01 Feb 2011 20:42:28 GMT 
ETag: "74f2-49b3e92157500;89-3f26bd17a2f00" 
Accept-Ranges: bytes 
Content-Length: 29938 
Cache-Control: max-age=600 
Expires: Mon, 07 Feb 2011 02:56:36 GMT 
P3P: policyref="http://www.w3.org/2001/05/P3P/p3p.xml" 
Connection: close 
Content-Type: text/html; charset=utf-8 

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> 
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> 
<!-- Generated from data/head-home.php, ../../smarty/{head.tpl} --> 
<head> 
<title>World Wide Web Consortium (W3C)</title> 
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> 
<link rel="Help" href="/Help/" /> 
<link rel="stylesheet" href="/2008/site/css/minimum" type="text/css" media="handheld, all" /> 
<style type="text/css" media="print, screen and (min-width: 481px)"> 
/*<![CDATA[*/ 
@import url("/2008/site/css/advanced"); 
/*]]>*/ 
</style> 
<link href="/2008/site/css/minimum" rel="stylesheet" type="text/css" media="handheld, only screen and (max-device-width: 480px)" /> 
<meta name="viewport" content="width=device-width" /> 
<link rel="stylesheet" href="/2008/site/css/print" type="text/css" media="print" /> 
<link rel="shortcut icon" href="/2008/site/images/favicon.ico" type="image/x-icon" /> 
<meta name="description" content="The World Wide Web Consortium (W3C) is an international community where Member organizations, a full-time staff, and the public work together to develop Web standards." /> 
<link rel="alternate" type="application/atom+xml" title="W3C News" href="/News/atom.xml" /> 
</head> 
<body id="www-w3-org" class="w3c_public w3c_home"> 
<div id="w3c_container"> 
<!-- Generated from data/mast-home.php, ../../smarty/{mast.tpl} --> 
<div id="w3c_mast"><!-- #w3c_mast/Page top header --> 
<h1 class="logo"><a tabindex="2" accesskey="1" href="/"><img src="/2008/site/images/logo-w3c-mobile-lg" width="90" height="53" alt="W3C" /></a> <span class="alt-logo">W3C</span></h1> 
<div id="w3c_nav"> 

... 

这就是所有的卷曲请求获取。那里有一张图片:<img src="/2008/site/images/logo-w3c-mobile-lg" width="90" height="53" alt="W3C" />。这就是你所得到的,你没有获得图像本身。

+0

我不想下载200幅图像。我想阻止他们。 – 2011-02-07 02:34:39

0

你不能让它没有图像,但你可以从结果很容易的用正则表达式或DOM解析器剥夺他们......不过,卷曲,你实际上并没有庄家对图像的请求,只是页面上的HTML(所以你会剥出标签)

相关问题