用Goutte/Guzzle登录后下载文件

到目前为止，我尝试了下面的代码。问题是，看来我没有被授权了。下载的文件是login.html页面。

有人知道如何使这项工作？提前致谢！

<?php 

require 'vendor/autoload.php'; 

use Goutte\Client; 

$client = new Client(); 

$crawler = $client->request('GET', 'https://website.com/login.php'); 

$form = $crawler->selectButton('Login')->form(); 
$crawler = $client->submit($form, array('username' => 'username', 'password' => 'password')); 

... 

$download_link = 'https://website.com/extracted_download_link_from_crawler.pdf'; 

$guzzleClient = $client->getClient(); 

$response = $guzzleClient->get($download_link, ['save_to' => '/local_path/file.pdf']);

来源

2016-07-22 Ramon Hollands

您是否尝试设置一个用户代理？ – lauda

想通了自己：

我从GOUTTE客户端的cookie和它们存储在狂饮客户cookiejar：

//get the PHPSESSION COOKIE 
$cookieJar = $goutteClient->getCookieJar(); 
$all_cookies = $cookieJar->all(); 
$PHPSESSID_value = $all_cookies[7]->getValue(); 

//Update the cookie for different guzzleClient and download 
$guzzleClient = $client->getClient(); 
$jar = new \GuzzleHttp\Cookie\CookieJar; 
$response = $guzzleClient->get($download_link, ['cookies' => $jar, 'save_to' => '/local_path/file.pdf']);

来源

2016-08-07 18:25:18

你有什么想法如何阅读下载网址上的名称？在上述情况下，文件的名称在URL中。但并不总是在URL中提供文件名称。 –

$cookieJar = $client->getCookieJar(); 
$guzzleClient = $client->getClient(); 
$jar = GuzzleHttp\Cookie\CookieJar::fromArray($cookieJar->all(), 'website.com'); 
$response = $guzzleClient->get('URL TO FILE', ['cookies' => $jar, 'sink' => 'my.pdf']);

来源

2018-01-26 17:15:49

欢迎来到Stack Overflow！尽管这段代码可以解决这个问题，但[包括一个解释]（// meta.stackexchange.com/questions/114762/explaining-entirely-code-based-answers）确实有助于提高您的帖子的质量。请记住，您将来会为读者回答问题，而这些人可能不知道您的代码建议的原因。也请尽量不要使用解释性注释来挤占代码，因为这会降低代码和解释的可读性！ – FrankerZ

即使您的代码解决了OP问题，建议您将一些描述性文本添加到代码段中。 –

用Goutte/Guzzle登录后下载文件

回答

相关问题