2016-11-11 103 views
1

我可以找到一些doc解释如何使用tqdm包,但我不知道如何在线下载数据时如何生成进度表。如何在python中使用`tqdm`来在线下载数据时显示进度?

下面是一个示例代码,我从ResidentMario复制下载数据

def download_file(url, filename): 
    """ 
    Helper method handling downloading large files from `url` to `filename`. Returns a pointer to `filename`. 
    """ 
    r = requests.get(url, stream=True) 
    with open(filename, 'wb') as f: 
     for chunk in r.iter_content(chunk_size=1024): 
      if chunk: # filter out keep-alive new chunks 
       f.write(chunk) 
    return filename 


dat = download_file("https://data.cityofnewyork.us/api/views/h9gi-nx95/rows.csv?accessType=DOWNLOAD", 
        "NYPD Motor Vehicle Collisions.csv") 

谁能告诉我如何在这里使用tqdm包显示下载进度?

感谢

回答

1

截至目前,我做这样的事情:

def download_file(url, filename): 
    """ 
    Helper method handling downloading large files from `url` to `filename`. Returns a pointer to `filename`. 
    """ 
    chunkSize = 1024 
    r = requests.get(url, stream=True) 
    with open(filename, 'wb') as f: 
     pbar = tqdm(unit="B", total=int(r.headers['Content-Length'])) 
     for chunk in r.iter_content(chunk_size=chunkSize): 
      if chunk: # filter out keep-alive new chunks 
       pbar.update (len(chunk)) 
       f.write(chunk) 
    return filename 
1

感谢精灵宝钻,但下面的工作,使我更有意义。

def download_file(url, filename): 
    testread = requests.head(url_r)  # A HEAD request only downloads the headers 
    filelength = int(testread.headers['Content-length']) 

    r = requests.get(url, stream=True) # actual download full file 

    with open(filename, 'wb') as f: 
     pbar = tqdm(total=int(filelength/1024)) 
     for chunk in r.iter_content(chunk_size=1024): 
      if chunk:     # filter out keep-alive new chunks 
       pbar.update() 
       f.write(chunk) 
+0

所以基本上你做两个http请求下载单个文件。 效率不高,如果目标网址经历了动态处理,则效率更高。 – silmaril

相关问题