2013-12-15 67 views
0

我仍在使用我的mp3下载器,但现在我遇到了正在下载的文件的问题。我有两个版本的部分让我绊倒。第一个给我一个正确的文件,但会导致错误。第二个给我一个文件太小,但没有错误。我试过以二进制模式打开文件,但没有帮助。我对使用html做任何工作都很陌生,所以任何帮助都很重要。使用urllib.urlretrieve通过HTTP下载文件无法正常工作

import urllib 
import urllib2 

def milk(): 
    SongList = [] 
    SongStrings = [] 
    SongNames = [] 
    earmilk = urllib.urlopen("http://www.earmilk.com/category/pop") 
    reader = earmilk.read() 
    #gets the position of the playlist 
    PlaylistPos = reader.find("var newPlaylistTracks = ") 
    #finds the number of songs in the playlist 
    NumberSongs = reader[reader.find("var newPlaylistIds = "): PlaylistPos].count(",") + 1 
    initPos = PlaylistPos 

    #goes though the playlist and records the html address and name of the song 

    for song in range(0, NumberSongs): 
     songPos = reader[initPos:].find("http:") + initPos 
     namePos = reader[songPos:].find("name") + songPos 
     namePos += reader[namePos:].find(">") 
     nameEndPos = reader[namePos:].find("<") + namePos 
     SongStrings.append(reader[songPos: reader[songPos:].find('"') + songPos]) 
     SongNames.append(reader[namePos + 1: nameEndPos]) 
     initPos = nameEndPos 

    for correction in range(0, NumberSongs): 
     SongStrings[correction] = SongStrings[correction].replace('\\/', "/") 

    #downloading songs 

    fileName = ''.join([a.isalnum() and a or '_' for a in SongNames[0]]) 
    fileName = fileName.replace("_", " ") + ".mp3" 


#   This version writes a file that can be played but gives an error saying: "TypeError: expected a character buffer object" 
## songDL = open(fileName, "wb") 
## songDL.write(urllib.urlretrieve(SongStrings[0], fileName)) 


#   This version creates the file but it cannot be played (file size is much smaller than it should be) 
## url = urllib.urlretrieve(SongStrings[0], fileName) 
## url = str(url) 
## songDL = open(fileName, "wb") 
## songDL.write(url) 


    songDL.close() 

    earmilk.close() 

回答

2

重读the documentation for urllib.urlretrieve

返回一个元组(文件名,标题)其中filename是本地文件 名下该对象可以发现,和头是无论 信息( )由urlopen()返回的对象的方法返回(对于可能缓存的 远程对象)。

您似乎期待它返回文件本身的字节。 urlretrieve这一点是它为你处理写入文件,并返回它写入的文件名(如果你提供了一个函数,它通常与你函数的第二个参数是一样的)。

+2

顺便说一下,这种事情是学习使用[pdb](http://docs.python.org/2/library/pdb.html)的重要原因。在Python REPL中运行你的函数,当它崩溃时,输入'import pdb; pdb.pm()'在代码崩溃时获得调试器提示符。从那里你可以直接查看像'urlretrieve'这样的函数实际上是否返回。这应该让你了解为什么你要用返回值做的各种事情都失败了。 – Iguananaut

相关问题