我试图编写一个Python脚本,通过目录树进行搜索并列出所有.flac文件并从resp中派生Arist,Album和Title。 dir/subdir /文件名并将其写入文件。该代码工作正常,直到它击中一个Unicode字符。下面的代码:python findall,正则表达式,unicode
import os, glob, re
def scandirs(path):
for currentFile in glob.glob(os.path.join(path, '*')):
if os.path.isdir(currentFile):
scandirs(currentFile)
if os.path.splitext(currentFile)[1] == ".flac":
rpath = os.path.relpath(currentFile)
print "**DEBUG** rpath =", rpath
title = os.path.basename(currentFile)
title = re.findall(u'\d\d\s(.*).flac', title, re.U)
title = title[0].decode("utf8")
print "**DEBUG** title =", title
fpath = os.path.split(os.path.dirname(currentFile))
artist = fpath[0][2:]
print "**DEBUG** artist =", artist
album = fpath[1]
print "**DEBUG** album =", album
out = "%s | %s | %s | %s\n" % (rpath, artist, album, title)
flist = open('filelist.tmp', 'a')
flist.write(out)
flist.close()
scandirs('./')
码输出:
**DEBUG** rpath = Thriftworks/Fader/Thriftworks - Fader - 01 180°.flac
**DEBUG** title = 180°
**DEBUG** artist = Thriftworks
**DEBUG** album = Fader
Traceback (most recent call last):
File "decflac.py", line 25, in <module>
scandirs('./')
File "decflac.py", line 7, in scandirs
scandirs(currentFile)
File "decflac.py", line 7, in scandirs
scandirs(currentFile)
File "decflac.py", line 20, in scandirs
out = "%s | %s | %s | %s\n" % (rpath, artist, album, title)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 46: ordinal not in range(128)
但是在Python控制台尝试时,它工作正常:
>>> import re
>>> title = "Thriftworks - Fader - 01 180°.flac"
>>> title2 = "dummy"
>>> title = re.findall(u'\d\d\s(.*).flac', title, re.U)
>>> title = title[0].decode("utf8")
>>> out = "%s | %s\n" % (title2, title)
>>> print out
dummy | 180°
所以,我的问题: 1)为什么相同的代码在控制台中工作,但不在脚本中? 2)如何修复脚本?
谢谢,马克。仍然无法让它与u前缀glob一起工作,但是使用os.walk而不是glob构造,脚本在unicode和Python2中工作得很好。 – 2015-02-09 12:25:48