Python“Re”模块对特殊语言敏感吗？

-1

def process_file_step(message): 
chat_id = message.chat.id 
search = message.text 
pattern = re.compile(u'.*%s.*\.pdf' %re.escape(search), re.I) 
if next(filter(pattern.search, os.listdir('Files')), False): 
    bot.send_chat_action(chat_id, 'typing') 
    bot.send_message(chat_id, 'فایل هایی که نیاز داشتید :') 
    for files in filter(pattern.search, os.listdir('Files')): 
     requested_file = open('Files/' + files, 'rb') 
     bot.send_chat_action(chat_id, 'upload_document') 
     bot.send_document(chat_id, requested_file, caption='@RavanPediaBot') 
     requested_file.close() 
else: 
    bot.send_chat_action(chat_id, 'typing') 
    bot.send_message(chat_id, 'چنین سندی وجود ندارد !') 
bot.register_next_step_handler(bot.send_message(chat_id, 'درخواست دیگری دارید ؟', reply_markup=process_request_step_markup), process_request_step)

这从用户获取和搜索类似的名字，以它在文件中的字符串，并上传用户。它工作的很好，但问题是它不适用于波斯语，它总是发送一条消息，指出该文件不存在。当我在计算机上运行脚本时，它也适用于波斯语名称，但是当我使用codeanywhere.com时为了运行脚本，它不起作用。我使用pyTelegramBotAPI。另外我使用Python 3.x.我从函数参数中获取文本。我也用&转义搜索字符串，但没有奏效。

我印刷波斯消息，触发此错误：

UnicodeEncodeError： 'ASCII' 编解码器在0-3位置无法编码的字符：在范围序数不（128）

来源

2017-08-06 Ali Bahaari

我将首先转义搜索模式：''。*％s。* \。pdf'％re.escape（search）'。从现在开始，可以在其中编写正则表达式。 –

查看标准库的pathlib模块中的新Path类。你可以很容易地做glob匹配：'Path（“Files”）。glob（“* {} *。pdf”.format（search））' – PaulMcG

@WillemVanOnsem谢谢，但它没有工作... –

这是很可能是编码问题：您从命令行获得search，但是您确定编码与您正在搜索的文件的编码相匹配吗？

是什么操作系统您使用?:的Linux/Unix/OSX通常使用UTF-8外壳输入，但Windows通常使用CP-1252在默认情况下，编码这是almost the same... but not quite。？你要知道你的input_encoding是什么，然后将其解码为Unicode，使其工作：unicode_search = search.decode(input_encoding, "strict")
您正在使用什么版本的Python - Python 3中使用UTF-8字符串作为默认but Python 2 doesn't;在这种情况下，force the regex string into being Unicode使用u'string'文字：pattern = re.compile(u'.*%s.*\.pdf' %search, re.I)
如果你正在使用Python 2，你应该通过re.UNICODE您re.compile(..)声明。
最后，你不是逃避你的输入，也就是说， ?被视为前一个字符的条件表达式，而不是匹配实际的问号;你的第一行应该是search = re.escape(args[0])

来源

2017-08-06 22:07:19 errantlinguist

@erranlinguist我从电报中得到用户文本... –

这是你必须找出的东西，因为我不知道什么进入你的'args'变量，我不'不知道如何从电报中获取数据。 – errantlinguist

我会把代码... –

Python“Re”模块对特殊语言敏感吗？

回答

相关问题