2017-04-07 139 views
1

在Pytesseract的以下代码中遇到此错误代码的问题。从PIL进口的ImageFilter (Python的3.6.1,Mac OSX版)Pytesseract转换过程中出现“ValueError:无法过滤调色板图像”

进口pytesseract 导入请求 从PIL进口图片 从IO导入StringIO的,BytesIO

def process_image(url): 
    image = _get_image(url) 
    image.filter(ImageFilter.SHARPEN) 
    return pytesseract.image_to_string(image) 


def _get_image(url): 
    r = requests.get(url) 
    s = BytesIO(r.content) 
    img = Image.open(s) 
    return img 

process_image("https://www.prepressure.com/images/fonts_sample_ocra_medium.png") 

错误:

/usr/local/Cellar/python3/3.6.0_1/Frameworks/Python.framework/Versions/3.6/bin/python3.6 /Users/g/pyfo/reddit/ocr.py 
Traceback (most recent call last): 
    File "/Users/g/pyfo/reddit/ocr.py", line 20, in <module> 
    process_image("https://www.prepressure.com/images/fonts_sample_ocra_medium.png") 
    File "/Users/g/pyfo/reddit/ocr.py", line 10, in process_image 
    image.filter(ImageFilter.SHARPEN) 
    File "/usr/local/lib/python3.6/site-packages/PIL/Image.py", line 1094, in filter 
    return self._new(filter.filter(self.im)) 
    File "/usr/local/lib/python3.6/site-packages/PIL/ImageFilter.py", line 53, in filter 
    raise ValueError("cannot filter palette images") 
ValueError: cannot filter palette images 

Process finished with exit code 1 

似乎很简单,但不起作用。任何帮助将不胜感激。

+0

可能重复的[Python3错误:初始\ _value必须是str或None](http://stackoverflow.com/questions/31064981/python3-error-initial-value-must-be-str-or-none) – Craig

+0

@克雷格我看到一个和不幸的答案并没有解决我的问题。我正在使用Python 3.6.1顺便说一句。 – gmonz

+1

所以你用'BytesIO'替换了'StringIO'并且你得到了同样的错误信息?如果是这样,那么将Image.open(StringIO(requests.get(url).content))''分成几个单独的行(基本调试)以确定哪个调用正在抛出错误。 – Craig

回答

2

您拥有的图像是基于托盘的图像。您需要将其转换为完整的RGB图像才能使用PIL滤镜。

import pytesseract 
import requests 
from PIL import Image, ImageFilter 
from io import StringIO, BytesIO 

def process_image(url): 
    image = _get_image(url) 
    image = image.convert('RGB') 
    image = image.filter(ImageFilter.SHARPEN) 
    return pytesseract.image_to_string(image) 


def _get_image(url): 
    r = requests.get(url) 
    s = BytesIO(r.content) 
    img = Image.open(s) 
    return img 

process_image("https://www.prepressure.com/images/fonts_sample_ocra_medium.png") 

你也应该注意到,在.convert().filter()方法返回图像的副本,只要不改变现有的图像对象。您需要将返回值分配给变量,如上面的代码所示。

注:我没有pytesseract,所以我无法检查最后一行process_image()

+0

嗯,现在我得到 '/usr/local/Cellar/python3/3.6.0_1/Frameworks/Python.framework/Versions/3.6/bin/python3.6 /Users/g/pyfo/reddit/ocr2.py 回溯(最近呼叫最后): 文件“/ Users/g/pyfo/reddit/ocr2。py“,第19行,在 process_image(”https://www.prepressure.com/images/fonts_sample_ocra_medium.png“) 文件”/Users/g/pyfo/reddit/ocr2.py“,第10行,在process_image return pytesseract.image_to_string(image) AttributeError:模块'pytesseract'没有任何属性'image_to_string'' – gmonz

+0

我知道我接近另一个了,只需将image.enhance(2.0)设置为?pastebin.com/ uRfhsi8J @Craig或者我确实需要过滤器和增强功能,所以如下:?pastebin.com/GbUkqp9c?我需要在进行锐化和过滤之前进行明确的增强 – gmonz

+0

我无法帮助您使用pytesseract,我建议您继续试验代码,如果你仍然坚持,发布一个新的问题,指导如何在你的图像中使用pytesseract。 – Craig