tess-two OCR没有正确解码

我已经按照教程来获取Tesseract，特别是苔丝二和眼睛二安装和我的Android应用程序的一部分。tess-two OCR没有正确解码

它运行，但从 baseApi.getUTF8Text();返回的OCR文本是完整的乱码。

BitmapFactory.Options options = new BitmapFactory.Options(); 
     options.inSampleSize = 4; 
     Bitmap bmp = BitmapFactory.decodeFile(path , options); 
     receipt.setImageBitmap(bmp); 

     try { 
      ExifInterface exif = new ExifInterface(path); 
      int exifOrientation = exif.getAttributeInt(ExifInterface.TAG_ORIENTATION , ExifInterface.ORIENTATION_NORMAL); 
      int rotate = 0; 
      switch (exifOrientation) { 
       case ExifInterface.ORIENTATION_ROTATE_90: rotate = 90; break; 
       case ExifInterface.ORIENTATION_ROTATE_180: rotate = 180; break; 
       case ExifInterface.ORIENTATION_ROTATE_270: rotate = 270; break; 
      } 
      if (rotate != 0) { 
       int w = bmp.getWidth(); 
       int h = bmp.getHeight(); 
       Matrix matrix = new Matrix(); 
       matrix.preRotate(rotate); 
       bmp = Bitmap.createBitmap(bmp, 0, 0, w, h, matrix, false); 
      } 

      bmp = bmp.copy(Bitmap.Config.ARGB_8888, true); 


      TessBaseAPI baseApi = new TessBaseAPI(); 
      baseApi.init(DATA_PATH , "eng"); 
      baseApi.setImage(bmp); 
      String OCRText = baseApi.getUTF8Text(); 
      baseApi.end(); 

      Log.i("OCR Text", "rotate " + rotate); 
      Log.i("OCR Text", "OCR "); 
      Log.i("OCR Text", OCRText); 
      Log.i("OCR Text", "=======================================================================================");

拍摄具有OCR字符返回

05-14 11:01:59.131: I/OCR Text(18199): rotate 90 
05-14 11:01:59.131: I/OCR Text(18199): OCR 
05-14 11:01:59.131: I/OCR Text(18199): 4— ‘ ‘ 
05-14 11:01:59.131: I/OCR Text(18199): \Dxﬁ ‘ 
05-14 11:01:59.131: I/OCR Text(18199): I W man"! no Accounv 
05-14 11:01:59.131: I/OCR Text(18199): 1’ 
05-14 11:01:59.131: I/OCR Text(18199): my... «unblm m. mm. 
05-14 11:01:59.131: I/OCR Text(18199): :~A 
05-14 11:01:59.131: I/OCR Text(18199): «Ln. 
05-14 11:01:59.131: I/OCR Text(18199): ‘ “w “IN. N I “H‘M‘ 
05-14 11:01:59.131: I/OCR Text(18199): mmnwnmw- .; k. ' 
05-14 11:01:59.131: I/OCR Text(18199): Wilt-run”. uni” nl 
05-14 11:01:59.131: I/OCR Text(18199): mam. I 
05-14 11:01:59.131: I/OCR Text(18199): =======================================================================================

如何清理和纠正OCR识别任何意见支票？使用设备是三星Galaxy 7"

来源

2015-05-14 NewDev

三星Galaxy Tab 2 7" 没有按在主摄像头（后置）上没有自动对焦功能，所以在使用不同设备之后，您不可能获得更好的效果。 – rmtheis

您可以使用类似

OCRText = OCRText.replaceAll("[^a-zA-Z0-9]+", " "); 
OCRText = OCRText.trim();

它是基于一个正方体实现我发现这里：SimpleAndroidOCRActivity.java

来源

2015-05-14 17:23:12 determined

谢谢。但我相信这可能与焦点有关。如果我使用前置摄像头（具有自动对焦）进行扫描，则准确度达到90％更有意义。当我使用后置摄像头进行扫描时（它没有自动对焦），这是上面的乱码。这应该是一个名字和地址。 – NewDev

tess-two OCR没有正确解码

回答

相关问题