2016-03-04 35 views
0

我尝试使用tesseract tess-two从android中的图像读取问题和回答。目前,我得到一个字符串与图像上的每一个字。 我的问题是我不能拆分答案 是否有可能与TessBaseAPI拆分answear?在Java/Android的一个解决方案也将是罚款;)与Tesseract tess-two在Android上的分词

public String detectText(Bitmap bitmap) { 
    Log.d(TAG, "Initialization of TessBaseApi"); 
    TessDataManager.initTessTrainedData(context); 
    TessBaseAPI tessBaseAPI = new TessBaseAPI(); 
    String path = TessDataManager.getTesseractFolder(); 
    Log.d(TAG, "Tess folder: " + path); 
    tessBaseAPI.setDebug(true); 
    tessBaseAPI.init(path, "eng"); 
    tessBaseAPI.setVariable(TessBaseAPI.VAR_CHAR_WHITELIST, "1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ" + 
      "abcdefghijklnmopqrstuvwxyzäüößÄÖÜ[email protected]#$%^&*+=-;()/"); 
    tessBaseAPI.setPageSegMode(TessBaseAPI.OEM_TESSERACT_CUBE_COMBINED); 

    Log.d(TAG, "Ended initialization of TessEngine"); 
    Log.d(TAG, "Running inspection on bitmap"); 
    tessBaseAPI.setImage(bitmap); 

    String inspection = tessBaseAPI.getUTF8Text(); 
    Log.d(TAG, "Got data: " + inspection); 
    tessBaseAPI.end(); 
    System.gc(); 
    return inspection; 
} 

Here is an example how the image look like

回答

0

这是它的工作方式:

tessBaseAPI.setPageSegMode(TessBaseAPI.PageSegMode.PSM_SPARSE_TEXT);