Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

english the api works well but if i use the arabic trained data the app crashes #428

Closed
yatharthgupta112 opened this issue Sep 17, 2016 · 2 comments

Comments

@yatharthgupta112
Copy link

09-17 15:20:02.050 21768-21778/com.example.sigmaway.homeimage W/art: Suspending all threads took: 28.488ms
09-17 15:20:02.078 21768-25485/com.example.sigmaway.homeimage V/OCR: Ctesseract 1
09-17 15:20:02.085 21768-25485/com.example.sigmaway.homeimage W/linker: /data/app/com.example.sigmaway.homeimage-1/lib/arm64/libjpgt.so: unused DT entry: type 0x6ffffffe arg 0x29b0
09-17 15:20:02.085 21768-25485/com.example.sigmaway.homeimage W/linker: /data/app/com.example.sigmaway.homeimage-1/lib/arm64/libjpgt.so: unused DT entry: type 0x6fffffff arg 0x1
09-17 15:20:02.088 21768-25485/com.example.sigmaway.homeimage W/linker: /data/app/com.example.sigmaway.homeimage-1/lib/arm64/libpngt.so: unused DT entry: type 0x6ffffffe arg 0x58e0
09-17 15:20:02.088 21768-25485/com.example.sigmaway.homeimage W/linker: /data/app/com.example.sigmaway.homeimage-1/lib/arm64/libpngt.so: unused DT entry: type 0x6fffffff arg 0x2
09-17 15:20:02.093 21768-25485/com.example.sigmaway.homeimage W/linker: /data/app/com.example.sigmaway.homeimage-1/lib/arm64/liblept.so: unused DT entry: type 0x6ffffffe arg 0x231d0
09-17 15:20:02.093 21768-25485/com.example.sigmaway.homeimage W/linker: /data/app/com.example.sigmaway.homeimage-1/lib/arm64/liblept.so: unused DT entry: type 0x6fffffff arg 0x2
09-17 15:20:02.097 21768-25485/com.example.sigmaway.homeimage W/linker: /data/app/com.example.sigmaway.homeimage-1/lib/arm64/libtess.so: unused DT entry: type 0x6ffffffe arg 0x67f60
09-17 15:20:02.097 21768-25485/com.example.sigmaway.homeimage W/linker: /data/app/com.example.sigmaway.homeimage-1/lib/arm64/libtess.so: unused DT entry: type 0x6fffffff arg 0x3
09-17 15:20:02.156 21768-25485/com.example.sigmaway.homeimage V/OCR: Ctesseract 2
09-17 15:20:02.157 21768-25485/com.example.sigmaway.homeimage V/OCR: Ctesseract 3
09-17 15:20:02.293 21768-25485/com.example.sigmaway.homeimage A/libc: Fatal signal 6 (SIGABRT), code -6 in tid 25485 (AsyncTask #4)
09-17 15:20:03.802 27329-27329/com.example.sigmaway.homeimage W/art: Before Android 4.1, method android.graphics.PorterDuffColorFilter android.support.graphics.drawable.VectorDrawableCompat.updateTintFilter(android.graphics.PorterDuffColorFilter, android.content.res.ColorStateList, android.graphics.PorterDuff$Mode) would have incorrectly overridden the package-private method in android.graphics.drawable.Drawable
09-17 15:20:04.033 27329-27329/com.example.sigmaway.homeimage A/add home: tess data or Document file found
09-17 15:20:04.037 27329-27329/com.example.sigmaway.homeimage A/add home: tess data or Document file found
09-17 15:20:04.090 27329-27372/com.example.sigmaway.homeimage D/OpenGLRenderer: Use EGL_SWAP_BEHAVIOR_PRESERVED: true
09-17 15:20:04.099 27329-27329/com.example.sigmaway.homeimage D/Atlas: Validating map...

public class Ocr {
String TAG= "OCR";
String DATA_PATH = Environment.getExternalStorageDirectory().toString() + "/Sigmaway/";
String[] language={"eng","ara"};
Context c;
ArrayList Pics=new ArrayList();
public void Ocr(Context context){

    this.c=context;
    String[] paths = new String[]
            { DATA_PATH, DATA_PATH + "tessdata/" };

    for (String path : paths) {
        File dir = new File(path);
        if (!dir.exists()) {
            if (!dir.mkdirs()) {
                Log.v(TAG, "ERROR: Creation of directory " + path + " on sdcard failed");
                return;
            } else {
                Log.v(TAG, "Created directory " + path + " on sdcard");
            }
        }

    }
    for (String lang:language)
    {   Log.v(TAG, "hey c");

        if (!(new File(DATA_PATH + "tessdata/" + lang + ".traineddata")).exists()) {
            try {

                AssetManager assetManager = c.getAssets();
                InputStream in = assetManager.open("tessdata/" + lang + ".traineddata");
                //GZIPInputStream gin = new GZIPInputStream(in);
                OutputStream out = new FileOutputStream(DATA_PATH
                        + "tessdata/" + lang + ".traineddata");

                // Transfer bytes from in to out
                byte[] buf = new byte[1024];
                int len;
                //while ((lenf = gin.read(buff)) > 0) {
                while ((len = in.read(buf)) > 0) {
                    out.write(buf, 0, len);
                }
                in.close();
                //gin.close();
                out.close();

                Log.v(TAG, "Copied " + lang + " traineddata");
            } catch (IOException e) {
                Log.e(TAG, "Was unable to copy " + lang + " traineddata " + e.toString());
            }
        }

    }

}

public String tesseract(Context context,Bitmap bmpImg, String lang){
this.c=context;

 Log.v(TAG, "Ctesseract 1" );
   TessBaseAPI baseApi = new TessBaseAPI();
 Log.v(TAG, "Ctesseract 2" );
   baseApi.setDebug(true);
 Log.v(TAG, "Ctesseract 3" );
   baseApi.init(DATA_PATH,lang);
 Log.v(TAG, "Ctesseract 4" );
   baseApi.setImage(bmpImg);
 Log.v(TAG, "Ctesseract 5  "  );
   String recognizedText = baseApi.getUTF8Text();
 Log.v(TAG, "Ctesseract 6" );
   baseApi.end();
   if ( lang.equalsIgnoreCase("eng") ) {
       recognizedText = recognizedText.replaceAll("[^a-zA-Z0-9]+", " ");
   }

   //recognizedText = recognizedText.trim();
 return recognizedText;
}

}
This is my class through which i ocr the task and call the method in async task from the main activity.
so if i do use english the api works well but if i use the arabic trained data the app crashes giving
the below error on baseApi.init(DATA_PATH,lang); command
09-17 15:20:02.293 21768-25485/com.example.sigmaway.homeimage A/libc: Fatal signal 6 (SIGABRT), code -6 in tid 25485 (AsyncTask #4)
09-17 15:20:03.802 27329-27329/com.example.sigmaway.homeimage W/art: Before Android 4.1, method android.graphics.PorterDuffColorFilter android.support.graphics.drawable.VectorDrawableCompat.updateTintFilter(android.graphics.PorterDuffColorFilter, android.content.res.ColorStateList, android.graphics.PorterDuff$Mode) would have incorrectly overridden the package-private method in android.graphics.drawable.Drawable

@amitdo
Copy link
Collaborator

amitdo commented Sep 19, 2016

Your issue seems to be related to #235

Please read this:
https://github.com/tesseract-ocr/tesseract/blob/master/CONTRIBUTING.md

Make sure you are able to replicate the problem with Tesseract command line program. For external programs that use Tesseract (including wrappers and your own program, if you are developer), report the issue to the developers of that software if it's possible. You can also try to find help in the Tesseract forum.

https://groups.google.com/d/forum/tesseract-ocr

@amitdo
Copy link
Collaborator

amitdo commented Sep 19, 2016

Also, Arabic uses a special OCR engine 'Cube' which isn't maintained anymore. It will be replaced by a better engine in the next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants