How to use the pytesseract.pytesseract.run_tesseract function in pytesseract

To help you get started, we’ve selected a few pytesseract examples, based on popular ways it is used in public projects.

Secure your code as it's written. Use Snyk Code to scan source code in minutes - no build needed - and fix issues immediately.

github konstantint / PassportEye / passporteye / util / ocr.py View on Github external
output_file_name = "%s.txt" % output_file_name_base
    try:
        # Prevent annoying warning about lossy conversion to uint8
        if str(img.dtype).startswith('float') and np.nanmin(img) >= 0 and np.nanmax(img) <= 1:
            img = img.astype(np.float64) * (np.power(2.0, 8) - 1) + 0.499999999
            img = img.astype(np.uint8)
        imwrite(input_file_name, img)

        if mrz_mode:
			# NB: Tesseract 4.0 does not seem to support tessedit_char_whitelist
            config = ("--psm 6 -c tessedit_char_whitelist=ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789><"
                      " -c load_system_dawg=F -c load_freq_dawg=F {}").format(extra_cmdline_params)
        else:
            config = "{}".format(extra_cmdline_params)

        pytesseract.run_tesseract(input_file_name,
                                  output_file_name_base,
                                  'txt',
                                  lang=None,
                                  config=config)
        
        if sys.version_info.major == 3:
            f = open(output_file_name, encoding='utf-8')
        else:
            f = open(output_file_name)
        
        try:
            return f.read().strip()
        finally:
            f.close()
    finally:
        pytesseract.cleanup(input_file_name)