1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[Python] Why pytesseract can't handle OSD mode?

Discussão em 'Python' iniciado por Stack, Outubro 8, 2024.

  1. Stack

    Stack Membro Participativo

    I cant run OSD mode in pytesseract on docker image on Ubuntu. On windows, this command works like charm:

    pytesseract.image_to_osd(image)


    But inside docker image, it causes the following error. What I want to achieve is reading the rotation info using OSD.

    File "/usr/local/lib/python3.9/site-packages/pytesseract/pytesseract.py", line 263, in run_tesseract
    raise TesseractError(proc.returncode, get_errors(error_string))pytesseract.pytesseract.TesseractError: (1, 'Tesseract Open Source OCR Engine v5.0.0-alpha-20210401 with Leptonica UZN file /tmp/tess__cujlspf loaded. Estimating resolution as 169 UZN file /tmp/tess__cujlspf loaded. Warning. Invalid resolution 0 dpi. Using 70 instead. Too few characters. Skipping this page Error during processing.')


    Tesseract is installed correctly because all other methods like image_to_string are working properly. The suprising thing is that when I call the OSD directly from terminal, it works

    tesseract /images/1.jpg output --psm 0
    # cat output.osd
    Page number: 0
    Orientation in degrees: 0
    Rotate: 0
    Orientation confidence: 5.69
    Script: Cyrillic
    Script confidence: 0.10


    Is there some bug in Pytesseract or any workaround? The rotation info is not included in any other Tesseract methods, only in this OSD. Many thanks

    Continue reading...

Compartilhe esta Página