Skip to content

OCR is identifying characters incorrectly #4473

@saikrishnagopal1227

Description

@saikrishnagopal1227

Current Behavior

After running OCR on an image, I found multiple character recognition errors (e.g., "T" was interpreted as "I"), along with several other misidentified characters.

Expected Behavior

All the characters should identify correctly after OCR.

Suggested Fix

No response

tesseract -v

5.5.1

Operating System

Ubuntu 20.04

Other Operating System

No response

uname -a

No response

Compiler

No response

CPU

No response

Virtualization / Containers

No response

Other Information

Im attaching the file that I have used here

P1.PC.00000003-1 (1).pdf

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions