open source – Application for pdf deskew


Looking for a tool that can deskew pdf pages. The pages are scanned images of text. The text (or all of it I’m interested in) is in two columns.

This is prepatory for OCR. All the OCR tools I have tried get confused by the columns and put chunks of text in the wrong places. I tried cutting each page in half vertically, but the pages are all different angles.

I need something

  1. Which will work offline (preferably)

  2. Will not compromise the image quality on the deskewed image, so not recompress it (this is bad for OCR)

  3. Will take a document more than 200 pages long

  4. One with a GUI would be good

  5. FOSS

  6. Windows

I have a good OCR solution (PDFGear, uses AI help which is necessary with such an old image). So once I have them all at the right angle, I can divide up the left and the right columns on all the pages and we’re golden.

I had a look here: https://www.pdfgear.com/pdf-editor-reader/deskew-pdf.htm#part3 but all options listed are CLI, paid-for, or do the OCR without giving me the intermediate step I want, which is a de-skewed image.

Example page:

page from an old book

Example page below



Source link

Related Posts

About The Author

Add Comment