leo charre

contact links about resume

dev design near life experience heroes




how to deskew an image

OVERVIEW

First thing you need is a way to figure out what the skew angle is.

unpaper 0.2

I tried using ‘unpaper’ v 0.2

But for some reason no matter how much I fumbled with the settings.. I could not get it to detect a skew.

pagetools 0.1

The next thing I tried was something called ‘pagetools’
http://sourceforge.net/projects/pagetools/
The current version for that is 0.1 (as of this writing).
You can see them on http://sourceforge.net/projects/pagetools/

This version has no real installer per se, you unzip, untar, and then
where you extracted you run ‘make’.
The following is an example of how you would go about doing this.
This may fail on your machine- use your head and don’t dispair.
(as root..)

	cd ~/
	mkdir tmp
	cd tmp
	wget http://downloads.sourceforge.net/pagetools/pagetools-0.1.tar.gz?use_mirror=internap
	gunzip pagetools-0.1.tar.gz
	tar -xvf pagetools-0.1.tar
	make

And likely get an error about missing pbm.h
WHY! Because this is provided by another package.
I did a yum search

	yum provides pbm.h

And got some results. Then chose what to install..

	yum install netpbm-devel

Great. Then I did a ‘make clean’ to wipe my previous make try..

	make clean

Now run the make again..

	make

And.. and?? What happenned??
What happened is if you look in pbm_findskew/pbm_findskew , this binary is what
was compiled for you.

Try the sucker out.

Imagine you have a skewed.png image..
(assuming you habe imagemagick installed on your system)

	convert skewed.png skewed.pbm
	./pbm_findskew/pbm_findskew skewed.pbm

Output is something like ‘0.839234′
Great. Now.. Use that value to fix your original

	mogrify -rotate "-0.839234" skewed.png

Why minus – ? Because pbm_findskew tells you how many degrees counter clockqise you must
rotate to get it straight.

Check it out.

	eog skewed.png

Conclusion

You could script this together pretty easy with perl/bash.
I was doing this originally to prep stuff for gocr. But.. I think the quality of the rotated
image is not as good for ocr reaing as the original!! Weird- but makes sense.


Linux User