home Get a blog for free contact login

Pages tagged: OSS

Debian releases - do not freeze, rather do a RC

Dear Debian,

OS ideology says - release early, release often.

Instead of doing "the freeze" I would go for RC releases. IMHO this would simplify the process a lot:

  • in sync with the release practice of most of the OSS out there
  • "testing" will not be blocked
  • no need to do complicated things, like blocking package migrations, submission of new packages, etc.
  • ...

Then, when you feel you're in a state for a new stable release just "name it".

A temporary intermediate version can be introduced:

  • experimental
  • unstable
  • testing
  • RC
  • stable
  • oldstable
  • ...

Posted in dir: /blog/
Tags: Debian OSS

Open source OCR sucks

This night I've tried text recognition with various open source tools. The input were images packages as PDF. The text in the images was bad looking, but readable.

To summarize my experience:

  • Too much reading
  • Too much hassle to convert between various input formats accepted by the tools
  • Totally unacceptable results
  • Even segmentation faults by some of the apps

None of the tools did the job even close to what I expected. Maybe it was my fault, but I could not spend a day each time I need to do a simple job which I do not do each month.

At the end I did the job by googling for "Online OCR" and using (guess what ?!) http://www.onlineocr.net/ for the first five pages. It had a limit for five pages per hour for non registered users (and 5 pages total for registered ones) so I registered and OCRed the last sixth page.

BTW, just to prove my point of not enough reading I later found this site http://www.free-ocr.com/, which also did the job and used one of the software I have tried - Tesseract.


Posted in dir: /blog/
Tags: linux ocr oss

All tags SiteMap Owner Cookies policy [Atom Feed]