go to
> tesseract(1)
Homepage > Man Pages > Category > General Commands
Homepage > Man Pages > Name > O


man page of ocropus

ocropus: command line OCR tool


ocropus - command line OCR tool
ocroscript <script> <arguments>
You can see a list of all available commands by looking in the $OCROSCRIPTS (/usr/share/ocropus/scripts/ by default) path. The 'recognize' script uses tesseract for recognition and sends the html-based hOCR ouput to stdout. Tesseract is probably the most mature text recognizer within OCRopus at the moment. Natively, Tesseract doesn't do layout analysis, but combined with OCRopus, it makes for a pretty good OCR system: $ ocroscript recognize page.png > page.html Here is a brief summary of the remaining command line commands available. You will need to look at the script to see what the command line arguments are: degrade.lua Simple document image degradation hocr-to-text.lua Convert hOCR output to plain text. line-clean.lua Given a line image, remove marginal noise and fix some other problems. sauvola.lua Perform Sauvola thresholding.


tesseract(1), <//code.google.com/p/ocropus/w/list>


ocroscript was written by Thomas Breuel. This manual page was written by Jeffrey Ratcliffe <Jeffrey.Ratcliffe@gmail.com>, for the Debian project (but may be used by others). June 06, 2008 OCROPUS(1)

Copyright © 2011–2018 by topics-of-interest.com . All rights reserved. Hosted by all-inkl.
Contact · Imprint · Privacy

Page generated in 28.70ms.

adsenseexperts.com | wippsaege.name | Ausf├╝hrliches GetResponse Review