Ocr Table

Extract tables from scanned image PDFs using Optical Character Recognition.
Alternatives To Ocr Table
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Zh Ner Tf1,761
3 years ago67Python
A very simple BiLSTM-CRF model for Chinese Named Entity Recognition 中文命名实体识别 (TensorFlow)
Deep_ocr1,452
5 years ago28Python
make a better chinese character recognition OCR than tesseract
Scenereco908
5 years ago32Python
ctpn+crnn Scene character recognition
Evil623
5 years ago2October 22, 20183mitSwift
Optical Character Recognition in Swift for iOS&macOS. 银行卡、身份证、门牌号光学识别
Javaverify204
8 years ago3Java
A Java CAPTCHA recognition library for sticky characters
Chinese Character Recognition182
6 years agoPython
This project shows how to use CNN to perform Chinese character recognition, a much more complicated task compared to MNIST digit recognition.
Ssocr172
4 months agogpl-3.0C
Seven Segment Optical Character Recognition
Ocr Ios Example157
10 years ago4Objective-C
A simple example of how to do optical character recognition (OCR) on iOS.
Simpsonrecognition128
6 years ago5Jupyter Notebook
Detect and recognize The Simpsons characters using Keras and Faster R-CNN
Gocarina124
5 years ago1August 27, 20181Go
simple Optical Character Recognition in Go
Alternatives To Ocr Table
Select To Compare


Alternative Project Comparisons
Readme

ocr-table

This project aims to extract tables from scanned image PDFs using Optical Character Recognition.

Install Requirements

  1. Tesseract OCR

    sudo apt-get install tesseract-ocr
    
  2. Imagemagick

    sudo apt-get install imagemagick
    
  3. PDF Utilities

    sudo apt-get install poppler-utils
    
  4. Python packages

    sudo pip install -r requirements.txt
    

Usage

  1. Clear the pdf/ folder and copy all your pdf files to be scanned in it.

  2. Run the OCR:

    python3 shellocr.py
    
  3. The scanned text files shall be available in the txt/ folder once the process completes.

Alternate

  1. If the above doesn't work for you, try the alternate method.

  2. Save your file as input.pdf in the root directory.

  3. Run

    python3 pdf_miner.py 
    
Popular Recognition Projects
Popular Character Projects
Popular Machine Learning Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Python
Shell
Table
Character
Extract
Recognition
Ocr
Tesseract
Optical Character Recognition