Awesome Open Source
Awesome Open Source
Combined Topics
extraction
x
Advertising
📦 10
All Projects
Application Programming Interfaces
📦 124
Applications
📦 192
Artificial Intelligence
📦 78
Blockchain
📦 73
Build Tools
📦 113
Cloud Computing
📦 80
Code Quality
📦 28
Collaboration
📦 32
Command Line Interface
📦 49
Community
📦 83
Companies
📦 60
Compilers
📦 63
Computer Science
📦 80
Configuration Management
📦 42
Content Management
📦 175
Control Flow
📦 213
Data Formats
📦 78
Data Processing
📦 276
Data Storage
📦 135
Economics
📦 64
Frameworks
📦 215
Games
📦 129
Graphics
📦 110
Hardware
📦 152
Integrated Development Environments
📦 49
Learning Resources
📦 166
Legal
📦 29
Libraries
📦 129
Lists Of Projects
📦 22
Machine Learning
📦 347
Mapping
📦 64
Marketing
📦 15
Mathematics
📦 55
Media
📦 239
Messaging
📦 98
Networking
📦 315
Operating Systems
📦 89
Operations
📦 121
Package Managers
📦 55
Programming Languages
📦 245
Runtime Environments
📦 100
Science
📦 42
Security
📦 396
Social Media
📦 27
Software Architecture
📦 72
Software Development
📦 72
Software Performance
📦 58
Software Quality
📦 133
Text Editors
📦 49
Text Processing
📦 136
User Interface
📦 330
User Interface Components
📦 514
Version Control
📦 30
Virtualization
📦 71
Web Browsers
📦 42
Web Servers
📦 26
Web User Interface
📦 210
The Top 25 Extraction Open Source Projects
Categories
>
Data Processing
>
Extraction
Mtail
⭐
2,660
extract whitebox monitoring data from application logs for collection in a timeseries database
Parsr
⭐
2,563
Transforms PDF, Documents and Images into Enriched Structured Data
Aubio
⭐
2,024
a library for audio and music analysis
Adversarial Robustness Toolbox
⭐
1,962
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference
Textract
⭐
1,348
node.js module for extracting text from html, pdf, doc, docx, xls, xlsx, csv, pptx, png, jpg, gif, rtf and more!
Tika Python
⭐
970
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Garbro
⭐
721
Visual Novels resource browser
Stanford Openie Python
⭐
331
Stanford Open Information Extraction made simple!
Uritemplate
⭐
308
PHP URI Template (RFC 6570) supports both URI expansion & extraction
Unrpa
⭐
301
A program to extract files from the RPA archive format.
Survivcheatinjector
⭐
198
An actual, updated, surviv.io cheat. Works great and we reply fast.
Youtubeextractor
⭐
170
A helper to extract the metadata, including streaming video Urls from a YouTube video
Jarchivelib
⭐
159
A simple archiving and compression library for Java
Autolink Java
⭐
155
Java library to extract links (URLs, email addresses) from plain text; fast, small and smart
Bit7z
⭐
154
A C++ static library offering a clean and simple interface to the 7-zip DLLs.
Xioc
⭐
145
Extract indicators of compromise from text, including "escaped" ones.
Android Otp Extractor
⭐
141
Extracts OTP tokens from rooted Android devices
Full Text Rss
⭐
132
Full-Text RSS can transform partial feeds to deliver the full content stripped of clutter and ads
Ie Survey
⭐
115
北航大数据高精尖中心张日崇研究团队对信息抽取领域的调研。包括实体识别,关系抽取,属性抽取等子任务,每类子任务分别对学术界和工业界进行调研。
Florentino
⭐
90
Fast Static File Analysis Framework
Email Extractor
⭐
77
The main functionality is to extract all the emails from one or several URLs - La funcionalidad principal es extraer todos los correos electrónicos de una o varias Url
Stegextract
⭐
76
Detect hidden files and text in images
Locky
⭐
61
Ppe
⭐
16
Probabilistic plane extraction
Puree
⭐
8
Metadata extraction from the Pure Research Information System.
1-25 of 25 projects
Advertising
📦 10
All Projects
Application Programming Interfaces
📦 124
Applications
📦 192
Artificial Intelligence
📦 78
Blockchain
📦 73
Build Tools
📦 113
Cloud Computing
📦 80
Code Quality
📦 28
Collaboration
📦 32
Command Line Interface
📦 49
Community
📦 83
Companies
📦 60
Compilers
📦 63
Computer Science
📦 80
Configuration Management
📦 42
Content Management
📦 175
Control Flow
📦 213
Data Formats
📦 78
Data Processing
📦 276
Data Storage
📦 135
Economics
📦 64
Frameworks
📦 215
Games
📦 129
Graphics
📦 110
Hardware
📦 152
Integrated Development Environments
📦 49
Learning Resources
📦 166
Legal
📦 29
Libraries
📦 129
Lists Of Projects
📦 22
Machine Learning
📦 347
Mapping
📦 64
Marketing
📦 15
Mathematics
📦 55
Media
📦 239
Messaging
📦 98
Networking
📦 315
Operating Systems
📦 89
Operations
📦 121
Package Managers
📦 55
Programming Languages
📦 245
Runtime Environments
📦 100
Science
📦 42
Security
📦 396
Social Media
📦 27
Software Architecture
📦 72
Software Development
📦 72
Software Performance
📦 58
Software Quality
📦 133
Text Editors
📦 49
Text Processing
📦 136
User Interface
📦 330
User Interface Components
📦 514
Version Control
📦 30
Virtualization
📦 71
Web Browsers
📦 42
Web Servers
📦 26
Web User Interface
📦 210