Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for extraction
extraction
x
4,866 search results found
Newspaper
⭐
13,147
News, full-text, and article metadata extraction in Python 3. Advanced docs:
Warp
⭐
8,841
A super-easy, composable, web server framework for warp speeds.
Sm64
⭐
7,163
A Super Mario 64 decompilation, brought to you by a bunch of clever folks.
Archwsl
⭐
6,039
ArchLinux based WSL Distribution. Supports multiple install.
Nlp.js
⭐
5,944
An NLP library for building bots, with entity extraction, sentiment analysis, automatic language identify, and so more
Pyaudioanalysis
⭐
5,453
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
Parsr
⭐
5,423
Transforms PDF, Documents and Images into Enriched Structured Data
Pdfminer.six
⭐
5,112
Community maintained fork of pdfminer - we fathom PDF
Blind_watermark
⭐
4,875
Blind&Invisible Watermark ,图片盲水印,提取水印无须原图!
Pdfminer
⭐
4,864
Python PDF Parser (Not actively maintained). Check out pdfminer.six.
Vibrant.js
⭐
4,484
Extract prominent colors from an image. JS port of Android's Palette.
Adversarial Robustness Toolbox
⭐
4,420
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
Video Subtitle Extractor
⭐
4,267
视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框 GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.
Swiftsoup
⭐
4,203
SwiftSoup: Pure Swift HTML Parser, with best of DOM, CSS, and jquery (Supports Linux, iOS, Mac, tvOS, watchOS)
Archiver
⭐
4,170
Easily create & extract archives, and compress & decompress files of various formats
Snips Nlu
⭐
3,796
Snips Python library to extract meaning from text
Python Goose
⭐
3,741
Html Content / Article Extractor, web scrapping lib in Python
Uefitool
⭐
3,726
UEFI firmware image viewer and editor
Textract
⭐
3,699
extract text from any document. no muss. no fuss.
Mtail
⭐
3,679
extract internal monitoring data from application logs for collection in a timeseries database
React Docgen
⭐
3,513
A CLI and library to extract information from React component files for documentation generation purposes.
Wikiextractor
⭐
3,440
A tool for extracting plain text from Wikipedia dumps
Camelot
⭐
3,376
Camelot: PDF Table Extraction for Humans
Feedbin
⭐
3,332
A nice place to read on the web.
Knowledgegraphcourse
⭐
3,319
东南大学《知识图谱》研究生课程
Zotfile
⭐
3,109
Zotero plugin to manage your attachments: automatically rename, move, and attach PDFs (or other files) to Zotero items, sync PDFs from your Zotero library to your (mobile) PDF reader (e.g. an iPad, Android tablet, etc.), and extract PDF annotations.
Aubio
⭐
3,082
a library for audio and music analysis
Keybert
⭐
3,047
Minimal keyword extraction with BERT
Uniextract2
⭐
3,016
Universal Extractor 2 is a tool to extract files from any type of archive or installer.
Pdfsam
⭐
2,914
PDFsam, a desktop application to split, merge, mix, rotate PDF files and extract pages
Mitie
⭐
2,778
MITIE: library and tools for information extraction
Grobid
⭐
2,749
A machine learning software for extracting information from scholarly documents
Jionlp
⭐
2,724
中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com
Property Access
⭐
2,704
Provides functions to read and write from/to an object or array using a simple string notation
Oletools
⭐
2,665
oletools - python tools to analyze MS OLE2 files (Structured Storage, Compound File Binary Format) and MS Office documents, for malware analysis, forensics and debugging.
Ios Artwork Extractor
⭐
2,642
Extract iOS artwork and emoji symbols into png files, generate glossy buttons png files
Jailer
⭐
2,623
Database Subsetting and Relational Data Browsing Tool.
Camelot
⭐
2,495
A Python library to extract tabular data from PDFs
Disunity
⭐
2,425
An experimental toolset for Unity asset and asset bundle files.
Webplotdigitizer
⭐
2,375
Online tool to extract numerical data from plot images.
Invoicenet
⭐
2,297
Deep neural network to extract intelligent information from invoice documents.
Keras Bert
⭐
2,281
Implementation of BERT that could load official pre-trained models for feature extraction and prediction
Awesome Oneliner Bugbounty
⭐
2,201
A collection of awesome one-liner scripts especially for bug bounty tips.
Pdfparser
⭐
2,191
PdfParser, a standalone PHP library, provides various tools to extract data from a PDF file.
Html To React Components
⭐
2,101
Converts HTML pages into React components
Property Info
⭐
2,095
Extracts information about PHP class' properties using metadata of popular sources
Information Extraction Chinese
⭐
2,086
Chinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation Extraction with biGRU+2ATT 中文实体识别与关系提取
Pythonvscode
⭐
2,066
This extension is now maintained in the Microsoft fork.
Assetcatalogtinkerer
⭐
2,059
An app that lets you open .car files and browse/extract their images.
Earth Reverse Engineering
⭐
2,008
Reversing Google's 3D satellite mode
Tika
⭐
2,007
The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).
Tabula Py
⭐
1,986
Simple wrapper of tabula-java: extract table from PDF into pandas DataFrame
Unblob
⭐
1,964
Extract files from any kind of container formats
Garbro
⭐
1,944
Visual Novels resource browser
News Please
⭐
1,821
news-please - an integrated web crawler and information extractor for news that just works
Lsassy
⭐
1,811
Extract credentials from lsass remotely
Awesome_time_series_in_python
⭐
1,811
This curated list contains python packages for time series analysis
Ios Images Extractor
⭐
1,738
A Mac app to decode and extract images from iOS apps, support png/jpg/ipa/Assets.car files.
Utinyripper
⭐
1,731
GUI and API library to work with Engine assets, serialized and bundle files
Php Font Lib
⭐
1,689
A library to read, parse, export and make subsets of different types of font files.
Scrapely
⭐
1,668
A pure-python HTML screen-scraping library
Tabula Java
⭐
1,603
Extract tables from PDF files
Bonobo
⭐
1,548
Extract Transform Load for Python 3.5+
Goose
⭐
1,529
Html Content / Article Extractor in Scala - open sourced from Gravity Labs
Yake
⭐
1,522
Single-document unsupervised keyword extraction
Textract
⭐
1,487
node.js module for extracting text from html, pdf, doc, docx, xls, xlsx, csv, pptx, png, jpg, gif, rtf and more!
Toutatis
⭐
1,456
Toutatis is a tool that allows you to extract information from instagrams accounts such as e-mails, phone numbers and more
Vscode Glean
⭐
1,435
The extension provides refactoring tools for your React codebase
Pke
⭐
1,431
Python Keyphrase Extraction module
Pdflayouttextstripper
⭐
1,390
Converts a pdf file into a text file while keeping the layout of the original pdf. Useful to extract the content from a table in a pdf file for instance. This is a subclass of PDFTextStripper class (from the Apache PDFBox library).
Awesome Embedded And Iot Security
⭐
1,362
A curated list of awesome embedded and IoT security resources.
Zt Zip
⭐
1,353
ZeroTurnaround ZIP Library
Treat
⭐
1,347
Natural language processing framework for Ruby.
Excalibur
⭐
1,319
A web interface to extract tabular data from PDFs
Tika Python
⭐
1,316
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Article Extractor
⭐
1,297
To extract main article from given URL with Node.js
Wombat
⭐
1,297
Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages.
Zip
⭐
1,275
A portable, simple zip library written in C
Talon
⭐
1,231
Flume
⭐
1,228
Extract logic from your apps with a user-friendly node editor powered by React.
Kerberoast
⭐
1,147
Download
⭐
1,147
Download and extract files
Psenet
⭐
1,142
Official Pytorch implementations of PSENet.
Xurls
⭐
1,105
Extract urls from text
Swiftinfo
⭐
1,095
📊 Extract and analyze the evolution of an iOS app's code.
Dlt
⭐
1,069
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
Depthy
⭐
1,064
Extract depth map and original from photos made with Google Camera's Lens Blur.
Saliencheat
⭐
1,019
👽 Cheating Salien minigame, the proper way
Parsel
⭐
1,010
Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors
Pdfx
⭐
986
Extract text, metadata and references (pdf, url, doi, arxiv) from PDF. Optionally download all referenced PDFs.
Web Scraper Chrome Extension
⭐
980
Web data extraction tool implemented as chrome extension
Five Video Classification Methods
⭐
960
Code that accompanies my blog post outlining five video classification methods in Keras and TensorFlow
Redux React Router Async Example
⭐
956
A showcase of the Redux architecture with React Router
Mlscraper
⭐
935
🤖 Scrape data from HTML websites automatically by just providing examples
Dureader
⭐
924
Baseline Systems of DuReader Dataset
Hactool
⭐
906
hactool is a tool to view information about, decrypt, and extract common file formats for the Nintendo Switch, especially Nintendo Content Archives.
Ruby Readability
⭐
902
Port of arc90's readability project to Ruby
Andriller
⭐
899
📱 Andriller - is software utility with a collection of forensic tools for smartphones. It performs read-only, forensically sound, non-destructive acquisition from Android devices.
Jcseg
⭐
886
Jcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywords extraction, key sentence extraction, summary extraction implemented based on TEXTRANK algorithm. Jcseg had a build-in http server and search modules for lucene,solr,elasticsearch,opensearch
Fastannotationtool
⭐
877
A tool using OpenCV to annotate images for image classification, optical character reading, ...
1-100 of 4,866 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.