Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Hacker News Digest | 620 | 3 months ago | 9 | lgpl-3.0 | Python | |||||
:newspaper: Let ChatGPT Summarize Hacker News for You | ||||||||||
Awesome Scrapy | 450 | a year ago | 2 | |||||||
A curated list of awesome packages, articles, and other cool resources from the Scrapy community. | ||||||||||
Html2article | 425 | 1 | 7 years ago | 5 | July 11, 2013 | 6 | other | C# | ||
Html网页正文提取 | ||||||||||
Node Readability | 302 | 10 | 4 | 6 years ago | 67 | August 01, 2018 | 9 | JavaScript | ||
Scrape/Crawl article from any site automatically. Make any web page readable, no matter Chinese or English. | ||||||||||
Javacrawling | 252 | 7 years ago | 10 | Java | ||||||
"奇伢爬虫"是基于sprint boot 、 WebMagic 实现 微信公众号文章、新闻、csdn、info等网站文章爬取,可以动态设置文章爬取规则、清洗规则,基本实现了爬取大部分网站的文章。 | ||||||||||
Article_spider | 187 | 6 years ago | 4 | TypeScript | ||||||
微信公众号爬虫 | ||||||||||
Wechatpubspider | 107 | 2 years ago | 2 | Python | ||||||
wechat spiders微信公众号爬虫 | ||||||||||
Gospider | 55 | 2 years ago | 1 | mit | Jupyter Notebook | |||||
some small project and some articles | ||||||||||
Risjbot | 50 | 3 years ago | 4 | Python | ||||||
A scrapy project to extract the text and metadata of articles from news websites | ||||||||||
Spider2 | 42 | 1 | 8 years ago | 6 | December 19, 2015 | 2 | JavaScript | |||
A 2nd generation spider to crawl any article site, automatic read title and article. |