Magic_google

Google search results crawler, get google search results that you need
Alternatives To Magic_google
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Wooyun_public3,701
4 years ago29PHP
This repo is archived. Thanks for wooyun! 乌云公开漏洞、知识库爬虫和搜索 crawl and search for wooyun.org public bug(vulnerability) and drops
Ambar1,797
2 years ago2mitJavaScript
:mag: Ambar: Document Search Engine
Open Source Search Engine1,376
7 months ago89apache-2.0C++
Nov 20 2017 -- A distributed open source search engine and spider/crawler written in C/C++ for Linux on Intel/AMD. From gigablast dot com, which has binaries for download. See the README.md file at the very bottom of this page for instructions.
Spider919
5 months ago88Python
Python website crawler.
Angrysearch866
a year ago37gpl-2.0Python
Linux file search, instant results as you type
Ipfs Search75822 months ago19April 20, 202141agpl-3.0Go
Search engine for the Interplanetary Filesystem.
Fess714121921 hours ago135June 13, 202227apache-2.0Java
Fess is very powerful and easily deployable Enterprise Search Server.
Tweetscraper698
2 years ago8April 29, 20181gpl-2.0Python
TweetScraper is a simple crawler/spider for Twitter Search without using API
Go Dork677
6 months ago4April 03, 20214mitGo
The fastest dork scanner written in Go.
Filemasta625
a year ago4gpl-3.0C#
A search application to explore, discover and share online files
Alternatives To Magic_google
Select To Compare


Alternative Project Comparisons
Readme

magic_google

1.What's magic_google

This is an easy Google Searching crawler that you can get anything you want in the page by using it.

During the process of  crawling,you need to pay attention to the limitation from google towards ip address and the warning of exception , so I suggest that you should pause running the program and own the Proxy ip

php - MagicGoogle

2.How to Use?

Run

pip install magic_google
# Or
pip install git+https://github.com/howie6879/magic_google.git
# Or
git clone https://github.com/howie6879/magic_google.git
cd magic_google
vim google_search.py
# Or 
python setup.py install

Example

from magic_google import MagicGoogle
import pprint

# Or PROXIES = None
PROXIES = [{
    'http': 'http://192.168.2.207:1080',
    'https': 'http://192.168.2.207:1080'
}]

# Or MagicGoogle()
mg = MagicGoogle(PROXIES)

#  Crawling the whole page
result = mg.search_page(query='python')

# Crawling url
for url in mg.search_url(query='python'):
    pprint.pprint(url)
    
# Output
# 'https://www.python.org/'
# 'https://www.python.org/downloads/'
# 'https://www.python.org/about/gettingstarted/'
# 'https://docs.python.org/2/tutorial/'
# 'https://docs.python.org/'
# 'https://en.wikipedia.org/wiki/Python_(programming_language)'
# 'https://www.codecademy.com/courses/introduction-to-python-6WeG3/0?curriculum_id=4f89dab3d788890003000096'
# 'https://www.codecademy.com/learn/python'
# 'https://developers.google.com/edu/python/'
# 'https://learnpythonthehardway.org/book/'
# 'https://www.continuum.io/downloads'

# Get {'title','url','text'}
for i in mg.search(query='python', num=1):
    pprint.pprint(i)
    
# Output
# {'text': 'The official home of the Python Programming Language.',
# 'title': 'Welcome to Python .org',
# 'url': 'https://www.python.org/'}

You can see google_search.py

If  you need a big amount of querie but only having an ip address,I suggest  you can have a time lapse between 5s ~ 30s.

The reason that it always return empty might be as follows:

<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="https://ipv4.google.com/sorry/index?continue=https://www.google.me/s****">here</A>.
</BODY></HTML>
Popular Crawler Projects
Popular Search Projects
Popular Data Processing Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Python
Google
Search
Crawler
Spider