Amundsensearchlibrary

Search service library for Amundsen
Alternatives To Amundsensearchlibrary
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Go Dork677
6 months ago4April 03, 20214mitGo
The fastest dork scanner written in Go.
Nboost439
3 years ago26June 12, 202018apache-2.0Python
NBoost is a scalable, search-api-boosting platform for deploying transformer models to improve the relevance of search results on different platforms (i.e. Elasticsearch)
Alfred Web Search Suggest330
3 months ago1mitPHP
Alfred search suggest workflow for various popular websites.
Gmusicproxy317
2 years ago15gpl-3.0Python
Google Play Music Proxy - "Let's stream Google Play Music using any media-player"
Cloudbunny279
2 years ago2mitPython
CloudBunny is a tool to capture the real IP of the server that uses a WAF as a proxy or protection. In this tool we used three search engines to search domain information: Shodan, Censys and Zoomeye.
Pallas24012 years ago4August 06, 2019113Java
Curator is to Zookeeper what Pallas is to Elasticsearch
Guppy Proxy111
4 years ago1mitPython
The Guppy Proxy (GUI Pappy)
Google Places81
35 years ago3February 02, 20187otherPHP
PHP wrapper class for the Google Places API
Serp70563 months ago26June 15, 20222JavaScript
Google Search SERP Scraper
Wp Json Scraper60
a year ago2mitPython
Scrapes WordPress data using the WP-JSON API activated by default since WordPress 4.7
Alternatives To Amundsensearchlibrary
Select To Compare


Alternative Project Comparisons
Readme

Deprecated: please visit https://github.com/amundsen-io/amundsen/tree/main/search

The Amundsen project moved to a monorepo. This repository will be kept up temporarily to allow users to transition gracefully, but new PRs won't be accepted.

Amundsen Search Service

PyPI version Coverage Status License PRs Welcome Slack Status

Amundsen Search service serves a Restful API and is responsible for searching metadata. The service leverages Elasticsearch for most of it's search capabilites.

For information about Amundsen and our other services, visit the main repository README.md. Please also see our instructions for a quick start setup of Amundsen with dummy data, and an overview of the architecture.

Requirements

  • Python >= 3.6
  • elasticsearch 6.x (currently it doesn't support 7.x)

Doc

Instructions to start the Search service from distribution

$ venv_path=[path_for_virtual_environment]
$ python3 -m venv $venv_path
$ source $venv_path/bin/activate
$ pip3 install amundsen-search
$ python3 search_service/search_wsgi.py

# In a different terminal, verify the service is up by running
$ curl -v http://localhost:5001/healthcheck

Instructions to start the Search service from source

$ git clone https://github.com/amundsen-io/amundsensearchlibrary.git
$ cd amundsensearchlibrary
$ venv_path=[path_for_virtual_environment]
$ python3 -m venv $venv_path
$ source $venv_path/bin/activate
$ pip3 install -r requirements.txt
$ python3 setup.py install
$ python3 search_service/search_wsgi.py

# In a different terminal, verify the service is up by running
$ curl -v http://localhost:5001/healthcheck

Instructions to start the service from Docker

$ docker pull amundsendev/amundsen-search:latest
$ docker run -p 5001:5001 amundsendev/amundsen-search
# - alternative, for production environment with Gunicorn (see its homepage link below)
$ ## docker run -p 5001:5001 amundsendev/amundsen-search gunicorn --bind 0.0.0.0:5001 search_service.search_wsgi

# In a different terminal, verify the service is up by running
$ curl -v http://localhost:5001/healthcheck

Production environment

By default, Flask comes with a Werkzeug webserver, which is used for development. For production environments a production grade web server such as Gunicorn should be used.

$ pip3 install gunicorn
$ gunicorn search_service.search_wsgi

# In a different terminal, verify the service is up by running
$ curl -v http://localhost:8000/healthcheck

For more imformation see the Gunicorn configuration documentation.

Configuration outside local environment

By default, Search service uses LocalConfig that looks for Elasticsearch running in localhost. In order to use different end point, you need to create a Config suitable for your use case. Once a config class has been created, it can be referenced by an environment variable: SEARCH_SVC_CONFIG_MODULE_CLASS

For example, in order to have different config for production, you can inherit Config class, create Production config and passing production config class into environment variable. Let's say class name is ProdConfig and it's in search_service.config module. then you can set as below:

SEARCH_SVC_CONFIG_MODULE_CLASS=search_service.config.ProdConfig

This way Search service will use production config in production environment. For more information on how the configuration is being loaded and used, here's reference from Flask doc.

Developer guide

Code style

API documentation

We have Swagger documentation setup with OpenApi 3.0.2. This documentation is generated via Flasgger. When adding or updating an API please make sure to update the documentation. To see the documentation run the application locally and go to localhost:5001/apidocs/. Currently the documentation only works with local configuration.

Code structure

Amundsen Search service consists of three packages, API, Models, and Proxy.

API package

A package that contains Flask Restful resources that serves Restful API request. The routing of API is being registered here.

Proxy package

Proxy package contains proxy modules that talks dependencies of Search service. There are currently two modules in Proxy package, Elasticsearch and Statsd.

Elasticsearch proxy module

Elasticsearch proxy module serves various use case of searching metadata from Elasticsearch. It uses Query DSL for the use case, execute the search query and transform into model.

Atlas proxy module

Apache Atlas proxy module uses Atlas to serve the Atlas requests. At the moment the Basic Search REST API is used via the Python Client.

Statsd utilities module

Statsd utilities module has methods / functions to support statsd to publish metrics. By default, statsd integration is disabled and you can turn in on from Search service configuration. For specific configuration related to statsd, you can configure it through environment variable.

Models package

Models package contains many modules where each module has many Python classes in it. These Python classes are being used as a schema and a data holder. All data exchange within Amundsen Search service use classes in Models to ensure validity of itself and improve readability and maintainability.

Popular Search Projects
Popular Proxy Projects
Popular Computer Science Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Python
Search
Proxy
Elasticsearch
Statsd