Amundsengremlin

Alternatives To Amundsengremlin
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Amazon Neptune Samples329
22 days ago18mit-0JavaScript
Samples and documentation for using the Amazon Neptune graph database service
Gremlin Aws Sigv427
7 months ago8March 14, 20224mitJavaScript
Extension for Apache's TinkerPop3 Gremlin JS driver compatible with IAM Database Authentication for Neptune
Gremlin Python Example25
3 years ago4Python
An example project for gremlin_python, as there is very little public code out there using it.
Sqerzo23
2 years ago5January 29, 2021apache-2.0Python
Tiny ORM for graph databases: Neo4j, RedisGraph, AWS Neptune or Gremlin
Amazon Neptune Gremlin Java Sigv423
5 months ago8apache-2.0Java
A Gremlin client for Amazon Neptune that includes AWS Signature Version 4 signing.
Amundsengremlin186a year ago11January 13, 2022apache-2.0Python
Amundsen Gremlin
Gizmo14
5 years ago3Java
Cypher UI for interacting with Gremlin graph databases (eg. AWS Neptune)
Gremtune1213 years ago4December 13, 20194mitGo
Golang Gremlin Tinkerpop client with AWS Neptune compatibility
Bio4j Neo4j6
8 years ago5agpl-3.0Java
Neo4j-specific implementation of Bio4j
Graphdb_aws_neptune_neo4j4
2 years agoagpl-3.0Jupyter Notebook
Welcome to world of AWS Neptune & Neo4J GraphDB
Alternatives To Amundsengremlin
Select To Compare


Alternative Project Comparisons
Readme

amundsengremlin

PyPI version License PRs Welcome Slack Status

Amundsen Gremlin contains code to use AWS Neptune as the graph backend for Amundsen. Specifically it uploads two CSVs -- one for vertices, one for edges -- to an S3 bucket, then tells the bulk loader to import those into the graph database. In order to prevent duplicate vertexes/edges, we specify the key of each.

Requirements

It can be used with Python 3.6 except for async_consume_in_chunks which relies on Python 3.7 asyncio functionality.

Prerequisites include a configured Neptune instance and an S3 bucket.

Example Code

This can be used by databuilder jobs to load data into the graph. Example code for batching:

    def load_tables(self, *, table_data: Iterable[Table], batch_size: int = 200000,
                    batch_metric: LoadTablesBatchMetric = LoadTablesBatchMetric.NUMBER_OF_NODES) -> int:
        """
        lazily loads Tables in chunks of batch_size
        :param table_data: the Iterable (possibly a Generator or stream) of Tables
        :param batch_size: the maximum chunk size to process, or <= 0 if process all at once
        :param batch_metric: what metric to count for chunks?  number of tables or number of nodes?
        """
        return consume_in_chunks(stream=table_data, n=batch_size, metric=batch_metric.value,
                                 consumer=self._load_some_tables)

    async def async_load_tables(self, *, table_data: AsyncIterator[Table], batch_size: int = 5000) -> int:
        """
        lazily loads Tables in chunks of batch_size
        """
        return await async_consume_in_chunks(stream=table_data, n=batch_size, consumer=self._load_some_tables)

    def _load_some_tables(self, data: Iterable[Table]) -> None:
        _data = list(data)
        entities = GetGraph.table_entities(table_data=_data, g=self.neptune_graph_traversal_source_factory())
        self.neptune_bulk_loader_api.bulk_load_entities(entities=entities)

AWS Configuration Guide

Coming Soon...

Instructions to configure venv

Virtual environments for python are convenient for avoiding dependency conflicts. The venv module built into python3 is recommended for ease of use, but any managed virtual environment will do. If you'd like to set up venv in this repo:

$ venv_path=[path_for_virtual_environment]
$ python3 -m venv $venv_path
$ source $venv_path/bin/activate
$ pip install -r requirements.txt

If something goes wrong, you can always:

$ rm -rf $venv_path

Roundtrip tests

The roundtrip tests hit the Neptune backend directly, which requires a valid Neptune configuration. As amundsen-gremlin CI does not currently have AWS configured, these tests do not run by default.

In order to run the roundtrip tests:

$ python -m pytest --roundtrip .
Popular Gremlin Projects
Popular Amazon Web Services Projects
Popular Data Storage Categories

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Python
Aws
Graph
Gremlin