Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Datakit | 999 | a year ago | 34 | apache-2.0 | OCaml | |||||
Connect processes into powerful data pipelines with a simple git-like filesystem interface | ||||||||||
Nodestream | 69 | 8 | 4 | 3 years ago | 11 | February 11, 2017 | 7 | bsd-3-clause | JavaScript | |
Storage-agnostic streaming library for binary data transfers | ||||||||||
Igneous | 36 | 2 days ago | 41 | December 13, 2022 | 15 | gpl-3.0 | Python | |||
Scalable Neuroglancer compatible Downsampling, Meshing, Skeletonizing, Contrast Normalization, Transfers and more. | ||||||||||
Fastdfscore | 30 | 5 | a year ago | 36 | December 15, 2021 | 2 | mit | C# | ||
distributed file system fastdfs c# client | ||||||||||
Owin.compression | 19 | 6 months ago | 1 | unlicense | F# | |||||
Compression (Deflate / GZip) module for Microsoft OWIN filesystem pipeline. Works with Selfhost and also on AspNetCore. | ||||||||||
Metalfs | 14 | a year ago | 3 | mit | C++ | |||||
Near-storage compute aware file system and FPGA operator pipelines. | ||||||||||
Fsweeper | 13 | 2 years ago | 2 | October 06, 2021 | mit | Go | ||||
A file management automation tool | ||||||||||
Ember Cli Deploy Cp | 7 | 5 years ago | 2 | mit | JavaScript | |||||
An ember-cli-deploy-plugin to copy your built assets on your filesystem | ||||||||||
Pipe Ist | 3 | 7 years ago | mit | TypeScript | ||||||
A back to basics build tool for JavaScript, TypeScript and pretty much everything else | ||||||||||
Tinypipeline | 2 | 6 years ago | 1 | gpl-3.0 | Python | |||||
A basic pipeline for small, personal CG projects. |
DataKit is a tool to orchestrate applications using a Git-like dataflow. It revisits the UNIX pipeline concept, with a modern twist: streams of tree-structured data instead of raw text. DataKit allows you to define complex build pipelines over version-controlled data.
DataKit is currently used as the coordination layer for HyperKit, the hypervisor component of Docker for Mac and Windows, and for the DataKitCI continuous integration system.
There are several components in this repository:
src
contains the main DataKit service. This is a Git-like database to which other services can connect.ci
contains DataKitCI, a continuous integration system that uses DataKit to monitor repositories and store build results.ci/self-ci
is the CI configuration for DataKitCI that tests DataKit itself.bridge/github
is a service that monitors repositories on GitHub and syncs their metadata with a DataKit database.
e.g. when a pull request is opened or updated, it will commit that information to DataKit. If you commit a status message to DataKit, the bridge will push it to GitHub.bridge/local
is a drop-in replacement for bridge/github
that just monitors a local Git repository. This is useful for local testing.The easiest way to use DataKit is to start both the server and the client in containers.
To expose a Git repository as a 9p endpoint on port 5640 on a private network, run:
$ docker network create datakit-net # create a private network
$ docker run -it --net datakit-net --name datakit -v <path/to/git/repo>:/data datakit/db
Note: The --name datakit
option is mandatory. It will allow the client
to connect to a known name on the private network.
You can then start a DataKit client, which will mount the 9p endpoint and expose the database as a filesystem API:
# In an other terminal
$ docker run -it --privileged --net datakit-net datakit/client
$ ls /db
branch remotes snapshots trees
Note: the --privileged
option is needed because the container will have
to mount the 9p endpoint into its local filesystem.
Now you can explore, edit and script /db
. See the
Filesystem API
for more details.
The easiest way to build the DataKit project is to use docker, (which is what the start-datakit.sh script does under the hood):
docker build -t datakit/db -f Dockerfile .
docker run -p 5640:5640 -it --rm datakit/db --listen-9p=tcp://0.0.0.0:5640
These commands will expose the database's 9p endpoint on port 5640.
If you want to build the project from source without Docker, you will need to install ocaml and opam. Then write:
$ make depends
$ make && make test
For information about command-line options:
$ datakit --help
Run with --listen-prometheus 9090
to expose metrics at http://*:9090/metrics
.
Note: there is no encryption and no access control. You are expected to run the database in a container and to not export this port to the outside world. You can either collect the metrics by running a Prometheus service in a container on the same Docker network, or front the service with nginx or similar if you want to collect metrics remotely.
api/go
directory.api/ocaml
directory. See examples/ocaml-client
for an example.DataKit is licensed under the Apache License, Version 2.0. See LICENSE for the full license text.
Contributions are welcome under the terms of this license. You may wish to browse the weekly reports to read about overall activity in the repository.