Awesome Open Source
Awesome Open Source

License: Apache 2.0 release Go Reference Codacy Badge Go Report Card DepShield Badge FOSSA Status DeepSource CLA Artifact Hub Slack Twitter

What is Vald?

Vald is a highly scalable distributed fast approximate nearest neighbor dense vector search engine.

Vald is designed and implemented based on Cloud-Native architecture.

It uses the fastest ANN Algorithm NGT to search neighbors.

Vald has automatic vector indexing and index backup, and horizontal scaling which made for searching from billions of feature vector data.

Vald is easy to use, feature-rich and highly customizable as you needed.

Go to Get Started page to try out Vald :)

(If you are interested in ANN benchmarks, please refer to the official website.)

Main Features

  • Asynchronous Auto Indexing

    • Usually the graph requires locking during indexing, which causes stop-the-world. But Vald uses distributed index graphs so it continues to work during indexing.
  • Customizable Ingress/Egress Filtering

    • Vald implements it's own highly customizable Ingress/Egress filter.
    • Which can be configured to fit the gRPC interface.
      • Ingress Filter: Ability to Vectorize through filter on request.
      • Egress Filter: rerank or filter the searching result with your own algorithm.
  • Cloud-native based vector searching engine

    • Horizontal scalable on memory and CPU for your demand.
  • Auto Backup for Index data

    • Vald has a feature to store the backup of the index data using MySQL or Cassandra which enables disaster recovery.
  • Distributed Indexing

    • Vald distribute vector index to multiple agents, each agent stores different index.
  • Index Replication

    • Vald stores each index in multiple agents which enables index replicas.
    • Automatically rebalance the replica when some Vald agent goes down.
  • Easy to use

    • Vald can be easily installed in a few steps.
  • Highly customizable

    • You can configure the number of vector dimensions, the number of replica and etc.
  • Multi language supported

    • Go, Java, Clojure, Node.js, and Python client library are supported.
    • gRPC APIs can be triggered by any programming languages which support gRPC.
    • REST API is also supported.

Requirements

  • Kubernetes 1.17~
  • AVX2 instructions (required by Vald Agent NGT)

Get Started

Please refer to Get Started.

Installation

Using Helm

helm repo add vald https://vald.vdaas.org/charts
helm install vald-cluster vald/vald

If you use the default values.yaml, the nightly images will be installed.

Docker image tagging policy

  • nightly ... latest build of master branch
  • vX.X.X ... released versions
  • latest ... latest build of release versions
  • stable ... latest long-term supported version

Using Helm-operator

vald-helm-operator

Example

Write example here

Architecture Overview

Please refer here for more details of the architecture overview in the future.

Development

Before your first commit to this repository, it is strongly recommended to run the commands below.

make init

Components

Component Docker image
Agent NGT
Agent Sidecar
Discoverer
Gateways






Backup Managers


Compressor
Metas


Index Manager
Helm Operator

Contribution

Please read the contribution guide

Contributors

All Contributors

Thanks goes to these wonderful people (emoji key):


Yusuke Kato

💻 ğŸŽ¨ 🚧 📆

Rintaro Okamura

💻 📖 🚧 📦

Kosuke Morimoto

💻 💡 🔧 ⚠️

Kiichiro YUKAWA

📖 🚧 ⚠️ ✅

datelier

💻 🤔

Kevin Diu

📖 💡 ⚠️ ✅

Hiroto Funakoshi

📖 🔧 ⚠️ ✅

taisho

ğŸŽ¨ 📖 💡

Pierre Grimaud

📖

Omer Katz

📖 ✅

LICENSE

vald released under Apache 2.0 license, refer LICENSE file.

FOSSA Status


Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
go (14,671) 
golang (3,757) 
kubernetes (1,698) 
cloud (478) 
microservices (444) 
distributed-systems (291) 
high-performance (197) 
cloud-native (191) 
vector (111) 
nearest-neighbor-search (27) 
image-search (18) 
approximate-nearest-neighbor-search (18) 
similarity-search (16) 

Find Open Source By Browsing 7,000 Topics Across 59 Categories

Advertising 📦 10
All Projects
Application Programming Interfaces 📦 124
Applications 📦 192
Artificial Intelligence 📦 78
Blockchain 📦 73
Build Tools 📦 113
Cloud Computing 📦 80
Code Quality 📦 28
Collaboration 📦 32
Command Line Interface 📦 49
Community 📦 83
Companies 📦 60
Compilers 📦 63
Computer Science 📦 80
Configuration Management 📦 42
Content Management 📦 175
Control Flow 📦 213
Data Formats 📦 78
Data Processing 📦 276
Data Storage 📦 135
Economics 📦 64
Frameworks 📦 215
Games 📦 129
Graphics 📦 110
Hardware 📦 152
Integrated Development Environments 📦 49
Learning Resources 📦 166
Legal 📦 29
Libraries 📦 129
Lists Of Projects 📦 22
Machine Learning 📦 347
Mapping 📦 64
Marketing 📦 15
Mathematics 📦 55
Media 📦 239
Messaging 📦 98
Networking 📦 315
Operating Systems 📦 89
Operations 📦 121
Package Managers 📦 55
Programming Languages 📦 245
Runtime Environments 📦 100
Science 📦 42
Security 📦 396
Social Media 📦 27
Software Architecture 📦 72
Software Development 📦 72
Software Performance 📦 58
Software Quality 📦 133
Text Editors 📦 49
Text Processing 📦 136
User Interface 📦 330
User Interface Components 📦 514
Version Control 📦 30
Virtualization 📦 71
Web Browsers 📦 42
Web Servers 📦 26
Web User Interface 📦 210