Awesome Open Source
Awesome Open Source

file.d

Overview

Maintenance CI GitHub go.mod Go version of a Go module GoReportCard example

file.d is a blazing fast tool for building data pipelines: read, process, and output events. Primarily developed to read from files, but also supports numerous input/action/output plugins.

⚠ Although we use it in production, it still isn't v1.0.0. Please, test your pipelines carefully on dev/stage environments.

Motivation

Well, we already have several similar tools: vector, filebeat, logstash, fluend-d, fluent-bit, etc.

Performance tests state that best ones achieve a throughput of roughly 100MB/sec. Guys, it's 2020 now. HDDs and NICs can handle the throughput of a few GB/sec and CPUs processes dozens of GB/sec. Are you sure 100MB/sec is what we deserve? Are you sure it is fast?

Main features

  • Fast: more than 10x faster compared to similar tools
  • Predictable: it uses pooling, so memory consumption is limited
  • Reliable: doesn't lose data due to commitment mechanism
  • Container / cloud / kubernetes native
  • Simply configurable with YAML
  • Prometheus-friendly: transform your events into metrics on any pipeline stage
  • Vault-friendly: store sensitive info and get it for any pipeline parameter
  • Well-tested and used in production to collect logs from Kubernetes cluster with 3000+ total CPU cores

Performance

On MacBook Pro 2017 with two physical cores file.d can achieve the following throughput:

  • 1.7GB/s in files > devnull case
  • 1.0GB/s in files > json decode > devnull case

TBD: throughput on production servers.

Plugins

Input: dmesg, fake, file, http, journalctl, k8s, kafka

Action: add_host, convert_date, debug, discard, flatten, join, json_decode, keep_fields, mask, modify, parse_es, parse_re2, remove_fields, rename, throttle

Output: devnull, elasticsearch, gelf, kafka, splunk, stdout

What's next


Generated using insane-doc


Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
Go (193,794
Golang (32,615
Json (11,097
Processing (6,506
Http (5,572
Elasticsearch (3,800
Kafka (3,311
File (2,152
Events (2,100
Log (1,980
Pipeline (1,939
Actions (1,802
Input (758
Tracing (612
Observability (376
Reading (251
Sre (184
Output (146
Throttle (128
Related Projects