Awesome Open Source
Awesome Open Source


flume version : 1.7.0

canal version : 1.0.24

flume 是什么

Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. It uses a simple extensible data model that allows for online analytic application.

canal 是什么


flume-canal-source 做了什么

flume-canal-source 是对 flume 的 source 扩展。从 canal 获取数据到 flume channel。 进而可以实现binlog数据到 kafka / hdfs / hive / elasticsearch 等等。 **canal 和 flume 都有高可用的解决方案,这种方式同步 binlog 可用性非常高。**组合前人的优秀轮子,不重复造轮子。


部署 canal、flume 这里忽略。

配置 flume

  • 配置 source 类型*
agent.sources = canalSource

agent.sources.canalSource.type = com.weiboyi.etl.flume.source.canal.CanalSource
  • 配置连接 canal 的三种方式*
# 1. zookeeper servers
agent.sources.canalSource.zkServers = zookeeper-host:2181

# 2. canal server urls
agent.sources.canalSource.serverUrls = canal-server1:111111,canal-server2:111111

# 3. canal server urls
agent.sources.canalSource.serverUrl = canal-server1:111111
  • 配置 canal destination*
agent.sources.canalSource.destination = example
  • 配置用户名密码
agent.sources.canalSource.username = user
agent.sources.canalSource.password = passwd
  • binlog batch size, default 1024
agent.sources.canalSource.batchSize = 1024
  • 是否需要 MySQL 修改前的数据, default true
agent.sources.canalSource.oldDataRequired = true

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
java (30,358
mysql (977
elasticsearch (532
kafka (397
hdfs (41
mq (29
binlog (16

Find Open Source By Browsing 7,000 Topics Across 59 Categories