Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Luigi | 16,811 | 338 | 71 | 18 hours ago | 79 | May 04, 2023 | 119 | apache-2.0 | Python | |
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in. | ||||||||||
Hrider | 132 | 6 years ago | 10 | apache-2.0 | Java | |||||
hbase UI tool | ||||||||||
Druid Hadoop Inputformat | 9 | 7 years ago | apache-2.0 | Java | ||||||
Hadoop InputFormat for http://druid.io/ | ||||||||||
Fbus | 4 | 13 years ago | 6 | Java | ||||||
A file busing system for integration with Hadoop | ||||||||||
Wukonggpu | 4 | 6 years ago | apache-2.0 | C++ | ||||||
Fast and Concurrent Distributed RDF Queries using RDMA-assisted GPU Graph Exploration | ||||||||||
Hrider | 3 | 5 years ago | apache-2.0 | Java | ||||||
HBase GUI 工具,支持HBase 2.x。另外导出excel时,自动修改表中数据的 c:exportFlag字段的值改为1,便于导出操作。Forked From NiceSystems/hrider(https://github.com/NiceSystems/hrider )。 | ||||||||||
Hbase Rule | 2 | a year ago | Java | |||||||
JUnit rule which provides an embedded HBase server. | ||||||||||
Parquet Tools Assembly | 2 | 7 years ago | apache-2.0 | Shell | ||||||
Parquet-tools assembly and distribution |
Parquet-tools assembly and distribution
Repository provides script to clone apache/parquet-mr and
build distribution for submodule parquet-tools
, command-line utility to read Parquet files.
Repo has following structure:
bin
binaries copied from parquet-tools
with some minor changes, e.g. parquet-cat
lib
contains jar files that will be included in distribution, acts as staging foldersbin
scripts to build distributionstaging
staging folder for cloned repositories (folder for each tag)Script creates tar.gz
and zip
distributions with or without Hadoop dependency. Name
parquet-tools-dist-TAG-VERSION.tar.gz
contains provided tag, VERSION
is a version of this
repository, not parquet-tools
or Hadoop. Suffix -dh
is included in name when client dependency
is included. Some versions of parquet-tools
have already been prepared,
see releases for more info.
To build parquet-tools
you must have python
, git
and mvn
installed, though script checks if
those are available. Currently project works and tested only for Python 2.7.x, but it should be
trivial to extend it for Python 3.x.
You can build distribution with or without Hadoop dependency (see parquet-tools for more info), meaning whether or not client library will be included as part of uber-jar.
cd parquet-tools-assembly && sbin/make-distribution.sh --tag=XYZ
With Hadoop dependency:
cd parquet-tools-assembly && sbin/make-distribution.sh --tag=XYZ --client=true
where:
--tag
- parquet-mr
repository tag to use, e.g. apache-parquet-1.8.1
.
See all available tags.--client
- whether or not client library should be included.
If true, distribution name will include -dh
suffix.Once archives are built, unarchive them into wanted directory:
tar zxf parquet-tools-dist.tar.gz
cd parquet-tools-dist
And use scripts:
bin/parquet-schema /path/to/parquet-file
bin/parquet-head /path/to/parquet-file
bin/parquet-cat /path/to/parquet-file