|Project Name||Stars||Downloads||Repos Using This||Packages Using This||Most Recent Commit||Total Releases||Latest Release||Open Issues||License||Language|
|Spark||35,873||2,394||882||16 hours ago||46||May 09, 2021||262||apache-2.0||Scala|
|Apache Spark - A unified analytics engine for large-scale data processing|
|Clickhouse||29,013||16 hours ago||699||December 16, 2021||3,264||apache-2.0||C++|
|ClickHouse® is a free analytics DBMS for big data|
|Tdengine||21,396||1||17 hours ago||12||April 14, 2022||1,015||agpl-3.0||C|
|TDengine is an open source, high-performance, cloud native time-series database optimized for Internet of Things (IoT), Connected Cars, Industrial IoT and DevOps.|
|Flink||21,315||26||284||17 hours ago||62||September 22, 2021||1,046||apache-2.0||Java|
|Shardingsphere||18,455||8||16 hours ago||7||June 04, 2020||657||apache-2.0||Java|
|Ecosystem to transform any database into a distributed database system, and enhance it with sharding, elastic scaling, encryption features & more|
|Presto||14,746||178||11||16 hours ago||278||September 08, 2022||1,401||apache-2.0||Java|
|The official home of the Presto distributed SQL query engine for big data|
|Questdb||11,505||1||16 hours ago||38||August 26, 2022||354||apache-2.0||Java|
|An open source time-series database for fast ingest and SQL queries|
|Trino||7,886||3||9||17 hours ago||51||December 29, 2020||2,328||apache-2.0||Java|
|Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)|
|Beam||6,898||12||17 hours ago||532||August 17, 2022||4,271||apache-2.0||Java|
|Apache Beam is a unified programming model for Batch and Streaming data processing.|
|Hive||4,833||17 hours ago||102||apache-2.0||Java|
Presto is a distributed SQL query engine for big data.
See the User Manual for deployment instructions and end user documentation.
Presto is a standard Maven project. Simply run the following command from the project root directory:
./mvnw clean install
On the first build, Maven will download all the dependencies from the internet and cache them in the local repository (
~/.m2/repository), which can take a considerable amount of time. Subsequent builds will be faster.
Presto has a comprehensive set of unit tests that can take several minutes to run. You can disable the tests when building:
./mvnw clean install -DskipTests
Presto native is a C++ rewrite of Presto worker. Presto native uses Velox as its primary engine to run presto workloads.
Velox is a C++ database library which provides reusable, extensible, and high-performance data processing components.
Check out building instructions to get started.
After building Presto for the first time, you can load the project into your IDE and run the server. We recommend using IntelliJ IDEA. Because Presto is a standard Maven project, you can import it into your IDE using the root
pom.xml file. In IntelliJ, choose Open Project from the Quick Start box or choose Open from the File menu and select the root
After opening the project in IntelliJ, double check that the Java SDK is properly configured for the project:
Presto comes with sample configuration that should work out-of-the-box for development. Use the following options to create a run configuration:
-ea -XX:+UseG1GC -XX:G1HeapRegionSize=32M -XX:+UseGCOverheadLimit -XX:+ExplicitGCInvokesConcurrent -Xmx2G -Dconfig=etc/config.properties -Dlog.levels-file=etc/log.properties
$MODULE_DIR$(Depends your version of IntelliJ)
The working directory should be the
presto-main subdirectory. In IntelliJ, using
$MODULE_DIR$ accomplishes this automatically.
Additionally, the Hive plugin must be configured with location of your Hive metastore Thrift service. Add the following to the list of VM options, replacing
localhost:9083 with the correct host and port (or use the below value if you do not have a Hive metastore):
If your Hive metastore or HDFS cluster is not directly accessible to your local machine, you can use SSH port forwarding to access it. Setup a dynamic SOCKS proxy with SSH listening on local port 1080:
ssh -v -N -D 1080 server
Then add the following to the list of VM options:
Start the CLI to connect to the server and run SQL queries:
Run a query to see the nodes in the cluster:
SELECT * FROM system.runtime.nodes;
In the sample configuration, the Hive connector is mounted in the
hive catalog, so you can run the following queries to show the tables in the Hive database
SHOW TABLES FROM hive.default;
See Contributions for guidelines around making new contributions and reviewing them.
To learn how to build the docs, see the docs README.
dist folder). You must have Node.js and Yarn installed to execute these commands. To update this folder after making changes, simply run:
yarn --cwd presto-main/src/main/resources/webapp/src install
package.json), it is faster to run:
yarn --cwd presto-main/src/main/resources/webapp/src run package
To simplify iteration, you can also run in
watch mode, which automatically re-compiles when changes to source files are detected:
yarn --cwd presto-main/src/main/resources/webapp/src run watch
To iterate quickly, simply re-build the project in IntelliJ after packaging is complete. Project resources will be hot-reloaded and changes are reflected on browser refresh.
When authoring a pull request, the PR description should include its relevant release notes. Follow Release Notes Guidelines when authoring release notes.