Awesome Open Source
Awesome Open Source

Asakusa Framework

Asakusa is a full stack framework for distributed/parallel computing, which provides with a development platform and runtime libraries supporting various distributed/parallel computing environments such as Hadoop, Spark, M3 for Batch Processing, and so on. Users can enjoy the best performance on distributed/parallel computing transparently changing execution engines among MapReduce, SparkRDD, and C++ native based on their data size.

Other than query-based languages, Asakusa helps to develop more complicated data flow programs more easily, efficiently, and comprehensively due to following components.

  • Data-flow oriented DSL

    Data-flow based approach is suitable for DAG constructions which is appropriate for distributed/parallel computing. Asakusa offers Domain Specific Language based on Java with data-flow design, which is integrated with compilers.

  • Compilers

    A multi-tier compiler is supported. Java based source code is once compiled to inter-mediated representation and then optimized for each execution environments such that Hadoop(MapReduce), Spark(RDD), M3 for Batch Processing(C++ Native), respectively.

  • Data-Modeling language

    Data-Model language is supported, which is comprehensive for mapping with relational models, CSVs, or other data formats.

  • Test Environment

    JUnit based unit testing and end-to-end testing are supported, which are portable among each execution environments. Source code, test code, and test data are fully compatible across Hadoop, Spark, M3 for Batch Processing and others.

  • Runtime execution driver

    A transparent job execution driver is supported.

All these features have been well designed and developed with the expertise from experiences on enterprise-scale system developments over decades and promised to contribute to large scale systems on distributed/parallel environments to be more robust and stable.

How to build

Maven artifacts

./mvnw clean install -DskipTests

Gradle plug-ins

cd gradle
./gradlew clean [build] install

How to run tests

Maven artifacts

export HADOOP_CMD=/path/to/bin/hadoop
./mvnw test

Gradle plug-ins

cd gradle
./gradlew [clean] check

How to import projects into Eclipse

Maven artifacts

./mvnw eclipse:eclipse

And then import existing projects from Eclipse.

If you run tests in Eclipse, please activate Preferences > Java > Debug > 'Only include exported classpath entries when launching'.

Gradle plug-ins

cd gradle
./gradlew eclipse

And then import existing projects from Eclipse.

Sub Projects

Related Projects

Resources

Bug reports, Patch contribution

License


Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
java (30,674
framework (1,096
big-data (239
hadoop (133
batch (65
mapreduce (31
batch-processing (29
data-flow (24

Find Open Source By Browsing 7,000 Topics Across 59 Categories