Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for scala pipeline
pipeline
x
scala
x
61 search results found
Papermill
⭐
5,513
📚 Parameterize, execute, and analyze notebooks
Mleap
⭐
1,479
MLeap: Deploy ML Pipelines to Production
Keystone
⭐
472
Simplifying robust end-to-end machine learning on Apache Spark.
Blaze
⭐
344
Blazing fast NIO microframework and Http Parser
Koober
⭐
301
Big Data Rosetta Code
⭐
283
Code snippets for solving common big data problems in various platforms. Inspired by Rosetta Code
Chalk
⭐
262
Chalk is a natural language processing library.
Setl
⭐
177
A simple Spark-powered ETL framework that just works 🍺
Mario
⭐
137
Functional, Typesafe, Declarative Data Pipelines
Basel Face Pipeline
⭐
127
Superglue
⭐
102
Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs and reports.
Crunch
⭐
100
Mirror of Apache Crunch (Incubating)
Qstreaming
⭐
89
A simplified, lightweight ETL pipeline framework for build stream/batch processing applications on top of Apache Spark
Smart Data Lake
⭐
87
Smart Automation Tool for building modern Data Lakes and Data Pipelines
Mleap
⭐
76
MLeap allows for easily putting Spark ML pipelines into production
Learn By Examples
⭐
72
Real-world Spark pipelines examples
Pipeline
⭐
68
Complete Pipeline Training at Big Data Scala By the Bay
Dagr
⭐
67
A scala based DSL and framework for writing and executing bioinformatics pipelines as Directed Acyclic GRaphs
Jgit Spark Connector
⭐
67
jgit-spark-connector is a library for running scalable data retrieval pipelines that process any number of Git repositories for source code analysis.
Sparklingml
⭐
65
Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)
Jigg
⭐
63
Pipeline framework for easy natural language processing
Lighthouse
⭐
54
Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines and apply best practices.
Pravda Ml
⭐
52
This project is used to capture machine learning pipelines created on top of Spark as OK
Til
⭐
51
Today I Learned
Albedomm
⭐
50
Albedo Morphable Model
Aardpfark
⭐
47
A library for exporting Spark ML models and pipelines to PFA
Spark Ml Serving
⭐
44
Spark ML Lib serving library
Trembita
⭐
43
Model complex data transformation pipelines easily
Sparkplug
⭐
42
A framework for creating composable and pluggable data processing pipelines using Apache Spark, and running them on a cluster.
Hyperdrive
⭐
41
Extensible streaming ingestion pipeline on top of Apache Spark
Ckoocnlp
⭐
41
爬虫与机器学习
Streamcorpus
⭐
33
common data interchange format for document processing pipelines that apply natural language processing tools to large streams of text
Streamliner Starter
⭐
33
Starter project for building MemSQL Streamliner Pipelines
Spark Flow
⭐
32
Library for organizing batch processing pipelines in Apache Spark
Sparkpipe Core
⭐
30
Modular, non-linear pipeline framework for Spark
Build
⭐
28
Source code and build system used to generate the book Hands-on Scala Programming
Chuckwagon
⭐
28
a Scala/sbt AWS Lambda Toolkit
Uimascala
⭐
26
A toolkit to write UIMA components and applications
Ctakes Server
⭐
25
A simple REST-server around ctakes clinical pipeline.
Kyogenrv
⭐
25
The Simple 5-staged pipeline RISC-V written in chisel3 for intel FPGA.
Scala Datapipeline Dsl
⭐
25
Domain-specific language to help build and maintain AWS Data Pipelines
Spark Featureselection
⭐
24
Featureselection methods as Spark MLlib Pipelines
Streamliner Examples
⭐
23
Example code for building your own MemSQL Streamliner Pipelines
Suim
⭐
23
Analytic UIMA pipelines using Spark
Spark Intro Ml Pipeline Workshop
⭐
23
A simple introduction to using spark ml pipelines
Piper
⭐
22
A genomics pipeline build on top of the GATK Queue framework. Main repository: https://github.com/NationalGenomicsInfrastructure/ (make sure you fork from there)
Elitzur
⭐
21
Click Through Rate Prediction
⭐
21
Kaggle's click through rate prediction with Spark Pipeline API
Azure Databricks Anomaly
⭐
17
Anomaly Detection Pipeline on Azure Databricks
Databricks Workflow
⭐
16
Example of a scalable IoT data processing pipeline setup using Databricks
Rflows
⭐
16
reactive pipelines for Agoda.com; lightweight and Future-oriented; automatic yammer metrics and pipeline visualization
Biopet
⭐
15
Biopet docs
Pipelines Examples
⭐
15
Pipelines Example Applications
Peapod
⭐
15
Dependency and data pipeline management framework for Spark and Scala
Graphsense Transformation
⭐
15
GraphSense Transformation Pipeline
Scamr
⭐
15
A Hadoop map reduce framework for Scala.
Hashtagcashtag
⭐
13
insight data engineering project
Cromwell Client
⭐
13
Client for the Cromwell workflow engine
Nyc_taxi_pipeline
⭐
12
Design/Implement stream/batch architecture on NYC taxi data | #DE
Spark Ranking Algorithms
⭐
11
Ranking algorithms for Spark machine learning pipeline
Singlecelllineage
⭐
11
Updated scripts and pipelines for processing GESTALT data at single-cell resolution
Stackexchange Spark Scala Analyser
⭐
10
Still in Beta
Scalcium
⭐
9
Scala NLP Algorithms
Beam Scala Examples
⭐
9
Scala examples for using Apache Beam Java API (2.1.0)
Lambdaconf 2017 Bigdata
⭐
9
Materials for "Big Data Pipelines with Scala" Workshop at LambdaConf 2017
Project Fortis Spark
⭐
9
A repository for all spark jobs running on fortis
Sc_gestalt
⭐
9
GESTALT processing pipeline for barcodes captured with single-cell RNA sequencing
Spark Kaggle
⭐
9
Spark in Kaggle competitions
Generic Event Parser
⭐
8
This project is a Google Dataflow pipeline that process generic JSON messages from Google PubSub or Apache Kafka and writes it parsed to Google BigQuery.
Vamana
⭐
8
Autoscaling toolkit based on custom Application Metrics
Fluid
⭐
8
Fluid Pipelines
Streaming Pipeline
⭐
7
A real-time text classification based on Kafka and Spark.
Openmrs Etl
⭐
7
openmrs - mysql - debezium - kafka - spark - scala
Snyk Tekton
⭐
7
A set of Tekton Tasks for using Snyk to check for vulnerabilities in your pipelines
Platform Etl Backend
⭐
7
Spark Vs
⭐
7
Structure-Based Virtual Screening in Spark
Morenlp
⭐
6
Capabilities of StanfordNLP and OpenNLP on Spark
Arc Jupyter
⭐
6
Arc-Jupyter is an interactive Jupyter Notebooks Extenstion for building Arc data pipelines via Jupyter Notebooks.
Scala Pipeline
⭐
6
Pipeline Pattern implementation in Scala
Insight18b Sparksql Array
⭐
6
Repo of my Insight project. Extended SparkSQL functionality internally and tested its performance against UDFs. Additionally, implemented a batch pipeline HDFS->SparkSQL->MySQL->Flask and a streaming pipeline Kafka->Spark Streaming->MySQL->Flask to analyze Amazon User Data.
Fa18 Smartnic
⭐
6
SmartNIC
Sbt Pipeline.playframework.g8
⭐
5
template for playframework projects
Genetics Pipe
⭐
5
Spark Smile
⭐
5
Integrating SMILE and Spark
Mambo
⭐
5
A simple in-memory, configuration driven, data processing pipeline for Apache Spark.
Daice_databrickssparkdevops
⭐
5
A set of example build and release pipelines for deploying Python and Scala to Azure Databricks and HDInsight
Streaming Data Pipeline
⭐
5
Streaming pipeline repo for data engineering training program
Related Searches
Python Pipeline (4,199)
Scala Sbt (4,178)
Scala Spark (3,279)
Scala Akka (2,120)
Java Scala (1,794)
Javascript Pipeline (1,369)
Scala Play Framework (1,309)
Pipeline Jenkins (1,150)
Shell Pipeline (1,143)
Plugin Scala (1,079)
1-61 of 61 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.