Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for spark dataframe
dataframe
x
spark
x
126 search results found
Koalas
⭐
3,291
Koalas: pandas API on Apache Spark
Ballista
⭐
2,244
Distributed compute platform implemented in Rust, and powered by Apache Arrow.
Graphframes
⭐
944
Mobius
⭐
943
C# and F# language binding and extensions to Apache Spark
Spark Redis
⭐
926
A connector for Spark that allows reading and writing to/from Redis cluster
Spark Daria
⭐
738
Essential Spark extensions and helper methods ✨😲
Datafusion
⭐
626
DataFusion has now been donated to the Apache Arrow project
Metorikku
⭐
536
A simplified, lightweight ETL Framework based on Apache Spark
Spark Avro
⭐
535
Avro Data Source for Apache Spark
Traceml
⭐
490
Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.
Shc
⭐
484
The Apache Spark - Apache HBase Connector is a library to support Spark accessing HBase table as external data source or sink.
Spark Scala Examples
⭐
443
This project provides Apache Spark SQL, RDD, DataFrame and Dataset examples in Scala language
Spark Solr
⭐
440
Tools for reading data from Solr as a Spark RDD and indexing objects from Spark into Solr using SolrJ.
Spark Excel
⭐
421
A Spark plugin for reading and writing Excel files
Ballista
⭐
411
Experimental Distributed Compute Platform based on Kubnernetes and Apache Arrow
Learningspark
⭐
406
Scala examples for learning to use Spark
Spark Fast Tests
⭐
385
Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)
Datacompy
⭐
339
Pandas and Spark DataFrame comparison for humans and more!
Sparkflow
⭐
301
Easy to use library to bring Tensorflow on Apache Spark
Neo4j Spark Connector
⭐
300
Neo4j Connector for Apache Spark, which provides bi-directional read/write access to Neo4j from Spark, using the Spark DataSource APIs
Spark Hbase Connector
⭐
287
Connect Spark to HBase for reading and writing data with ease
Nimdata
⭐
276
DataFrame API written in Nim, enabling fast out-of-core data processing
Geni
⭐
268
A Clojure dataframe library that runs on Spark
Pyspark Style Guide
⭐
264
This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurring topics across the PySpark repos we've encountered.
Rust Dataframe
⭐
250
A Rust DataFrame implementation, built on Apache Arrow
Sql Spark Connector
⭐
242
Apache Spark Connector for SQL Server and Azure SQL
Rasterframes
⭐
226
Geospatial Raster support for Spark DataFrames
Abris
⭐
215
Avro SerDe for Apache Spark structured APIs.
Isolation Forest
⭐
211
A Spark/Scala implementation of the isolation forest unsupervised outlier detection algorithm.
Rumble
⭐
194
⛈️ RumbleDB 1.21.0 "Hawthorn blossom" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Ddf
⭐
160
Distributed DataFrame: Productivity = Power x Simplicity For Scientists & Engineers, on any Data Engine
Spark Binlog
⭐
153
A library for querying Binlog with Apache Spark structure streaming, for Spark SQL , DataFrames and [MLSQL](https://www.mlsql.tech).
Data Algorithms With Spark
⭐
151
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
Pyspark Cheatsheet
⭐
140
PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster
Apache Spark Node
⭐
134
Node.js bindings for Apache Spark DataFrame APIs
Handyspark
⭐
129
HandySpark - bringing pandas-like capabilities to Spark dataframes
Aut
⭐
128
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Pulsar Spark
⭐
103
Spark Connector to read and write with Pulsar
Spark With Python
⭐
98
Fundamentals of Spark with Python (using PySpark), code examples
Spark Llap
⭐
82
Spark Highcharts
⭐
80
Support Highcharts in Apache Zeppelin
Mleap
⭐
76
MLeap allows for easily putting Spark ML pipelines into production
Dataframecheatsheet
⭐
74
Cheatsheet for Spark DataFrame
Doric
⭐
73
Type safety for spark columns
Spark Sftp
⭐
69
Spark connector for SFTP
Spark Bigquery
⭐
69
Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.
Net.jgp.labs.spark
⭐
63
Apache Spark examples exclusively in Java
Learn Spark
⭐
60
Examples To Help You Learn Apache Spark
Vectorpipe
⭐
60
Convert Vector data to VectorTiles with GeoTrellis.
Sparkgbm
⭐
57
Spark-based GBM
Delta Plus
⭐
56
A library based on delta for Spark and MLSQL
Spark Tutorial
⭐
55
This tutorial provides a quick introduction to using Spark
Big_data
⭐
55
Tutorials on Big Data essentials: Hadoop, MapReduce, Spark.
Spark Salesforce
⭐
54
Spark data source for Salesforce
Spark Stringmetric
⭐
50
Spark functions to run popular phonetic and string matching algorithms
Spark Json Schema
⭐
50
JSON schema parser for Apache Spark
Facets Overview Spark
⭐
48
Spark Implementation of Google Facets Overview https://github.com/PAIR-code/facets
Spark Nkp
⭐
47
Natural Korean Processor for Apache Spark
Spark Hive Udf
⭐
47
Example project showing how to use Hive UDFs in Apache Spark
Spark Google Spreadsheets
⭐
46
Google Spreadsheets datasource for SparkSQL and DataFrames
Megasparkdiff
⭐
46
A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations of possible data sources. Multiple execution modes in multiple environments enable the user to generate a diff report as a Java/Scala-friendly DataFrame or as a file for future use. Comes with out of the box SparkFactory and SparkCompare tools.
Struct Type Encoder
⭐
44
Deriving Spark DataFrame schemas from case classes
Spark Dataframe Introduction
⭐
42
This is an introduction of Apache Spark DataFrames.
Flowml
⭐
41
流程化 机器学习框架 基于 scala java语言 ,一站式自动机器学习平台 ,主要包括数据分析 特征工程 ,机器模型,自动部署,超参数优化,模型自动优化,自动扩容分配创建功能,类似第四范式、阿里PAI平台、 autoMl、亚马逊SageMaker
Sparkoptics
⭐
40
Optics for Spark DataFrames
Spark And Python For Big Data With Pyspark
⭐
38
Course on Udemy by Jose Portilla
Spark Hadoopoffice Ds
⭐
37
A Spark datasource for the HadoopOffice library
Spark Tools
⭐
35
Spark In Practice
⭐
34
Getting started with Spark, Spark Streaming, Spark SQL, DataFrame
Dx
⭐
34
Data Explorer for Python
Spark In Practice Scala
⭐
33
Getting started with Spark, Spark streaming, Spark SQL and DataFrame.
Pyspark Algorithms
⭐
33
PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
Spark Flow
⭐
32
Library for organizing batch processing pipelines in Apache Spark
Hivemall Spark
⭐
31
A Hivemall wrapper for Spark
Spark Hats
⭐
29
Nested array transformation helper extensions for Apache Spark
Isarn Sketches Spark
⭐
27
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
Spark Cloudant
⭐
27
Cloudant integration with Spark as Spark SQL external datasource
Ggplot2.sparkr
⭐
26
Rebooting ggplot2 for scalable big data visualization
Spark Hive Streaming Sink
⭐
26
A sink to save Spark Structured Streaming DataFrame into Hive table
Chronicler Spark
⭐
25
InfluxDB connector to Apache Spark on top of Chronicler
Movies Analytics In Spark And Scala
⭐
24
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Mleap
⭐
23
R Interface to MLeap
Pbspark
⭐
17
protobuf pyspark conversion
Featuretoolsonspark
⭐
16
A simplified version of featuretools for Spark
Spark Vcf
⭐
15
Spark VCF data source implementation for Dataframes
Spark Hive Streaming Sink
⭐
15
A sink to save Spark Structured Streaming DataFrame into Hive table
Tiledb Spark
⭐
15
Spark interface to the TileDB storage manager
Sparkphoenix
⭐
14
Spark Example using Phoenix to interact with HBase
Sparklingwater
⭐
14
Sparkling Water for R
Mison
⭐
14
Implementing MISON by Microsoft in C++ as a test
Spark To Tableau
⭐
14
Spark to Tableau Extractor library
Spark Meta
⭐
14
Spark data profiling utilities
Blog
⭐
13
Pyspark Ml Examples
⭐
13
Spark ML Tutorial and Examples for Beginners
Vectordisassembler
⭐
12
Orange3 Spark
⭐
11
A set of widgets for Python's Orange Machine Learning to work with Apache Spark ML
Spark Example
⭐
11
Spark1.6和spark2.2的示例,包含kafka,flume,structuredstrea
Greenplum Spark Connector
⭐
11
Example of using greenplum-spark connector
Net.jgp.books.spark.ch03
⭐
10
Spark in Action, 2nd edition - chapter 3
Pyspark Dataframe Made Easy
⭐
10
pyspark dataframe made easy
Related Searches
Scala Spark (3,279)
Python Spark (2,053)
Java Spark (1,587)
Apache Spark (1,207)
Spark Hadoop (1,188)
Python Dataframe (1,170)
Jupyter Notebook Spark (1,151)
Spark Kafka (985)
Spark Streaming (817)
Spark Pyspark (812)
1-100 of 126 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.