Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for apache pyspark
apache
x
pyspark
x
46 search results found
Spark Nlp
⭐
3,578
State of the Art Natural Language Processing
Awesome Spark
⭐
1,461
A curated list of awesome Apache Spark packages and resources.
Spark Standalone Cluster On Docker
⭐
311
Learn Apache Spark in Scala, Python (PySpark) and R (SparkR) by building your own cluster with a JupyterLab interface on Docker. ⚡
Learningapachespark
⭐
233
LearningApacheSpark
Joblib Spark
⭐
226
Joblib Apache Spark Backend
Pyspark Stubs
⭐
116
Apache (Py)Spark type annotations (stub files).
Spark Df Profiling
⭐
115
Create HTML profiling reports from Apache Spark DataFrames
Spark With Python
⭐
98
Fundamentals of Spark with Python (using PySpark), code examples
Pyspark Cookbook
⭐
76
PySpark Cookbook, published by Packt
Pyspark Twitter Stream Mining
⭐
63
Real-time Machine Learning with Apache Spark on Twitter Public Stream
Spark
⭐
60
Apache Spark (Scala, PySpark, SparkR) Code, Tricks, and References
Pyspark Setup Guide
⭐
54
A guide for setting up Spark + PySpark under Ubuntu linux
Data_processing_course
⭐
53
Some class materials for a data processing course using PySpark
Sparkora
⭐
46
Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟
Dlsa
⭐
33
Distributed least squares approximation (dlsa) implemented with Apache Spark
Docker Pyspark
⭐
28
Docker image of Apache Spark with its Python interface, pyspark.
Isarn Sketches Spark
⭐
27
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
Odsc_india_2018
⭐
26
My presentation at ODSC India 2018 about Deep Learning with Apache Spark
Spark Tdd Example
⭐
20
A simple Spark TDD example
Spark Sframe
⭐
19
This project contains the code to translate between Apache Spark and SFrame.
Spark Sparql Connector
⭐
17
spark-sparql-connector
Setup Spark
⭐
17
✨ Setup Apache Spark in GitHub Action workflows
Sparklanes
⭐
16
A lightweight data processing framework for Apache Spark
Django Libspark
⭐
16
Apache Spark API for Django
Spark Fits
⭐
15
FITS data source for Spark SQL and DataFrames
Listenbrainz Labs
⭐
15
A collection tools/scripts to explore the ListenBrainz data using Apache Spark.
Pyspark S3 Parquet Example
⭐
13
This repo demonstrates how to load a sample Parquet formatted file from an AWS S3 Bucket. A python job will then be submitted to a Apache Spark instance running on AWS EMR, which will run a SQLContext to create a temporary table using a DataFrame. SQL queries will then be possible against the temporary table.
Pyspark For Beginners
⭐
12
PySpark for Beginners by Packt Pyblishing
Sparkling Titanic
⭐
12
Training models with Apache Spark, PySpark for Titanic Kaggle competition
Bigdata Spark
⭐
12
BerkeleyX: CS100.1x, Introduction to Big Data with Apache Spark
Distributed Machine Learning
⭐
12
PySpark, Databrick, h2o, MLlib
Pyspark Atlas
⭐
11
PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection
Orange3 Spark
⭐
11
A set of widgets for Python's Orange Machine Learning to work with Apache Spark ML
Pyspark Dataframe Made Easy
⭐
10
pyspark dataframe made easy
Sparkitecture
⭐
9
A collection of “cookbook-style” scripts for simplifying data engineering and machine learning in Apache Spark.
Livy Server Docker
⭐
7
Spark Tutorials
⭐
6
PySpark notebooks to learn Apache Spark (WIP)
Optimus Examples
⭐
6
Examples for Optimus a Data Cleansing Library for Big Data.
Datascience Playground
⭐
6
A scalable, cloud-ready environment for Data Science using Docker
Docker Spark Anaconda
⭐
5
Spark and Anaconda in Docker
Spark Streaming In Python
⭐
5
Apache Spark 3 - Structured Streaming Course Material
Spark Course
⭐
5
Nginx Log Analytics With Spark
⭐
5
Using Apache Spark (PySpark) to Analyze the Access Logfiles of Nginx
Apachespark Pyspark 2023
⭐
5
PySpark es una biblioteca de procesamiento de datos distribuidos en Python que permite procesar grandes volúmenes de datos en clústeres utilizando el framework Apache Spark, ofreciendo un alto rendimiento y un conjunto de herramientas integradas para el análisis y manejo de datos a gran escala.
Machine Learning Pipeline Lr Pyspark
⭐
5
Power Plant ML Pipeline Application - Apache Spark
Spark Streaming
⭐
5
Twitter Spark Streaming using PySpark
Related Searches
Java Apache (4,331)
Php Apache (2,627)
Javascript Apache (1,555)
Python Apache (1,438)
Shell Apache (1,374)
Docker Apache (1,363)
Apache Spark (1,207)
Mysql Apache (961)
Spark Pyspark (773)
Scala Apache (707)
1-46 of 46 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.