Apachespark Pyspark 2023

PySpark es una biblioteca de procesamiento de datos distribuidos en Python que permite procesar grandes volúmenes de datos en clústeres utilizando el framework Apache Spark, ofreciendo un alto rendimiento y un conjunto de herramientas integradas para el análisis y manejo de datos a gran escala.
Alternatives To Apachespark Pyspark 2023
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Spark37,6612,3949395 months ago46May 09, 2021186apache-2.0Scala
Apache Spark - A unified analytics engine for large-scale data processing
Cookbook12,557
6 months ago111apache-2.0
The Data Engineering Cookbook
God Of Bigdata8,483
a year ago3
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Iceberg5,179
5 months ago3October 29, 20221,485apache-2.0Java
Apache Iceberg
Bigdl4,728105 months ago16April 19, 2021958apache-2.0Jupyter Notebook
Accelerate LLM with low-bit (FP4 / INT4 / FP8 / INT8) optimizations using bigdl-llm
Sparkinternals4,665
3 years ago27
Notes talking about the design and implementation of Apache Spark
Tensorflowonspark3,851
5a year ago32April 21, 202213apache-2.0Python
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
Spark Nlp3,578305 months ago134December 08, 202343apache-2.0Scala
State of the Art Natural Language Processing
Roaringbitmap3,3084351245 months ago187September 22, 202389apache-2.0Java
A better compressed bitset in Java: used by Apache Spark, Netflix Atlas, Tablesaw, and many others
Koalas3,2911169 months ago47October 19, 2021112apache-2.0Python
Koalas: pandas API on Apache Spark
Alternatives To Apachespark Pyspark 2023
Select To Compare


Alternative Project Comparisons
Popular Spark Projects
Popular Apache Projects
Popular Data Processing Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Python
Apache
Data Science
Spark
Dataframe
Apache Spark
Pyspark