Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for jupyter notebook parquet
jupyter-notebook
x
parquet
x
27 search results found
Quilt
⭐
1,299
Quilt is a data mesh for connecting people with actionable data
Kglab
⭐
518
Graph Data Science: an abstraction layer in Python for building knowledge graphs, integrated with popular graph libraries – atop Pandas, NetworkX, RAPIDS, RDFlib, pySHACL, PyVis, morph-kgc, pslpython, pyarrow, etc.
Lonboard
⭐
237
Python library for fast, interactive geospatial vector data visualization in Jupyter.
Rumble
⭐
194
⛈️ RumbleDB 1.21.0 "Hawthorn blossom" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
D6tstack
⭐
166
Quickly ingest messy CSV and XLS files. Export to clean pandas, SQL, parquet
Nyc Transport
⭐
144
A Unified Database of NYC transport (subway, taxi/Uber, and citibike) data.
Scientificsummarizationdatasets
⭐
88
Datasets I have created for scientific summarization, and a trained BertSum model
Snowset
⭐
41
Snowflake dataset containing statistics for 70 million queries over 14 day period
Functions
⭐
35
MLRun template functions and examples
Chicago Crimes
⭐
25
Exploring Chicago crimes dataset with Jupyter notebooks, DuckDB, Malloy and new Panel/PyScript data and dashboard tools.
Perspective Parquet
⭐
24
Parquet file reader and editor in Jupyterlab, built with `perspective` for pivoting, filtering, aggregating, etc
Sql Based Etl With Apache Spark On Amazon Eks
⭐
23
A solution that provides declarative data processing capability, and workflow orchestration automation to help your business users (such as analysts and data scientists) access their data and create meaningful insights without the need for manual IT processes.
Pycon2016
⭐
13
Code and Presentation for PyCon2016
Pudl Examples
⭐
13
Example Jupyter notebooks hosted on Kaggle that demonstrate how to work with US energy data from PUDL.
Infoflow
⭐
12
An Apache Spark implementation of the InfoMap community detection algorithm
Vector Io
⭐
11
Use the universal VDF format for vector datasets to easily export and import data from all vector databases
Chicagocrimes
⭐
11
Exploring public Chicago crimes data set in Python
Pyspark Dataframe Made Easy
⭐
10
pyspark dataframe made easy
Bitcoin Insights
⭐
8
Nanodoc
⭐
8
RNA modification detection using Nanopore raw reads with Deep One Class classification
Spark Streaming Twitter
⭐
7
Building pipeline to process the real-time data using Spark and Mongodb.
Azure Sql Db Databricks
⭐
7
Azure SQL and Databricks samples and best practices for loading data quickly and efficiently
Spark For Noobs By A Noob
⭐
7
Jupyter notebooks for learning PySpark
Decisiveworkflowresearch
⭐
6
Workbooks for the Discord channel
Computeai Integrations
⭐
6
The Most Efficient and Scalable SQL Analytics Platform
Bigdata Platform
⭐
6
End to end big data project, that aims to show how to implement different big data layers, from the infrastructure layer to the end user one. [HADOOP][Spark][Kafka][Cassandra][Ansible][Jupyter
Iota
⭐
6
Strava Spark
⭐
6
Analyzing my Strava history with Spark
Schema_evolution_exploration
⭐
5
Explore schema evolution using parquet and Spark or Presto
Bids2table
⭐
5
Efficiently index large-scale BIDS neuroimaging datasets and derivatives
Genomic Bigdata Spark
⭐
5
Genomic BigData Warehousing with Apache Spark and LakeHouse Architecture
Related Searches
Python Jupyter Notebook (12,976)
Jupyter Notebook Machine Learning (8,463)
Jupyter Notebook Dataset (6,824)
Jupyter Notebook Deep Learning (6,566)
Jupyter Notebook Tensorflow (4,771)
Jupyter Notebook Convolutional Neural Networks (4,218)
Jupyter Notebook Classification (3,939)
Jupyter Notebook Neural (3,926)
Jupyter Notebook Pytorch (3,877)
Jupyter Notebook Data Science (3,734)
1-27 of 27 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.