Awesome Open Source
Awesome Open Source

Tablesaw

Apache 2.0 Build Status Codacy Badge Maintainability Rating

Overview

Tablesaw is a dataframe and visualization library that supports loading, cleaning, transforming, filtering, and summarizing data. If you work with data in Java, it may save you time and effort. Tablesaw also supports descriptive statistics and can be used to prepare data for working with machine learning libraries like Smile, Tribuo, H20.ai, DL4J.

Tablesaw features

Data processing & transformation

  • Import data from RDBMS, Excel, CSV, TSV, JSON, HTML, or Fixed Width text files, whether they are local or remote (http, S3, etc.)
  • Export data to CSV, JSON, HTML or Fixed Width files.
  • Combine tables by appending or joining
  • Add and remove columns or rows
  • Sort, Group, Filter, Edit, Transpose, etc.
  • Map/Reduce operations
  • Handle missing values

Visualization

Tablesaw supports data visualization by providing a wrapper for the Plot.ly JavaScript plotting library. Here are a few examples of the new library in action.

Tornadoes Tornadoes Tornadoes
Tornadoes Tornadoes Tornadoes
Tornadoes Tornadoes Tornadoes
Tornadoes Tornadoes Tornadoes

Statistics

  • Descriptive stats: mean, min, max, median, sum, product, standard deviation, variance, percentiles, geometric mean, skewness, kurtosis, etc.

Getting started

Add tablesaw-core to your project. You can find the version number for the latest release in the release notes:

<dependency>
    <groupId>tech.tablesaw</groupId>
    <artifactId>tablesaw-core</artifactId>
    <version>VERSION_NUMBER_GOES_HERE</version>
</dependency>

You may also add supporting projects:

  • tablesaw-beakerx - for using Tablesaw inside BeakerX
  • tablesaw-excel - for using Excel workbooks
  • tablesaw-html - for using HTML
  • tablesaw-json - for using JSON
  • tablesaw-jsplot - for creating charts

External supporting projects - outside of this organization:

Documentation and support

Integrations

Jupyter Notebooks

Other integrations

Alternatives To Tablesaw
Select To Compare


Alternative Project Comparisons
Related Awesome Lists
Top Programming Languages

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
Java (413,361
Machine Learning (41,079
Visualization (15,698
Chart (13,549
Data Science (11,501
Statistics (10,843
Excel (7,400
Data Visualization (6,306
Data Analysis (5,421
Plotting (3,023
Dataframe (2,810
Plotly (1,777
Statistical Analysis (720