Awesome Open Source
Awesome Open Source

FireAnt - Analytics and Reporting

BuildStatus CoverageStatus Codacy Docs PyPi License

fireant is a a data analysis tool used for quickly building charts, tables, reports, and dashboards. It defines a schema for configuring metrics and dimensions which removes most of the leg work of writing queries and formatting charts. fireant even works great with Jupyter notebooks and in the Python shell providing quick and easy access to your data.

Read more at http://fireant.readthedocs.io/en/latest/

Installation

To install fireant, run the following command in the terminal:

pip install fireant

Introduction

fireant arose out of an environment where several different teams, each working with data sets often with crossover, were individually building their own dashboard platforms. fireant was developed as a centralized way of building dashboards without the legwork.

fireant is used to create configurations of data sets using DataSet which backs a database table containing analytics and defines sets of Field. A Field can be used to group data by properties, such as a timestamp, an account, a device type, etc, or to render quantifiers such as clicks, ROI, conversions into a widget such as a chart or table.

A DataSet exposes a rich builder API that allows a wide range of queries to be constructed that can be rendered as several widgets. A DataSet can be used directly in a Jupyter notebook, eliminating the need to write repetitive custom queries and render the data in visualizations.

Data Sets

DataSet are the core component of fireant. A DataSet is a representation of a data set and is used to execute queries and transform result sets into widgets such as charts or tables.

A DataSet requires only a couple of definitions in order to use: A database connector, a database table, join tables, and dimensions and metrics. Metrics and Dimension definitions tell fireant how to query and use data in widgets. Once a dataset is created, it's query API can be used to build queries with just a few lines of code selecting which dimensions and metrics to use and how to filter the data.

Instantiating a Data Set

from fireant.dataset import *
from fireant.database import VerticaDatabase
from pypika import Tables, functions as fn

vertica_database = VerticaDatabase(user='myuser', password='mypassword')
analytics, customers = Tables('analytics', 'customers')

my_dataset = DataSet(
    database=vertica_database,
    table=analytics,
    fields=[
        Field(
            # Non-aggregate definition
            alias='customer',
            definition=customers.id,
            label='Customer'
        ),
        Field(
            # Date/Time type, also non-aggregate
            alias='date',
            definition=analytics.timestamp,
            type=DataType.date,
            label='Date'
        ),
        Field(
            # Text type, also non-aggregate
            alias='device_type',
            definition=analytics.device_type,
            type=DataType.text,
            label='Device_type'
        ),
        Field(
            # Aggregate definition (The SUM function aggregates a group of values into a single value)
            alias='clicks',
            definition=fn.Sum(analytics.clicks),
            label='Clicks'
        ),
        Field(
            # Aggregate definition (The SUM function aggregates a group of values into a single value)
            alias='customer-spend-per-clicks',
            definition=fn.Sum(analytics.customer_spend / analytics.clicks),
            type=DataType.number,
            label='Spend / Clicks'
        )
    ],
    joins=[
        Join(customers, analytics.customer_id == customers.id),
    ],

Building queries with a Data Set

Use the query property of a data set instance to start building a data set query. A data set query allows method calls to be chained together to select what should be included in the result.

This example uses the data set defined above

from fireant import Matplotlib, Pandas, day

 matplotlib_chart, pandas_df = my_dataset.data \
      .dimension(
         # Select the date dimension with a daily interval to group the data by the day applies to
         # dimensions are referenced by `dataset.fields.{alias}`
         day(my_dataset.fields.date),

         # Select the device_type dimension to break the data down further by which device it applies to
         my_dataset.fields.device_type,
      ) \
      .filter(
         # Filter the result set to data to the year of 2018
         my_dataset.fields.date.between(date(2018, 1, 1), date(2018, 12, 31))
      ) \
      # Add a week over week reference to compare data to values from the week prior
      .reference(WeekOverWeek(dataset.fields.date))
      .widget(
         # Add a matpotlib chart widget
         Matplotlib()
            # Add axes with series to the chart
            .axis(Matplotlib.LineSeries(dataset.fields.clicks))

            # metrics are referenced by `dataset.metrics.{alias}`
            .axis(Matplotlib.ColumnSeries(
                my_dataset.fields['customer-spend-per-clicks']
            ))
      ) \
      .widget(
         # Add a pandas data frame table widget
         Pandas(
             my_dataset.fields.clicks,
             my_dataset.fields['customer-spend-per-clicks']
         )
      ) \
      .fetch()

 # Display the chart
 matplotlib_chart.plot()

 # Display the chart
 print(pandas_df)

License

Copyright 2020 KAYAK Germany, GmbH

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Crafted with in Berlin.


Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
Python (1,140,957
Mysql (12,325
Database (9,553
Sql (7,073
Data (4,312
Pandas (3,761
Postgres (2,192
Analytics (2,185
Analysis (2,049
Oracle (1,169
Science (1,139
Query (900
Builder (707
Business (292
Rdbms (131
Related Projects