Blurr transforms structured, streaming raw data
into features
for model training and prediction using a high-level expressive YAML-based language
called the Blurr Transform Spec (BTS). The BTS merges the schema and computation model for data processing.
The BTS is a data transform definition for structured data. The BTS encapsulates the business logic of data transforms and Blurr orchestrates the execution of data transforms. Blurr is runner-agnostic, so BTSs can be run by event processors such as Spark, Spark Streaming or Flink.
Yes, if: you are well on your way on the ML 'curve of enlightenment', and are thinking about how to do online scoring
Coming up with features is difficult, time-consuming, requires expert knowledge. 'Applied machine learning' is basically feature engineering --- Andrew Ng
Streaming BTS Tutorial | Window BTS Tutorial
Preparing data for specific use cases using Blurr:
Welcome to the Blurr community! We are so glad that you share our passion for building MLOps!
Please create a new issue to begin a discussion. Alternatively, feel free to pick up an existing issue!
Please sign the Contributor License Agreement before raising a pull request.
Inspired by the (old school) Joel Test to rate software teams, here's our version for data science teams. What's your score?
Blurr is currently in Developer Preview. Stay in touch!: Star this project or email [email protected]