Awesome Open Source
Awesome Open Source

Santander Value Prediction Challenge: Open Solution

Join the chat at https://gitter.im/minerva-ml/open-solution-value-prediction license

This is an open solution to the Santander Value Prediction Challenge 😃

More competitions ğŸŽ‡

Check collection of public projects ğŸŽ, where you can find multiple Kaggle competitions with code, experiments and outputs.

Our goals

We are building entirely open solution to this competition. Specifically:

  1. Learning from the process - updates about new ideas, code and experiments is the best way to learn data science. Our activity is especially useful for people who wants to enter the competition, but lack appropriate experience.
  2. Encourage more Kagglers to start working on this competition.
  3. Deliver open source solution with no strings attached. Code is available on our GitHub repository 💻. This solution should establish solid benchmark, as well as provide good base for your custom ideas and experiments. We care about clean code 😃
  4. We are opening our experiments as well: everybody can have live preview on our experiments, parameters, code, etc. Check: Santander-Value-Prediction-Challenge 📈 and screens below.
LightGBM train and validation performance on folds 📊 LightGBM experiment logged values 📊
train-validation-results-on-folds LightGBM-learning-curves

Disclaimer

In this open source solution you will find references to the neptune.ml. It is free platform for community Users, which we use daily to keep track of our experiments. Please note that using neptune.ml is not necessary to proceed with this solution. You may run it as plain Python script 😉.

How to start?

Learn more about our solutions

  1. Check Kaggle discussion for most recent updates and comments.
  2. Read Wiki pages, where we describe solutions in more detail. Click on the tropical fish to get started 🐠 or pick solution from the table below.
link to code name CV LB link to the description
solution 1 honey bee 🐝 1.39 1.43 LightGBM and 5fold CV
solution 2 beetle ğŸž 1.60 1.77 LightGBM on binarized dataset
solution 3 dromedary camel 🐪 1.35 1.41 LightGBM with row aggregations
solution 4 whale 🐳 1.3416 1.41 LightGBM on dimension reduced dataset
solution 5 water buffalo 🐃 1.336 1.39 Exploring various dimension reduction techniques
solution 6 blowfish 🐡 1.333 1.38 bucketing row aggregations

Start experimenting with ready-to-use code

You can jump start your participation in the competition by using our starter pack. Installation instruction below will guide you through the setup.

Installation (fast track)

  1. Clone repository and install requirements (check requirements.txt)
  2. Register to the neptune.ml (if you wish to use it)
  3. Run experiment:

🔱

neptune run --config neptune_random_search.yaml main.py train_evaluate_predict --pipeline_name SOME_NAME

🐍

python main.py -- train_evaluate_predict --pipeline_name SOME_NAME

Installation (step by step)

  1. Clone this repository
git clone https://github.com/minerva-ml/open-solution-value-prediction.git
  1. Install requirements in your Python3 environment
pip3 install -r requirements.txt
  1. Register to the neptune.ml (if you wish to use it)
  2. Update data directories in the neptune.yaml configuration file
  3. Run experiment:

🔱

neptune login
neptune run --config neptune_random_search.yaml main.py train_evaluate_predict --pipeline_name SOME_NAME

🐍

python main.py -- train_evaluate_predict --pipeline_name SOME_NAME
  1. collect submit from experiment_directory specified in the neptune.yaml

Get involved

You are welcome to contribute your code and ideas to this open solution. To get started:

  1. Check competition project on GitHub to see what we are working on right now.
  2. Express your interest in particular task by writing comment in this task, or by creating new one with your fresh idea.
  3. We will get back to you quickly in order to start working together.
  4. Check CONTRIBUTING for some more information.

User support

There are several ways to seek help:

  1. Kaggle discussion is our primary way of communication.
  2. Read project's Wiki, where we publish descriptions about the code, pipelines and supporting tools such as neptune.ml.
  3. Submit an issue directly in this repo.

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
python (51,964) 
deep-learning (3,855) 
machine-learning (3,530) 
data-science (866) 
open-source (751) 
education (280) 
training (103) 
sklearn (67) 
xgboost (60) 
reproducibility (48) 
lightgbm (36) 
competition (28) 
pipeline-framework (18) 

Find Open Source By Browsing 7,000 Topics Across 59 Categories

Advertising 📦 10
All Projects
Application Programming Interfaces 📦 124
Applications 📦 192
Artificial Intelligence 📦 78
Blockchain 📦 73
Build Tools 📦 113
Cloud Computing 📦 80
Code Quality 📦 28
Collaboration 📦 32
Command Line Interface 📦 49
Community 📦 83
Companies 📦 60
Compilers 📦 63
Computer Science 📦 80
Configuration Management 📦 42
Content Management 📦 175
Control Flow 📦 213
Data Formats 📦 78
Data Processing 📦 276
Data Storage 📦 135
Economics 📦 64
Frameworks 📦 215
Games 📦 129
Graphics 📦 110
Hardware 📦 152
Integrated Development Environments 📦 49
Learning Resources 📦 166
Legal 📦 29
Libraries 📦 129
Lists Of Projects 📦 22
Machine Learning 📦 347
Mapping 📦 64
Marketing 📦 15
Mathematics 📦 55
Media 📦 239
Messaging 📦 98
Networking 📦 315
Operating Systems 📦 89
Operations 📦 121
Package Managers 📦 55
Programming Languages 📦 245
Runtime Environments 📦 100
Science 📦 42
Security 📦 396
Social Media 📦 27
Software Architecture 📦 72
Software Development 📦 72
Software Performance 📦 58
Software Quality 📦 133
Text Editors 📦 49
Text Processing 📦 136
User Interface 📦 330
User Interface Components 📦 514
Version Control 📦 30
Virtualization 📦 71
Web Browsers 📦 42
Web Servers 📦 26
Web User Interface 📦 210