ml-ease
ml-ease is the Open-sourced Large-scale machine learning library from LinkedIn; currently it has ADMM based large scale logistic regression.
Copyright 2014 LinkedIn Corporation. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 . Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. The license information on third-party code is included in NOTICE.
ADMM stands for Alternating Direction Method of Multipliers (Boyd et al. 2011). The basic idea of ADMM is as follows: ADMM considers the large scale logistic regression model fitting as a convex optimization problem with constraints. The ADMM algorithm is guaranteed to converge.? While minimizing the user-defined loss function, it enforces an extra constraint that coefficients from all partitions have to equal. To solve this optimization problem, ADMM uses an iterative process. For each iteration it partitions the big data into many small partitions, and fit an independent logistic regression for each partition. Then, it aggregates the coefficients collected from all partitions, learns the consensus coefficients, and sends it back to all partitions to retrain. After 10-20 iterations, it ends up with a converged solution that is theoretically close to what you would have obtained if you trained it on a single machine.
name="age"
term="[10,20]"
value=1.0
Record 1:
{
"response" : 0,
"features" : [
{
"name" : "7",
"term" : "33",
"value" : 1.0
}, {
"name" : "8",
"term" : "151",
"value" : 1.0
}, {
"name" : "3",
"term" : "0",
"value" : 1.0
}, {
"name" : "12",
"term" : "132",
"value" : 1.0
}
],
"weight" : 1.0,
"offset" : 0.0,
"foo" : "whatever"
}
key:string,
model:[{name:string, term:string, value:float}]
"key" column saves the lambda value. "model" saves the model output for that lambda. "name" + "term" again represents a feature string, and "value" is the learned coefficient. Note that the first element of the model is always "(INTERCEPT)", which means the intercept. Below is a sample of the learned model for lambda = 1.0 and 2.0:
Record 1:
{
"key" : "1.0",
"model" : [ {
"name" : "(INTERCEPT)",
"term" : "",
"value" : -2.5
}, {
"name" : "7",
"term" : "33",
"value" : 0.98
}, {
"name" : "8",
"term" : "151",
"value" : 0.34
}, {
"name" : "3",
"term" : "0",
"value" : -0.4
}, {
"name" : "12",
"term" : "132",
"value" : -0.3
} ],
}
Record 2:
{
"key" : "2.0",
"model" : [ {
"name" : "(INTERCEPT)",
"term" : "",
"value" : -2.5
}, {
"name" : "7",
"term" : "33",
"value" : 0.83
}, {
"name" : "8",
"term" : "151",
"value" : 0.32
}, {
"name" : "3",
"term" : "0",
"value" : -0.3
}, {
"name" : "12",
"term" : "132",
"value" : -0.1
} ],
}
This tool is developed by Applied Relevance Science team at LinkedIn. People who contributed to this tool include: