Wisconsin Breast Cancer

[ICMLSC 2018] On Breast Cancer Detection: An Application of Machine Learning Algorithms on the Wisconsin Diagnostic Dataset
Alternatives To Wisconsin Breast Cancer
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Cs Video Courses53,706
4 days ago14
List of Computer Science courses with video lectures.
C Plus Plus23,677
4 days ago59mitC++
Collection of various algorithms in mathematics, machine learning, computer science and physics implemented in C++ for educational purposes.
15 days ago2otherC#
Bitmap & tilemap generation from a single example with the help of ideas from quantum mechanics
Homemade Machine Learning20,319
5 months ago21mitJupyter Notebook
🤖 Python examples of popular machine learning algorithms with interactive Jupyter demos and math being explained
4 days ago23gpl-3.0C
Collection of various algorithms in mathematics, machine learning, computer science, physics, etc implemented in C for educational purposes.
Recommenders15,2942a day ago11April 01, 2022150mitPython
Best Practices on Recommendation Systems
Machine Learning Tutorials12,876
3 months ago33cc0-1.0
machine learning and deep learning tutorials, articles and other resources
Nni12,63882212 hours ago51June 22, 2022280mitPython
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Halfrost Field11,208
9 months ago5cc-by-sa-4.0Go
✍🏻 这里是写博客的地方 —— Halfrost-Field 冰霜之地
a year ago8mitPython
Minimal and clean examples of machine learning algorithms implementations
Alternatives To Wisconsin Breast Cancer
Select To Compare

Alternative Project Comparisons

On Breast Cancer Detection: An Application of Machine Learning Algorithms on the Wisconsin Diagnostic Dataset


Note: This repository is retired and will not be ported to use TF2. However, you may use this as a reference in doing so.

This paper was presented at the 2nd International Conference on Machine Learning and Soft Computing (ICMLSC) in Phu Quoc Island, Vietnam last February 2-4, 2018.

The full paper on this project may be read at arXiv.org.


This paper presents a comparison of six machine learning (ML) algorithms: GRU-SVM[4], Linear Regression, Multilayer Perceptron (MLP), Nearest Neighbor (NN) search, Softmax Regression, and Support Vector Machine (SVM) on the Wisconsin Diagnostic Breast Cancer (WDBC) dataset [22] by measuring their classification test accuracy and their sensitivity and specificity values. The said dataset consists of features which were computed from digitized images of FNA tests on a breast mass[22]. For the implementation of the ML algorithms, the dataset was partitioned in the following fashion: 70% for training phase, and 30% for the testing phase. The hyper-parameters used for all the classifiers were manually assigned. Results show that all the presented ML algorithms performed well (all exceeded 90% test accuracy) on the classification task. The MLP algorithm stands out among the implemented algorithms with a test accuracy of ~99.04% Lastly, the results are comparable with the findings of the related studies[18 , 23].


To cite the paper, kindly use the following BibTex entry:

 author = {Agarap, Abien Fred M.},
 title = {On Breast Cancer Detection: An Application of Machine Learning Algorithms on the Wisconsin Diagnostic Dataset},
 booktitle = {Proceedings of the 2Nd International Conference on Machine Learning and Soft Computing},
 series = {ICMLSC '18},
 year = {2018},
 isbn = {978-1-4503-6336-5},
 location = {Phu Quoc Island, Viet Nam},
 pages = {5--9},
 numpages = {5},
 url = {http://doi.acm.org/10.1145/3184066.3184080},
 doi = {10.1145/3184066.3184080},
 acmid = {3184080},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {artificial intelligence, artificial neural networks, classification, linear regression, machine learning, multilayer perceptron, nearest neighbors, softmax regression, supervised learning, support vector machine, wisconsin diagnostic breast cancer dataset},

To cite the repository/software, kindly use the following BibTex entry:

  author       = {Abien Fred Agarap},
  title        = {AFAgarap/wisconsin-breast-cancer: v0.1.0-alpha},
  month        = dec,
  year         = 2017,
  doi          = {10.5281/zenodo.1098533},
  url          = {https://doi.org/10.5281/zenodo.1098533}

Machine Learning (ML) Algorithms


All experiments in this study were conducted on a laptop computer with Intel Core(TM) i5-6300HQ CPU @ 2.30GHz x 4, 16GB of DDR3 RAM, and NVIDIA GeForce GTX 960M 4GB DDR5 GPU.

Figure 1. Training accuracy of the machine learning algorithms on breast cancer detection using WDBC.

Figure 1 shows the training accuracy of the ML algorithms: (1) GRU-SVM finished its training in 2 minutes and 54 seconds with an average training accuracy of 90.6857639%, (2) Linear Regression finished its training in 35 seconds with an average training accuracy of 92.8906257%, (3) MLP finished its training in 28 seconds with an average training accuracy of 96.9286785%, (4) Softmax Regression finished its training in 25 seconds with an average training accuracy of 97.366573%, and (5) L2-SVM finished its training in 14 seconds with an average training accuracy of 97.734375%. There was no recorded training accuracy for Nearest Neighbor search since it does not require any training, as the norm equations (L1 and L2) are directly applied on the dataset to determine the “nearest neighbor” of a given data point p_{i} ∈ p.

Table 1. Summary of experiment results on the machine learning algorithms.

Parameter GRU-SVM Linear Regression MLP L1-NN L2-NN Softmax Regression L2-SVM
Accuracy 93.75% 96.09375% 99.038449585420729% 93.567252% 94.736844% 97.65625% 96.09375%
Data points 384000 384000 512896 171 171 384000 384000
Epochs 3000 3000 3000 1 1 3000 3000
FPR 16.666667% 10.204082% 1.267042% 6.25% 9.375% 5.769231% 6.382979%
FNR 0 0 0.786157% 6.542056% 2.803738% 0 2.469136%
TPR 100% 100% 99.213843% 93.457944% 97.196262% 100% 97.530864%
TNR 83.333333% 89.795918% 98.732958% 93.75% 90.625% 94.230769% 93.617021%

Table 1 summarizes the results of the experiment on the ML algorithms. The parameters recorded were test accuracy, number of data points (epochs * dataset_size), epochs, false positive rate (FPR), false negative rate (FNR), true positive rate (FPR), and true negative rate (TNR). All code implementations of the algorithms were written using Python with TensorFlow as the machine intelligence library.


Copyright 2017 Abien Fred Agarap

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at


Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
See the License for the specific language governing permissions and
limitations under the License.
Popular Algorithms Projects
Popular Machine Learning Projects
Popular Computer Science Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Machine Learning
Recurrent Neural Networks
Scikit Learn
Machine Learning Algorithms
Linear Regression
Logistic Regression
Supervised Learning