Awesome Open Source
Awesome Open Source


Sytora is a multilingual symptom-disease classification app. Translation is managed through the UMLS coding standard. A multinomial Naive Bayes classifier is trained on a handpicked dataset, which is freely available under CC4.0.

To get started:

  • Clone this repo
  • Install requirements
  • Run the scripts (see below) and npm dependencies
  • Get a UMLS license to download UMLS lexica & generate DB (
  • Run and check http://localhost:5001
  • Done! 🎉


Check out for a demo.


Finding the right diagnosis cannot be achieved by extracting symptoms and running a classification algorithm. The hardest part is asking the right questions, focusing what is important in the situation, connecting other events, and much more. Despite all this, I have long been exited about writing a symptom-disease lookup system to quickly gather related symptoms to symptoms etc. Not everything the model outputs is nonsense. Actually it helps a lot to quickly get a list of diseases given to a set of symptoms.


The data is formatted as CSV files. Example entry:


Data sources:

  • DiseaseSymptomKB.csv: extracted from Disease-Symptom Knowledge Database. This data solely belongs to the respective authors. The authors are not not affiliated with this project.
  • disease-symptom.csv: Manually created by hand. Freely available under CC 4.0.


Training models & generating files from data:

  1. Run to convert to GloVe-format. You need to get the pretrained embeddings first, available here: Place them in the data folder.
  2. Run to create the option labels for the select fields. Languages are currently hardcoded as list and can be extended if needed.
  3. Run to train a MNB classifier (for the disease prediction). Other necessary files are generated, too.
  4. Run to train the model for the autosuggestion feature. This uses cui2vec. Please note that the authors of cui2vec are not affiliated with this code.

React client: cd into flaskapp and npm install. For development npm run watch, for production npm run build.

Flask Service

A small flask app is avaiable to showcase the trained models. cd into the flaskapp folder and start the app



Make sure to export REACT_APP_ENDPOINT with the correct address (e.g.

Get going in ~10 min:

sudo apt update
sudo apt install python3-pip python3-dev build-essential libssl-dev libffi-dev python3-setuptools
sudo apt install python-pip python-dev
sudo apt install nodejs npm
pip install flask pandas sklearn numpy
pip install Flask-Limiter flask-expects-json
pip install more-itertools requests configparser
sudo apt-get install nginx supervisor

git clone
cd sytora/flaskapp && npm i

vi /etc/supervisor/conf.d/sytora.conf
sudo supervisorctl reread
sudo service supervisor restart
sudo supervisorctl status

sudo vim /etc/nginx/conf.d/virtual.conf
sudo nginx -t
sudo service nginx restart


command=gunicorn app:app -b


server {
    listen       80;

    location / {

don't forget to transfer the umls.db, e.g. scp ./umls.db [email protected]:/root/sytora/flaskapp/umls/database

Coding quality, security & stability

This project was written very quickly with no performance or stability features in mind; the code base suffered accordingly. Expect things to be cleaned up soon though.

Please note that I'm a machine learning hobbyist and a medical student. The code may not in accordance with common conventions.


This project is heavily inspired by:

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
javascript (70,403
machine-learning (3,648
classification (282
data-analysis (280
embeddings (82
healthcare (43
medical (43
classifier (38