Awesome Open Source
Awesome Open Source


Analyzing the Drugs Descriptions, conditions, reviews and then recommending it using Deep Learning Models, for each Health Condition of a Patient.


The freestyle format of hackathons has time and again stimulated groundbreaking and innovative data insights and technologies. The Kaggle University Club Hackathon recreates this environment virtually on our platform. We challenge you to build a meaningful project around the UCI Machine Learning - Drug Review Dataset. Teams are free to let their creativity run and propose methods to analyze this dataset and form interesting machine learning models.

Machine learning has permeated nearly all fields and disciplines of study. One hot topic is using natural language processing and sentiment analysis to identify, extract, and make use of subjective information. The UCI ML Drug Review dataset provides patient reviews on specific drugs along with related conditions and a 10-star patient rating system reflecting overall patient satisfaction. The data was obtained by crawling online pharmaceutical review sites. This data was published in a study on sentiment analysis of drug experience over multiple facets, ex. sentiments learned on specific aspects such as effectiveness and side effects (see the acknowledgments section to learn more).

The sky's the limit here in terms of what your team can do! Teams are free to add supplementary datasets in conjunction with the drug review dataset in their Kernel. Discussion is highly encouraged within the forum and Slack so everyone can learn from their peers.

Here are just a couple ideas as to what you could do with the data:


Can you predict the patient's condition based on the review?


Can you predict the rating of the drug based on the review?

Sentiment analysis:

What elements of a review make it more helpful to others? Which patients tend to have more negative reviews? Can you determine if a review is positive, neutral, or negative?

Data visualizations:

What kind of drugs are there? What sorts of conditions do these patients have?


How many domains of analysis and topics does this Kernel cover? Does it attempt machine learning methods? Does the Kernel offer a variety of unique analyses and interesting conclusions or solutions?


What is the subject matter of this Kernel? Does it have a well-defined and interesting project scope, narrative or problem? Could the results make an impact? Is it thought provoking?


How easy is it to understand this Kernel? Are all thought processes clear? Is the code clean, with useful comments? Are visualizations and processes articulated and self-explanatory? Teams with top submissions have a chance to receive exclusive Kaggle University Club swag and be featured on our official blog and across social media.

Thankyou for Visiting. If you like it Please Star my repository. and feel free to fork and clone.

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
jupyter-notebook (6,448
deep-learning (4,077
machine-learning (3,737
data-visualization (439
data-analysis (292
data-mining (170
feature-engineering (52
recommendation-system (47
feature-selection (30
data-cleaning (25