The Top 108 Lemmatization Open Source Projects on Github
Topic > Lemmatization
The Bus-Mama is a bus tracking mobile application for the transportation of the students of BSMRSTU. It helps the students of our university by showing the available route, bus, and their exact location. This app includes real-time bus tracking which is going to solve a problem that university students have been facing for many years. Students are often seen missing their buses. Often they can't maintain the bus time. Since there are many buses in our university, students can easily catch a bus if they know where and when it will pass by. My goal is to track the buses and make hardware, mobile application, and machine learning solution to solve the issue. This way the students can get relief from missing the bus and use the buses efficiently. The main idea is to track the buses. GPS trackers will be attached to every bus that will give the current position of them and automatically sync on the server. The Bus-Mama mobile application will show every real-time position of those buses. This application will be installed on students' mobile phones and in this way the students can easily maintain their transportation. In this application, the current location of the bus can be seen through Google map. Every bus will have a specific marker on Google map and all the details about a specific bus will be shown by clicking on the marker. There will be seen about how far the bus is, from which direction it will come, how much time to reach the bus, how much time it will take if there is any traffic on road, etc. There is also a search option to know about any specific bus details. There is also a list of all buses with sufficient details that will help students to know about all the details. Every student will have an account through which they can access bus data. Another main objective is the Bus-Mama Chatbot in the Bengali language so that the students can communicate to know about the bus easily. For now, they can make conversation only about bus-related information. The Chatbot is not yet able to make conversation except bus-related questions. If anyone asks anything except bus-related questions, it cannot reply to the question rather it will give a tag to the question as a reply. As the Chatbot is created in the Bengali language, it has used the "trie" data structure in lemmatization. A library has been designed to lemmatize the Bengali words. Almost 63,205 Bengali words have been lemmatized by using the library to train the SVM machine learning model. Assignment-11-Text-Mining-01-Elon-Musk, Perform sentimental analysis on the Elon-musk tweets (Exlon-musk.csv), Text Preprocessing: remove both the leading and the trailing characters, removes empty strings, because they are considered in Python as False, Joining the list into one string/text, Remove Twitter username handles from a given twitter text. (Removes @usernames), Again Joining the list into one string/text, Remove Punctuation, Remove https or url within text, Converting into Text Tokens, Tokenization, Remove Stopwords, Normalize the data, Stemming (Optional), Lemmatization, Feature Extraction, Using BoW CountVectorizer, CountVectorizer with N-grams (Bigrams & Trigrams), TF-IDF Vectorizer, Generate Word Cloud, Named Entity Recognition (NER), Emotion Mining - Sentiment Analysis.