top of page

Analyzing Opioid Users on Reddit
Natural Language Processing

Reddit is a network of communities, called subreddits, where groups of people can gather and communicate anonymously. Exploring subreddits dedicated to opioids can provide insights into the social workings of opioid users. This project aims to get a deeper understanding of opioid users by analyzing two subreddits: r/Opiates and r/OpiatesRecovery. The goal is to identify any measurable differences between the posts during opioid use and the posts after opioid use. If there are any differences, can these differences help improve treatments and interventions? In order to achieve this goal, sentiment analysis, word networks, and topic modeling were utilized.

 

Please click on title for a detailed view of the project and click here for project code.

Projects

Telco Customer Churn Prediction
Classification Modeling

Customer churn rate is the percentage of customers that stop using a product within a given time frame. The goal of this project is to identify important features that help determine if a customer will churn and to build a model that will predict if a customer will churn. To achieve this goal, three models were trained and tested: Logistic Regression, Random Forest Classifier, and Gradient Boosting Classifier. 

 

The features that were deemed as important across all three classifiers are "tenure", "Contract_Two year", "TotalCharges", "InternetService_Fiber optic", and "MonthlyCharges". The model that had the best performance was the Random Forest Classifier with data that was oversampled and undersampled; this model resulted in an f1-score of 81% and accuracy of 82%.

​

Please click on title for detailed view of the project and click here for project code.

bottom of page