Leonardo Patricelli data sceince portfolio
Final Capstone Project for Analytics for business decision making post-graduate course, in collaboration with MAGID
Click here to read the PDF presenation (with censor)
Click here to go to the repository and read more details
As project leader of a team of 4, I led the development of new film genres and a movie recommendation app based on emotions. Using original survey data collected by MAGID on the emotional responses of respondents after watching a movie and other demographic data, we achieved over 80% of accuracy on the XGBoost model and various exciting insights. The project steps were:
Master’s degree thesis at University of Turin, in collaboration with OPTA Sport
I created a model to predict the outcome of a match using metrics related to a team’s network and playstyle with 85% of accuracy. OPTA sport provided high-quality data about games of the Italian championship season 2012/2013. The project involved:
Project done for the business analytics course during my master’s degree at the University of Turin
Click here to read the PDF Report for more details
Click here to go to the repository and see the R code
This project was about predicting and analyzing the customers who renew the subscription of Piemonte’s museums (and who doesn’t), using the 2014 data to predict the 2015 renewals using the R programming language. In the end, I created a Random Forest model able to predict correctly almost 75% of the renewals. The project included:
Project for Hackathon 2021 at Seneca College, in collaboration with Orphadata
Click here to read the PDF presenation
Click here to go to the repository and read more details
Orphadata provides the scientific community with data about rare diseases. In this challenge, we had to retrieve data about rare diseases to gain insights and develop an application to assist doctors and researchers with diagnosing rare diseases. The challenge involved:
Report done for the blog The Pizza Statistician with data from the city of Toronto open data portal
Click here to view the Tableau Dashboard
Click here to read the Report for more details
Click here to go to the repository and see the R code
This project analyzes all bicycle theft occurrences reported to the Toronto Police Service from 2014 to 2019 using the R programming language. As expected, most of thefts happen to be in the evening of summer months. The project involved:
Report done in college for the course business, web and social media metrics and analysis, based on the case study from the book Data mining for business analytics: concepts, Techniques and Applications in Python
Click here to read the PDF presentation
Click here to go to the repository for more details
In this case study my group was asked to run a customer segmentation on transaction data and then retrieve the cluster containing people who stick with their favorite brand and don’t churn. I was responsible for the whole code and analysis. In the end, We managed to build 7 clusters using different metrics and a logit model able to reach 86% of precision using unbalanced data. The project involved:
Report done in college for the course business, web and social media metrics and analysis, based on the case study from the book Data mining for business analytics: concepts, Techniques and Applications in Python
Click here to read the PDF presentation
Click here to go to the repository for more details
In this project my group was asked to run a market basket analysis via association rules on transaction data to analyze the sellings and to introduce a cross-selling strategy to improve the revenue. The steps involved were: