Sarang Gupta

Machine Learning Visualization Statistics/Operations Research Web Others

Machine Learning

Spotify Playlist Prediction

The project aims to predict the number of followers that a Spotify playlist will attract. Various quantitative and qualitative features of the playlists and the tracks wthin the playlist are used to construct a predictive model. An algorithm to assemble a playlist with songs that are similar to a user-specified song has also been developed. Github

Otto Product Classification

This project aims to train a machine learning algorithm that is able to classify a product into different categories based on the product characteristics. The data set provided by the Otto group consists of product features and their categories for around 60,000 products. Labelled data and supervised learning techniques are used to develop an algorithm for predicting the product categories given the product features. Github


Analyzing Airbnb Rentals Dataset

In the project we aim to understand the Airbnb rental landscape in New York City through exploratory analysis of the Airbnb dataset. Through static and interactive visualizations we aim to answer questions related to rental pricing, demand and booking policies among others. Over 1 million user reviews are also mined and presented through interactive word clouds. more info Github

Web Development

Appétit - A Novel Social Network

Appétit is an innovative web-based social networking platform that helps users arrange lunch meet-ups with friends and colleagues. At the core of Appétit is a recommendation engine that uses Dijkstras Shortest Path algorithm and analyses common connections and prior meetups between users in the network suggest new engagements. more info PDF

Statistics/Operations Research

Simulation of Laundry System in Student Residences

Improving efficiency of the laundry system in student halls of the Hong Kong University of Science and Technology (HKUST) through the analysis of statistical simulation models. Simulation models were built using Arena software to model the laundry system and the simulation outputs were used to analyse inefficiencies. Proposal for a revamped system with 25% higher system utilization and reduced waiting time. PDF

Correlation Analysis of Components of the HDI Across Countries

In the study, a correlation analysis of different component of United Nation's Human Development Index - life expectancy, mean years of school and income per capita is conducted. The key question being answered is whether higher levels of education and long life of citizens indicate greater prosperity of a nation. The study takes a random sample of countries from different development levels and statistically concludes that the components are significantly corrrelated. PDF


NapkinAd: Using Data Analytics for Market Research

NapkinAd is an Australian advertising business that publishes advertisements for companies on paper napkins. The aim of the study was to evaluate whether NapkinAd should place its product in HKUST's cafeterica. Through exploratory and descriptive market research and analysis of data collected through surveys, focus groups and interviews, it is concluded that HKUST's cafeteria potentially serves as a profitable advertising location for NapkinAd. PDF