Sarang Gupta

Airbnb Rental Listings Dataset Mining

An Exploratory Analysis of Airbnb’s Data to understand the rental landscape in New York City

Airbnb has seen a meteoric growth since its inception in 2008 with the number of rentals listed on its website growing exponentially each year. Airbnb has successfully disrupted the traditional hospitality industry as more and more travellers, not just the ones who are looking for a bang for their buck but also business travellers resort to Airbnb as their premier accommodation provider.
New York City has been one of the hottest markets for Airbnb, with over 52,000 listings as of November 2018. This means there are over 40 homes being rented out per square km. in NYC on Airbnb! One can perhaps attribute the success of Airbnb in NYC to the high rates charged by the hotels, which are primarily driven by the exorbitant rental prices in the city.

In this post, I will perform an exploratory analysis of the Airbnb dataset sourced from the Inside Airbnb website to understand the rental landscape in NYC through various static and interactive visualisations. Read it on Medium

Time-Series Calendar Heatmaps

A new way to visualize Time-Series data

Time series is a series of data that is indexed in time order. The time order can be expressed in days, weeks, months or years. The most common way to visualize time series data is to use a simple line chart, where the horizontal axis plots the increments of time and the vertical axis plots the variable that is being measured. The visualization can be achieved using geom_line() in ggplot2 or simply using the plot() function in Base R.

In this tutorial, I will introduce a new tool to visualize Time Series Data called Time-Series Calendar Heatmap. We will look at how Time-Series Calendar Heatmaps can be drawn using ggplot2. We will also explore the calendarHeat() function written by Paul Bleicher (released as open source under GPL license) which provides an easy way to create the visualization. Read it on Medium