The Etsy Data Explorer

Kyle Scully
3/22/2015

The Overview

Etsy.com has a unique markplace that has grown by enabling small companies and hobbyists the ability to make online stores to extend their reach.

This project was created to analyze this marketplace while not having direct access to Etsy's internal data.

The Data

  • A crawler was made to scrape approximately 30k Etsy stores for the following attributes:
    • store name
    • date the store opened
    • number of admirers the store has
    • number of sold items
    • number of reviews the store has
    • location the store identifies with
  • The raw data, cleaned data, and cleaning script can be found here:

The Application

Where to find:

The application can be cloned from here:

The application is hosted on shinyapps.io:

The Front End

  • Users can build their own filters to subset the data.
  • Users can select which variables to plot
  • Users can select the number of k-means clusters.

Alt text

The Back End

  • Implemented using ggvis but essentially works like this:
filtered <- user_filter(all_stores)
two_vars <- filtered[, c("reviews", "sales")]
cluster <- kmeans(two_vars, 3)
plot(two_vars, col=cluster$cluster)

plot of chunk unnamed-chunk-4