Identify regions where Flickr is used most and least.
Flickr can exercise their marketing team where Flickr used least.
Predict number of uploads based on tags used and region where the photo is uploaded.
This Prediction matrix can be used to effectively design and strengthen the storage servers and the architecture of system to sustain traffic and manage the upcoming load.
We will be using the APIs mentioned below to collect our data. flickr.interestingness.getList flickr.tags.getListPhoto flickr.tags.getRelated flickr.tags.getHotList flickr.places.placesForTags
We may use external data for analysis if required but for now we are going to use strictly Flickr data.
We have collected 166,376 instances of data, which we will need to clean by finding unique tags and the tags which have geo-location data available.
Columns which will be in use for analysis are : Tags Place_type-It can have values as – 22 for neighborhood, 7 for locality, 8 for region, 12 for country and 29 for continent. Photo_count Photo_id
We will be predicting if a photo uploaded with some tags in some region, how many similar photos could be uploaded soon.
Today’s world is a world of trend. It has been seen that many tags have a correlation with the regions where the tag originates in.
Prediction of the number of the uploads Flickr should expect are coming could help them manage their resources of storage and servers for the upcoming traffic and data.
For this prediction we will be using the following prediction algorithms :
Knn
Linear Regression
Neural Network