Student Profile: Rebecca Elliott

Introduction & About the Data

Yoga has been a tradition that has been around for thousands and thousands of years, but more recently there has been a yoga resurgance in the Western world. We all have probably seen this increase in people practicing and studios popping up everywhere. I wanted to see if this resurgance was also present in other western countries. The data that I have linked is data of the current yoga, pilates and tai chi locations in Victoria, Australia seperated by region, and suburb. This data is interesting to me because I wanted to see what the yoga studio growth is like in other parts of the western world. I know that this data is open because the data is from a site that the site advertises their open data, and has explicit notes saying that the data can be accessed, used and re-used, and allows universal participation.

Preperations

## Installing package into '/home/rstudio-user/R/x86_64-pc-linux-gnu-library/3.5'
## (as 'lib' is unspecified)

## Installing package into '/home/rstudio-user/R/x86_64-pc-linux-gnu-library/3.5'
## (as 'lib' is unspecified)

## Warning: package 'knitrExtra' is not available (for R version 3.5.2)

## Installing package into '/home/rstudio-user/R/x86_64-pc-linux-gnu-library/3.5'
## (as 'lib' is unspecified)

## Installing package into '/home/rstudio-user/R/x86_64-pc-linux-gnu-library/3.5'
## (as 'lib' is unspecified)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:plyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

## 
## Attaching package: 'data.table'

## The following objects are masked from 'package:dplyr':
## 
##     between, first, last

## 
## Attaching package: 'kableExtra'

## The following object is masked from 'package:dplyr':
## 
##     group_rows

Data

The data that I worked with can be found at… https://data.gov.au/data/dataset/6bac616d-552b-4112-8f59-b0b0a6cc79b5

yoga <- read.csv(url("https://data.gov.au/data/storage/f/2013-05-12T194735/tmpJwybOUyoga-%26-relaxation.csv"), stringsAsFactors = FALSE)

yoga2 <- read.csv(url("https://docs.google.com/spreadsheets/d/e/2PACX-1vRVjXODslMk-GSB7hDtsXu6hojGVGZhKkZ0ACnABVTFW6f8znWVGkIrvfePyaH3v5b8-Bk2nQW9pAfM/pub?output=csv"))

yoga3 <- read.csv(url("https://docs.google.com/spreadsheets/d/e/2PACX-1vQTp3Gtd9BjXW9PLjJwBeSJvdcpOGGNYmmHfNmsHWVSEtgFzjyCqcWbE18MoEVXlAR2uiUNPEvZxjD_/pub?output=csv"))

Charts

This first chart I am continuously attempting to change the legend titles, and am having trouble without distirbuing the whole ggplot. With that said, X.1= Regions, and X.2=Number of Studios in that Region. Thank you for bearing with me!

This second ggplot above is a bar graph that includes all of the different regions that are included in the data, and the amount of yoga studios/ tai chi(less than 5) that correspond with the particular region. The metro areas are the thress tallest, and the smallest with only 3 studios, which is still a lot, is Hume. I realized after my first chart that I would find it really interesting to see a different visual that includes all of the data regions!

This final ggplot is exactly like the first just containing all of the data, I was really curious to see what the plot would look like after seeing the bar chart.

Table

Business Category	Region	Studio Number
Yoga	Barwon S/W	9
Yoga	Eastern Metro	55
Yoga	Gippsland	5
Yoga	Loddon-Mallee	11
Yoga	Southern Metro	60

The table above has two highlighted (in purple) rows, the reason I chose to highlight these is due to the large number of studios in the region, that to me would seem virtually impossible. There is an obvious relationship between the two (both metro areas). I would like to learn more about the data, I am wondering if these two areas are umbrellas over some of the smaller regions and it just not depict it, I am not sure, and would be curious to find out!

Conclusion

The biggest issue that I had when preparing the data was figuring out how to find the link from the site to load the data. I found a few different links, with different formats that I now realize are for different purposes, but it was really a challenge to begin with. I found that I relied less on Data Camp for this assignment, and was able to work through the majority of the steps just by looking at my notes, and from memory. What I am going to continue working on is attempting to change the key labels, for some reason when I do it alters the entire graph. Another challenge that is continuous for me is the process of loading all the data, and packages and having everything be in the right format. I am confident in my abilities involving the content but I always seem to accidentally add a comma or mess if up slightly so that I second guess myself and change the data. My final challenge was adding the code folding. Even though I added in in properly, in my preperations the data is not hidden so I am continuously working on this as well. I really enjoyed experiementing with the different colors for this assignment which is something that I didnt really have time to do last assignment! I still found this assignment to be something that I was a bit daunted by when I first started, but as we continue the class I am getting more and more confident that I will be able to use what I have learned and put it into R Markdown, and it is actually really cool to see the work come to life when you are done.