** Please click all the tabs (in sequence) to get the entire set of information in these pages. **
** To download code, see the instructions in Session 2: https://rpubs.com/hkb/DAX-Session2 **
Objectives
Teams and Planning for Your Hands-on Data Analysis Project
State the area or domain you want to work on (e.g., Sports, or Social Media Use)
List 2-5 main questions you want to answer from the data
State your data source, and how you would access the data
List how and where you would do your computations (e.g., on RStudio)
Try to identify some tasks that you would need to accomplish (e.g., set up the data, how to transform it, what visualizations you would want to build)
Come up with a timeline to do all these things, so that you have a first draft by August 4th.
Visualizing Multiple Dimensions Through Line Charts, Bubble Charts, Position Plots
Interpreting Alternative Visualizations and Picking the (or a) Right One
Storytelling with Data - from FindHotel.com, https://blog.findhotel.net/2020/06/data-story-of-the-travel-market-recovery/
Project Possibilities and Data Sets
Some Example Visualizations
Let’s start with this very simple chart. What is it telling us? How can you improve it?

In the previous chart, each object of interest was a point. For instance, the “number of movies for each year in the dataset”. Now let us add one more dimension, so that each object of interest is a line (i.e., a number of points that are connected to each other). For instance, the total revenues for a movie at each rank in a specific year. That is, each single object of interest is a curve, or a series of line segments). And of course we want to display and compare multiple objects.

Because we have multiple dimensions, we have multiple ways in which we could organize them. Here is one additional alternative. What do you see here? How do you see decide which visualization is better?

Here’s a totally different way to display this kind of data. First, what is the visualization about (what is the information it is giving you)?

Is it an effective way of conveying the information in this particular instance?
Here is another example using the same type of visualization.


Movies Data Set
Lets load the data by making a call to boxofficemojo.com through the boxoffice() library. If, for some reason, you have not yet installed the package look through Session 2 notes and do it.
date.seq <- paste(2000:2019,"-12-31",sep="")
# Fetch the data
movies <- boxoffice(date = as.Date(date.seq), top_n = 50)
We’ll extend the data frame by adding - for each movie in the database - Year, and Rank within Year based on gross revenues.
movies <- movies %>% na.omit() %>% mutate(Year = as.numeric(format(as.Date(date), "%Y"))) # na.omit() omits the rows with NA values; create new column Year. which extracts the Y (year) from the date
# Extract the Year, then Rank by Sales
movies <- movies %>% group_by(Year) %>% arrange(desc(total_gross)) %>% mutate(rank=row_number())
LS0tCnRpdGxlOiAiU2Vzc2lvbiA1IgphdXRob3I6ICJIZW1hbnQgQmhhcmdhdmEiCmRhdGU6ICI3LzI5LzIwMjAiCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KCioqIFBsZWFzZSBjbGljayBhbGwgdGhlIHRhYnMgKGluIHNlcXVlbmNlKSB0byBnZXQgdGhlIGVudGlyZSBzZXQgb2YgaW5mb3JtYXRpb24gaW4gdGhlc2UgcGFnZXMuICoqCgoqKiBUbyBkb3dubG9hZCBjb2RlLCBzZWUgdGhlIGluc3RydWN0aW9ucyBpbiBTZXNzaW9uIDI6IGh0dHBzOi8vcnB1YnMuY29tL2hrYi9EQVgtU2Vzc2lvbjIgKioKCgpgYGB7ciBzZXR1cCwgZWNobyA9IEZBTFNFfQprbml0cjo6b3B0c19jaHVuayRzZXQoZWNobyA9IFRSVUUsIHdhcm5pbmc9RkFMU0UsIG1lc3NhZ2U9RkFMU0UpCm9wdGlvbnMoc2NpcGVuPTEwMDAwMDAwKQpvcHRpb25zKGRpZ2l0cz0zKQpgYGAKCmBgYHtyIHBhY2thZ2VzLCBlY2hvID0gRkFMU0V9CiMgaW5zdGFsbC5wYWNrYWdlcygia25pdHIiKQpsaWJyYXJ5KGtuaXRyKQoKbGlicmFyeShkcGx5cikKbGlicmFyeSh0aWR5dmVyc2UpCmxpYnJhcnkoZ2dwbG90MikKbGlicmFyeShncmlkRXh0cmEpCmxpYnJhcnkoZ2dyZXBlbCkKbGlicmFyeShib3hvZmZpY2UpICMgYmVjYXVzZSB0aGUgcGFja2FnZSBpcyBhbHJlYWR5IGluc3RhbGxlZApgYGAKCiMjIE9iamVjdGl2ZXMKCiogVGVhbXMgYW5kIFBsYW5uaW5nIGZvciBZb3VyIEhhbmRzLW9uIERhdGEgQW5hbHlzaXMgUHJvamVjdAoKICAtIFN0YXRlIHRoZSBhcmVhIG9yIGRvbWFpbiB5b3Ugd2FudCB0byB3b3JrIG9uIChlLmcuLCBTcG9ydHMsIG9yIFNvY2lhbCBNZWRpYSBVc2UpCgogIC0gTGlzdCAyLTUgbWFpbiBxdWVzdGlvbnMgeW91IHdhbnQgdG8gYW5zd2VyIGZyb20gdGhlIGRhdGEKCiAgLSBTdGF0ZSB5b3VyIGRhdGEgc291cmNlLCBhbmQgaG93IHlvdSB3b3VsZCBhY2Nlc3MgdGhlIGRhdGEKCiAgLSBMaXN0IGhvdyBhbmQgd2hlcmUgeW91IHdvdWxkIGRvIHlvdXIgY29tcHV0YXRpb25zIChlLmcuLCBvbiBSU3R1ZGlvKQoKICAtIFRyeSB0byBpZGVudGlmeSBzb21lIHRhc2tzIHRoYXQgeW91IHdvdWxkIG5lZWQgdG8gYWNjb21wbGlzaCAoZS5nLiwgc2V0IHVwIHRoZSBkYXRhLCBob3cgdG8gdHJhbnNmb3JtIGl0LCB3aGF0IHZpc3VhbGl6YXRpb25zIHlvdSB3b3VsZCB3YW50IHRvIGJ1aWxkKQoKICAtIENvbWUgdXAgd2l0aCBhIHRpbWVsaW5lIHRvIGRvIGFsbCB0aGVzZSB0aGluZ3MsIHNvIHRoYXQgeW91IGhhdmUgYSBmaXJzdCBkcmFmdCBieSBBdWd1c3QgNHRoLgoKKiBWaXN1YWxpemluZyBNdWx0aXBsZSBEaW1lbnNpb25zIFRocm91Z2ggTGluZSBDaGFydHMsIEJ1YmJsZSBDaGFydHMsIFBvc2l0aW9uIFBsb3RzCgoqIEludGVycHJldGluZyBBbHRlcm5hdGl2ZSBWaXN1YWxpemF0aW9ucyBhbmQgUGlja2luZyB0aGUgKG9yIGEpIFJpZ2h0IE9uZQoKKiBTdG9yeXRlbGxpbmcgd2l0aCBEYXRhIC0gZnJvbSBGaW5kSG90ZWwuY29tLCBodHRwczovL2Jsb2cuZmluZGhvdGVsLm5ldC8yMDIwLzA2L2RhdGEtc3Rvcnktb2YtdGhlLXRyYXZlbC1tYXJrZXQtcmVjb3ZlcnkvICAKCiMjIFByb2plY3QgUG9zc2liaWxpdGllcyBhbmQgRGF0YSBTZXRzCgoqIENvdmlkIHJlbGF0ZWQgZGF0YTogV29ybGQgKG11bHRpcGxlIGNvdW50cmllcyksIFVTIChtdWx0aXBsZSBzdGF0ZXMsIGNvdW50eSBsZXZlbCkgd2l0aCBtdWx0aXBsZSBtZXRyaWNzLCBzcGVjaWZpYyBjb3VudHJ5IChlLmcuLCBJbmRpYSBhbmQgc3RhdGVzIHdpdGhpbikKCiogQ2xpbWF0ZSBhbmQgd2VhdGhlciBkYXRhLCBjbGltYXRlIGNoYW5nZQoKKiBEYXRhIGZyb20gZml2ZXRoaXJ0eWVpZ2h0LmNvbSAoRWxlY3Rpb25zIGFuZCBwb2xscywgYnV0IG11Y2ggbW9yZSk6IGh0dHBzOi8vZGF0YS5maXZldGhpcnR5ZWlnaHQuY29tLyAgCgoqIFRlZW5zIGFuZCBTb2NpYWwgTWVkaWE6IGh0dHBzOi8vd3d3LnBld3Jlc2VhcmNoLm9yZy9pbnRlcm5ldC8yMDE4LzA1LzMxL3RlZW5zLXNvY2lhbC1tZWRpYS10ZWNobm9sb2d5LTIwMTgvIAoKKiBEYXRhIHNldHMgYXQgS2FnZ2xlIChodHRwczovL3d3dy5rYWdnbGUuY29tLyksIGRhdGEud29ybGQgKGh0dHBzOi8vZGF0YS53b3JsZC8pLCAKCiMjIFNvbWUgRXhhbXBsZSBWaXN1YWxpemF0aW9ucwoKTGV0J3Mgc3RhcnQgd2l0aCB0aGlzIHZlcnkgc2ltcGxlIGNoYXJ0LiBXaGF0IGlzIGl0IHRlbGxpbmcgdXM/IEhvdyBjYW4geW91IGltcHJvdmUgaXQ/IAoKYGBge3IgZmlnLmFsaWduPSJjZW50ZXIiLCBvdXQud2lkdGg9IjMwJSIsIGVjaG89RkFMU0V9CmtuaXRyOjppbmNsdWRlX2dyYXBoaWNzKCJJbWFnZXMvZmlnLW1vdmllLXNpbmdsZS1kb3RzLnBuZyIpCmBgYApJbiB0aGUgcHJldmlvdXMgY2hhcnQsIGVhY2ggb2JqZWN0IG9mIGludGVyZXN0IHdhcyBhIHBvaW50LiBGb3IgaW5zdGFuY2UsIHRoZSAibnVtYmVyIG9mIG1vdmllcyBmb3IgZWFjaCB5ZWFyIGluIHRoZSBkYXRhc2V0Ii4gTm93IGxldCB1cyBhZGQgb25lIG1vcmUgZGltZW5zaW9uLCBzbyB0aGF0IGVhY2ggb2JqZWN0IG9mIGludGVyZXN0IGlzIGEgbGluZSAoaS5lLiwgYSBudW1iZXIgb2YgcG9pbnRzIHRoYXQgYXJlIGNvbm5lY3RlZCB0byBlYWNoIG90aGVyKS4gRm9yIGluc3RhbmNlLCB0aGUgdG90YWwgcmV2ZW51ZXMgZm9yIGEgbW92aWUgYXQgZWFjaCByYW5rIGluIGEgc3BlY2lmaWMgeWVhci4gVGhhdCBpcywgZWFjaCBzaW5nbGUgb2JqZWN0IG9mIGludGVyZXN0IGlzIGEgY3VydmUsIG9yIGEgc2VyaWVzIG9mIGxpbmUgc2VnbWVudHMpLiBBbmQgb2YgY291cnNlIHdlIHdhbnQgdG8gZGlzcGxheSBhbmQgY29tcGFyZSBtdWx0aXBsZSBvYmplY3RzLiAKCmBgYHtyIGZpZy5hbGlnbj0iY2VudGVyIiwgb3V0LndpZHRoPSIzMCUiLCBlY2hvPUZBTFNFfQprbml0cjo6aW5jbHVkZV9ncmFwaGljcygiSW1hZ2VzL2ZpZy1tb3ZpZS1saW5lY2hhcnRzLWNvbG9yLnBuZyIpCmBgYAoKQmVjYXVzZSB3ZSBoYXZlIG11bHRpcGxlIGRpbWVuc2lvbnMsIHdlIGhhdmUgbXVsdGlwbGUgd2F5cyBpbiB3aGljaCB3ZSBjb3VsZCBvcmdhbml6ZSB0aGVtLiBIZXJlIGlzIG9uZSBhZGRpdGlvbmFsIGFsdGVybmF0aXZlLiBXaGF0IGRvIHlvdSBzZWUgaGVyZT8gSG93IGRvIHlvdSBzZWUgZGVjaWRlIHdoaWNoIHZpc3VhbGl6YXRpb24gaXMgYmV0dGVyPyAKCmBgYHtyIGZpZy5hbGlnbj0iY2VudGVyIiwgb3V0LndpZHRoPSIzMCUiLCBlY2hvPUZBTFNFfQprbml0cjo6aW5jbHVkZV9ncmFwaGljcygiSW1hZ2VzL2ZpZy1tb3ZpZS1saW5lY2hhcnRzLWF4ZXMucG5nIikKYGBgCkhlcmUncyBhIHRvdGFsbHkgZGlmZmVyZW50IHdheSB0byBkaXNwbGF5IHRoaXMga2luZCBvZiBkYXRhLiBGaXJzdCwgd2hhdCBpcyB0aGUgdmlzdWFsaXphdGlvbiBhYm91dCAod2hhdCBpcyB0aGUgaW5mb3JtYXRpb24gaXQgaXMgZ2l2aW5nIHlvdSk/IAoKCmBgYHtyIGZpZy5hbGlnbj0iY2VudGVyIiwgb3V0LndpZHRoPSIzMCUiLCBlY2hvPUZBTFNFfQprbml0cjo6aW5jbHVkZV9ncmFwaGljcygiSW1hZ2VzL2ZpZy1tb3ZpZS1idWJibGUteWVhci5wbmciKQpgYGAKCklzIGl0IGFuIGVmZmVjdGl2ZSB3YXkgb2YgY29udmV5aW5nIHRoZSBpbmZvcm1hdGlvbiBpbiB0aGlzIHBhcnRpY3VsYXIgaW5zdGFuY2U/IAoKSGVyZSBpcyBhbm90aGVyIGV4YW1wbGUgdXNpbmcgdGhlIHNhbWUgdHlwZSBvZiB2aXN1YWxpemF0aW9uLiAKCmBgYHtyIGZpZy5hbGlnbj0iY2VudGVyIiwgb3V0LndpZHRoPSIzMCUiLCBlY2hvPUZBTFNFfQprbml0cjo6aW5jbHVkZV9ncmFwaGljcygiSW1hZ2VzL2ZpZy1tb3ZpZS1idWJibGUucG5nIikKYGBgCgoKYGBge3IgZmlnLmFsaWduPSJjZW50ZXIiLCBvdXQud2lkdGg9IjMwJSIsIGVjaG89RkFMU0V9CmtuaXRyOjppbmNsdWRlX2dyYXBoaWNzKCJJbWFnZXMvZmlnLW1vdmllLXBvc2l0aW9uLXBsb3RzLnBuZyIpCmBgYAoKCgojIyBNb3ZpZXMgRGF0YSBTZXQKCkxldHMgbG9hZCB0aGUgZGF0YSBieSBtYWtpbmcgYSBjYWxsIHRvIGJveG9mZmljZW1vam8uY29tIHRocm91Z2ggdGhlIGJveG9mZmljZSgpIGxpYnJhcnkuIElmLCBmb3Igc29tZSByZWFzb24sIHlvdSBoYXZlIG5vdCB5ZXQgaW5zdGFsbGVkIHRoZSBwYWNrYWdlIGxvb2sgdGhyb3VnaCBTZXNzaW9uIDIgbm90ZXMgYW5kIGRvIGl0LiAKCmBgYHtyIG1vdmllcy5kYXRhfQpkYXRlLnNlcSA8LSBwYXN0ZSgyMDAwOjIwMTksIi0xMi0zMSIsc2VwPSIiKSAKIyBGZXRjaCB0aGUgZGF0YSAKbW92aWVzIDwtIGJveG9mZmljZShkYXRlID0gYXMuRGF0ZShkYXRlLnNlcSksIHRvcF9uID0gNTApCmBgYAoKV2UnbGwgZXh0ZW5kIHRoZSBkYXRhIGZyYW1lIGJ5IGFkZGluZyAtIGZvciBlYWNoIG1vdmllIGluIHRoZSBkYXRhYmFzZSAtIFllYXIsIGFuZCBSYW5rIHdpdGhpbiBZZWFyIGJhc2VkIG9uIGdyb3NzIHJldmVudWVzLiAKCmBgYHtyIG1vdmllcy5leHRlbmR9IAptb3ZpZXMgPC0gbW92aWVzICU+JSBuYS5vbWl0KCkgJT4lIG11dGF0ZShZZWFyID0gIGFzLm51bWVyaWMoZm9ybWF0KGFzLkRhdGUoZGF0ZSksICIlWSIpKSkgIyBuYS5vbWl0KCkgb21pdHMgdGhlIHJvd3Mgd2l0aCBOQSB2YWx1ZXM7IGNyZWF0ZSBuZXcgY29sdW1uIFllYXIuIHdoaWNoIGV4dHJhY3RzIHRoZSBZICh5ZWFyKSBmcm9tIHRoZSBkYXRlCgojIEV4dHJhY3QgdGhlIFllYXIsIHRoZW4gUmFuayBieSBTYWxlcwoKbW92aWVzIDwtIG1vdmllcyAlPiUgZ3JvdXBfYnkoWWVhcikgJT4lIGFycmFuZ2UoZGVzYyh0b3RhbF9ncm9zcykpICU+JSAgbXV0YXRlKHJhbms9cm93X251bWJlcigpKQoKYGBgCgo=