Good job today in class! It was nice seeing people working together.
You should type load("~/Library-small.rda")
not load("Library-small.rda")
since you downloaded the library data to your home directory and “Library-small.rda” isn’t the correct path.
to see the number of cases and variables you can use the functions nrow()
and ncol()
You should add some comments between the chunks to make your report more readable.
To embed the source code .Rmd in your html document read my instructions in Wednesday’s lecture in the section heading Embedded Files within HTML files
To submit your html document save it on your computer. Then upload it to b-courses like you would upload any other document. You are allowed to upload multiple documents for your assignment. Don’t use RPubs since then other students can go to the RPubs website and see your solution.
I am changing the due dates for assignments (including this one) from Monday at 1pm to Monday at 8pm so that you can take advantage of the GSI office hours on Monday if you have any last minute questions.
In this small project, you will be exploring some of the records on books from the UC Berkeley library. You will be creating an .Rmd
file that summarises certain aspects of the data. That file will contain three things:
You only need to write (1) and (2). The outputs (3) will be generated automatically when you compile the .Rmd
to .html
.
The .Rmd
file is called the source document. In general, you write, revise, and update source documents. Compilation of the source document to html (or other formats) is done automatically. The point of compilation is to bring together your narrative and computer commands with the computer output into one easy-to-read document.
You will need to get access to the library-book data. The full collection of books is too large to download in a short time, so you will be using a small sample of books.
download.file(
"http://tiny.cc/dcf/Library-small.rda",
dest="Library-small.rda") # this will put it in your home directory
You only need to do this once. You can confirm that the data file has been downloaded by going to the “Files” tab in RStudio.
Step 2. Open a blank .Rmd
file. Actually, the file will not be completely blank, it will already contain a few chunks that make it easier to get started.
small-project-books
. The suffix .Rmd
will be automatically added to the file name.small-project-books.Rmd
has been created. (Look at the “modified” column to see when the file was created. It should match the clock time.).Rmd
file, press the “Knit” button in the editor to compile the document from .Rmd
to .html
. Depending on how your RStudio system is set up, the compiled document will be displayed in either a new window or in the “Viewer” tab. On some computers, particularly Windows, the new window may be behind RStudio, so if you don’t see the new window immediately, look back there.Why compile the document before you have added anything to it? So that you can confirm that you are starting with a working document. Then, after you add a little bit, compile the document again. If the compilation works, then continue on. If it doesn’t, you have a huge hint about where the problem is: somewhere in the stuff you just added to the document.
Compilation doesn’t cost anything. And don’t think of it as the final step. Compilation is part of the work process and you will be doing it many, many times as you construct your document.
You can start writing at the bottom of the “blank” template. Remember to compile often, perhaps after each step.
load("Library-small.rda") #file path for home directory is "~\Library-small.rda"
This will create two data table objects, Inv
and Bks
. The library’s collection is in Inv
. The Bks
data table is about individual books that might or might not be in the library’s collection.
4. Create new chunks to calculate the number of cases in each data table, the names of the variables, and whatever else might occur to you. In addition, write a narrative with a short description of the contents of each data table.
5. Create a chunk to look at the number of books with each different Current.Status
. Your command will look like this:
Inv %>%
group_by(Current.Status) %>%
tally()
Inv %>%
group_by(Issued.Count) %>%
tally()
Try to figure out what the results mean, and write your interpretation in a narrative following the chunk.
Congratulations! You’ve completed your data project. You will submit this with your Assignment 2. Best thing is for you to submit the URL of your HTML file small-project-books.html with the .Rmd file embedded in it. I discussed how to do this in the previous lecture. This will be separate from the other parts of your assignment.