STA 363 Project Part 3
Getting Started
Open the same Markdown file you used for Project Part 2. You will be continuing to work on the same Markdown file for this part of the project. Keep everything!
Again, you are ADDING to your Part 2 file. You need all the previous sections in your submission for this project.
PRO TIP 1
Note that this project is a paper, not a lab. This means you need complete sentences, proper grammar and spelling, and you need to be clear in your steps and explanations. Let Dr. Dalzell know if you have any questions!
PRO TIP 2
Every section in your paper must have a transition sentence, something like “In this section, we will…”. This helps your reader follow your work, and it will also help you structure your paper.
Section R2:
Section R2: Revisions
This section is going to feel a little disjointed from the rest of the paper, and that is okay. If you don’t like it in the paper, you can submit it in the comments on the submission on Canvas instead! Either way, you need to:
- Look at your comments from Project Part 2.
- If I made any suggestions for changes or improvements, you need to
make these changes in the Part 2 part you will submit with Part 3. In
other words, since you are putting your Part 3 in the same document as
Part 1 and 2, scroll up to the Part 2 component and make your changes
there.
- In this section, you are going to clearly indicate how you addressed each of those suggestions. For example: “Comment: Add transition sentences. Addressed: Transition sentences were added.”
- You WILL lose points if you do not address each comment, so make sure you let me know if you have any questions!!
I will also note that there will not be any revisions for Part 3. If I have already mentioned something to you in comments in the past, you will lose points for it in Part 3. Use the revisions as way to help you check your work for Part 3!!
Section 5:
Section 5.1: Tree
In this section, you are going to build an appropriate tree for your response \(Y\) using all your features (or at least all the ones you used in Section 4). Your task is to:
- Build either a regression or classification tree (depending on what is appropriate for your analysis) to predict \(Y\).
- Show a formatted visual of your tree.
- Describe a few key relationships highlighted by your visualization. In other words, interpret the relationships shown in your tree. If your tree is really deep, you do not need to describe all of them, but you should have 2-3 sentences here at least.
- Compute an appropriate metric to assess how well your tree would do at prediction on test data.
Section 5.2: Forest
In this section, you are going to apply an appropriate forest to predict your response \(Y\). Your task is to:
- Build either a regression or classification forest (depending on what is appropriate for your analysis) to predict \(Y\). Make sure you are clear about how many trees you use.
- Show an importance plot for your forest. (1) Interpret the
importance for one feature of your choice and (2) describe in general
what top three features seem important in your forest. Does this match
the features that seemed important in your tree?
- Compute an appropriate metric to assess how well your forest would do at prediction on test data.
Section 5.3: Comparison
Your task is to:
- State whether you would recommend using your tree or your forest for prediction and explain your choice.
- Using the model you recommend, show a plot or table of your predictions versus the true values of \(Y\) and comment on how well the model is doing at predicting \(Y\).
Section 6
Section 6: Conclusion
Your task is to:
- Look at all the methods you have tried across all 3 parts of the project. State which you would recommend using for prediction of \(Y\) in your application and why.
- Clearly describe how well this method does at prediction (if you have done this previously, you can copy that here).
- Clearly describe if you have any reservations or concerns about using this model to predict in practice, and justify your answer.
Turning in your assignment
You have completed the proejct!!
Submission Component 1:
Submit a .Rmd file showing all of your code in such a way that Dr. Dalzell can re-run it and get the same answers as you show in your paper.
Submission Component 2:
Submit a PDF or html version of your work. Make sure:
- You have run spell check.
- There is NO raw R output (meaning no R output that is not formatted).
- There is NO formatted output or plots of any kind that do not have words right near them to describe the output.
- All plots have labelled axes and titles or captions.
- You do NOT have any super long output (like printing 50 numbers on the screen.)
When your Markdown document is complete, do a final knit and make sure everything compiles. You will then submit your document on Canvas. You must submit a PDF or html document to Canvas. No other formats will be accepted. Make sure you look through the final document to make sure everything has knit correctly.