itle: “DSci110 Homework 1 Target”
uthor: “Rosario Ciancio”
ate: “01 20, 2020”
utput:
df_document:
umber_sections: TRUE
ontsize: 12pt

You will need to use either straight pdf generation, or converting a Word or html file to pdf:}

This is where you tell the reader what it is you are going to be showing them. Typically, if this is an article, you will be outlining a background thesis, and presenting support for your conclusions.

Typing some Math

What follows is a math document that descries how you calculate the average of a set of numbers. Remember, it isn’t the _____math________ we are trying to do here. This is why we picked a very simple formula. We are trying to learn how to communicate_______ data science ideas effectively!

So in particular: we are not trying to write a

We are trying to describe an analysis. So here we go:

Let _____\(x_{1},x_{2},x_{3},x_{N}=x_{k}^{n}\) __________________ be a set of data values. We define____ the average of the values \(x_{k}\)____ as_______\(1/N\sum_{k = 1}^{N} x_{k}\)__________

Now we can also describe a very important measure in data terms, namely the .

This is a measure of how “spread out” a set of data values is. If we let \(\tilde{x}\) ___ be the ____average_______ of our data, then we define this ``standard" measure of spread-outness as \(s\), given by

\[ s = [\sum_{k = 1}^{N}1/N (x_{k}^2-\tilde{x}^2)]^{1/2} \]

Note that these values are called , which means they were derived from a set of data. An overall population can have a which is often denoted by \(\mu\), and a population standard deviation, denoted by \(\sigma\).

Remarks

This formula can be simplified a bit, using the “bar” notation for average, as

\[s =[\overline{x^2}-\tilde{x}^2]^.5\] where the symbol \(\overline{x^2}\) denotes the avearge of the squares of the data values:

\[\overline{x^2} =(1/N)1/N\sum_{k = 1}^{N} (x^2)_{k}\]

Basic R code (without leveraging the built-in functions) could be as follows:

  std <- sqrt( sum(x^2)/length(x) - (sum(x)/length(x))^2)

We chose “std” since it is a bit more suggestive as a variable name than just “s”. Recall the mean, \(\bar{x}\), or \(\mu\) is:

  mu <- sum(x)/length(x)

Eventually, we will generate fancier graphics similar to this, using other R packages: