Question 1 [2 pts]

The displacement of a combustion engine is given by the volume displaced by the pistons in all the cylinders of the engine. Because the volume of a cylinder is the area of the base of the cylinder times the height of the cylinder, one gets the following expression for the displacement of a combustion engine:

π/4 x (D)^2 x stroke x number of cylinders

where D is the diameter of the cylinder and stroke is the height run by the piston.

The engine of a Chevrolet 350 V-8 has eight cylinders, each with a cylinder base RADIUS of 2 inches and a stroke of 3.48 inches.

1.1 Use R to calculate the displacement in cubic inches for this engine

diameter=4 # replace this  by the correct value (diameter of base cylinder)
stroke=3.48
ncyl=8
disp=((pi/4)*(diameter**2)*stroke*ncyl) # replace this by a formula using the variables above
disp # print the disp variable (value of the displacement in cubic inches)

## [1] 349.8478

The value of the displacement in cubic inches is 349.8477579.

1.2 Use this cell of R code to calculate this displacement in cubic centimetres, knowing that one inch is 2.54 centimetres:

print(paste((disp*2.54),"cubic centimeters"))

## [1] "888.613305075549 cubic centimeters"

Question 2 [2 pts]

Write a R code that creates the truth table of the logical function

\[!(a | b)\]

(This formula means not(a or b), where a and b are logical variables that can takes 2 values. T or F).

a=c(T,T,F,F) # replace NA by a vector of 4 logical values
b=c(T,F,F,T) # replace NA by a vector of 4 logical values
M=cbind(a,b,!(a|b)) # replace NA by a column bind of a , b and your logical function
print(M) # now add a line of code to print M:

##          a     b      
## [1,]  TRUE  TRUE FALSE
## [2,]  TRUE FALSE FALSE
## [3,] FALSE FALSE  TRUE
## [4,] FALSE  TRUE FALSE

Question 3 [5 pts]

The probability density function of a gamma distribution with shape parameter k and scale parameter theta is the function given by the following formula:

\[ y=f(x|k,\theta)=\frac{1}{\Gamma(k)\theta^k}x^{k-1}e^{-x/\theta}\] Write a R code to calculate this function for x=3 when k is 2 and theta is 2. Use the variable name y to store the result of your computation and print y.

You will need to use the exponential function and the gamma function, which are both available in R.

For example \(e^3\) and \(\Gamma(5)\) can be calculated like this in R:

exp(3)

## [1] 20.08554

gamma(5)

## [1] 24

x=3
k=2
theta=2
y=1/(gamma(k)*theta^k) *x^(k-1) *exp(-x/theta) # replace this by the correct formula using correct functions
print(y) # print the y value

## [1] 0.1673476

To check that your calculation is right, you can execute this cell of code and you should find the same result as y:

dgamma(x,k,scale=theta)

## [1] 0.1673476

The value of y is 0.1673476

Question 4 [13 pts]

Download the file asmall.txt from BS and place it in the same folder as your Rmd file.

+ Read the numbers into a vector named Vs using the "scan" command. Print the vector. [2 pts]

Vs=scan("asmall.txt",what = "vector") # replace this by a code that scans the asmall.txt file
Vs

##  [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14" "15"
## [16] "16" "17" "18" "19" "20" "21" "22" "23" "24"

+ Arrange Vs into a matrix named m1 with 4 rows and 6 columns, where the first 6 numbers in the vector are in the first row, the next 6 numbers in the second row, and so on.  Use the matrix command, with the "byrow=T" argument. Print the matrix. [2 pts]

m1=matrix(1:24, nrow = 4, ncol = 6, byrow = T) # replace NA with your code
m1

##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    1    2    3    4    5    6
## [2,]    7    8    9   10   11   12
## [3,]   13   14   15   16   17   18
## [4,]   19   20   21   22   23   24

+  Arrange Vs into a matrix named m2 with 6 rows and 4 columns, where the first 6 numbers in the vector Vs are in the first column, the next 6 numbers in the second column, etc.  Print the matrix m2. [2 pts]

m2=matrix(1:24, nrow = 6, ncol = 4, byrow = T) # replace NA with your code
m2

##      [,1] [,2] [,3] [,4]
## [1,]    1    2    3    4
## [2,]    5    6    7    8
## [3,]    9   10   11   12
## [4,]   13   14   15   16
## [5,]   17   18   19   20
## [6,]   21   22   23   24

+ Calculate, and print, the matrix product named m3 of the  matrices m1 and m2, using the matrix multiplication operator "%*%. Use the name m3 for the result. [2 pts]

m3=m1%*%m2 # replace NA with your code
m3

##      [,1] [,2] [,3] [,4]
## [1,]  301  322  343  364
## [2,]  697  754  811  868
## [3,] 1093 1186 1279 1372
## [4,] 1489 1618 1747 1876

+ Use the "det" command to calculate the determinant of matrix m3. Save the result in a object called m3det, and print m3det. [2 pts]

m3det=det(m3) # replace NA with your code
m3det

## [1] 8.040976e-23

+ Use the "dim" command to get the dimensions of the product matrix, save this in variable m3dim and print m3dim. [1 pt]

m3dim=dim(m3) # replace NA with your code
print(m3dim)

## [1] 4 4

+  extract the third column of the product matrix into a vector V2, and print V2 [2 pts]

V2=m3[, 3] # replace NA with your code
V2

## [1]  343  811 1279 1747

The value of the M1 is 1, 7, 13, 19, 2, 8, 14, 20, 3, 9, 15, 21, 4, 10, 16, 22, 5, 11, 17, 23, 6, 12, 18, 24

The value of the M2 is 1, 5, 9, 13, 17, 21, 2, 6, 10, 14, 18, 22, 3, 7, 11, 15, 19, 23, 4, 8, 12, 16, 20, 24

The value of the Mprod is 301, 697, 1093, 1489, 322, 754, 1186, 1618, 343, 811, 1279, 1747, 364, 868, 1372, 1876

The value of the det is 8.040976^{-23}

The dimension of m3 is 4, 4

Question 5 [13=3+5+5 pts]

Auto.csv contains data on an outcome variable “mpg”, which is automobile mileage (i.e. number of miled covered per gallon of gas), and a number of other variables (for example the weight of the automobile) which can be used to predict mileage.

5.1 [3 pts]

Use the read.csv command to read the data into a data frame called mydata. Use the summary command to get summary statistics for all of the variables in the data set.

mydata=read.csv("Auto.csv",header=T,quote="") # replace NA with your code
summary(mydata) # get a summary : add your line of code

##       mpg          cylinders      displacement    horsepower       
##  Min.   : 9.00   Min.   :3.000   Min.   : 68.0   Length:397        
##  1st Qu.:17.50   1st Qu.:4.000   1st Qu.:104.0   Class :character  
##  Median :23.00   Median :4.000   Median :146.0   Mode  :character  
##  Mean   :23.52   Mean   :5.458   Mean   :193.5                     
##  3rd Qu.:29.00   3rd Qu.:8.000   3rd Qu.:262.0                     
##  Max.   :46.60   Max.   :8.000   Max.   :455.0                     
##      weight      acceleration        year           origin     
##  Min.   :1613   Min.   : 8.00   Min.   :70.00   Min.   :1.000  
##  1st Qu.:2223   1st Qu.:13.80   1st Qu.:73.00   1st Qu.:1.000  
##  Median :2800   Median :15.50   Median :76.00   Median :1.000  
##  Mean   :2970   Mean   :15.56   Mean   :75.99   Mean   :1.574  
##  3rd Qu.:3609   3rd Qu.:17.10   3rd Qu.:79.00   3rd Qu.:2.000  
##  Max.   :5140   Max.   :24.80   Max.   :82.00   Max.   :3.000  
##      name          
##  Length:397        
##  Class :character  
##  Mode  :character  
##                    
##                    
##

5.2 [5 pts]

Make boxplots of the mileage values for the different numbers of cylinders. Make sure to add appropriate labels to the X and Y axes, and add a title to the plot.

The syntax for the boxplot command is in the Rintro notes.

In order to do this, you need to either attach the created data frame prior to using the plot command, or include the “data=mydata” option on the boxplot command. I recommend the second solution.

# use your code
data=mydata
cyl <- table(mydata$cylinders)
boxplot(mydata$mpg, main="Automobile Mileage Data", xlab="Number of Cylinders", ylab="Miles per Gallon")

5.3 [5 pts]

Make a scatterplot (function plot) with miles per gallon (column mpg) on the Y axis and displacement (column disp) on the X axis.

Make sure to label the axes.

Choose ‘displacement (cu. in.)’ as label for the x-axis.

Choose ‘miles per gallon’ as label for the y-axis.

Choose ‘mileage vs engine displacement’ as title for the plot.

You can use the ‘main’ argument for the title and the xlab and ylab arguments for the axis labels.

If you have “attached” the data frame, you won’t need the “data=mydata” syntax on the command line. Otherwise, you will.

# use your code
attach(mydata)
plot(mydata$displacement, mydata$mpg, main = 'Mileage vs Engine Displacement', xlab = 'Displacement', ylab = 'Miles per Gallon')

STAT 2450 Assignment 1 (35 points)

B00737628 Junyi Yin

Jan. 25 2022

PLEASE