Assignment 2

1. Read the dataset from “https://www.stats.govt.nz/assets/Uploads/Government-finance-statistics-general-government/Government-finance-statistics-general-government-Year-ended-June-2020/Download-data/government-finance-statistics-general-government-year-ended-june-2020-csv.csv” and find the datatypes of the variables. If there are some charecter variables, convert them to factors in the next step. Show the data is read properly by printing first few rows.

#Read the dataset
data = read.csv("https://www.stats.govt.nz/assets/Uploads/Government-finance-statistics-general-government/Government-finance-statistics-general-government-Year-ended-June-2020/Download-data/government-finance-statistics-general-government-year-ended-june-2020-csv.csv")

head(data)
##   Series_reference  Period Data_value  STATUS   UNITS MAGNTUDE
## 1 GFSA.SGS01G01Z90 2009.06       3374   FINAL Dollars        6
## 2 GFSA.SGS01G01Z90 2010.06      -2971   FINAL Dollars        6
## 3 GFSA.SGS01G01Z90 2011.06     -13101 REVISED Dollars        6
## 4 GFSA.SGS01G01Z90 2012.06      -3367 REVISED Dollars        6
## 5 GFSA.SGS01G01Z90 2013.06       -476 REVISED Dollars        6
## 6 GFSA.SGS01G01Z90 2014.06       1715 REVISED Dollars        6
##                                 Subject
## 1 Government Financial Statistics - GFS
## 2 Government Financial Statistics - GFS
## 3 Government Financial Statistics - GFS
## 4 Government Financial Statistics - GFS
## 5 Government Financial Statistics - GFS
## 6 Government Financial Statistics - GFS
##                                                   Group        Series_title_1
## 1 General Government, Operating Statement, Net Balances Net operating balance
## 2 General Government, Operating Statement, Net Balances Net operating balance
## 3 General Government, Operating Statement, Net Balances Net operating balance
## 4 General Government, Operating Statement, Net Balances Net operating balance
## 5 General Government, Operating Statement, Net Balances Net operating balance
## 6 General Government, Operating Statement, Net Balances Net operating balance
##   Series_title_2 Series_title_3 Series_title_4 Series_title_5
## 1                                                          NA
## 2                                                          NA
## 3                                                          NA
## 4                                                          NA
## 5                                                          NA
## 6                                                          NA
#find the datatypes of the variables.
str(data)
## 'data.frame':    2012 obs. of  13 variables:
##  $ Series_reference: chr  "GFSA.SGS01G01Z90" "GFSA.SGS01G01Z90" "GFSA.SGS01G01Z90" "GFSA.SGS01G01Z90" ...
##  $ Period          : num  2009 2010 2011 2012 2013 ...
##  $ Data_value      : int  3374 -2971 -13101 -3367 -476 1715 4244 6080 8016 9134 ...
##  $ STATUS          : chr  "FINAL" "FINAL" "REVISED" "REVISED" ...
##  $ UNITS           : chr  "Dollars" "Dollars" "Dollars" "Dollars" ...
##  $ MAGNTUDE        : int  6 6 6 6 6 6 6 6 6 6 ...
##  $ Subject         : chr  "Government Financial Statistics - GFS" "Government Financial Statistics - GFS" "Government Financial Statistics - GFS" "Government Financial Statistics - GFS" ...
##  $ Group           : chr  "General Government, Operating Statement, Net Balances" "General Government, Operating Statement, Net Balances" "General Government, Operating Statement, Net Balances" "General Government, Operating Statement, Net Balances" ...
##  $ Series_title_1  : chr  "Net operating balance" "Net operating balance" "Net operating balance" "Net operating balance" ...
##  $ Series_title_2  : chr  "" "" "" "" ...
##  $ Series_title_3  : chr  "" "" "" "" ...
##  $ Series_title_4  : chr  "" "" "" "" ...
##  $ Series_title_5  : logi  NA NA NA NA NA NA ...

If there are some charecter variables, convert them to factors in the next step. Show the data is read properly by printing first few rows.

#Read the dataset
data = read.csv("https://www.stats.govt.nz/assets/Uploads/Government-finance-statistics-general-government/Government-finance-statistics-general-government-Year-ended-June-2020/Download-data/government-finance-statistics-general-government-year-ended-june-2020-csv.csv", stringsAsFactors = T)

#printing first few rows.
head(data)
##   Series_reference  Period Data_value  STATUS   UNITS MAGNTUDE
## 1 GFSA.SGS01G01Z90 2009.06       3374   FINAL Dollars        6
## 2 GFSA.SGS01G01Z90 2010.06      -2971   FINAL Dollars        6
## 3 GFSA.SGS01G01Z90 2011.06     -13101 REVISED Dollars        6
## 4 GFSA.SGS01G01Z90 2012.06      -3367 REVISED Dollars        6
## 5 GFSA.SGS01G01Z90 2013.06       -476 REVISED Dollars        6
## 6 GFSA.SGS01G01Z90 2014.06       1715 REVISED Dollars        6
##                                 Subject
## 1 Government Financial Statistics - GFS
## 2 Government Financial Statistics - GFS
## 3 Government Financial Statistics - GFS
## 4 Government Financial Statistics - GFS
## 5 Government Financial Statistics - GFS
## 6 Government Financial Statistics - GFS
##                                                   Group        Series_title_1
## 1 General Government, Operating Statement, Net Balances Net operating balance
## 2 General Government, Operating Statement, Net Balances Net operating balance
## 3 General Government, Operating Statement, Net Balances Net operating balance
## 4 General Government, Operating Statement, Net Balances Net operating balance
## 5 General Government, Operating Statement, Net Balances Net operating balance
## 6 General Government, Operating Statement, Net Balances Net operating balance
##   Series_title_2 Series_title_3 Series_title_4 Series_title_5
## 1                                                          NA
## 2                                                          NA
## 3                                                          NA
## 4                                                          NA
## 5                                                          NA
## 6                                                          NA
#find the datatypes of the variables.
str(data)
## 'data.frame':    2012 obs. of  13 variables:
##  $ Series_reference: Factor w/ 167 levels "GFSA.SGS01G01Z90",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Period          : num  2009 2010 2011 2012 2013 ...
##  $ Data_value      : int  3374 -2971 -13101 -3367 -476 1715 4244 6080 8016 9134 ...
##  $ STATUS          : Factor w/ 3 levels "FINAL","PROVISIONAL",..: 1 1 3 3 3 3 3 3 3 3 ...
##  $ UNITS           : Factor w/ 1 level "Dollars": 1 1 1 1 1 1 1 1 1 1 ...
##  $ MAGNTUDE        : int  6 6 6 6 6 6 6 6 6 6 ...
##  $ Subject         : Factor w/ 1 level "Government Financial Statistics - GFS": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Group           : Factor w/ 26 levels "General Government, Balance Sheet, Financial Assets",..: 17 17 17 17 17 17 17 17 17 17 ...
##  $ Series_title_1  : Factor w/ 57 levels "ACC outstanding claims liability",..: 30 30 30 30 30 30 30 30 30 30 ...
##  $ Series_title_2  : Factor w/ 47 levels "","Cash","Customs and other import duties",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Series_title_3  : Factor w/ 12 levels "","Additions",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Series_title_4  : Factor w/ 6 levels "","Additions",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Series_title_5  : logi  NA NA NA NA NA NA ...

**2. Create a 3*3 matrix A using the following data [4,5,7,8,0,9,5,4,8]. Find the transpose of the matrix.**

#Create a 3*3 matrix A
A = matrix(c(4,5,7,8,0,9,5,4,8),3,3)
A
##      [,1] [,2] [,3]
## [1,]    4    8    5
## [2,]    5    0    4
## [3,]    7    9    8
#Find the transpose of the matrix
A_trans = t(A)
A_trans
##      [,1] [,2] [,3]
## [1,]    4    5    7
## [2,]    8    0    9
## [3,]    5    4    8

3. Create a 33 matrix B using the following data [14,52,75,89,10,91,51,44,28]. Using matrix A and B find A + B, A - B and AB.

#Create a 3*3 matrix B
B = matrix(c(14,52,75,89,10,91,51,44,28),3,3)
B
##      [,1] [,2] [,3]
## [1,]   14   89   51
## [2,]   52   10   44
## [3,]   75   91   28
#A + B
A + B
##      [,1] [,2] [,3]
## [1,]   18   97   56
## [2,]   57   10   48
## [3,]   82  100   36
#A-B
A-B
##      [,1] [,2] [,3]
## [1,]  -10  -81  -46
## [2,]  -47  -10  -40
## [3,]  -68  -82  -20
#A*B
A*B
##      [,1] [,2] [,3]
## [1,]   56  712  255
## [2,]  260    0  176
## [3,]  525  819  224
#A/B
A/B
##            [,1]       [,2]       [,3]
## [1,] 0.28571429 0.08988764 0.09803922
## [2,] 0.09615385 0.00000000 0.09090909
## [3,] 0.09333333 0.09890110 0.28571429

4. Create 2 vectors with 7 elements each and perform all the vector operations.

V1 = c(1,2,3,4,5,6,7)
V2 = c(7,6,5,4,3,2,1)

V1+V2
## [1] 8 8 8 8 8 8 8
V1-V2
## [1] -6 -4 -2  0  2  4  6
V1*V2
## [1]  7 12 15 16 15 12  7
V1/V2
## [1] 0.1428571 0.3333333 0.6000000 1.0000000 1.6666667 3.0000000 7.0000000
cbind(V1,V2)
##      V1 V2
## [1,]  1  7
## [2,]  2  6
## [3,]  3  5
## [4,]  4  4
## [5,]  5  3
## [6,]  6  2
## [7,]  7  1
rbind(V1,V2)
##    [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## V1    1    2    3    4    5    6    7
## V2    7    6    5    4    3    2    1
as.numeric(V1)
## [1] 1 2 3 4 5 6 7
as.character(V2)
## [1] "7" "6" "5" "4" "3" "2" "1"
data.frame(V1,V2)
##   V1 V2
## 1  1  7
## 2  2  6
## 3  3  5
## 4  4  4
## 5  5  3
## 6  6  2
## 7  7  1

5. Create the following vectors and create a data frame using those vectors. Also find the charectristics of the data frame.

id,name,salary,start_date,dept
1,Rick,623.3,2012-01-01,IT
2,Dan,515.2,2013-09-23,Operations
3,Michelle,611,2014-11-15,IT
4,Ryan,729,2014-05-11,HR
5,Gary,843.25,2015-03-27,Finance
6,Nina,578,2013-05-21,IT
7,Simon,632.8,2013-07-30,Operations
8,Guru,722.5,2014-06-17,Finance

v1 = c("id","name","salary","start_date","dept")
v2 = c(1,"Rick",623.3,2012-01-01,"IT")
v3 = c(2,"Dan",515.2,2013-09-23,"Operations")
v4 = c(3,"Michelle",611,2014-11-15,"IT")
v5 = c(4,"Ryan",729,2014-05-11,"HR")
v6 = c(5,"Gary",843.25,2015-03-27,"Finance")
v7 = c(6,"Nina",578,2013-05-21,"IT")
v8 = c(7,"Simon",632.8,2013-07-30,"Operations")
v9 = c(8,"Guru",722.5,2014-06-17,"Finance")


# create a data frame using those vectors.

data_frame = data.frame(rbind(v1,v2,v3,v4,v5,v6,v7,v8,v9))

data_frame
##    X1       X2     X3         X4         X5
## v1 id     name salary start_date       dept
## v2  1     Rick  623.3       2010         IT
## v3  2      Dan  515.2       1981 Operations
## v4  3 Michelle    611       1988         IT
## v5  4     Ryan    729       1998         HR
## v6  5     Gary 843.25       1985    Finance
## v7  6     Nina    578       1987         IT
## v8  7    Simon  632.8       1976 Operations
## v9  8     Guru  722.5       1991    Finance