1.Data Exploration: This should include summary statistics, means, medians, quartiles, or any other relevant information about the data set. Please include some conclusions in the R Markdown text.

Titanic1<-read.table(file="C:\\Users\\a\\Desktop\\Cuny Homework\\Titanic1.csv",TRUE,",")
View(Titanic1)
head(Titanic1,10)
##     X                                          Name PClass   Age    Sex
## 1   1                  Allen, Miss Elisabeth Walton    1st 29.00 female
## 2   2                   Allison, Miss Helen Loraine    1st  2.00 female
## 3   3           Allison, Mr Hudson Joshua Creighton    1st 30.00   male
## 4   4 Allison, Mrs Hudson JC (Bessie Waldo Daniels)    1st 25.00 female
## 5   5                 Allison, Master Hudson Trevor    1st  0.92   male
## 6   6                            Anderson, Mr Harry    1st 47.00   male
## 7   7              Andrews, Miss Kornelia Theodosia    1st 63.00 female
## 8   8                        Andrews, Mr Thomas, jr    1st 39.00   male
## 9   9  Appleton, Mrs Edward Dale (Charlotte Lamson)    1st 58.00 female
## 10 10                        Artagaveytia, Mr Ramon    1st 71.00   male
##    Survived SexCode
## 1         1       1
## 2         0       1
## 3         0       0
## 4         0       1
## 5         1       0
## 6         1       0
## 7         1       1
## 8         0       0
## 9         1       1
## 10        0       0
tail(Titanic1,10)
##         X                   Name PClass Age    Sex Survived SexCode
## 1304 1304     Yasbeck, Mr Antoni    3rd  27   male        0       0
## 1305 1305    Yasbeck, Mrs Antoni    3rd  15 female        1       1
## 1306 1306     Youssef, Mr Gerios    3rd  NA   male        0       0
## 1307 1307    Zabour, Miss Hileni    3rd  NA female        0       1
## 1308 1308    Zabour, Miss Tamini    3rd  NA female        0       1
## 1309 1309     Zakarian, Mr Artun    3rd  27   male        0       0
## 1310 1310 Zakarian, Mr Maprieder    3rd  26   male        0       0
## 1311 1311       Zenni, Mr Philip    3rd  22   male        0       0
## 1312 1312       Lievens, Mr Rene    3rd  24   male        0       0
## 1313 1313         Zimmerman, Leo    3rd  29   male        0       0
summary(Titanic1)
##        X                                  Name      PClass   
##  Min.   :   1   Carlsson, Mr Frans Olof     :   2   *  :  1  
##  1st Qu.: 329   Connolly, Miss Kate         :   2   1st:322  
##  Median : 657   Kelly, Mr James             :   2   2nd:279  
##  Mean   : 657   Abbing, Mr Anthony          :   1   3rd:711  
##  3rd Qu.: 985   Abbott, Master Eugene Joseph:   1            
##  Max.   :1313   Abbott, Mr Rossmore Edward  :   1            
##                 (Other)                     :1304            
##       Age            Sex         Survived         SexCode      
##  Min.   : 0.17   female:462   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:21.00   male  :851   1st Qu.:0.0000   1st Qu.:0.0000  
##  Median :28.00                Median :0.0000   Median :0.0000  
##  Mean   :30.40                Mean   :0.3427   Mean   :0.3519  
##  3rd Qu.:39.00                3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :71.00                Max.   :1.0000   Max.   :1.0000  
##  NA's   :557
str(Titanic1)
## 'data.frame':    1313 obs. of  7 variables:
##  $ X       : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Name    : Factor w/ 1310 levels "Abbing, Mr Anthony",..: 22 25 26 27 24 31 45 46 50 54 ...
##  $ PClass  : Factor w/ 4 levels "*","1st","2nd",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ Age     : num  29 2 30 25 0.92 47 63 39 58 71 ...
##  $ Sex     : Factor w/ 2 levels "female","male": 1 1 2 1 2 2 1 2 1 2 ...
##  $ Survived: int  1 0 0 0 1 1 1 0 1 0 ...
##  $ SexCode : int  1 1 0 1 0 0 1 0 1 0 ...

2.Data wrangling: Please perform some basic transformations. They will need to make sense but could include column renaming, creating a subset of the data, replacing values, or creating new columns with derived data (for example - if it makes sense you could sum two columns together)

###Data subsetting
Titanic2<-Titanic1[, c(2,4,6)]
head(Titanic2,5)
##                                            Name   Age Survived
## 1                  Allen, Miss Elisabeth Walton 29.00        1
## 2                   Allison, Miss Helen Loraine  2.00        0
## 3           Allison, Mr Hudson Joshua Creighton 30.00        0
## 4 Allison, Mrs Hudson JC (Bessie Waldo Daniels) 25.00        0
## 5                 Allison, Master Hudson Trevor  0.92        1
###Rename Columns
colnames(Titanic1)<-c("Count","Passenger Name", "Class", "Passenger Age", "Passenger Sex", "Surv", "SCode")
View(Titanic1)
###Replacing Value
Titanic1$"Surv"[Titanic1$`Surv` == 1]<-"Y"
Titanic1$"Surv"[Titanic1$`Surv` == 0]<-"N"
View(Titanic1)
###Creating a new column by concatenating 2 columns
Titanic1$NewCol<-paste(Titanic1$`Passenger Name`,Titanic1$`Passenger Sex`)
summary(Titanic1)
##      Count                           Passenger Name Class    
##  Min.   :   1   Carlsson, Mr Frans Olof     :   2   *  :  1  
##  1st Qu.: 329   Connolly, Miss Kate         :   2   1st:322  
##  Median : 657   Kelly, Mr James             :   2   2nd:279  
##  Mean   : 657   Abbing, Mr Anthony          :   1   3rd:711  
##  3rd Qu.: 985   Abbott, Master Eugene Joseph:   1            
##  Max.   :1313   Abbott, Mr Rossmore Edward  :   1            
##                 (Other)                     :1304            
##  Passenger Age   Passenger Sex     Surv               SCode       
##  Min.   : 0.17   female:462    Length:1313        Min.   :0.0000  
##  1st Qu.:21.00   male  :851    Class :character   1st Qu.:0.0000  
##  Median :28.00                 Mode  :character   Median :0.0000  
##  Mean   :30.40                                    Mean   :0.3519  
##  3rd Qu.:39.00                                    3rd Qu.:1.0000  
##  Max.   :71.00                                    Max.   :1.0000  
##  NA's   :557                                                      
##     NewCol         
##  Length:1313       
##  Class :character  
##  Mode  :character  
##                    
##                    
##                    
## 

Graphics: Please make sure to display at least one scatter plot, box plot and histogram. Don’t be limited to this. Please explore the many other options in R packages such as ggplot2.

hist(Titanic1$`Passenger Age`, main = "Histogram", xlab = "Age", xlim = c(0,80),ylim=c(0,150), col = 31)

plot.default(Titanic1$`Passenger Age`,main="Scatter plot",ylab=" Age", col=30, xlim = c(0,1313))

plot(Titanic1$"Class", Titanic1$"Passenger Age",main="Box plot", xlab="Class",ylab="Age", ylim=c(0,80), col=30)

###Survived Passengers vs NoT Survived ; with respect to Gender
Survived_Gender<-table(Titanic1$`Passenger Sex`,Titanic1$Surv)
Survived_Gender
##         
##            N   Y
##   female 154 308
##   male   709 142
plot(Survived_Gender)

###Survived Passengers vs NoT Survived ; with respect to Class
Survived_Class<-table(Titanic1$Class,Titanic1$Surv)
Survived_Class
##      
##         N   Y
##   *     1   0
##   1st 129 193
##   2nd 160 119
##   3rd 573 138
plot(Survived_Class)

4.Meaningful question for analysis: Please state at the beginning a meaningful question for analysis. Use the first three steps and anything else that would be helpful to answer the question you are posing from the data set you chose. Please write a brief conclusion paragraphin R markdown at the end.

Based on the Data Analysis the following conclusion can be made: 1.There were more adult passengers ages 20-50 than kids or elderly people. Amoung young adults statistically more people age 20-25. 2.The older passenger - the higher class was occupied. 3. More females survived than males. 4. More Passengers from the 1st class survived than from any other class.

5. BONUS - place the original .csv in a github file and have R read from the link. This will be a very useful skill as you progress in your data science education and career

Titanic3<-read.csv(“https://raw.githubusercontent.com/uplotnik/Final-Project-R-Bridge-Course/master/Titanic1.csv”)