This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.
Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Ctrl+Shift+Enter.
plot(cars)
Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Ctrl+Alt+I.
When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Ctrl+Shift+K to preview the HTML file).
The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike Knit, Preview does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.
R SQL
#getwd()
#Installed the package sqldf.
install.packages("sqldf")
Installing package into ‘/cloud/lib/x86_64-pc-linux-gnu-library/4.2’
(as ‘lib’ is unspecified)
trying URL 'http://rspm/default/__linux__/focal/latest/src/contrib/sqldf_0.4-11.tar.gz'
Content type 'application/x-gzip' length 76354 bytes (74 KB)
==================================================
downloaded 74 KB
* installing *binary* package ‘sqldf’ ...
* DONE (sqldf)
The downloaded source packages are in
‘/tmp/RtmpZvOBz8/downloaded_packages’
library(sqldf)
Loading required package: gsubfn
Loading required package: proto
Warning: couldn't connect to display ":0"Loading required package: RSQLite
data1<-read.csv("quiz2 (2).csv",header = TRUE,sep = ",")
#A summary of the data.
summary(data1)
subregion region pop
Length:29 Length:29 Min. : 2311
Class :character Class :character 1st Qu.: 19005
Mode :character Mode :character Median : 32122
Mean : 135940
3rd Qu.: 101482
Max. :1554720
income ipaddr ufo2010
Min. :26784 Min. : 637 Min. : 0.00
1st Qu.:37970 1st Qu.: 12294 1st Qu.: 0.00
Median :41595 Median : 30418 Median : 2.00
Mean :43858 Mean : 440130 Mean : 16.66
3rd Qu.:47469 3rd Qu.: 102104 3rd Qu.: 9.00
Max. :70821 Max. :5394949 Max. :169.00
infections
Min. : 39
1st Qu.: 123
Median : 245
Mean :1117
3rd Qu.: 672
Max. :6781
#I had to run a query that returned every record from the table data1.
sqldf("SELECT * FROM data1")
#I created a query that returns the first 10 records of the data1 table.
sqldf('SELECT * FROM data1 LIMIT 10')
sqldf("SELECT * FROM data1 ORDER BY infections DESC LIMIT 5")
#Run a similar query (limit 7) including the attributes ipadddr , ufo2010 ,and infections.
sqldf("SELECT ipaddr , ufo2010 ,infections FROM data1 ORDER BY infections DESC LIMIT 7")
#Created a query income,region and population when the condition is income>=50,000.
#run a query that returns every region and population with a number of infections between 1000 and 7000.
sqldf('SELECT region,pop FROM data1 WHERE infections between 1000 and 7000')
The scenario could be the variables of infections with the first letter A in subregions.
sqldf('SELECT * FROM data1 WHERE subregion LIKE "A%"')
The average number of infections.
sqldf("SELECT AVG(infections) FROM data1")
Applied the summary to data2 and examined the output.
summary(data2)
region pop income ipaddr ufo2010
Length:29 Min. : 2311 Min. :26784 Min. : 637 Min. : 0.00
Class :character 1st Qu.: 19005 1st Qu.:37970 1st Qu.: 12294 1st Qu.: 0.00
Mode :character Median : 32122 Median :41595 Median : 30418 Median : 2.00
Mean : 135940 Mean :43858 Mean : 440130 Mean : 16.66
3rd Qu.: 101482 3rd Qu.:47469 3rd Qu.: 102104 3rd Qu.: 9.00
Max. :1554720 Max. :70821 Max. :5394949 Max. :169.00
infections
Min. : 39
1st Qu.: 123
Median : 245
Mean :1117
3rd Qu.: 672
Max. :6781
Run a query where region affects the number of infections.
sqldf('SELECT region,infections FROM data2')
Run a query where population affects the number of infections.
sqldf('SELECT pop,infections FROM data2')
Run a query where income affects the number of infections.
sqldf('SELECT income,infections FROM data2')
Run a query where ufo2010 affects the number of infections
sqldf('SELECT ufo2010,infections FROM data2')
Run a query where the different interactions of region and income, and income and population affects the number of infections
sqldf('SELECT region, income, income, pop, infections FROM data2 ORDER by infections')