Chatterjee and Hadi (Regression by Examples, 2006) provided a link to the right to workdata set on their web page. Display the relationship between Income and Taxes.
#Import tab-delimited files
<- "http://www1.aucegypt.edu/faculty/hadi/RABE5/Data5/P005.txt"
q1fL <- read.csv(q1fL, sep='\t') q1dta
::glimpse(q1dta) dplyr
## Rows: 38
## Columns: 8
## $ City <chr> "Atlanta", "Austin", "Bakersfield", "Baltimore", "Baton Rouge",~
## $ COL <int> 169, 143, 339, 173, 99, 363, 253, 117, 294, 291, 170, 239, 174,~
## $ PD <int> 414, 239, 43, 951, 255, 1257, 834, 162, 229, 1886, 643, 1295, 3~
## $ URate <dbl> 13.6, 11.0, 23.7, 21.0, 16.0, 24.4, 39.2, 31.5, 18.2, 31.5, 29.~
## $ Pop <int> 1790128, 396891, 349874, 2147850, 411725, 3914071, 1326848, 162~
## $ Taxes <int> 5128, 4303, 4166, 5001, 3965, 4928, 4471, 4813, 4839, 5408, 463~
## $ Income <int> 2961, 1711, 2122, 4654, 1620, 5634, 7213, 5535, 7224, 6113, 480~
## $ RTWL <int> 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, ~
head(q1dta[,c(1,6,7)]) |> knitr::kable()
City | Taxes | Income |
---|---|---|
Atlanta | 5128 | 2961 |
Austin | 4303 | 1711 |
Bakersfield | 4166 | 2122 |
Baltimore | 5001 | 4654 |
Baton Rouge | 3965 | 1620 |
Boston | 4928 | 5634 |
#擷取要分析的變數
<-q1dta[,c(1,6,7)]
mydata
summary(mydata)
## City Taxes Income
## Length:38 Min. :3965 Min. : 782
## Class :character 1st Qu.:4620 1st Qu.:3110
## Mode :character Median :4858 Median :4865
## Mean :4903 Mean :4709
## 3rd Qu.:5166 3rd Qu.:6082
## Max. :6404 Max. :8392
# 散布圖
with(mydata, plot(Income, Taxes,
xlab = "Income",
ylab = "Taxes",
main = "Overall relationship between tax and income",
xlim = c(750,8500),
ylim = c(3900, 6500),
type='n', bty='n'))
grid()
abline(lm(Taxes ~ Income, data=mydata))
with(mydata, text(Income, Taxes,
labels=City,
cex=.5))
所得與稅率的關係似乎有一點正相關。
The pdf version of the paper by Hsu, et al (2020). Health inequality: a longitudinal study on geographic variations in lung cancer incidence and mortality in Taiwanis available on-line. Extract the values in Table 1of the articlc to replicate Figure 2 in the paper.
#Extract tables from web pages
::p_load(rvest)
pacman
<- "https://bmcpublichealth.biomedcentral.com/articles/10.1186/s12889-020-09044-2/tables/1"
q2fL
<- q2fL |>
df read_html() |>
html_nodes("tables") |>
html_table(fill = T,header = T) #fill = T缺失值填上NA
#dta <- data.frame(df[[1]])[-1,1:5] #抓第一個表格,減第一列表頭,1:5 column資料
#head(dta,7) |> knitr::kable()
出現錯誤訊息Error in df[[1]] : subscript out of bounds 不懂為何會越界?!
Reproducible data and code are available for Leongomez, J.D. et al.(2020). Self-reported Health is Related to Body Height and Waist Circumference in Rural Indigenous and Urbanised Latin-American Populations. Scientific Report, 10, 4391.
Summarize the mean of height, waist, and weight in each study sample by gender.
#Download a file to a directory
dir.create(file.path(getwd(), "tmp_data"), showWarnings=FALSE)
<- "https://osf.io/kxut3/"
fL <- "./tmp_data/Full_data.csv(Version: 2)"
fD download.file(fL, destfile = fD)
出現訊息 trying URL ‘https://osf.io/kxut3/’ Content type ‘text/html; charset=utf-8’ length 41091 bytes (40 KB) downloaded 40 KB
這表示資料已經下載了嗎??
#data.table::fread(fD, fill=F)
無法排除error :(
#改手動下載後讀取資料
<- read.csv("C:/Users/Ching-Fang Wu/Documents/dataM/Full_data.csv", header=T)
q3dta head(q3dta)
## ID Country Population Sex Age Waist Hip Height Weight
## 1 F001 Colombia Urban Female 23 67.33333 90.43333 157.8000 48.80000
## 2 F003 Colombia Urban Female 24 97.50000 107.50000 164.6000 71.43333
## 3 F004 Colombia Urban Female 19 81.13333 106.06667 164.7667 73.90000
## 4 F005 Colombia Urban Female 19 70.30000 96.06667 161.0667 56.16667
## 5 F009 Colombia Urban Female 18 66.73333 91.46667 162.1667 54.80000
## 6 F010 Colombia Urban Female 18 82.53333 102.06667 160.9667 72.73333
## Fat VisceralFat BMI Muscle Health Sample
## 1 29.96667 3.000000 19.70000 24.66667 75.00 Colombia-Urban
## 2 42.56667 5.000000 26.23333 26.36667 50.00 Colombia-Urban
## 3 43.50000 5.000000 27.10000 23.73333 43.75 Colombia-Urban
## 4 34.23333 3.666667 21.66667 25.66667 50.00 Colombia-Urban
## 5 32.36667 3.000000 20.90000 26.23333 68.75 Colombia-Urban
## 6 45.96667 5.000000 28.10000 22.23333 62.50 Colombia-Urban
#Summarize the mean of height, waist, and weight in each study sample by gender.
show(mss <- aggregate(cbind(Height, Waist, Weight) ~ Sex, data=q3dta, FUN=mean))
## Sex Height Waist Weight
## 1 Female 157.4532 75.34818 58.44692
## 2 Male 170.1851 80.66893 68.16513
女性平均身高157.4532;男性平均身高170.1851 女性平均腰圍75.35;男性平均腰圍80.67 女性平均體重58.45;男性平均體重68.17
The following zip file contains one subject’s laser-event potentials (LEP) data for 4 separate conditions (different level of stimulus intensity), each in a plain text file (1w.dat, 2w.dat, 3w.dat and 4w.dat).
The rows are time points from -100 to 800 ms sampled at 2 ms per record. The columns are channel IDs.
Input all the files into R for graphical exploration.
#Import zipped stata file
#step1 Create a temp.file
<- tempfile() tmp
#step2 Use download.file() to fetch the file into the temp. file
<- "http://140.116.183.121/~sheu/dataM/Data/Subject1.zip"
fL #download.file(fL, tmp)
Warning in download.file(fL, tmp, mode = “wb”) : cannot open URL ‘http://140.116.183.121/~sheu/dataM/Data/Subject1.zip’: HTTP status was ‘401 Unauthorized’ 意思是未授權下載檔案??
#手動下載後再解壓縮zip檔
<- unzip("C:/Users/Ching-Fang Wu/Documents/dataM/tmp_data/Subject1.zip",unzip = "internal")
q4dta
str(q4dta)
## chr [1:4] "./Subject1/1w.dat" "./Subject1/2w.dat" "./Subject1/3w.dat" ...
#dplyr::glimpse(q4dta <- foreign::read.dta("./Subject1/1w.dat"))
The ASCII (plain text) file schiz.asc contains response times (in milliseconds) for 11 non-schizophrenics and 6 schizophrenics (30 measurements for each person). Summarize and compare descriptive statistics of the measurements from the two groups. Source: Belin, T., & Rubin, D. (1995). The analysis of repeated-measures data on schizophrenic reaction times using mixture models. Statistics in Medicine 14(8), 747-768.
#Import plain text files
<-"http://www.stat.columbia.edu/~gelman/book/data/schiz.asc" q5fL
Built my file on google scholar.
<-"https://scholar.google.com/citations?hl=en&user=g8B2YwwAAAAJ" myfile