https://vincentarelbundock.github.io/Rdatasets/doc/Ecdat/DoctorAUS.html
http://cameron.econ.ucdavis.edu/racd/racddata.html
http://faculty.econ.ucdavis.edu/faculty/cameron/racd2/
http://site.ebrary.com/lib/upenn/reader.action?docID=10073578 Page 68
a cross-section from 1977--1978 number of observations : 5190 observation : individuals country : Australia : 15 variables ((2 discretised variables, 9 count variables, 1 ordinal variable, 3 factors))
*sex factor w/ 2 levels: Male, Female
*age factor w/ 12 levels: 19,22,27...
*income annual income in tens of thousands of dollars,Factor w/ 14 levels:0,0.1,0.6...
*insurance insurance contract, factor w/ 4 levels: medlevy = Medicare => Univeral, levyplus = private health insurance => Priveate, freepoor = government insurance due to low income => govPoor, freerepa = government insurance due to old age disability or veteran status => govVA
*illness number of illness in past 2 weeks, numberic
*actdays number of days of reduced activity in past 2 weeks due to illness or injury, numeric
*hscore general health score using Goldberg's method (from 0 to 12), numeric
*chcond chronic condition, factor w/ 3 levels: np = no problem, la = limiting activity, nla = not limiting activity)
*doctorco number of consultations with a doctor or specialist in the past 2 weeks, numeric
*nondocco number of consultations with non-doctor health professionals (chemist, optician, physiotherapist, social worker, district community nurse, chiropodist or chiropractor) in the past 2 weeks, numeric
*hospadmi number of admissions to a hospital, psychiatric hospital, nursing or convalescent home in the past 12 months (up to 5 or more admissions which is coded as 5)
*hospdays number of nights in a hospital, etc. during most recent admission: taken, where appropriate, as the mid-point of the intervals 1, 2, 3, 4, 5, 6, 7, 8-14, 15-30, 31-60, 61-79 with 80 or more admissions coded as 80. If no admission in past 12 months then equals zero. Numeric.
*medecine total number of prescribed and nonprescribed medications used in past 2 days. numeric.
*prescrib total number of prescribed medications used in past 2 days, numeric.
*nonpresc total number of nonprescribed medications used in past 2 days, numeric
Adjusted numberical data sex, age and income into better processable factors of 2, 12, and 14 levels respectively. Added variable ageZ, that incidate age group in 3 levels: youth (18-24), midage (25-59) or senior (60+).
a1<-ggplot(DoctorAUS, aes(sex)) + geom_bar() + ylab( "Count" )
a2<-ggplot(DoctorAUS, aes(age)) + geom_bar() + ylab( "Count" )
a3<-ggplot(DoctorAUS, aes(insurance)) + geom_bar() + ylab( "Count" )
a4<-ggplot(DoctorAUS, aes(income)) + geom_bar() + ylab("Count" )
grid.arrange(a1, a2, a3, a4, nrow=2, widths=c(2,3))There are slightly more females than males. The age distribution is curiously U-shaped. According to the reference, the distribution is due to the fact only single people were included in the dataset. This explains why there are relatively many young people (not yet married) and old people (divorced or widowed or never married). Almost half have private health insurance ('levyplus') and a few have government insurance due to low income ('freepor'). Income is measured in thousands of Australian dollars and shows a clear mode at 2.5 (the interval of 2001 to 3000 AUD). *The final age and income categories represent 70+ in age and 14,000+ AUD in income respectively.
ggplot(DoctorAUS, aes(age)) + geom_bar() + facet_grid(sex~.) + ylab("Count")ggplot(DoctorAUS, aes(income)) + geom_bar() + facet_grid(sex~.) + ylab("Count")There are more young single men in the study than young single women and a lot more old single women than old single men. A large proportion of females have income in the range of 2001 to 3000 AUD a year. The mode for the income distribution for both sexes is predominantly due to females. More males than females have higher incomes. The mode for male income is 9 (8,000 to 10,000 AUD)
ggplot(DoctorAUS, aes(income)) + geom_bar() + facet_grid(ageZ~sex) + ylab("Count") +
scale_y_continuous(breaks=c(0,200,400,600))It is now clear that a major portion of females in the income class 2.5 are seniors. There are relatively more senior males with the income class 2.5 as well, maybe that is the amount of a standard pension.
b1 <- ggplot(DoctorAUS, aes(chcond)) + geom_bar() + ylab("Count") +
xlab("chronic condition")
b2 <- ggplot(DoctorAUS, aes(factor(hscore))) + geom_bar() + ylab("Count") +
xlab("General health score")
b3 <- ggplot(DoctorAUS, aes(factor(illness))) + geom_bar() + ylab("Count") +
xlab("Number of illnesses")
b4 <- ggplot(DoctorAUS, aes(factor(actdays))) + geom_bar() + ylab("Count") +
xlab("Number of days of reduced activity")
grid.arrange(b1, b2, b3, b4,nrow=2, widths=c(2,3))Most people in the survey have a chronic condition which limits their activity (la) or no chronic problem (np). Only less than 10% survey participants have a chronic condition which does not limit their activity (nla). Many as 60% of the survey participants reported good health according to the general health score. More than two thirds of participants reported having one or more illnesses in the previous two weeks. The vast majority of participants had no days of reduced activity due to illness or injury in the previous two weeks, but a few had reduced activity for the whole two weeks.
ggplot(DoctorAUS, aes(factor(medecine))) + geom_bar() + facet_grid(insurance~ageZ) +
xlab("Count of Medcine") + ylab("Count of Participants")Youth group mostly use less than two medicines, mostly enroll in Universal and Private insurances, and a few enrolled in govenment insurance for low income. Midage group mostly use less than three medicines small number of medicines, mostly enroll in Universal and Private insurances programs and a few enrolled in govenment insurance for diaability and veterans. Senior group mostly use less than five medicines, mostly enroll in Private and goverment insurance for disability and Veterans.