Overview

The “World Values Survey”" has documented waves of data collection, to understand values across cultures and across time. Below study concentrates study related to environment. People across world were asked their opinion about “Environmental problems in the world: Global warming or the greenhouse effect.” and below is analysis about the response received.

Specifications

Here we will concentrate on data from three countries. US, China and India. These three countries are presumably the World’s biggest polluters. Hence what people of these countries think about environment will have major impact across the globe.

Data Capturing


Step 1:

Go to http://www.worldvaluessurvey.org and download Codebook files and “WVS_Longitudinal_1981-2014_rdata_v_2015_04_18” file under “Longitudinal Data Files”.

Step 2:

Read the R data file and save relavant information. Here we haved saved environment question (labeled as B021) information in “Environment_Subset.RData” file.

Step 3:

Reading file

library(data.table)
## Warning: package 'data.table' was built under R version 3.2.4
#mydat <- fread('https://raw.githubusercontent.com/chirag-vithlani/606/master/project%20proposal/Environment_Subset.RData')
Environment_Subset_Data <- "Environment_Subset.RData"
load(Environment_Subset_Data)

colnames(Environment_Subset) <- c("Environment_Related_Question",
                          "Year_Of_survey", 
                          "Country_or_Region", 
                          "Age", 
                          "Highest_educational_level")

Data Filtering

Reading Only US,China and India specific data and labeling them for simplification. Also notice that we have avoided reading data where people have not answered question.

# Subset to US,India & China
# also removing missing/unanswered data
Environment_Subset_US_India_China<-subset( Environment_Subset,(Country_or_Region==356 | Country_or_Region==156 |Country_or_Region==840) & Environment_Related_Question >0  )
Environment_Subset_US_India_China_Country_Number<-Environment_Subset_US_India_China
#Country specific labeling

Environment_Subset_US_India_China$Country_or_Region[Environment_Subset_US_India_China$Country_or_Region==156] <-"China"
Environment_Subset_US_India_China$Country_or_Region[Environment_Subset_US_India_China$Country_or_Region==356] <-"India"
Environment_Subset_US_India_China$Country_or_Region[Environment_Subset_US_India_China$Country_or_Region==840] <-"US"

Grouping

Getting “arithmetic mean” of answer grouped by country.

aggregate(Environment_Subset_US_India_China[1],list(Environment_Subset_US_India_China$Country_or_Region),mean)
##   Group.1 Environment_Related_Question
## 1   China                     1.816924
## 2   India                     1.709510
## 3      US                     1.779605

Create new column call “avg” and apply above values.

Environment_Subset_US_India_China$avg[Environment_Subset_US_India_China$Country_or_Region=='India']<-1.70
Environment_Subset_US_India_China$avg[Environment_Subset_US_India_China$Country_or_Region=='China']<-1.81
Environment_Subset_US_India_China$avg[Environment_Subset_US_India_China$Country_or_Region=='US']<-1.77

Plotting average values

#Plotting

plot_ly(Environment_Subset_US_India_China,
  x = Country_or_Region,
  y = avg,
  name = "Environment",
  type = "bar"
)%>%
    layout(title = "Average response of people country wise",yaxis = list(title="Average (1|Very serious->4|Not serious at all)",range = c(1.6, 1.9)))

Observation 1 : As we can see, on average, all people are resonably serious about global warming, which is good thing.Among people of three countries, Indians are most serious.

Checking correlation

with(Environment_Subset_US_India_China_Country_Number,cor.test(Environment_Related_Question,Country_or_Region))
## 
##  Pearson's product-moment correlation
## 
## data:  Environment_Related_Question and Country_or_Region
## t = -0.32547, df = 3789, p-value = 0.7448
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.03711559  0.02655138
## sample estimates:
##          cor 
## -0.005287463

Observation 2 : The low R value (cor) indicates essentially no significant relationship, positive or negative.

Younger People : Comparision

Getting responses of the younger ( age less than 25 ) people.

Environment_Subset_US_India_China25<-subset(Environment_Subset_US_India_China,Environment_Subset_US_India_China$Age<25)

grp25<-aggregate(Environment_Subset_US_India_China25[1],list(Environment_Subset_US_India_China25$Country_or_Region),mean)

names(grp25) <- c("Country","Avg")

plot_ly(grp25,
  x = Country,
  y = Avg,
  name = "Environment",
  type = "bar"
)%>%
    layout(title = "Average response of people country wise (Age under 25)",yaxis = list(range = c(1.6, 1.9)))

Observation 3 : In comparison with younger people, US population is more serious about envionment.

Higher Education : Comparision

Environment_Subset_US_India_China_High_Degree<-subset(Environment_Subset_US_India_China,Environment_Subset_US_India_China$Highest_educational_level==8)

highlyQualified<-aggregate(Environment_Subset_US_India_China_High_Degree[1],list(Environment_Subset_US_India_China_High_Degree$Country_or_Region),mean)

names(highlyQualified) <- c("Country","Avg")

plot_ly(highlyQualified,
  x = Country,
  y = Avg,
  name = "Environment",
  type = "bar"
)%>%
    layout(title = "Avg response of people country wise (Higher education)",yaxis = list(range = c(1,2)))

Observation 4 : If we check Highest Education Level population then all three countries responses are same.

Environment_Subset_US_India_China_Less_serious<-subset(Environment_Subset_US_India_China,Environment_Subset_US_India_China$Environment_Related_Question==4)

table(Environment_Subset_US_India_China_Less_serious$Country_or_Region)
## 
## China India    US 
##    26    75    68
table(Environment_Subset_US_India_China_Less_serious$Highest_educational_level)
## 
## -3  1  2  3  4  5  6  7  8 
## 38  5 13  6 21 23 34 13 16

Observation 5 : As we can see age has nothing to do with people who are not at all serious about environment. But country-wise China has ver less number of people who are at all serious about envionment. So we can say in China very few people think that global warming is not serious issue.

Summary

Although there is no strong relationship between variables studied. We still have some noticable points.

  • On average all people from three countries are serious about environment and thik that global warming is serious or very serious issue.
  • In comparision, very few chinese people think that this issue not serious at all

Research question

You should phrase your research question in a way that matches up with the scope of inference your dataset allows for.

How people think about environment in three different countries China,US and India? ( World’s biggest polluters** ).

If time permits other variables like age and level of education will be explored.

Cases

What are the cases, and how many are there?

Total cases are 3791

Data collection

Describe the method of data collection.

As per “Collection Procedures” section at http://www.worldvaluessurvey.org/WVSContents.jsp

“The mode of data collection for WVS surveys is face-to-face interviewing. Other modes (e.g., telephone, mail, internet) are not acceptable except under very exceptional circumstances and only on an experimental basis”

R Data file downloaded from http://www.worldvaluessurvey.org/WVSDocumentationWVL.jsp and then filtered as per requirement

Type of study

What type of study is this (observational/experiment)?

This is observational study.

Data Source

Link :

http://www.worldvaluessurvey.org/WVSDocumentationWVL.jsp

Citation : WORLD VALUES SURVEY 1981-2014 LONGITUDINAL AGGREGATE v.20150418. World Values Survey Association (www.worldvaluessurvey.org). Aggregate File Producer: JDSystems, Madrid SPAIN.

Response

The response variable would be answer to question “Environmental problems in the world: Global warming or the greenhouse effect.”

Answer would be one of the following ( for this project we are considering only first four values).

Value Description
1 Very serious
2 Somewhat serious
3 Not very serious
4 Not serious at all

We wil check how answer to this question varies with region,and also possibly, with age and education.

Explanatory

What is the explanatory variable, and what type is it (numerical/categorival)?

Explanatory variables are region ( US,China & India ), education qualification and age.

Region is categorical variable

Value Description
156 China
356 India
840 US

Education qualification is categorical variable

Value Description
1 Inadequately completed elementary education
2 Completed (compulsory) elementary education
3 Incomplete secondary school: technical/vocational type/(Compulsory) elementary education and basic vocational qualification
4 Complete secondary school: technical/vocational type/Secondary, intermediate vocational qualification
5 Incomplete secondary: university-preparatory type/Secondary, intermediate general qualification
6 Complete secondary: university-preparatory type/Full secondary, maturity level certificate
7 Some university without degree/Higher education - lower-level tertiary certificate
8 University with degree/Higher education - upper-level tertiary certificate
-5 Missing; Unknown
-4 Not asked in survey
-3 Not applicable; No formal education
-2 No answer
-1 Don´t know

Age is numerical variable

Value Description
15 to 98 age ranges from 15 to 98
-5 Missing; Unknown
-4 Not asked in survey
-3 Not applicable
-2 No answer
-1 Don’t know

Relevant summary statistics

** Provide summary statistics relevant to your research question. For example, if you’re comparing means across groups provide means, SDs, sample sizes of each group. This step requires the use of R, hence a code chunk is provided below. Insert more code chunks as needed. **

We can ask below questions from data available.

  1. Which country’s people are serious about environment.
  2. How this answer varies with age & education qualification.
  3. Who are the people not serious ( about environment ) ? what are their age, educational qualification and belong with which country.
  4. We can also try different combination like China+India ( most populous ),China+US ( largest economies )and US+India ( oldest and largest democracies ).
##  Environment_Related_Question Year_Of_survey Country_or_Region 
##  Min.   :1.000                Min.   :2006   Length:3791       
##  1st Qu.:1.000                1st Qu.:2006   Class :character  
##  Median :2.000                Median :2006   Mode  :character  
##  Mean   :1.767                Mean   :2006                     
##  3rd Qu.:2.000                3rd Qu.:2007                     
##  Max.   :4.000                Max.   :2007                     
##       Age        Highest_educational_level      avg       
##  Min.   :-2.00   Min.   :-3.000            Min.   :1.700  
##  1st Qu.:31.00   1st Qu.: 2.000            1st Qu.:1.700  
##  Median :42.00   Median : 5.000            Median :1.770  
##  Mean   :43.57   Mean   : 3.838            Mean   :1.758  
##  3rd Qu.:54.00   3rd Qu.: 6.000            3rd Qu.:1.810  
##  Max.   :93.00   Max.   : 8.000            Max.   :1.810

The mean value of question “Environmental problems in the world: Global warming or the greenhouse effect” In China NaN , in India NaN and in US NaN

References

WORLD VALUES SURVEY 1981-2014 LONGITUDINAL AGGREGATE v.20150418. World Values Survey Association (www.worldvaluessurvey.org). Aggregate File Producer: JDSystems, Madrid SPAIN.