R packages required listed below.

library(maps)
library(mapdata)
library(plyr)
library(graphics)
library(ggplot2)
library(ggmap)
library(plotGoogleMaps)
## Loading required package: sp
## Loading required package: spacetime
library(stats)
library(reshape2)
library(maptools)
## Checking rgeos availability: TRUE
library(dplyr)
## 
## Attaching package: 'dplyr'
## 
## The following objects are masked from 'package:plyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize
## 
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(scales)

Setting the directory location.

dir<-"C:/Homework [DC]/Groundwater_Proj"
setwd(dir)

Data was downloaded from http://www.bgs.ac.uk/research/groundwater/health/arsenic/Bangladesh/data.html as “DPHE/BGS National Hydrochemical Survey”, describing various groundwater quality parameters for several thousand wells measured by the British Geological Survey throughout Bangladesh in 1998-1999. Subsets of existing .csv data created for the administrative division and district (zila) of Dhaka, most populous in the country.

gwbang <- read.csv("NationalSurveyData (1).csv",header=TRUE,skip=4,sep=",")
dhaka <- subset(gwbang,subset=(DIVISION == "Dhaka"))
dhakadis <- subset(dhaka,subset=(DISTRICT == "Dhaka"))

Simple statistics for well data as a whole - mean, median, range obtained through the summary() function.

gwbang$As <- as.numeric(gwbang$As)

mean(gwbang$As)
## [1] 229.1186
summary(gwbang$As)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     1.0     1.0    88.0   229.1   413.0   908.0

I will firstly create a simple histogram of the frequency with which various levels of groundwater arsenic occur throughout Bangladesh. The national water quality standard of 50 ug/L As is indicated by the red line; this threshold has in 2010 been confirmed by JECFA (FAO/WHO) to produce adverse health effects if exceeded.

gwbang$As <- as.numeric(gwbang$As) #factor gwbang$As must be converted to class numeric
hist(gwbang$As,breaks=35,main="Arsenic Contamination of Groundwater in Bangladesh",xlab="Arsenic Con. (ug/L)",ylab="Number Wells",col="lightblue")
abline(v=50,col="red",lty=2)

While a majority of wells have “safe” levels of arsenic, there is a substantial amount and wide range of contaminated wells, with several extreme sites nearing 1 mg/L As.

To provide an overview of geographic distribution, I colorimetrically plot all measured sites (>850 ug/L arbitrarily used to indicate extreme values) by their GPS coordinates on a map:

extremeAs <- subset(gwbang,subset=(As >= 850))
map <- get_map(location='Bangladesh', zoom=7)
## Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=Bangladesh&zoom=7&size=640x640&scale=2&maptype=terrain&language=en-EN&sensor=false
## Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=Bangladesh&sensor=false
mapPoints <- ggmap(map) + geom_point(aes(x=gwbang$LONG_DEG,y=gwbang$LAT_DEG,size=2,col=(gwbang$As)),data=gwbang) + geom_point(aes(x=extremeAs$LONG_DEG,y=extremeAs$LAT_DEG),size=4,colour="#990000",data=extremeAs)

mapPoints