Contents
Summary Discussion
Column Descriptions
1. SR RD refers to unique identifier assigned to a service request
2. Opened intake date of a service request
3. Closed date on which request was marked complete by MC311 Center
4. Status open/closed/
5. Department id of county department
6. Area intra departmental unit
7. Sub-Area subunit
8. Attached Solution (Topic) standard description of work performed
9. Attached Solution SLA Days perscribed elapsed business day to complete the work described
10. City
11. State
12. Zip Code
13. Source intake by phone/web/email/etc
14. Election District ambiguous
15. Maryland State District ambiguous
16. Congressional District
17. Congressional Member
18. Council District
19. Council Member Name
20. Changed Date
21. # of Days Open count of elapsed business days to close request
22. Within SLA Windows logical 1/0
23. SLA Yes 1 if yes; 0 if no
24. SLA No 1 if no; 0 if yes
R/Rstudio Code
Summary Discussion
I chose to explore the Drainage-Erosion-Repair dataset with the idea for a project that would tie complaints of flooded basements, sidewalks and streets to precipitation recorded at NOAA weather stations located in Montgomery County. This could suggest where steps could remediate the causes of local flooding.
The csv data on the DataMontgomery portal collects the calls to the county’s 311 Center that get referred to Department of Transportation (DOT) to investigate.
There a several likely issues that will pose difficulties for this project:
- lack of precise location
- electoral district have changed over the time span of the data
- uncertainties connecting recorded precipitation events to the time of creation of service requests.
- inability to determine the scale of flood events. are the most significant.
Additionally there are the typical problems with structure of the data:
- column naming: embedded spaces and punctuation
- column names are verbose and do not adequately describe their contents
- redundant columns
- timestamps are overly precise.
1. SR RD refers to unique identifier assigned to a service request
Back to Contents
1 Variables 16058 Observations
SR ID
| n | missing | distinct |
| 16058 | 0 | 16058 |
| lowest : | 1106594008 | 1106613556 | 1106626493 | 1106626564 | 1106635476 |
| highest: | 1484150476 | 1484157546 | 1484161503 | 1484177607 | 1484185305 |
2. Opened intake date of a service request
Back to Contents
1 Variables 16058 Observations
Opened
n missing distinct Info
16058 0 16052 1
Mean Gmd .05 .10
2017-06-06 20:40:22 96238969 2013-04-12 10:58:26 2013-10-25 02:57:18
.25 .50 .75 .90
2015-03-23 13:41:02 2017-06-21 09:43:21 2019-07-15 13:47:50 2021-02-08 11:45:27
.95
2021-08-04 11:36:15
| lowest : | 2012-07-02 09:02:36 | 2012-07-02 10:59:16 | 2012-07-02 11:45:19 | 2012-07-02 11:49:13 | 2012-07-02 12:48:42 |
| highest: | 2022-02-10 13:08:12 | 2022-02-10 13:23:58 | 2022-02-10 14:49:07 | 2022-02-10 16:12:45 | 2022-02-10 16:29:01 |
3. Closed date on which request was marked complete by MC311 Center
Back to Contents
1 Variables 16058 Observations
Closed
n missing distinct Info
15869 189 15867 1
Mean Gmd .05 .10
2017-07-13 04:23:43 94221632 2013-05-08 11:08:48 2014-01-16 07:55:02
.25 .50 .75 .90
2015-05-26 07:21:50 2017-08-01 15:37:14 2019-07-29 13:29:37 2021-01-22 07:47:23
.95
2021-08-04 00:06:19
| lowest : | 2012-07-02 11:52:32 | 2012-07-06 09:37:54 | 2012-07-09 13:32:18 | 2012-07-10 09:15:48 | 2012-07-10 11:18:01 |
| highest: | 2022-02-10 09:30:53 | 2022-02-10 13:45:33 | 2022-02-10 14:51:21 | 2022-02-10 16:31:42 | 2022-02-11 13:36:46 |
Status
Value Closed In Progress
Frequency 15869 189
Proportion 0.988 0.012
5. Department id of county department
Back to Contents
1 Variables 16058 Observations
Department
| n | missing | distinct | value |
| 16058 | 0 | 1 | DOT |
Value DOT
Frequency 16058
Proportion 1
Area
| n | missing | distinct | value |
| 16058 | 0 | 1 | Highway Services |
Value Highway Services
Frequency 16058
Proportion 1
Sub-Area
| lowest : | Curb and Gutter Repair | Drainage Repair | Office | Road Repair | Sidewalk Repair |
| highest: | Curb and Gutter Repair | Drainage Repair | Office | Road Repair | Sidewalk Repair |
Value Curb and Gutter Repair Drainage Repair Office
Frequency 3 16051 1
Proportion 0 1 0
Value Road Repair Sidewalk Repair
Frequency 1 2
Proportion 0 0
8. Attached Solution (Topic) standard description of work performed
Back to Contents
1 Variables 16058 Observations
Attached Solution (Topic)
| n | missing | distinct |
| 16052 | 6 | 18 |
| lowest : | Clogged Storm Drain | Connect Sump Pump to Street Drain | Curb and Gutter Repair | Debris Pickup | Drain Under Driveway Apron |
| highest: | Sidewalk Repair | Sinkhole Repair | Status of storm drain repair | Status of Storm Drain Repair | Street Drainage Repair |
9. Attached Solution SLA Days perscribed elapsed business day to complete the work described
Back to Contents
1 Variables 16058 Observations
Attached Solution SLA Days
| n | missing | distinct |
| 16052 | 6 | 12 |
lowest : 1 10 120 15 20 , highest: 40 45 5 60 90
Value 1 10 120 15 20 3 30 40 45 5 60 90
Frequency 913 3003 77 10 1 2646 1350 1376 5749 914 10 3
Proportion 0.057 0.187 0.005 0.001 0.000 0.165 0.084 0.086 0.358 0.057 0.001 0.000
City
| n | missing | distinct |
| 13539 | 2519 | 55 |
| lowest : | Ashton | ASHTON | BARNESVILLE | BEALLSVILLE | Bethesda |
| highest: | SILVER SPRING | SPENCERVILLE | Takoma Park | TAKOMA PARK | unkown |
State
| n | missing | distinct | value |
| 14857 | 1201 | 1 | MD |
Value MD
Frequency 14857
Proportion 1
Zip Code
| n | missing | distinct |
| 14740 | 1318 | 50 |
lowest : 20707 20777 20812 20814 20815 , highest: 20906 20910 20912 20993 21771
13. Source intake by phone/web/email/etc
Back to Contents
1 Variables 16058 Observations
Source
lowest : CE Event Email Internal Map App Phone , highest: Internal Map App Phone Twitter Web
Value CE Event Email Internal Map App Phone Twitter Web
Frequency 1 2 188 1 12750 11 3105
Proportion 0.000 0.000 0.012 0.000 0.794 0.001 0.193
Election District
| n | missing | distinct |
| 14741 | 1317 | 22 |
lowest : 01 02 03 04 05 , highest: 5 6 7 8 9
15. Maryland State District ambiguous
Back to Contents
1 Variables 16058 Observations
Maryland State District
| n | missing | distinct |
| 14741 | 1317 | 8 |
lowest : 14 15 16 17 18 , highest: 17 18 19 20 39
Value 14 15 16 17 18 19 20 39
Frequency 3328 2491 3132 106 1490 2010 1380 804
Proportion 0.226 0.169 0.212 0.007 0.101 0.136 0.094 0.055
Congressional District
| n | missing | distinct |
| 14741 | 1317 | 4 |
Value 3 4 6 8
Frequency 2085 147 4597 7912
Proportion 0.141 0.010 0.312 0.537
Congressional Member
| n | missing | distinct |
| 14741 | 1317 | 15 |
| lowest : | CHRISTOPHER VAN HOLLEN,JR -Dem | CHRISTOPHER VAN HOLLEN, JR. (Democrat) | David Throne -Dem | DAVID TRONE (Democrat) | DONNA EDWARDS (Democrat) |
| highest: | John P. Sarbanes -Dem | JOHN P. SARBANES -Dem | JOHN P. SARBANES (Democrat) | ROSCOE BARTLETT - REP | ROSCOE BARTLETT (Republican) |
Council District
| n | missing | distinct |
| 14741 | 1317 | 5 |
lowest : 1 2 3 4 5 , highest: 1 2 3 4 5
Value 1 2 3 4 5
Frequency 5041 2052 1266 3836 2546
Proportion 0.342 0.139 0.086 0.260 0.173
Council Member Name
| n | missing | distinct |
| 14741 | 1317 | 14 |
| lowest : | Andrew Friedson | Cherri Branson | Craig Rice | CRAIG RICE | Nancy Navarro |
| highest: | ROGER BERLINER | Sidney Katz | Tom Hucker | Valerie Ervin | VALERIE ERVIN |
Andrew Friedson (1333, 0.090), Cherri Branson (283, 0.019), Craig Rice (1944, 0.132), CRAIG RICE (108, 0.007), Nancy Navarro (3730, 0.253), NANCY NAVARRO (106, 0.007), Phil Andrews (277, 0.019), PHIL ANDREWS (61, 0.004), Roger Berliner (3530, 0.239), ROGER BERLINER (178, 0.012), Sidney Katz (928, 0.063), Tom Hucker (2050, 0.139), Valerie Ervin (139, 0.009), VALERIE ERVIN (74, 0.005)
Changed Date
n missing distinct Info
16058 0 16055 1
Mean Gmd .05 .10
2017-07-30 03:56:25 94887311 2013-05-09 18:07:57 2014-01-28 09:49:09
.25 .50 .75 .90
2015-05-29 17:31:24 2017-08-24 02:22:20 2019-08-15 13:40:46 2021-03-15 18:00:01
.95
2021-08-20 10:09:40
| lowest : | 2012-07-02 15:52:33 | 2012-07-06 13:37:56 | 2012-07-09 17:32:19 | 2012-07-10 13:15:49 | 2012-07-10 15:18:01 |
| highest: | 2022-02-11 11:03:41 | 2022-02-11 11:27:44 | 2022-02-11 16:02:22 | 2022-02-11 16:42:01 | 2022-02-11 18:36:47 |
21. # of Days Open count of elapsed business days to close request
Back to Contents
1 Variables 16058 Observations
# of Days Open
| n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 |
| 16058 | 0 | 587 | 0.986 | 38.59 | 64.36 | 0 | 0 | 1 | 3 | 20 | 117 | 240 |
lowest : 0 1 2 3 4 , highest: 753 797 831 1740 1778
22. Within SLA Windows logical 1/0
Back to Contents
1 Variables 16058 Observations
Within SLA Windows
Value No Yes
Frequency 4450 11608
Proportion 0.277 0.723
SLA Yes
Value 0 1
Frequency 4450 11608
Proportion 0.277 0.723
SLA No
Value 0 1
Frequency 11608 4450
Proportion 0.723 0.277
Code:
---
title: "mc311waterWaterSomewhere"
author: "Steve Dutky Montgomery College"
date: "DATA 205 Spring 2022"
output: html_document
---
stop==>```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE,message = FALSE,warning = FALSE)
stop==>```
stop==>```{r, results='asis'}
cat(paste(readLines("toc"),"<br>"))
stop==>```
### Summary Discussion {#summary}
I chose to explore the [Drainage-Erosion-Repair dataset](https://data.montgomerycountymd.gov/Government/Drainage-Erosion-Repair/3bzi-qh3p) with the idea for a project that would tie complaints of flooded basements, sidewalks and streets to precipitation recorded at NOAA weather stations located in Montgomery County. This could suggest where steps could remediate the causes of local flooding.<br>
The [csv data](https://data.montgomerycountymd.gov/api/views/3bzi-qh3p/rows.csv?accessType=DOWNLOAD) on the DataMontgomery portal collects the calls to the county's 311 Center that get referred to Department of Transportation (DOT) to investigate.
<br><br>
There a several likely issues that will pose difficulties for this project:
<br>
* lack of precise location
* electoral district have changed over the time span of the data
* uncertainties connecting recorded precipitation events to the time of creation of service requests.
* inability to determine the scale of flood events.
are the most significant.
<br>
Additionally there are the typical problems with structure of the data:
<br>
* column naming: embedded spaces and punctuation
* column names are verbose and do not adequately describe their contents
* redundant columns
* timestamps are overly precise.
stop==>```{r loadLibraries}
library(tidyverse)
library(Hmisc)
library(lubridate)
stop==>```
stop==>```{r ingestData }
load(file="erode.save")
#erode<-read_delim("https://data.montgomerycountymd.gov/api/views/3bzi-qh3p/rows.csv?accessType=DOWNLOAD",",",col_types=paste(rep("c",24),collapse = ""))
columnTitle<-readLines("columnDetail")
stop==>```
stop==>```{r cleanAndWrangle}
tmp<-names(erode)
names(erode)[20]<-"tmpChange"
names(erode)[21]<-"tmpOpen"
erode<- erode %>%
mutate(
Opened=parse_date_time(Opened,"%m/%d/%Y %I:%M:%S %p"),
Closed=parse_date_time(Closed,"%m/%d/%Y %I:%M:%S %p"),
tmpChange=parse_date_time(tmpChange,"%m/%d/%Y %I:%M:%S %p"),
tmpOpen=as.integer(tmpOpen)
)
names(erode)<-tmp
stop==>```
stop==>```{r, results='asis'}
html(describe(erode[, 1 ],columnTitle[ 1 ]))
html(describe(erode[, 2 ],columnTitle[ 2 ]))
html(describe(erode[, 3 ],columnTitle[ 3 ]))
html(describe(erode[, 4 ],columnTitle[ 4 ]))
html(describe(erode[, 5 ],columnTitle[ 5 ]))
html(describe(erode[, 6 ],columnTitle[ 6 ]))
html(describe(erode[, 7 ],columnTitle[ 7 ]))
html(describe(erode[, 8 ],columnTitle[ 8 ]))
html(describe(erode[, 9 ],columnTitle[ 9 ]))
html(describe(erode[, 10 ],columnTitle[ 10 ]))
html(describe(erode[, 11 ],columnTitle[ 11 ]))
html(describe(erode[, 12 ],columnTitle[ 12 ]))
html(describe(erode[, 13 ],columnTitle[ 13 ]))
html(describe(erode[, 14 ],columnTitle[ 14 ]))
html(describe(erode[, 15 ],columnTitle[ 15 ]))
html(describe(erode[, 16 ],columnTitle[ 16 ]))
html(describe(erode[, 17 ],columnTitle[ 17 ]))
html(describe(erode[, 18 ],columnTitle[ 18 ]))
html(describe(erode[, 19 ],columnTitle[ 19 ]))
html(describe(erode[, 20 ],columnTitle[ 20 ]))
html(describe(erode[, 21 ],columnTitle[ 21 ]))
html(describe(erode[, 22 ],columnTitle[ 22 ]))
html(describe(erode[, 23 ],columnTitle[ 23 ]))
html(describe(erode[, 24 ],columnTitle[ 24 ]))
stop==>```