Contents
Summary Discussion

Column Descriptions
1. SR RD refers to unique identifier assigned to a service request
2. Opened intake date of a service request
3. Closed date on which request was marked complete by MC311 Center
4. Status open/closed/
5. Department id of county department
6. Area intra departmental unit
7. Sub-Area subunit
8. Attached Solution (Topic) standard description of work performed
9. Attached Solution SLA Days perscribed elapsed business day to complete the work described
10. City
11. State
12. Zip Code
13. Source intake by phone/web/email/etc
14. Election District ambiguous
15. Maryland State District ambiguous
16. Congressional District
17. Congressional Member
18. Council District
19. Council Member Name
20. Changed Date
21. # of Days Open count of elapsed business days to close request
22. Within SLA Windows logical 1/0
23. SLA Yes 1 if yes; 0 if no
24. SLA No 1 if no; 0 if yes
R/Rstudio Code

Summary Discussion

I chose to explore the Drainage-Erosion-Repair dataset with the idea for a project that would tie complaints of flooded basements, sidewalks and streets to precipitation recorded at NOAA weather stations located in Montgomery County. This could suggest where steps could remediate the causes of local flooding.

The csv data on the DataMontgomery portal collects the calls to the county’s 311 Center that get referred to Department of Transportation (DOT) to investigate.

There a several likely issues that will pose difficulties for this project:

  • lack of precise location
  • electoral district have changed over the time span of the data
  • uncertainties connecting recorded precipitation events to the time of creation of service requests.
  • inability to determine the scale of flood events. are the most significant.

Additionally there are the typical problems with structure of the data:

  • column naming: embedded spaces and punctuation
  • column names are verbose and do not adequately describe their contents
  • redundant columns
  • timestamps are overly precise.

1. SR RD refers to unique identifier assigned to a service request
Back to Contents

1 Variables   16058 Observations

SR ID
nmissingdistinct
16058016058
lowest :11065940081106613556110662649311066265641106635476
highest:14841504761484157546148416150314841776071484185305

2. Opened intake date of a service request
Back to Contents

1 Variables   16058 Observations

Opened
image
                   n             missing            distinct                Info 
               16058                   0               16052                   1 
                Mean                 Gmd                 .05                 .10 
 2017-06-06 20:40:22            96238969 2013-04-12 10:58:26 2013-10-25 02:57:18 
                 .25                 .50                 .75                 .90 
 2015-03-23 13:41:02 2017-06-21 09:43:21 2019-07-15 13:47:50 2021-02-08 11:45:27 
                 .95 
 2021-08-04 11:36:15 
 
lowest :2012-07-02 09:02:362012-07-02 10:59:162012-07-02 11:45:192012-07-02 11:49:132012-07-02 12:48:42
highest:2022-02-10 13:08:122022-02-10 13:23:582022-02-10 14:49:072022-02-10 16:12:452022-02-10 16:29:01

3. Closed date on which request was marked complete by MC311 Center
Back to Contents

1 Variables   16058 Observations

Closed
image
                   n             missing            distinct                Info 
               15869                 189               15867                   1 
                Mean                 Gmd                 .05                 .10 
 2017-07-13 04:23:43            94221632 2013-05-08 11:08:48 2014-01-16 07:55:02 
                 .25                 .50                 .75                 .90 
 2015-05-26 07:21:50 2017-08-01 15:37:14 2019-07-29 13:29:37 2021-01-22 07:47:23 
                 .95 
 2021-08-04 00:06:19 
 
lowest :2012-07-02 11:52:322012-07-06 09:37:542012-07-09 13:32:182012-07-10 09:15:482012-07-10 11:18:01
highest:2022-02-10 09:30:532022-02-10 13:45:332022-02-10 14:51:212022-02-10 16:31:422022-02-11 13:36:46

4. Status open/closed/
Back to Contents

1 Variables   16058 Observations

Status
nmissingdistinct
1605802
 Value           Closed In Progress
 Frequency        15869         189
 Proportion       0.988       0.012
 

5. Department id of county department
Back to Contents

1 Variables   16058 Observations

Department
nmissingdistinctvalue
1605801DOT
 Value        DOT
 Frequency  16058
 Proportion     1
 

6. Area intra departmental unit
Back to Contents

1 Variables   16058 Observations

Area
nmissingdistinctvalue
1605801Highway Services
 Value      Highway Services
 Frequency             16058
 Proportion                1
 

7. Sub-Area subunit
Back to Contents

1 Variables   16058 Observations

Sub-Area
image
nmissingdistinct
1605805
lowest :Curb and Gutter RepairDrainage Repair Office Road Repair Sidewalk Repair
highest:Curb and Gutter RepairDrainage Repair Office Road Repair Sidewalk Repair
 Value      Curb and Gutter Repair        Drainage Repair                 Office
 Frequency                       3                  16051                      1
 Proportion                      0                      1                      0
                                                         
 Value                 Road Repair        Sidewalk Repair
 Frequency                       1                      2
 Proportion                      0                      0
 

8. Attached Solution (Topic) standard description of work performed
Back to Contents

1 Variables   16058 Observations

Attached Solution (Topic)
image
nmissingdistinct
16052618
lowest :Clogged Storm Drain Connect Sump Pump to Street DrainCurb and Gutter Repair Debris Pickup Drain Under Driveway Apron
highest:Sidewalk Repair Sinkhole Repair Status of storm drain repair Status of Storm Drain Repair Street Drainage Repair

9. Attached Solution SLA Days perscribed elapsed business day to complete the work described
Back to Contents

1 Variables   16058 Observations

Attached Solution SLA Days
image
nmissingdistinct
16052612
lowest : 1 10 120 15 20 , highest: 40 45 5 60 90
 Value          1    10   120    15    20     3    30    40    45     5    60    90
 Frequency    913  3003    77    10     1  2646  1350  1376  5749   914    10     3
 Proportion 0.057 0.187 0.005 0.001 0.000 0.165 0.084 0.086 0.358 0.057 0.001 0.000
 

10. City
Back to Contents

1 Variables   16058 Observations

City
image
nmissingdistinct
13539251955
lowest :Ashton ASHTON BARNESVILLE BEALLSVILLE Bethesda
highest:SILVER SPRINGSPENCERVILLE Takoma Park TAKOMA PARK unkown

11. State
Back to Contents

1 Variables   16058 Observations

State
nmissingdistinctvalue
1485712011MD
 Value         MD
 Frequency  14857
 Proportion     1
 

12. Zip Code
Back to Contents

1 Variables   16058 Observations

Zip Code
image
nmissingdistinct
14740131850
lowest : 20707 20777 20812 20814 20815 , highest: 20906 20910 20912 20993 21771
13. Source intake by phone/web/email/etc
Back to Contents

1 Variables   16058 Observations

Source
image
nmissingdistinct
1605807
lowest : CE Event Email Internal Map App Phone , highest: Internal Map App Phone Twitter Web
 Value      CE Event    Email Internal  Map App    Phone  Twitter      Web
 Frequency         1        2      188        1    12750       11     3105
 Proportion    0.000    0.000    0.012    0.000    0.794    0.001    0.193
 

14. Election District ambiguous
Back to Contents

1 Variables   16058 Observations

Election District
image
nmissingdistinct
14741131722
lowest : 01 02 03 04 05 , highest: 5 6 7 8 9
15. Maryland State District ambiguous
Back to Contents

1 Variables   16058 Observations

Maryland State District
image
nmissingdistinct
1474113178
lowest : 14 15 16 17 18 , highest: 17 18 19 20 39
 Value         14    15    16    17    18    19    20    39
 Frequency   3328  2491  3132   106  1490  2010  1380   804
 Proportion 0.226 0.169 0.212 0.007 0.101 0.136 0.094 0.055
 

16. Congressional District
Back to Contents

1 Variables   16058 Observations

Congressional District
image
nmissingdistinct
1474113174
 Value          3     4     6     8
 Frequency   2085   147  4597  7912
 Proportion 0.141 0.010 0.312 0.537
 

17. Congressional Member
Back to Contents

1 Variables   16058 Observations

Congressional Member
image
nmissingdistinct
14741131715
lowest :CHRISTOPHER VAN HOLLEN,JR -Dem CHRISTOPHER VAN HOLLEN, JR. (Democrat)David Throne -Dem DAVID TRONE (Democrat) DONNA EDWARDS (Democrat)
highest:John P. Sarbanes -Dem JOHN P. SARBANES -Dem JOHN P. SARBANES (Democrat) ROSCOE BARTLETT - REP ROSCOE BARTLETT (Republican)

18. Council District
Back to Contents

1 Variables   16058 Observations

Council District
image
nmissingdistinct
1474113175
lowest : 1 2 3 4 5 , highest: 1 2 3 4 5
 Value          1     2     3     4     5
 Frequency   5041  2052  1266  3836  2546
 Proportion 0.342 0.139 0.086 0.260 0.173
 

19. Council Member Name
Back to Contents

1 Variables   16058 Observations

Council Member Name
image
nmissingdistinct
14741131714
lowest :Andrew FriedsonCherri Branson Craig Rice CRAIG RICE Nancy Navarro
highest:ROGER BERLINER Sidney Katz Tom Hucker Valerie Ervin VALERIE ERVIN
Andrew Friedson (1333, 0.090), Cherri Branson (283, 0.019), Craig Rice (1944, 0.132), CRAIG RICE (108, 0.007), Nancy Navarro (3730, 0.253), NANCY NAVARRO (106, 0.007), Phil Andrews (277, 0.019), PHIL ANDREWS (61, 0.004), Roger Berliner (3530, 0.239), ROGER BERLINER (178, 0.012), Sidney Katz (928, 0.063), Tom Hucker (2050, 0.139), Valerie Ervin (139, 0.009), VALERIE ERVIN (74, 0.005)
20. Changed Date
Back to Contents

1 Variables   16058 Observations

Changed Date
image
                   n             missing            distinct                Info 
               16058                   0               16055                   1 
                Mean                 Gmd                 .05                 .10 
 2017-07-30 03:56:25            94887311 2013-05-09 18:07:57 2014-01-28 09:49:09 
                 .25                 .50                 .75                 .90 
 2015-05-29 17:31:24 2017-08-24 02:22:20 2019-08-15 13:40:46 2021-03-15 18:00:01 
                 .95 
 2021-08-20 10:09:40 
 
lowest :2012-07-02 15:52:332012-07-06 13:37:562012-07-09 17:32:192012-07-10 13:15:492012-07-10 15:18:01
highest:2022-02-11 11:03:412022-02-11 11:27:442022-02-11 16:02:222022-02-11 16:42:012022-02-11 18:36:47

21. # of Days Open count of elapsed business days to close request
Back to Contents

1 Variables   16058 Observations

# of Days Open
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
1605805870.98638.5964.36 0 0 1 3 20117240
lowest : 0 1 2 3 4 , highest: 753 797 831 1740 1778
22. Within SLA Windows logical 1/0
Back to Contents

1 Variables   16058 Observations

Within SLA Windows
nmissingdistinct
1605802
 Value         No   Yes
 Frequency   4450 11608
 Proportion 0.277 0.723
 

23. SLA Yes 1 if yes; 0 if no
Back to Contents

1 Variables   16058 Observations

SLA Yes
nmissingdistinct
1605802
 Value          0     1
 Frequency   4450 11608
 Proportion 0.277 0.723
 

24. SLA No 1 if no; 0 if yes
Back to Contents

1 Variables   16058 Observations

SLA No
nmissingdistinct
1605802
 Value          0     1
 Frequency  11608  4450
 Proportion 0.723 0.277
 

Code:

---
title: "mc311waterWaterSomewhere"
author: "Steve Dutky Montgomery College"
date: "DATA 205 Spring 2022"
output: html_document

---
stop==>```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE,message = FALSE,warning = FALSE)
stop==>```

stop==>```{r, results='asis'}
cat(paste(readLines("toc"),"<br>"))
stop==>```


### Summary Discussion {#summary}

I chose to explore the [Drainage-Erosion-Repair dataset](https://data.montgomerycountymd.gov/Government/Drainage-Erosion-Repair/3bzi-qh3p) with the idea for a project that would tie complaints of flooded basements, sidewalks and streets to precipitation recorded at NOAA weather stations located in Montgomery County. This could suggest where steps could remediate the causes of local flooding.<br>

The [csv data](https://data.montgomerycountymd.gov/api/views/3bzi-qh3p/rows.csv?accessType=DOWNLOAD) on the DataMontgomery portal collects the calls to the county's 311 Center that get referred to Department of Transportation (DOT) to investigate.
<br><br>
There a several likely issues that will pose difficulties for this project:
<br>

*  lack of precise location
*  electoral district have changed over the time span of the data
*  uncertainties connecting recorded precipitation events to the time of creation of service requests.
*  inability to determine the scale of flood events.
are the most significant.
<br>

Additionally there are the typical problems with structure of the data:
<br>
 
*  column naming: embedded spaces and punctuation
*  column names are verbose and do not adequately describe their contents
*  redundant columns
*  timestamps are overly precise.

stop==>```{r loadLibraries}

library(tidyverse)
library(Hmisc)
library(lubridate)
stop==>```

stop==>```{r ingestData }
load(file="erode.save")
#erode<-read_delim("https://data.montgomerycountymd.gov/api/views/3bzi-qh3p/rows.csv?accessType=DOWNLOAD",",",col_types=paste(rep("c",24),collapse = ""))
columnTitle<-readLines("columnDetail")
stop==>```
stop==>```{r cleanAndWrangle}

tmp<-names(erode)
names(erode)[20]<-"tmpChange"
names(erode)[21]<-"tmpOpen"

erode<- erode %>%
  mutate(
    Opened=parse_date_time(Opened,"%m/%d/%Y %I:%M:%S %p"),
    Closed=parse_date_time(Closed,"%m/%d/%Y %I:%M:%S %p"),
    tmpChange=parse_date_time(tmpChange,"%m/%d/%Y %I:%M:%S %p"),
    tmpOpen=as.integer(tmpOpen)
  ) 

names(erode)<-tmp

stop==>```




stop==>```{r, results='asis'}
html(describe(erode[, 1 ],columnTitle[ 1 ]))
html(describe(erode[, 2 ],columnTitle[ 2 ]))
html(describe(erode[, 3 ],columnTitle[ 3 ]))
html(describe(erode[, 4 ],columnTitle[ 4 ]))
html(describe(erode[, 5 ],columnTitle[ 5 ]))
html(describe(erode[, 6 ],columnTitle[ 6 ]))
html(describe(erode[, 7 ],columnTitle[ 7 ]))
html(describe(erode[, 8 ],columnTitle[ 8 ]))
html(describe(erode[, 9 ],columnTitle[ 9 ]))
html(describe(erode[, 10 ],columnTitle[ 10 ]))
html(describe(erode[, 11 ],columnTitle[ 11 ]))
html(describe(erode[, 12 ],columnTitle[ 12 ]))
html(describe(erode[, 13 ],columnTitle[ 13 ]))
html(describe(erode[, 14 ],columnTitle[ 14 ]))
html(describe(erode[, 15 ],columnTitle[ 15 ]))
html(describe(erode[, 16 ],columnTitle[ 16 ]))
html(describe(erode[, 17 ],columnTitle[ 17 ]))
html(describe(erode[, 18 ],columnTitle[ 18 ]))
html(describe(erode[, 19 ],columnTitle[ 19 ]))
html(describe(erode[, 20 ],columnTitle[ 20 ]))
html(describe(erode[, 21 ],columnTitle[ 21 ]))
html(describe(erode[, 22 ],columnTitle[ 22 ]))
html(describe(erode[, 23 ],columnTitle[ 23 ]))
html(describe(erode[, 24 ],columnTitle[ 24 ]))

stop==>```