Doing Excel Things in R #2

Here is an involved exercise using a larger data base in the form of an Excel .csv file to work with.

If you want to download the .csv file from github you can find it here.

Excel can be kind of squirrelly, and I found it best to download the ZIP on the github page and drag the Employee_Earnings_Report_2014.csv file to my desktop without opening the file.

Open Data File and View Data

earnings <- read.csv('C:\\Users\\Steven\\Desktop\\Employee_Earnings_Report_2014.csv', stringsAsFactors = FALSE)

Explore the data a bit using the dim, head, or tail commands. We will see there are 22,233 rows and 12 columns.

dim(earnings)

## [1] 22233    12

head(earnings)

##                        NAME                         TITLE
## 1          Adario,Anthony J  Supvising Claims Agent (Asd)
## 2        Baker,Katherine E.            Nurse Case Manager
## 3            Braga,Ann Hess HumanResourcesComplianceOffcr
## 4      Calobrisi,Jane Ellen             Personnel Analyst
## 5 Castillo-Cook,Wendolyn M.             Alcholism Coord I
## 6       Curran,Liam Patrick         Asst Corp Counsel III
##       DEPARTMENT.NAME   REGULAR RETRO    OTHER OVERTIME INJURED DETAIL
## 1 ASD Human Resources $98538.32 $0.00 $1842.87    $0.00   $0.00  $0.00
## 2 ASD Human Resources $83942.21 $0.00  $225.00    $0.00   $0.00  $0.00
## 3 ASD Human Resources $98538.34 $0.00    $0.00    $0.00   $0.00  $0.00
## 4 ASD Human Resources $70073.81 $0.00  $225.00    $0.00   $0.00  $0.00
## 5 ASD Human Resources $79253.11 $0.00    $0.00    $0.00   $0.00  $0.00
## 6 ASD Human Resources $69266.69 $0.00    $0.00    $0.00   $0.00  $0.00
##   QUINN TOTAL.EARNINGS   ZIP
## 1 $0.00     $100381.19 02132
## 2 $0.00      $84167.21 02136
## 3 $0.00      $98538.34 02126
## 4 $0.00      $70298.81 02090
## 5 $0.00      $79253.11 02121
## 6 $0.00      $69266.69 02122

tail(earnings)

##                          NAME               TITLE        DEPARTMENT.NAME
## 22228 Yanosick,Justin Timothy        Fire Fighter Boston Fire Department
## 22229               Yee,Jacob        Fire Fighter Boston Fire Department
## 22230          Yetman,Patrick        Fire Fighter Boston Fire Department
## 22231           Young,Allan M        Fire Fighter Boston Fire Department
## 22232         Young,Michael L        Fire Fighter Boston Fire Department
## 22233        Zagami,Anthony J Fire Alarm Operator Boston Fire Department
##         REGULAR     RETRO    OTHER  OVERTIME   INJURED   DETAIL QUINN
## 22228 $92646.90 $14250.84  $550.00 $28269.86     $0.00    $0.00 $0.00
## 22229 $80073.63 $13529.71  $550.00 $15560.09 $13353.47 $7245.75 $0.00
## 22230 $69886.18  $5191.64  $550.00  $8227.25     $0.00    $0.00 $0.00
## 22231 $97234.17 $14091.91 $3050.00  $6193.76     $0.00 $4464.00 $0.00
## 22232 $95694.84 $13973.38  $550.00 $10678.12     $0.00  $132.00 $0.00
## 22233     $0.00  $5494.93    $0.00     $0.00     $0.00    $0.00 $0.00
##       TOTAL.EARNINGS   ZIP
## 22228     $135717.60 02132
## 22229     $130312.65 02136
## 22230      $83855.07 02127
## 22231     $125033.84 02136
## 22232     $121028.34 02121
## 22233       $5494.93 02124

Examine Data Structure

You will notice that the data file consists entirely of characters. All cells are non-numeric. We will have to change this in order to analyze the data. We’ll get to that in a bit.

str(earnings)

## 'data.frame':    22233 obs. of  12 variables:
##  $ NAME           : chr  "Adario,Anthony J" "Baker,Katherine E." "Braga,Ann Hess" "Calobrisi,Jane Ellen" ...
##  $ TITLE          : chr  "Supvising Claims Agent (Asd)" "Nurse Case Manager" "HumanResourcesComplianceOffcr" "Personnel Analyst" ...
##  $ DEPARTMENT.NAME: chr  "ASD Human Resources" "ASD Human Resources" "ASD Human Resources" "ASD Human Resources" ...
##  $ REGULAR        : chr  "$98538.32" "$83942.21" "$98538.34" "$70073.81" ...
##  $ RETRO          : chr  "$0.00" "$0.00" "$0.00" "$0.00" ...
##  $ OTHER          : chr  "$1842.87" "$225.00" "$0.00" "$225.00" ...
##  $ OVERTIME       : chr  "$0.00" "$0.00" "$0.00" "$0.00" ...
##  $ INJURED        : chr  "$0.00" "$0.00" "$0.00" "$0.00" ...
##  $ DETAIL         : chr  "$0.00" "$0.00" "$0.00" "$0.00" ...
##  $ QUINN          : chr  "$0.00" "$0.00" "$0.00" "$0.00" ...
##  $ TOTAL.EARNINGS : chr  "$100381.19" "$84167.21" "$98538.34" "$70298.81" ...
##  $ ZIP            : chr  "02132" "02136" "02126" "02090" ...

Target Specific Rows, Columns, or Cells

If you want to find values in a specific row, column, or cell, you can use the following codes. As usual in R, there are different ways of doing the same thing.

earnings[3, 1] #row 3, column 1 or

## [1] "Braga,Ann Hess"

earnings$NAME[3]

## [1] "Braga,Ann Hess"

earnings[1:10, 1] #column 1 (rows 1 through 10 only) or

##  [1] "Adario,Anthony J"          "Baker,Katherine E."       
##  [3] "Braga,Ann Hess"            "Calobrisi,Jane Ellen"     
##  [5] "Castillo-Cook,Wendolyn M." "Curran,Liam Patrick"      
##  [7] "De Jesus,Winnie Yam"       "Donnelly,Jennifer P."     
##  [9] "Donovan,Carol Anne"        "Donovan,John J"

earnings[1:10, 'NAME']

##  [1] "Adario,Anthony J"          "Baker,Katherine E."       
##  [3] "Braga,Ann Hess"            "Calobrisi,Jane Ellen"     
##  [5] "Castillo-Cook,Wendolyn M." "Curran,Liam Patrick"      
##  [7] "De Jesus,Winnie Yam"       "Donnelly,Jennifer P."     
##  [9] "Donovan,Carol Anne"        "Donovan,John J"

earnings[3, ] #row 3

##             NAME                         TITLE     DEPARTMENT.NAME
## 3 Braga,Ann Hess HumanResourcesComplianceOffcr ASD Human Resources
##     REGULAR RETRO OTHER OVERTIME INJURED DETAIL QUINN TOTAL.EARNINGS   ZIP
## 3 $98538.34 $0.00 $0.00    $0.00   $0.00  $0.00 $0.00      $98538.34 02126

Formatting

We noted that the information in the file is all non-numeric. We need to deal with that. Let’s start by removing the dollar sign from a selected column.

If you are working in R Studio, it is useful to keep the file view open in the upper left (the script editing pane) you will instantly see changes you make.

earnings$TOTAL.EARNINGS <- gsub('\\$', '', earnings$TOTAL.EARNINGS)

The column has to be a numeric vector if we are to manipulate it, so time to convert the character vector to numeric.

earnings$TOTAL.EARNINGS <- as.numeric(earnings$TOTAL.EARNINGS)
str(earnings$TOTAL.EARNINGS)

##  num [1:22233] 100381 84167 98538 70299 79253 ...

Sorting by a Column

Sorting is not a problem in R. The negative sign (-) in the code means sort in descending order. The default is ascending order.

earnings <- earnings[order(-earnings$TOTAL.EARNINGS), ]
earnings[1:10, 11] #column 1 (rows 1 through 10 only)

##  [1] 415709.5 361025.0 350510.5 342249.4 325943.6 325579.0 317181.5
##  [8] 300065.6 292205.7 289436.2

Create a New Column With a Formula

In this case, we are going to create a new column called Total.less.OT to find each employee’s base salary. The Total.Earnings column includes base plus overtime.

But first we have to remove the dollar signs and then convert the Overtime column to a numeric vector, as we did to Total.Earnings, or we will induce an error.

earnings$OVERTIME <- sub('\\$', '', earnings$OVERTIME)
earnings$OVERTIME <- as.numeric(earnings$OVERTIME)
earnings$Total.less.OT <- earnings$TOTAL.EARNINGS - earnings$OVERTIME
head(earnings)

##                       NAME                          TITLE
## 17835    Crossen,Patrick J             Police Captain/DDC
## 19414     Murray,Timothy J  Capt.D.D.C-pd Details Section
## 18825    Kervin,Timothy M. Police Lieutenant/Hdq Dispatch
## 8342           Hite,Brenda                        Teacher
## 17822 Creaven,Jacqueline D              Police Lieutenant
## 18633        Hosein,Haseeb             Police Captain/DDC
##                DEPARTMENT.NAME    REGULAR     RETRO      OTHER  OVERTIME
## 17835 Boston Police Department $132427.70 $44690.23 $140711.86  77287.73
## 19414 Boston Police Department $133014.78 $36744.79  $70567.98  46059.04
## 18825 Boston Police Department $112948.06 $41654.06  $16975.92 115079.93
## 8342      BPS Kenny Elementary  $38453.84     $0.00 $303795.59      0.00
## 17822 Boston Police Department $111090.95 $30161.50 $131044.53  26375.89
## 18633 Boston Police Department $117206.34 $32585.30  $14656.40  50739.18
##        INJURED    DETAIL     QUINN TOTAL.EARNINGS   ZIP Total.less.OT
## 17835    $0.00     $0.00 $20592.01       415709.5 02043      338421.8
## 19414    $0.00 $53956.00 $20682.44       361025.0 02021      314966.0
## 18825    $0.00 $46314.00 $17538.56       350510.5 02135      235430.6
## 8342     $0.00     $0.00     $0.00       342249.4 02136      342249.4
## 17822  $132.97  $9918.00 $17219.79       325943.6 02132      299567.7
## 18633 $1294.34 $90840.00 $18257.43       325579.0 02186      274839.8

Subset or Filter a Column

Basically, in Excel terms, we are going to cut-and-paste columns into new R objects. In other words, we are extracting all data in rows from the earnings data frame that include “Boston Fire Department” in the Department.Name column and putting them into a separate object. Likewise for the Boston PD.

fire_dept <- subset(earnings, DEPARTMENT.NAME == 'Boston Fire Department')
police_dept <- subset(earnings, DEPARTMENT.NAME == 'Boston Police Department')
head(fire_dept)

##                     NAME                          TITLE
## 21554  McDevitt,Daniel J    Dep Fire Chief Adm-Tech-ADR
## 22111    Tobin,William A District Fire Chief-Technician
## 21986 Shea,Bartholomew J                 Dep Fire Chief
## 20853 Doherty,Michael J.                 Dep Fire Chief
## 21232      Holt,Thomas E         Fire Captain Admin-ADR
## 21933  Ruggere,Michael J                 Dep Fire Chief
##              DEPARTMENT.NAME    REGULAR     RETRO      OTHER OVERTIME
## 21554 Boston Fire Department   $8656.63  $8457.93 $132814.00     0.00
## 22111 Boston Fire Department   $6049.45  $1522.99 $114358.64     0.00
## 21986 Boston Fire Department $172250.77 $22037.04  $10061.23 55245.78
## 20853 Boston Fire Department $167289.41 $23861.77   $3741.54 63222.42
## 21232 Boston Fire Department  $27005.66 $13978.43 $101051.37  3153.96
## 21933 Boston Fire Department $183195.52 $20912.07   $4565.94 33749.38
##          INJURED DETAIL QUINN TOTAL.EARNINGS   ZIP Total.less.OT
## 21554 $118050.73  $0.00 $0.00       267979.3 02127      267979.3
## 22111 $140484.65  $0.00 $0.00       262415.7 02191      262415.7
## 21986      $0.00  $0.00 $0.00       259594.8 02132      204349.0
## 20853    $603.66  $0.00 $0.00       258718.8 01867      195496.4
## 21232 $107359.66  $0.00 $0.00       252549.1 02135      249395.1
## 21933      $0.00  $0.00 $0.00       242422.9 02188      208673.5

head(police_dept)

##                       NAME                          TITLE
## 17835    Crossen,Patrick J             Police Captain/DDC
## 19414     Murray,Timothy J  Capt.D.D.C-pd Details Section
## 18825    Kervin,Timothy M. Police Lieutenant/Hdq Dispatch
## 17822 Creaven,Jacqueline D              Police Lieutenant
## 18633        Hosein,Haseeb             Police Captain/DDC
## 17906     Danilecki,John H                 Police Captain
##                DEPARTMENT.NAME    REGULAR     RETRO      OTHER  OVERTIME
## 17835 Boston Police Department $132427.70 $44690.23 $140711.86  77287.73
## 19414 Boston Police Department $133014.78 $36744.79  $70567.98  46059.04
## 18825 Boston Police Department $112948.06 $41654.06  $16975.92 115079.93
## 17822 Boston Police Department $111090.95 $30161.50 $131044.53  26375.89
## 18633 Boston Police Department $117206.34 $32585.30  $14656.40  50739.18
## 17906 Boston Police Department $131219.38 $37210.94  $19757.04  78010.40
##        INJURED    DETAIL     QUINN TOTAL.EARNINGS   ZIP Total.less.OT
## 17835    $0.00     $0.00 $20592.01       415709.5 02043      338421.8
## 19414    $0.00 $53956.00 $20682.44       361025.0 02021      314966.0
## 18825    $0.00 $46314.00 $17538.56       350510.5 02135      235430.6
## 17822  $132.97  $9918.00 $17219.79       325943.6 02132      299567.7
## 18633 $1294.34 $90840.00 $18257.43       325579.0 02186      274839.8
## 17906    $0.00 $30573.00 $20410.79       317181.5 02081      239171.1

We could work on these objects separately on their own, away from the original earnings data frame.

Summarizing Columns

As in Excel, we can calculate sums, means, medians, or any other value we like on columns we choose.

sum(earnings$TOTAL.EARNINGS)

## [1] 1487852769

mean(earnings$TOTAL.EARNINGS)

## [1] 66920.92

median(earnings$TOTAL.EARNINGS)

## [1] 61714.87

max(earnings$TOTAL.EARNINGS)

## [1] 415709.5

range(earnings$TOTAL.EARNINGS)

## [1]      2.53 415709.53

And we can create separate object for each of these if we need them. For example, for the sum of TOTAL.EARNINGS.

earnings_total <- sum(earnings$TOTAL.EARNINGS)
earnings_total

## [1] 1487852769

Text Editing

You may have noticed the Name column contains first and last names, and middle initials. It may be necessary to break this aggregated column into its constituent parts. This can be done in R but the process is a little complicated.

Let’s begin by breaking out the last names. The sub command below will delete everything in the Name column after the comma following the last name then place the remainder in a new column called Last.Name.

earnings$Last.Name <- sub(',.*', '', earnings$NAME) #deletes everything in NAME
#column after the comma
earnings[1:10, 'Last.Name']

##  [1] "Crossen"   "Murray"    "Kervin"    "Hite"      "Creaven"  
##  [6] "Hosein"    "Danilecki" "Spillane"  "Assad"     "Greeley"

Next, we will create a new column called First.Name and extract contents from the Name column.

earnings$First.Name <- sub('.*,', '', earnings$NAME)
earnings[1:10, 'First.Name']

##  [1] "Patrick J"      "Timothy J"      "Timothy M."     "Brenda"        
##  [5] "Jacqueline D"   "Haseeb"         "John H"         "Matthew J"     
##  [9] "Mark L"         "Darrin Patrick"

Still, there is the problem of the middle initials. We want to remove it from the First.Name column and create a new column for them. This gets a little complicated because we take a couple steps, one involving a function.

Part 1: Create an Array of All Lines in earnings

earnings_list <- 1:nrow(earnings)
head(earnings_list)

## [1] 1 2 3 4 5 6

Part 2: Use Function (and Loop)

In a couple steps, we are going to create a column for middle names, then clean up the column First.Name.

for(i in earnings_list){
        if (grepl(' ', earnings$First.Name[i])) {
                earnings$Mid.Name[i] <- sub('.* ', '', earnings$First.Name[i])
        } else {
                earnings$Mid.Name[i] <- ''
        }
}
head(earnings)

##                       NAME                          TITLE
## 17835    Crossen,Patrick J             Police Captain/DDC
## 19414     Murray,Timothy J  Capt.D.D.C-pd Details Section
## 18825    Kervin,Timothy M. Police Lieutenant/Hdq Dispatch
## 8342           Hite,Brenda                        Teacher
## 17822 Creaven,Jacqueline D              Police Lieutenant
## 18633        Hosein,Haseeb             Police Captain/DDC
##                DEPARTMENT.NAME    REGULAR     RETRO      OTHER  OVERTIME
## 17835 Boston Police Department $132427.70 $44690.23 $140711.86  77287.73
## 19414 Boston Police Department $133014.78 $36744.79  $70567.98  46059.04
## 18825 Boston Police Department $112948.06 $41654.06  $16975.92 115079.93
## 8342      BPS Kenny Elementary  $38453.84     $0.00 $303795.59      0.00
## 17822 Boston Police Department $111090.95 $30161.50 $131044.53  26375.89
## 18633 Boston Police Department $117206.34 $32585.30  $14656.40  50739.18
##        INJURED    DETAIL     QUINN TOTAL.EARNINGS   ZIP Total.less.OT
## 17835    $0.00     $0.00 $20592.01       415709.5 02043      338421.8
## 19414    $0.00 $53956.00 $20682.44       361025.0 02021      314966.0
## 18825    $0.00 $46314.00 $17538.56       350510.5 02135      235430.6
## 8342     $0.00     $0.00     $0.00       342249.4 02136      342249.4
## 17822  $132.97  $9918.00 $17219.79       325943.6 02132      299567.7
## 18633 $1294.34 $90840.00 $18257.43       325579.0 02186      274839.8
##       Last.Name   First.Name Mid.Name
## 17835   Crossen    Patrick J        J
## 19414    Murray    Timothy J        J
## 18825    Kervin   Timothy M.       M.
## 8342       Hite       Brenda         
## 17822   Creaven Jacqueline D        D
## 18633    Hosein       Haseeb

Now that we have three columns – Last.Name, First.Name, and Mid.Name we have to go back to the First.Name column and remove the middle initials.

earnings$First.Name <- sub(' .*', '', earnings$First.Name)
head(earnings)

##                       NAME                          TITLE
## 17835    Crossen,Patrick J             Police Captain/DDC
## 19414     Murray,Timothy J  Capt.D.D.C-pd Details Section
## 18825    Kervin,Timothy M. Police Lieutenant/Hdq Dispatch
## 8342           Hite,Brenda                        Teacher
## 17822 Creaven,Jacqueline D              Police Lieutenant
## 18633        Hosein,Haseeb             Police Captain/DDC
##                DEPARTMENT.NAME    REGULAR     RETRO      OTHER  OVERTIME
## 17835 Boston Police Department $132427.70 $44690.23 $140711.86  77287.73
## 19414 Boston Police Department $133014.78 $36744.79  $70567.98  46059.04
## 18825 Boston Police Department $112948.06 $41654.06  $16975.92 115079.93
## 8342      BPS Kenny Elementary  $38453.84     $0.00 $303795.59      0.00
## 17822 Boston Police Department $111090.95 $30161.50 $131044.53  26375.89
## 18633 Boston Police Department $117206.34 $32585.30  $14656.40  50739.18
##        INJURED    DETAIL     QUINN TOTAL.EARNINGS   ZIP Total.less.OT
## 17835    $0.00     $0.00 $20592.01       415709.5 02043      338421.8
## 19414    $0.00 $53956.00 $20682.44       361025.0 02021      314966.0
## 18825    $0.00 $46314.00 $17538.56       350510.5 02135      235430.6
## 8342     $0.00     $0.00     $0.00       342249.4 02136      342249.4
## 17822  $132.97  $9918.00 $17219.79       325943.6 02132      299567.7
## 18633 $1294.34 $90840.00 $18257.43       325579.0 02186      274839.8
##       Last.Name First.Name Mid.Name
## 17835   Crossen    Patrick        J
## 19414    Murray    Timothy        J
## 18825    Kervin    Timothy       M.
## 8342       Hite     Brenda         
## 17822   Creaven Jacqueline        D
## 18633    Hosein     Haseeb

It’s not perfect (or easy) but it can be done.

Create a Pivot Table

Suppose we want a count of employees in each department in our data frame. We are going to follow these steps.

Create a new data frame
Sort it
Rename columns in the new data frame
Create the pivot table

Start with a new data frame.

Dept_Workers <- data.frame(table(earnings$DEPARTMENT.NAME))
head(Dept_Workers)

##                           Var1 Freq
## 1              Achievement Gap   15
## 2  Alighieri Montessori School   12
## 3          ASD Human Resources   53
## 4  ASD Intergvernmtl Relations   11
## 5 ASD Office of Budget Mangmnt   25
## 6 ASD Office Of Labor Relation   10

Now we know there are 12 employees at the Alighieri Montessori School.

Next, let’s sort the data frame by number of employees in descending order.

Dept_Workers <- Dept_Workers[order(-Dept_Workers$Freq), ]
head(Dept_Workers)

##                               Var1 Freq
## 22        Boston Police Department 3178
## 20          Boston Fire Department 1834
## 120   BPS Substitute Teachers/Nurs 1007
## 119          BPS Special Education  721
## 64         BPS Facility Management  581
## 19  Boston Cntr - Youth & Families  562

With 3178 employees, the Boston Police Department is largest.

If we don’t like the current column names – and who does? – we can change Var1 and Freq to something more descriptive.

colnames(Dept_Workers) <- c('Dept', 'No. of Employees')
head(Dept_Workers)

##                               Dept No. of Employees
## 22        Boston Police Department             3178
## 20          Boston Fire Department             1834
## 120   BPS Substitute Teachers/Nurs             1007
## 119          BPS Special Education              721
## 64         BPS Facility Management              581
## 19  Boston Cntr - Youth & Families              562

We are in a position now to create a pivot table of total income for employees in each department.

Start by creating a new object – income – with a new column – Dept. You will understand why we are doing this in a moment.

income <- tapply(earnings$TOTAL.EARNINGS, earnings$DEPARTMENT.NAME, sum)
income <- data.frame(income)
income$Dept <- rownames(income)
head(income)

##                                 income                         Dept
## Achievement Gap              1399700.2              Achievement Gap
## Alighieri Montessori School   649428.2  Alighieri Montessori School
## ASD Human Resources          3300290.3          ASD Human Resources
## ASD Intergvernmtl Relations   792921.8  ASD Intergvernmtl Relations
## ASD Office of Budget Mangmnt 1993052.6 ASD Office of Budget Mangmnt
## ASD Office Of Labor Relation  734024.6 ASD Office Of Labor Relation

Now we know the 12 employees at the Montessori School earn a sum of nearly $650,000. That’s about $54,000 per employee.

Next, we will the two data frames we’ve created – income and Dept_Workers – into a data frame called merged. In order to make this merge work, we had to create the Dept column in the income data frame

merged <- merge(Dept_Workers, income, by = 'Dept')
head(merged)

##                           Dept No. of Employees    income
## 1              Achievement Gap               15 1399700.2
## 2  Alighieri Montessori School               12  649428.2
## 3          ASD Human Resources               53 3300290.3
## 4  ASD Intergvernmtl Relations               11  792921.8
## 5 ASD Office of Budget Mangmnt               25 1993052.6
## 6 ASD Office Of Labor Relation               10  734024.6

Now that we have created this data frame, we can manipulate it as we can any other object in R. For example, let’s figure out average income per employee in each department. We will have to create a new column for the results. Then we will sort the data frame by the data in the new column in descending order.

merged$income.per.employee <- merged$income / merged$'No. of Employees'
mergedPer <- merged[order(-merged$income.per.employee), ]
head(mergedPer)

##                             Dept No. of Employees      income
## 13 Asst Superintendent-Network E                3    392022.2
## 14 Asst Superintendent-Network F                2    242305.9
## 10 Asst Superintendent-Network B                3    359097.6
## 15 Asst Superintendent-Network G                5    584471.3
## 20        Boston Fire Department             1834 209723252.9
## 22      Boston Police Department             3178 357809909.7
##    income.per.employee
## 13            130674.1
## 14            121152.9
## 10            119699.2
## 15            116894.3
## 20            114352.9
## 22            112589.7

We can sort merged by income, if that is more useful to us.

merged <- merged[order(-merged$income), ]
head(merged)

##                         Dept No. of Employees    income
## 22  Boston Police Department             3178 357809910
## 20    Boston Fire Department             1834 209723253
## 119    BPS Special Education              721  39917713
## 64   BPS Facility Management              581  25468073
## 210  Public Works Department              485  24888294
## 23     Boston Public Library              558  24858698
##     income.per.employee
## 22            112589.65
## 20            114352.92
## 119            55364.37
## 64             43834.89
## 210            51316.07
## 23             44549.64

Now, we’re finished and, just like Excel, we can save our work in the fammiliar .csv format.

write.csv(merged, 'C:\\Users\\Steven\\Desktop\\merged.csv')
write.csv(mergedPer, 'C:\\Users\\Steven\\Desktop\\mergedPer.csv')