1 Introduction

This document forms part of a project I am working on to analyse and present primary school admissions and performance data in Surrey and Hampshire in a meaningful way to parents looking to choose a school for their child. The full project, including all source data and scripts, can be found on Github.

1.1 Data Overview

In this analysis, I will explore data relating to primary school admissions in Surrey and Hampshire for academic years beginning September 2014. The data was taken from the UK Government Explore Education Statistics service and contains public sector information licensed under the Open Government Licence v3.0. Full details of the data sources used for this analysis can be found in the README file within the Github repo.

The CSV dataset is located here.

The dataset includes the number of offers made to applicants for both primary and secondary school places to start in September 2021, methods of application and the proportion who received preferred offers of various type. A time series is included going back to 2014, when primary data was collected for the first time following the introduction of the first primary national offer day.

While the dataset does include data for both primary and secondary applications, and for every local authority in England, my analysis will focus on primary applications in Surrey and Hampshire.

Variable names and descriptions for this file are provided below:

Variable name Variable description
admission_numbers Number of admission places available
applications_received Number of applications received
first_preference_offers a. Number of first preference offers
first_preference_offers_percent b. First preference rate
NC_year_admission Year of admission
no_of_preferences Number of preferences allowed
no_offer m. Number made no offer
no_offer_percent n. No offer rate
non_preferred_offer k. Number of non preferred offers
non_preferred_offer_percent l. Non preferred offer rate
offers_to_nonapplicants Total number of offers to non applicants
one_of_the_three_preference_offers g. Number of top three preference offers
one_of_the_three_preference_offers_percent h. Top three preference rate
online_applications Number of applications submitted online
online_apps_percent Percentage of applications submitted online
preferred_school_offer i. Number of preferred offers
preferred_school_offer_percent j. Preferred offer rate
school_phase School phase - collection phase
schools_in_another_la_offer q. Number of offers in another LA
schools_in_another_la_offer_percent r. Other LA offer rate
schools_in_la_offer o. Number of offers in home LA
schools_in_la_offer_percent p. Home LA offer rate
second_preference_offers c. Number of second preference offers
second_preference_offers_percent d. Second preference rate
third_preference_offers e. Number of third preference offers
third_preference_offers_percent f. Third preference rate

1.2 R Setup

I will make use of several libraries during this analysis:

library(tidyverse)
library(janitor)

2 Questions

I am primarily interested in answering the following questions with this data:

  • What proportion of applicants typically receive their first preference place across England as a whole?
  • What proportion of applicants typically receive any of their preferred places across England as a whole?
  • How do Surrey and Hampshire compare against the national (English) average for number of first preference and any preference places offered?
  • Has the number of applications in Surrey and Hampshire gone up or down since 2014?
  • Has the number of available places kept pace with the number of applications in Surrey and Hampshire since 2014?
  • Is there any relationship between the rate of online applications and the rate of first preference offers?

As I work through the data wrangling and exploration steps, I may uncover further questions that I wish to answer with this data.

3 Data Wrangling

3.1 Gather

First I need to read in the data from the CSV file so that I can explore, clean and analyse it further.

r_offers_la <- read_csv("data-in/appsandoffers_2021.csv")

I use the prefix r_ at the start of the variable name for the dataframe I just read in. This denotes the variable as being the ‘raw’ version of the data (i.e. exactly as it appeared at source). When I start to clean and wrangle this data, I will store the manipulated version as a new variable, without the r_ prefix. In this way, I always have a copy of the raw data to refer back to if needed.

3.1.1 Warnings

The output of the read_csv() code in the chunk above resulted in several warnings relating to parsing failures (the warnings are not displayed here but can be called using problems(df). I will print the warnings into a friendly format to see what is going on.

r_offers_la_problems <- problems(r_offers_la) %>% 
  group_by(col, expected, actual) %>% 
  summarise(nrows = n())
Table 2: Warnings displayed when reading in the applications and offers data
col expected actual nrows
no_offer a double c 49
no_offer_percent a double c 49
non_preferred_offer a double c 51
non_preferred_offer_percent a double c 51
offers_to_nonapplicants a double c 11
one_of_the_three_preference_offers a double c 4
one_of_the_three_preference_offers_percent a double c 4
online_applications a double c 3
online_apps_percent a double c 3
preferred_school_offer a double c 4
preferred_school_offer_percent a double c 4
schools_in_another_la_offer a double c 4
schools_in_another_la_offer_percent a double c 4
schools_in_la_offer a double c 4
schools_in_la_offer_percent a double c 4
second_preference_offers a double c 8
second_preference_offers_percent a double c 8
third_preference_offers a double c 10
third_preference_offers_percent a double c 10

Here I can see that there are 19 columns that the read_csv() function determined should be numeric based on analysis of the first 1,000 rows of data. However, in some rows it found a string "c" in these columns (as shown by the value in the actual column above).

From the material that accompanied the data in its original source location I know that some values were suppressed, and replaced with a string denoting the suppression:

Rounding

Primary and secondary suppression was applied in the 2014 dataset. No suppression has been applied from 2015 onwards

Conventions

The following convention is used throughout the underlying data.

  • ‘c’ Suppressed (2014 only)
  • ‘:’ Not applicable.

The suppressed values have been read into the R dataframe as null values (NA). As I work through the analysis I will need to decide if and how to impute these missing values.

3.2 Assess

Now that I have read the data into R, I can spend some time assessing it to find out what the features look like and what might need to be cleaned.

I’ll start by looking at a summary of the numerical columns of the dataset.

r_offers_la %>% 
  select(where(is.numeric)) %>% 
  summary()
##   time_period      old_la_code    admission_numbers applications_received
##  Min.   :201415   Min.   :201.0   Min.   :     0    Min.   :    16       
##  1st Qu.:201516   1st Qu.:334.0   1st Qu.:  2434    1st Qu.:  2185       
##  Median :201718   Median :807.0   Median :  3551    Median :  3162       
##  Mean   :201768   Mean   :614.5   Mean   : 12364    Mean   : 11000       
##  3rd Qu.:201920   3rd Qu.:873.0   3rd Qu.:  6252    3rd Qu.:  5333       
##  Max.   :202122   Max.   :938.0   Max.   :713090    Max.   :641572       
##                   NA's   :176                                            
##  online_applications online_apps_percent first_preference_offers
##  Min.   :     1      Min.   :  0.05692   Min.   :    10         
##  1st Qu.:  1886      1st Qu.: 86.02105   1st Qu.:  1851         
##  Median :  2845      Median : 93.56607   Median :  2687         
##  Mean   :  9972      Mean   : 89.15999   Mean   :  9501         
##  3rd Qu.:  4846      3rd Qu.: 97.18310   3rd Qu.:  4756         
##  Max.   :580938      Max.   :100.00000   Max.   :566836         
##  NA's   :3           NA's   :3                                  
##  first_preference_offers_percent second_preference_offers
##  Min.   :51.40                   Min.   :    0.0         
##  1st Qu.:81.87                   1st Qu.:  131.0         
##  Median :88.43                   Median :  218.5         
##  Mean   :85.82                   Mean   :  771.5         
##  3rd Qu.:92.27                   3rd Qu.:  381.5         
##  Max.   :99.72                   Max.   :54187.0         
##                                  NA's   :8               
##  second_preference_offers_percent third_preference_offers
##  Min.   : 0.000                   Min.   :    0.0        
##  1st Qu.: 4.533                   1st Qu.:   29.0        
##  Median : 6.407                   Median :   64.0        
##  Mean   : 7.204                   Mean   :  250.2        
##  3rd Qu.: 9.056                   3rd Qu.:  132.0        
##  Max.   :36.842                   Max.   :20186.0        
##  NA's   :8                        NA's   :10             
##  third_preference_offers_percent one_of_the_three_preference_offers
##  Min.   : 0.0000                 Min.   :    13                    
##  1st Qu.: 0.9258                 1st Qu.:  2082                    
##  Median : 1.7637                 Median :  3006                    
##  Mean   : 2.3479                 Mean   : 10534                    
##  3rd Qu.: 3.1664                 3rd Qu.:  5176                    
##  Max.   :10.3448                 Max.   :617899                    
##  NA's   :10                      NA's   :4                         
##  one_of_the_three_preference_offers_percent preferred_school_offer
##  Min.   : 74.77                             Min.   :    15        
##  1st Qu.: 93.98                             1st Qu.:  2111        
##  Median : 96.65                             Median :  3048        
##  Mean   : 95.36                             Mean   : 10651        
##  3rd Qu.: 98.04                             3rd Qu.:  5185        
##  Max.   :100.00                             Max.   :621904        
##  NA's   :4                                  NA's   :4             
##  preferred_school_offer_percent non_preferred_offer non_preferred_offer_percent
##  Min.   : 83.80                 Min.   :    0.0     Min.   : 0.000             
##  1st Qu.: 95.52                 1st Qu.:   43.0     1st Qu.: 1.532             
##  Median : 97.14                 Median :   90.0     Median : 2.529             
##  Mean   : 96.56                 Mean   :  330.6     Mean   : 3.050             
##  3rd Qu.: 98.21                 3rd Qu.:  181.0     3rd Qu.: 4.102             
##  Max.   :100.00                 Max.   :27893.0     Max.   :16.199             
##  NA's   :4                      NA's   :51          NA's   :51                 
##     no_offer       no_offer_percent   schools_in_la_offer
##  Min.   :   0.00   Min.   : 0.00000   Min.   :     0     
##  1st Qu.:   0.00   1st Qu.: 0.00000   1st Qu.:  2010     
##  Median :   1.00   Median : 0.03357   Median :  2948     
##  Mean   :  39.12   Mean   : 0.40254   Mean   : 10398     
##  3rd Qu.:   9.00   3rd Qu.: 0.22144   3rd Qu.:  5170     
##  Max.   :3868.00   Max.   :15.06708   Max.   :622789     
##  NA's   :49        NA's   :49         NA's   :4          
##  schools_in_la_offer_percent schools_in_another_la_offer
##  Min.   :  0.00              Min.   :    0              
##  1st Qu.: 91.67              1st Qu.:   71              
##  Median : 95.81              Median :  163              
##  Mean   : 92.98              Mean   :  601              
##  3rd Qu.: 97.91              3rd Qu.:  324              
##  Max.   :100.00              Max.   :48916              
##  NA's   :4                   NA's   :4                  
##  schools_in_another_la_offer_percent offers_to_nonapplicants
##  Min.   :  0.000                     Min.   :   0.00        
##  1st Qu.:  2.091                     1st Qu.:   0.00        
##  Median :  4.190                     Median :   0.00        
##  Mean   :  7.017                     Mean   :  23.37        
##  3rd Qu.:  8.327                     3rd Qu.:   2.00        
##  Max.   :100.000                     Max.   :2494.00        
##  NA's   :4                           NA's   :11

From the summary I observe the following:

  • The timeperiod column denotes an academic year, but is in the format ‘202021’. I would like to mutate this numeric column (a numeric column so that it displays only the start year of the academic year (i.e. ‘202021’ becomes ‘2020’. I could also rename it to year_of_entry for clarity).
  • The old_la_code column contains a categorical variable rather than a true numerical measure, and could benefit from being converted to a string.
  • Several of the columns show a large difference between the mean and the median. Looking at the description of the dataset on the source website, I see that it states: “Figures are provided at national, regional and local authority level.” This means that there are some records that display an aggregation of other records within the dataset. To help my analysis, I will remove these aggregated records (I can easily recreate the aggregations from the remaining granular data if needed).
  • As mentioned earlier, some of the columns have missing data and I will need to decide the best way to handle these missing values for my analyses.
  • online_apps_percent has a large range (min: 0.06%, max: 100%). It might be interesting to see how this measure varies by local authority or over time.
  • The columns containing percentages follow the format 97.6, 88.3, 3.2 etc. I would like to convert these to decimals (i.e. divide each value by 100) to make it easier to work with later.

I can also view a summary of the character columns.

r_offers_la %>% 
  select(where(is.character)) %>% 
  lapply(table)
## $time_identifier
## 
## Academic year 
##          2588 
## 
## $geographic_level
## 
## Local authority        National        Regional 
##            2412              16             160 
## 
## $country_code
## 
## E92000001 
##      2588 
## 
## $country_name
## 
## England 
##    2588 
## 
## $region_code
## 
## E12000001 E12000002 E12000003 E12000004 E12000005 E12000006 E12000008 E12000009 
##       208       384       256       160       240       192       320       252 
## E13000001 E13000002 
##       240       320 
## 
## $region_name
## 
##            East Midlands          East of England             Inner London 
##                      160                      192                      240 
##               North East               North West             Outer London 
##                      208                      384                      320 
##               South East               South West            West Midlands 
##                      320                      252                      240 
## Yorkshire and The Humber 
##                      256 
## 
## $new_la_code
## 
## E06000001 E06000002 E06000003 E06000004 E06000005 E06000006 E06000007 E06000008 
##        16        16        16        16        16        16        16        16 
## E06000009 E06000010 E06000011 E06000012 E06000013 E06000014 E06000015 E06000016 
##        16        16        16        16        16        16        16        16 
## E06000017 E06000018 E06000019 E06000020 E06000021 E06000022 E06000023 E06000024 
##        16        16        16        16        16        16        16        16 
## E06000025 E06000026 E06000027 E06000028 E06000029 E06000030 E06000031 E06000032 
##        16        16        16        12        12        16        16        16 
## E06000033 E06000034 E06000035 E06000036 E06000037 E06000038 E06000039 E06000040 
##        16        16        16        16        16        16        16        16 
## E06000041 E06000042 E06000043 E06000044 E06000045 E06000046 E06000047 E06000048 
##        16        16        16        16        16        16        16        16 
## E06000049 E06000050 E06000051 E06000052 E06000054 E06000055 E06000056 E06000058 
##        16        16        16        16        16        16        16         4 
## E06000059 E08000001 E08000002 E08000003 E08000004 E08000005 E08000006 E08000007 
##         4        16        16        16        16        16        16        16 
## E08000008 E08000009 E08000010 E08000011 E08000012 E08000013 E08000014 E08000015 
##        16        16        16        16        16        16        16        16 
## E08000016 E08000017 E08000018 E08000019 E08000020 E08000021 E08000022 E08000023 
##        16        16        16        16        16        16        16        16 
## E08000024 E08000025 E08000026 E08000027 E08000028 E08000029 E08000030 E08000031 
##        16        16        16        16        16        16        16        16 
## E08000032 E08000033 E08000034 E08000035 E08000036 E09000001 E09000002 E09000003 
##        16        16        16        16        16        16        16        16 
## E09000004 E09000005 E09000006 E09000007 E09000008 E09000009 E09000010 E09000011 
##        16        16        16        16        16        16        16        16 
## E09000012 E09000013 E09000014 E09000015 E09000016 E09000017 E09000018 E09000019 
##        16        16        16        16        16        16        16        16 
## E09000020 E09000021 E09000022 E09000023 E09000024 E09000025 E09000026 E09000027 
##        16        16        16        16        16        16        16        16 
## E09000028 E09000029 E09000030 E09000031 E09000032 E09000033 E10000002 E10000003 
##        16        16        16        16        16        16        16        16 
## E10000006 E10000007 E10000008 E10000009 E10000011 E10000012 E10000013 E10000014 
##        16        16        16        12        16        16        16        16 
## E10000015 E10000016 E10000017 E10000018 E10000019 E10000020 E10000021 E10000023 
##        16        16        16        16        16        16        16        16 
## E10000024 E10000025 E10000027 E10000028 E10000029 E10000030 E10000031 E10000032 
##        16        16        16        16        16        16        16        16 
## E10000034 
##        16 
## 
## $la_name
## 
##               Barking and Dagenham                             Barnet 
##                                 16                                 16 
##                           Barnsley       Bath and North East Somerset 
##                                 16                                 16 
##                            Bedford                             Bexley 
##                                 16                                 16 
##                         Birmingham              Blackburn with Darwen 
##                                 16                                 16 
##                          Blackpool                             Bolton 
##                                 16                                 16 
##                        Bournemouth Bournemouth Christchurch and Poole 
##                                 12                                  4 
##                   Bracknell Forest                           Bradford 
##                                 16                                 16 
##                              Brent                  Brighton and Hove 
##                                 16                                 16 
##                   Bristol, City of                            Bromley 
##                                 16                                 16 
##                    Buckinghamshire                               Bury 
##                                 16                                 16 
##                         Calderdale                     Cambridgeshire 
##                                 16                                 16 
##                             Camden               Central Bedfordshire 
##                                 16                                 16 
##                      Cheshire East          Cheshire West and Chester 
##                                 16                                 16 
##                     City of London                           Cornwall 
##                                 16                                 16 
##                           Coventry                            Croydon 
##                                 16                                 16 
##                            Cumbria                         Darlington 
##                                 16                                 16 
##                              Derby                         Derbyshire 
##                                 16                                 16 
##                              Devon                          Doncaster 
##                                 16                                 16 
##                             Dorset                             Dudley 
##                                 16                                 16 
##                             Durham                             Ealing 
##                                 16                                 16 
##           East Riding of Yorkshire                        East Sussex 
##                                 16                                 16 
##                            Enfield                              Essex 
##                                 16                                 16 
##                          Gateshead                    Gloucestershire 
##                                 16                                 16 
##                          Greenwich                            Hackney 
##                                 16                                 16 
##                             Halton             Hammersmith and Fulham 
##                                 16                                 16 
##                          Hampshire                           Haringey 
##                                 16                                 16 
##                             Harrow                         Hartlepool 
##                                 16                                 16 
##                           Havering                      Herefordshire 
##                                 16                                 16 
##                      Hertfordshire                         Hillingdon 
##                                 16                                 16 
##                           Hounslow                      Isle of Wight 
##                                 16                                 16 
##                          Islington             Kensington and Chelsea 
##                                 16                                 16 
##                               Kent        Kingston Upon Hull, City of 
##                                 16                                 16 
##               Kingston upon Thames                           Kirklees 
##                                 16                                 16 
##                           Knowsley                            Lambeth 
##                                 16                                 16 
##                         Lancashire                              Leeds 
##                                 16                                 16 
##                          Leicester                     Leicestershire 
##                                 16                                 16 
##                           Lewisham                       Lincolnshire 
##                                 16                                 16 
##                          Liverpool                              Luton 
##                                 16                                 16 
##                         Manchester                             Medway 
##                                 16                                 16 
##                             Merton                      Middlesbrough 
##                                 16                                 16 
##                      Milton Keynes                Newcastle upon Tyne 
##                                 16                                 16 
##                             Newham                            Norfolk 
##                                 16                                 16 
##            North East Lincolnshire                 North Lincolnshire 
##                                 16                                 16 
##                     North Somerset                     North Tyneside 
##                                 16                                 16 
##                    North Yorkshire                   Northamptonshire 
##                                 16                                 16 
##                     Northumberland                         Nottingham 
##                                 16                                 16 
##                    Nottinghamshire                             Oldham 
##                                 16                                 16 
##                        Oxfordshire                       Peterborough 
##                                 16                                 16 
##                           Plymouth                              Poole 
##                                 16                                 12 
##                         Portsmouth                            Reading 
##                                 16                                 16 
##                          Redbridge               Redcar and Cleveland 
##                                 16                                 16 
##               Richmond upon Thames                           Rochdale 
##                                 16                                 16 
##                          Rotherham                            Rutland 
##                                 16                                 16 
##                            Salford                           Sandwell 
##                                 16                                 16 
##                             Sefton                          Sheffield 
##                                 16                                 16 
##                         Shropshire                             Slough 
##                                 16                                 16 
##                           Solihull                           Somerset 
##                                 16                                 16 
##              South Gloucestershire                     South Tyneside 
##                                 16                                 16 
##                        Southampton                    Southend-on-Sea 
##                                 16                                 16 
##                          Southwark                         St. Helens 
##                                 16                                 16 
##                      Staffordshire                          Stockport 
##                                 16                                 16 
##                   Stockton-on-Tees                     Stoke-on-Trent 
##                                 16                                 16 
##                            Suffolk                         Sunderland 
##                                 16                                 16 
##                             Surrey                             Sutton 
##                                 16                                 16 
##                            Swindon                           Tameside 
##                                 16                                 16 
##                 Telford and Wrekin                           Thurrock 
##                                 16                                 16 
##                             Torbay                      Tower Hamlets 
##                                 16                                 16 
##                           Trafford                          Wakefield 
##                                 16                                 16 
##                            Walsall                     Waltham Forest 
##                                 16                                 16 
##                         Wandsworth                         Warrington 
##                                 16                                 16 
##                       Warwickshire                     West Berkshire 
##                                 16                                 16 
##                        West Sussex                        Westminster 
##                                 16                                 16 
##                              Wigan                          Wiltshire 
##                                 16                                 16 
##             Windsor and Maidenhead                             Wirral 
##                                 16                                 16 
##                          Wokingham                      Wolverhampton 
##                                 16                                 16 
##                     Worcestershire                               York 
##                                 16                                 16 
## 
## $school_phase
## 
##   Primary Secondary 
##      1294      1294 
## 
## $NC_year_admission
## 
##    :    7    9    R 
##  176 1190   16 1206 
## 
## $no_of_preferences
## 
##    :    3    4    5    6 
##  176 1245  221  180  766

From this I can make some further observations:

  • Several columns appear to be redundant as they contain only a single value. time_identifer, country_code, country_name can probably all be removed since they contain no useful information.
  • NC_year_admission and no_of_preferences both contain ‘not applicable’ values denoted by :. I believe these null values are tied to the records with the regional and national aggregated data and will therefore disappear when I remove those rows.
  • no_of_preferences would be better converted to a numeric column once the ‘not applicable’ records have been removed.

3.3 Clean

I’ll start by fixing the issues I uncovered during my initial assessment. I will save my changes into a new dataframe offers_la so that I have a copy of both the raw data and my edited version.

# convert the timeperiod column to year_of_entry
offers_la <- r_offers_la %>% 
  mutate(time_period = as.character(time_period),
         time_period = str_sub(time_period, 1, 4),
         time_period = as.numeric(time_period)) %>% 
  rename(year_of_entry = time_period)

head(offers_la$year_of_entry)
## [1] 2021 2021 2021 2021 2021 2021
# convert the old_la_code column to character
offers_la <- offers_la %>% 
  mutate(old_la_code = as.character(old_la_code))

tail(offers_la$old_la_code)
## [1] "931" "933" "935" "936" "937" "938"
# remove the aggregated data
offers_la <- offers_la %>% 
  filter(geographic_level == "Local authority")
# convert percentages to decimal
offers_la <- offers_la %>% 
  mutate(across(ends_with("_percent"), ~ .x / 100))

offers_la %>% 
  select(ends_with("_percent")) %>% 
  head()
## # A tibble: 6 x 10
##   online_apps_perc… first_preference_of… second_preference_… third_preference_o…
##               <dbl>                <dbl>               <dbl>               <dbl>
## 1             1                    0.893              0.0357              0     
## 2             0.999                0.778              0.0974              0.0395
## 3             0.996                0.889              0.0664              0.0193
## 4             0.981                0.910              0.0438              0.0189
## 5             0.983                0.796              0.0878              0.0367
## 6             1                    0.785              0.0871              0.0356
## # … with 6 more variables: one_of_the_three_preference_offers_percent <dbl>,
## #   preferred_school_offer_percent <dbl>, non_preferred_offer_percent <dbl>,
## #   no_offer_percent <dbl>, schools_in_la_offer_percent <dbl>,
## #   schools_in_another_la_offer_percent <dbl>
# remove redundant columns
offers_la <- offers_la %>% 
  select(!c(time_identifier, country_code, country_name))
offers_la <- offers_la %>% 
  mutate(no_of_preferences = as.numeric(no_of_preferences))

3.3.1 Handling Missing Values

I have several columns that contain missing values, and I need to decide if and how to impute these values. I’ll start by inspecting the clean dataframe to find out where the missing values are located.

knitr::kable(colSums(is.na(offers_la)),
             col.names = c("count_na"))
count_na
year_of_entry 0
geographic_level 0
region_code 0
region_name 0
new_la_code 0
old_la_code 0
la_name 0
school_phase 0
NC_year_admission 0
admission_numbers 0
applications_received 0
online_applications 3
online_apps_percent 3
no_of_preferences 0
first_preference_offers 0
first_preference_offers_percent 0
second_preference_offers 8
second_preference_offers_percent 8
third_preference_offers 10
third_preference_offers_percent 10
one_of_the_three_preference_offers 4
one_of_the_three_preference_offers_percent 4
preferred_school_offer 4
preferred_school_offer_percent 4
non_preferred_offer 51
non_preferred_offer_percent 51
no_offer 49
no_offer_percent 49
schools_in_la_offer 4
schools_in_la_offer_percent 4
schools_in_another_la_offer 4
schools_in_another_la_offer_percent 4
offers_to_nonapplicants 11

4 Export Clean Data

I will save my clean data to a new CSV file that I can use for visualization in Tableau.

write_csv(offers_la, file="data-out/offers-la.csv", na="")