Rows: 13,247
Columns: 14
$ id <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,…
$ entity <dbl> 189, 189, 189, 189, 189, 189, 189, 189, 189, 189, 189, 189…
$ good <dbl> 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, …
$ units <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ avgkg <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ avgkgold <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ avglifespan <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ totalkg <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ tariff <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ selected <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ dimension <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ wasteclass <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ period <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ invoice <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
PICO Working Paper 02
PoM Data Wrangling
In this document we cast a glance at the ‘entitygood’ table for the current and previous periods. Also viewed are the:
- ‘collectiondetails’,
- ‘entitygoodapproval’,
- ‘good’,
- ‘servicerequest’,
- ‘servicerequestfrequency’,
- ‘servicerequestreport’,
- ‘treatmentdetails’,
- ‘address’,
- ‘contactnumber’,
- ‘email’,
- ‘entity’,
- objectvalue, and
- transaction tables.
The objective for viewing the tables is to verify that said tables contain PoM data for the period under review.
Quick glimpse of tables
‘entitygood’ - current period (2024)
2024 - ‘entitygood’ ‘period’ column set
[1] NA 202204 2022001 2021004 202201 202202 202203 202205 202111
[10] 202212 202112 202206 2022002 202207 202208 202209 2022003 202210
[19] 202211 2022004 202301 202302 2023001 202303 202312 202304 202305
[28] 202306 2023002 202307 202308 2023003 202309 202310 202311 2023004
[37] 2021001 2021002 2021003
‘entitygood’ - prior year (2023)
Rows: 14,879
Columns: 14
$ id <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,…
$ entity <dbl> 189, 189, 189, 189, 189, 189, 189, 189, 189, 189, 189, 189…
$ good <dbl> 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, …
$ units <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ avgkg <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ avgkgold <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ avglifespan <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ totalkg <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ tariff <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ selected <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ dimension <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ wasteclass <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ period <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ invoice <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
2023 - ‘entitygood’ ‘period’ column set
[1] NA 2021001 202204 2022001 2021004 202201 202202 202203 202205
[10] 202111 202212 202112 202206 2022002 202207 202208 202209 2022003
[19] 202210 202211 2022004 202301 202302 2023001 202303 202312 202304
[28] 202305 202306 2023002 202307 202308 2023003 202309 202310 202311
[37] 2023004 2021002 2021003
‘entitygood’ current and prior comparison analysis
‘totalkg’ maximum value and row location - [2024]
[1] 3537041640
# A tibble: 1 × 14
id entity good units avgkg avgkgold avglifespan totalkg tariff selected
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 361 166 109 62316 56760. NA NA 3.54e9 1.5 1
# ℹ 4 more variables: dimension <lgl>, wasteclass <dbl>, period <dbl>,
# invoice <dbl>
Aggregate ‘totalkg’ - [2024]
[1] 4383202398
‘totalkg’ maximum value and row location - [2023]
[1] 3537041640
# A tibble: 1 × 14
id entity good units avgkg avgkgold avglifespan totalkg tariff selected
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 361 166 109 62316 56760. NA NA 3.54e9 1.5 1
# ℹ 4 more variables: dimension <lgl>, wasteclass <dbl>, period <dbl>,
# invoice <dbl>
Aggregate ‘totalkg’ - [2023]
[1] 4528590649
Tables listed by developer as relevant for collating PoM data
‘collectiondetails’
Rows: 835
Columns: 13
$ id <dbl> 150, 151, 152, 153, 154, 155, 156, 157, 158, 159,…
$ servicerequestreport <dbl> 78, 78, 78, 78, 78, 78, 78, 78, 78, 78, 78, 78, 7…
$ wasteclassification <dbl> 63858, 63858, 63858, 63858, 63858, 63858, 63858, …
$ dimension <dbl> 58227, 58307, 58307, 58307, 58305, 58305, 58305, …
$ collect <dbl> 10000, 10000, 10000, 10000, 10000, 10000, 10000, …
$ transport <dbl> 20000, 20000, 20000, 20000, 20000, 20000, 20000, …
$ recycling <dbl> 30000, 30000, 30000, 30000, 30000, 30000, 30000, …
$ kgs <dbl> 930650.00, 13591.78, 3228.84, 945.30, 3866.55, 22…
$ source <dbl> 63860, 63860, 63860, 63860, 63860, 63860, 63860, …
$ spwaybill <chr> "S03163", "HB82403P", "MI112218P", "MI111918P", "…
$ productdescription <chr> "Electric and electronic machines", "Air conditio…
$ collectiondate <date> 2023-06-30, 2023-06-30, 2023-06-30, 2023-06-30, …
$ producer <chr> "Nestle (South Africa) (PTY) LTD", "Various", "Va…
‘collectiondetails’ ‘collectiondate’ column
[1] "2023-06-30" "2024-01-31" "2024-02-21" "2024-02-14" "2024-02-19"
[6] "2024-02-23" "2024-02-28" "2024-02-08" "2024-02-06" "2024-02-12"
[11] "2024-02-16" "2024-04-15" "2024-04-05" "2024-04-22" "2024-04-16"
[16] "2024-04-29" "2024-04-18" "2024-04-17" "2024-04-11" "2024-04-09"
[21] "2024-04-08" "2024-04-04" "2024-03-31" "2024-04-30" "2024-05-31"
[26] "2024-02-29" "2024-06-30" "2024-07-25" "2024-07-24" "2024-07-19"
[31] "2024-07-23" "2024-07-15" "2024-07-12" "2024-07-09" "2024-07-31"
[36] "2024-08-31" "2024-08-23" "2024-08-22" "2024-08-20" "2024-08-13"
[41] "2024-08-05" "2024-08-15" "2024-08-14" "2024-08-01" "2024-08-28"
[46] "2024-08-16" "2024-08-06" "2024-09-16" "2024-09-10" "2024-09-30"
[51] "2024-09-13" "2024-09-25" "2024-09-12" "2024-09-11" "2024-09-09"
[56] "2024-09-23" "2024-09-20" "2024-09-04" "2024-09-02" "2024-10-31"
[61] "2024-10-10" "2024-10-25" "2024-10-17" "2024-10-11" "2024-10-03"
[66] "2024-10-15" "2024-10-07" "2024-10-09" "2024-10-24" "2024-10-21"
[71] "2024-10-16" "2024-10-14" "2024-10-08" "2024-10-29"
’collectiondetails aggregate kgs
[1] 10579081
‘entitygoodapproval’
Rows: 238
Columns: 13
$ id <dbl> 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, …
$ approvedby <dbl> 189, 229, 229, 229, 229, 229, 229, 229, 229, 229, 22…
$ approvedon <dttm> 2022-10-14 10:31:00, 2022-10-14 10:44:47, 2022-10-1…
$ invoiceapprovedby <dbl> 229, NA, 229, 229, NA, NA, NA, NA, NA, NA, 229, 229,…
$ invoiceapprovedon <dttm> 2023-07-13 14:09:56, NA, 2023-07-14 14:16:40, 2023-…
$ entity <dbl> 73, 43, 107, 107, 10, 24, 130, 7, 3, 24, 73, 73, 73,…
$ period <dbl> 202207, 2022002, 202208, 202207, 202207, 202207, 202…
$ purchaseorder <chr> "-", "-", "-", "-", "-", "-", "-", "-", "-", "-", "-…
$ invoiceno <chr> "R7-22-DAT0", "R2-22-MUS0", "R8-22-DGS1", "R7-22-DGS…
$ usear <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ usevat <dbl> 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1…
$ query <dbl> 41, NA, 60, 59, NA, NA, NA, NA, NA, NA, 34, 35, 36, …
$ status <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
‘entitygoodapproval’ period column contents
[1] 202207 2022002 202208 2022001 202206 202112 202201 202202 202203
[10] 202204 202205 2021004 202111 202209 2022003 202210 202211 202212
[19] 2022004 202301 202302 2023001 202312 202303 202304 202305 202306
[28] 2023002 202307 202308 2023003 202309 202310 202311 2023004 2021001
[37] 2021002 2021003
‘good’
Rows: 6,514
Columns: 10
$ id <dbl> 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, …
$ hscode <chr> "H841510", "H841581", "H841582", "H841583", "H841821",…
$ description <chr> "Air conditioners window/wall types, self-contained", …
$ eu6 <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 4, 4, 4, …
$ eu10 <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 10, 10, 5, 1, 1, 1…
$ unu <chr> "0111", "0111", "0111", "0113", "0108", "0108", "0108"…
$ avgkg <dbl> 26.70, 26.70, 26.70, 110.10, 40.79, 40.79, 40.79, 44.0…
$ category <dbl> 58226, 58228, 58229, 58230, 58231, 58232, 58233, 58234…
$ hazardsubstance <dbl> 58225, 58225, 58225, 58225, 58225, 58225, 58225, 58225…
$ dimension <dbl> 58227, 58227, 58227, 58227, 58227, 58227, 58227, 58227…
‘servicerequest’
Rows: 30
Columns: 11
$ id <dbl> 18, 21, 22, 23, 24, 25, 26, 32, 33, 34, 35, 36, 37, 3…
$ servicerequestno <chr> "018102023-Levanya GMAIL Trading", "021102023-Levanya…
$ serviceprovider <dbl> 638, 638, 636, 431, 651, 638, 638, 636, 636, 459, 636…
$ createdby <dbl> 10439, 10439, 10417, 10362, 10362, 10439, 10417, 1041…
$ createdon <dttm> 2023-10-05 12:03:00, 2023-10-10 15:58:00, 2023-10-11…
$ fromdate <dttm> 2023-10-05 12:02:00, 2023-10-14 00:00:00, 2023-10-12…
$ todate <dttm> 2023-10-21, 2023-10-23, 2023-10-17, 2023-10-24, 2023…
$ services <chr> "XXXX", "ipads", "Laptops", "1. Cartage & Labour cost…
$ locations <chr> "XXXX", "Cape town", "Laptops", "Cnr Strandfontein &,…
$ deliverables <chr> "XXXX", "ipads", "Laptops", "1. Provide a 8 ton vehic…
$ deliverydate <dttm> 2023-10-06, 2023-10-23, 2023-10-23, 2023-10-25, 2023…
‘servicerequestfrequency’
Rows: 30
Columns: 4
$ id <dbl> 14, 17, 18, 19, 20, 21, 22, 28, 29, 30, 31, 32, 33, 34…
$ servicerequest <dbl> 18, 21, 22, 23, 24, 25, 26, 32, 33, 34, 35, 36, 37, 38…
$ reportfrequency <dbl> 63853, 63853, 63852, 63853, 63853, 63852, 63853, 63853…
$ active <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, …
‘servicerequestreport’
Rows: 26
Columns: 9
$ id <dbl> 78, 79, 80, 81, 82, 83, 85, 88, 89, 93, 94, 95, 96, 9…
$ servicerequest <dbl> 21, 18, 25, 36, 40, 32, 45, 45, 45, 41, 48, 50, 48, 4…
$ serviceprovider <dbl> 638, 638, 638, 638, 636, 636, 317, 317, 317, 636, 765…
$ createdby <dbl> 10474, 10474, 10474, 10474, 10473, 10473, NA, NA, NA,…
$ treatmentdetails <dbl> 67, 68, 69, 70, 71, 72, 74, 77, 78, 82, 83, 84, 85, 8…
$ createdon <date> 2023-10-12, 2023-10-12, 2023-10-19, 2023-10-26, 2024…
$ reportdate <date> 2023-10-12, 2023-10-12, 2023-10-19, 2023-10-26, 2024…
$ approved <dbl> 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1,…
$ rejected <dbl> 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
‘treatmentdetails’
Rows: 27
Columns: 9
$ id <dbl> 67, 68, 69, 70, 71, 72, 74, 75, 77, 78, 82, 83,…
$ servicerequestreport <dbl> 78, 79, 80, 81, 82, 83, 85, NA, 88, 89, 93, 94,…
$ openingbalance <dbl> 1116264.32, 1116264.32, 1289301.05, 1289301.05,…
$ refurbished <dbl> 1000000.00, 1000000.00, 1000000.00, 1000000.00,…
$ recovered <dbl> 116264.32, 116264.32, 289301.05, 289301.05, 758…
$ export <dbl> 50000.00, 50000.00, 189301.05, 189301.05, 7897.…
$ energy <dbl> 66264.32, 66264.32, 100000.00, 100000.00, 0.00,…
$ landfill <dbl> 0.00, 0.00, 0.00, 0.00, 0.00, 1131.05, 138.15, …
$ localsecondaryproducts <dbl> 0.0, 0.0, 0.0, 0.0, 750287.6, 0.0, 0.0, 1161388…
‘address’
Rows: 8
Columns: 17
$ id <dbl> 1, 6, 10, 28, 30, 79, 80, 81
$ streetno <chr> NA, "82", "10", NA, "3", "11", NA, "CNR"
$ streetname <chr> "ghjg", "Morkels Close Halfway House", "Silverstone Street"…
$ building <chr> NA, "Capital Hill Commercial Estate", "Raceway Industrial P…
$ postal <chr> NA, "PO Box 483", NA, "PO Box 1923", NA, NA, "PO Box 10648"…
$ suburb <chr> "Saron", "Midrand", "Gosforth Park", "Pietermaritzburg", "B…
$ city <chr> NA, NA, NA, NA, NA, "Cape Town", NA, NA
$ province <chr> "Eastern Cape", "Gauteng", "Gauteng", "Kwazulu-Natal", "Kwa…
$ country <dbl> 90, 77, 77, 77, 77, 77, 77, 77
$ postcode <dbl> 6812, 1685, 1401, 3201, 4399, 7708, 1505, 9585
$ type <dbl> 26, 28, 28, 28, 28, 28, 28, 28
$ person <dbl> 1, 105, 170, 315, 317, 692, 312, 312
$ movedate <date> 0001-01-01, 0001-01-01, 0001-01-01, 0001-01-01, 0001-01-01,…
$ preferred <dbl> 1, 0, 0, 1, 1, 1, 0, 1
$ dispatch <dbl> 1, 0, 0, 0, 0, 0, 0, 0
$ latitude <dbl> 1e-07, NA, NA, NA, NA, NA, NA, NA
$ longitude <dbl> 1e-07, NA, NA, NA, NA, NA, NA, NA
‘contactnumber’
Rows: 119
Columns: 5
$ id <dbl> 7, 9, 12, 13, 22, 23, 28, 29, 30, 43, 44, 47, 50, 51, 54, 55…
$ number <chr> "27114743423", "0837000441", "0826916759", "082691 6759", "0…
$ type <dbl> 33, 0, 0, 33, 0, 33, 33, 0, 33, 0, 33, 33, 0, 33, 0, 33, 0, …
$ person <dbl> 8, 13, 20, 20, 36, 36, 1, 42, 42, 67, 67, 105, 132, 132, 154…
$ preferred <dbl> 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, …
‘email’
Rows: 128
Columns: 5
$ id <dbl> 12, 15, 17, 22, 25, 26, 33, 35, 37, 39, 48, 49, 50, 51, 52, …
$ address <chr> "wilfred.m@silicon-online.co.za", "russell@temo.co.za", "era…
$ type <dbl> 145, 0, 0, 0, 145, 0, 0, 145, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ person <dbl> 8, 13, 20, 36, 1, 42, 67, 105, 132, 154, 195, 199, 202, 205,…
$ preferred <dbl> 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, …
‘entity’
Rows: 307
Columns: 16
$ id <dbl> 0, 1, 8, 15, 16, 19, 20, 22, 25, 28, 31, 34, 35, 36, 38, 40…
$ externalno <chr> "null", "001", "008", NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ level <dbl> 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0…
$ type <dbl> 8294, 8293, 58033, 0, 0, 0, 8293, 0, 0, 0, 0, 0, 0, 8293, 0…
$ title <dbl> 7145, 7147, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ name <chr> "Philip", "Aurelia", "Silicon Engineering", "Mark", "Conny"…
$ surname <chr> "Webb", "Forbes", "Silicon Engineering", "Helfrich", "Motau…
$ identity <chr> "null", NA, "80/04310/07", NA, NA, NA, NA, NA, NA, NA, NA, …
$ birthdate <date> 2022-03-10, 2021-10-07, 2021-10-20, 0001-01-01, 0001-01-01…
$ gender <dbl> 8310, 8311, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ language <dbl> 8341, 8341, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ status <dbl> 8300, 58198, 8301, 0, 0, 0, 8300, 0, 0, 0, 0, 0, 0, 0, 0, 8…
$ note <chr> "null", NA, NA, "Packaging Engineering Manager", "Public Af…
$ passport <chr> "null", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ hasphoto <dbl> 0, 1, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ ispaid <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
‘objectvalue’
Rows: 282
Columns: 7
$ id <dbl> 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, …
$ trait <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ value <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ instance <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 17, 18, 19, 2…
$ valuejson <chr> "{\\Children\\: 3, \\EntityProduct\\: 0}", "{\\Cost\\: 50, \…
$ object <chr> "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", …
$ entity <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
‘transactions’
This table is an empty set.
‘wasteclass’ and ‘wasteclassification’
[1] NA 58308 58466 58467 58635
[1] NA 58466 58308 58467 58635
[1] 63858 63859