PICO Working Paper 02

Published

December 11, 2024

PoM Data Wrangling

In this document we cast a glance at the ‘entitygood’ table for the current and previous periods. Also viewed are the:

  • ‘collectiondetails’,
  • ‘entitygoodapproval’,
  • ‘good’,
  • ‘servicerequest’,
  • ‘servicerequestfrequency’,
  • ‘servicerequestreport’,
  • ‘treatmentdetails’,
  • ‘address’,
  • ‘contactnumber’,
  • ‘email’,
  • ‘entity’,
  • objectvalue, and
  • transaction tables.

The objective for viewing the tables is to verify that said tables contain PoM data for the period under review.

Quick glimpse of tables

‘entitygood’ - current period (2024)

Rows: 13,247
Columns: 14
$ id          <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,…
$ entity      <dbl> 189, 189, 189, 189, 189, 189, 189, 189, 189, 189, 189, 189…
$ good        <dbl> 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, …
$ units       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ avgkg       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ avgkgold    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ avglifespan <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ totalkg     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ tariff      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ selected    <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ dimension   <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ wasteclass  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ period      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ invoice     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
2024 - ‘entitygood’ ‘period’ column set
 [1]      NA  202204 2022001 2021004  202201  202202  202203  202205  202111
[10]  202212  202112  202206 2022002  202207  202208  202209 2022003  202210
[19]  202211 2022004  202301  202302 2023001  202303  202312  202304  202305
[28]  202306 2023002  202307  202308 2023003  202309  202310  202311 2023004
[37] 2021001 2021002 2021003

‘entitygood’ - prior year (2023)

Rows: 14,879
Columns: 14
$ id          <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,…
$ entity      <dbl> 189, 189, 189, 189, 189, 189, 189, 189, 189, 189, 189, 189…
$ good        <dbl> 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, …
$ units       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ avgkg       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ avgkgold    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ avglifespan <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ totalkg     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ tariff      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ selected    <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ dimension   <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ wasteclass  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ period      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ invoice     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
2023 - ‘entitygood’ ‘period’ column set
 [1]      NA 2021001  202204 2022001 2021004  202201  202202  202203  202205
[10]  202111  202212  202112  202206 2022002  202207  202208  202209 2022003
[19]  202210  202211 2022004  202301  202302 2023001  202303  202312  202304
[28]  202305  202306 2023002  202307  202308 2023003  202309  202310  202311
[37] 2023004 2021002 2021003

‘entitygood’ current and prior comparison analysis

‘totalkg’ maximum value and row location - [2024]
[1] 3537041640
# A tibble: 1 × 14
     id entity  good units  avgkg avgkgold avglifespan   totalkg tariff selected
  <dbl>  <dbl> <dbl> <dbl>  <dbl>    <dbl>       <dbl>     <dbl>  <dbl>    <dbl>
1   361    166   109 62316 56760.       NA          NA    3.54e9    1.5        1
# ℹ 4 more variables: dimension <lgl>, wasteclass <dbl>, period <dbl>,
#   invoice <dbl>
Aggregate ‘totalkg’ - [2024]
[1] 4383202398
‘totalkg’ maximum value and row location - [2023]
[1] 3537041640
# A tibble: 1 × 14
     id entity  good units  avgkg avgkgold avglifespan   totalkg tariff selected
  <dbl>  <dbl> <dbl> <dbl>  <dbl>    <dbl>       <dbl>     <dbl>  <dbl>    <dbl>
1   361    166   109 62316 56760.       NA          NA    3.54e9    1.5        1
# ℹ 4 more variables: dimension <lgl>, wasteclass <dbl>, period <dbl>,
#   invoice <dbl>
Aggregate ‘totalkg’ - [2023]
[1] 4528590649

Tables listed by developer as relevant for collating PoM data

‘collectiondetails’

Rows: 835
Columns: 13
$ id                   <dbl> 150, 151, 152, 153, 154, 155, 156, 157, 158, 159,…
$ servicerequestreport <dbl> 78, 78, 78, 78, 78, 78, 78, 78, 78, 78, 78, 78, 7…
$ wasteclassification  <dbl> 63858, 63858, 63858, 63858, 63858, 63858, 63858, …
$ dimension            <dbl> 58227, 58307, 58307, 58307, 58305, 58305, 58305, …
$ collect              <dbl> 10000, 10000, 10000, 10000, 10000, 10000, 10000, …
$ transport            <dbl> 20000, 20000, 20000, 20000, 20000, 20000, 20000, …
$ recycling            <dbl> 30000, 30000, 30000, 30000, 30000, 30000, 30000, …
$ kgs                  <dbl> 930650.00, 13591.78, 3228.84, 945.30, 3866.55, 22…
$ source               <dbl> 63860, 63860, 63860, 63860, 63860, 63860, 63860, …
$ spwaybill            <chr> "S03163", "HB82403P", "MI112218P", "MI111918P", "…
$ productdescription   <chr> "Electric and electronic machines", "Air conditio…
$ collectiondate       <date> 2023-06-30, 2023-06-30, 2023-06-30, 2023-06-30, …
$ producer             <chr> "Nestle (South Africa) (PTY) LTD", "Various", "Va…
‘collectiondetails’ ‘collectiondate’ column
 [1] "2023-06-30" "2024-01-31" "2024-02-21" "2024-02-14" "2024-02-19"
 [6] "2024-02-23" "2024-02-28" "2024-02-08" "2024-02-06" "2024-02-12"
[11] "2024-02-16" "2024-04-15" "2024-04-05" "2024-04-22" "2024-04-16"
[16] "2024-04-29" "2024-04-18" "2024-04-17" "2024-04-11" "2024-04-09"
[21] "2024-04-08" "2024-04-04" "2024-03-31" "2024-04-30" "2024-05-31"
[26] "2024-02-29" "2024-06-30" "2024-07-25" "2024-07-24" "2024-07-19"
[31] "2024-07-23" "2024-07-15" "2024-07-12" "2024-07-09" "2024-07-31"
[36] "2024-08-31" "2024-08-23" "2024-08-22" "2024-08-20" "2024-08-13"
[41] "2024-08-05" "2024-08-15" "2024-08-14" "2024-08-01" "2024-08-28"
[46] "2024-08-16" "2024-08-06" "2024-09-16" "2024-09-10" "2024-09-30"
[51] "2024-09-13" "2024-09-25" "2024-09-12" "2024-09-11" "2024-09-09"
[56] "2024-09-23" "2024-09-20" "2024-09-04" "2024-09-02" "2024-10-31"
[61] "2024-10-10" "2024-10-25" "2024-10-17" "2024-10-11" "2024-10-03"
[66] "2024-10-15" "2024-10-07" "2024-10-09" "2024-10-24" "2024-10-21"
[71] "2024-10-16" "2024-10-14" "2024-10-08" "2024-10-29"
’collectiondetails aggregate kgs
[1] 10579081

‘entitygoodapproval’

Rows: 238
Columns: 13
$ id                <dbl> 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, …
$ approvedby        <dbl> 189, 229, 229, 229, 229, 229, 229, 229, 229, 229, 22…
$ approvedon        <dttm> 2022-10-14 10:31:00, 2022-10-14 10:44:47, 2022-10-1…
$ invoiceapprovedby <dbl> 229, NA, 229, 229, NA, NA, NA, NA, NA, NA, 229, 229,…
$ invoiceapprovedon <dttm> 2023-07-13 14:09:56, NA, 2023-07-14 14:16:40, 2023-…
$ entity            <dbl> 73, 43, 107, 107, 10, 24, 130, 7, 3, 24, 73, 73, 73,…
$ period            <dbl> 202207, 2022002, 202208, 202207, 202207, 202207, 202…
$ purchaseorder     <chr> "-", "-", "-", "-", "-", "-", "-", "-", "-", "-", "-…
$ invoiceno         <chr> "R7-22-DAT0", "R2-22-MUS0", "R8-22-DGS1", "R7-22-DGS…
$ usear             <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ usevat            <dbl> 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1…
$ query             <dbl> 41, NA, 60, 59, NA, NA, NA, NA, NA, NA, 34, 35, 36, …
$ status            <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
‘entitygoodapproval’ period column contents
 [1]  202207 2022002  202208 2022001  202206  202112  202201  202202  202203
[10]  202204  202205 2021004  202111  202209 2022003  202210  202211  202212
[19] 2022004  202301  202302 2023001  202312  202303  202304  202305  202306
[28] 2023002  202307  202308 2023003  202309  202310  202311 2023004 2021001
[37] 2021002 2021003

‘good’

Rows: 6,514
Columns: 10
$ id              <dbl> 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, …
$ hscode          <chr> "H841510", "H841581", "H841582", "H841583", "H841821",…
$ description     <chr> "Air conditioners window/wall types, self-contained", …
$ eu6             <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 4, 4, 4, …
$ eu10            <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 10, 10, 5, 1, 1, 1…
$ unu             <chr> "0111", "0111", "0111", "0113", "0108", "0108", "0108"…
$ avgkg           <dbl> 26.70, 26.70, 26.70, 110.10, 40.79, 40.79, 40.79, 44.0…
$ category        <dbl> 58226, 58228, 58229, 58230, 58231, 58232, 58233, 58234…
$ hazardsubstance <dbl> 58225, 58225, 58225, 58225, 58225, 58225, 58225, 58225…
$ dimension       <dbl> 58227, 58227, 58227, 58227, 58227, 58227, 58227, 58227…

‘servicerequest’

Rows: 30
Columns: 11
$ id               <dbl> 18, 21, 22, 23, 24, 25, 26, 32, 33, 34, 35, 36, 37, 3…
$ servicerequestno <chr> "018102023-Levanya GMAIL Trading", "021102023-Levanya…
$ serviceprovider  <dbl> 638, 638, 636, 431, 651, 638, 638, 636, 636, 459, 636…
$ createdby        <dbl> 10439, 10439, 10417, 10362, 10362, 10439, 10417, 1041…
$ createdon        <dttm> 2023-10-05 12:03:00, 2023-10-10 15:58:00, 2023-10-11…
$ fromdate         <dttm> 2023-10-05 12:02:00, 2023-10-14 00:00:00, 2023-10-12…
$ todate           <dttm> 2023-10-21, 2023-10-23, 2023-10-17, 2023-10-24, 2023…
$ services         <chr> "XXXX", "ipads", "Laptops", "1. Cartage & Labour cost…
$ locations        <chr> "XXXX", "Cape town", "Laptops", "Cnr Strandfontein &,…
$ deliverables     <chr> "XXXX", "ipads", "Laptops", "1. Provide a 8 ton vehic…
$ deliverydate     <dttm> 2023-10-06, 2023-10-23, 2023-10-23, 2023-10-25, 2023…

‘servicerequestfrequency’

Rows: 30
Columns: 4
$ id              <dbl> 14, 17, 18, 19, 20, 21, 22, 28, 29, 30, 31, 32, 33, 34…
$ servicerequest  <dbl> 18, 21, 22, 23, 24, 25, 26, 32, 33, 34, 35, 36, 37, 38…
$ reportfrequency <dbl> 63853, 63853, 63852, 63853, 63853, 63852, 63853, 63853…
$ active          <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, …

‘servicerequestreport’

Rows: 26
Columns: 9
$ id               <dbl> 78, 79, 80, 81, 82, 83, 85, 88, 89, 93, 94, 95, 96, 9…
$ servicerequest   <dbl> 21, 18, 25, 36, 40, 32, 45, 45, 45, 41, 48, 50, 48, 4…
$ serviceprovider  <dbl> 638, 638, 638, 638, 636, 636, 317, 317, 317, 636, 765…
$ createdby        <dbl> 10474, 10474, 10474, 10474, 10473, 10473, NA, NA, NA,…
$ treatmentdetails <dbl> 67, 68, 69, 70, 71, 72, 74, 77, 78, 82, 83, 84, 85, 8…
$ createdon        <date> 2023-10-12, 2023-10-12, 2023-10-19, 2023-10-26, 2024…
$ reportdate       <date> 2023-10-12, 2023-10-12, 2023-10-19, 2023-10-26, 2024…
$ approved         <dbl> 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1,…
$ rejected         <dbl> 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…

‘treatmentdetails’

Rows: 27
Columns: 9
$ id                     <dbl> 67, 68, 69, 70, 71, 72, 74, 75, 77, 78, 82, 83,…
$ servicerequestreport   <dbl> 78, 79, 80, 81, 82, 83, 85, NA, 88, 89, 93, 94,…
$ openingbalance         <dbl> 1116264.32, 1116264.32, 1289301.05, 1289301.05,…
$ refurbished            <dbl> 1000000.00, 1000000.00, 1000000.00, 1000000.00,…
$ recovered              <dbl> 116264.32, 116264.32, 289301.05, 289301.05, 758…
$ export                 <dbl> 50000.00, 50000.00, 189301.05, 189301.05, 7897.…
$ energy                 <dbl> 66264.32, 66264.32, 100000.00, 100000.00, 0.00,…
$ landfill               <dbl> 0.00, 0.00, 0.00, 0.00, 0.00, 1131.05, 138.15, …
$ localsecondaryproducts <dbl> 0.0, 0.0, 0.0, 0.0, 750287.6, 0.0, 0.0, 1161388…

‘address’

Rows: 8
Columns: 17
$ id         <dbl> 1, 6, 10, 28, 30, 79, 80, 81
$ streetno   <chr> NA, "82", "10", NA, "3", "11", NA, "CNR"
$ streetname <chr> "ghjg", "Morkels Close Halfway House", "Silverstone Street"…
$ building   <chr> NA, "Capital Hill Commercial Estate", "Raceway Industrial P…
$ postal     <chr> NA, "PO Box 483", NA, "PO Box 1923", NA, NA, "PO Box 10648"…
$ suburb     <chr> "Saron", "Midrand", "Gosforth Park", "Pietermaritzburg", "B…
$ city       <chr> NA, NA, NA, NA, NA, "Cape Town", NA, NA
$ province   <chr> "Eastern Cape", "Gauteng", "Gauteng", "Kwazulu-Natal", "Kwa…
$ country    <dbl> 90, 77, 77, 77, 77, 77, 77, 77
$ postcode   <dbl> 6812, 1685, 1401, 3201, 4399, 7708, 1505, 9585
$ type       <dbl> 26, 28, 28, 28, 28, 28, 28, 28
$ person     <dbl> 1, 105, 170, 315, 317, 692, 312, 312
$ movedate   <date> 0001-01-01, 0001-01-01, 0001-01-01, 0001-01-01, 0001-01-01,…
$ preferred  <dbl> 1, 0, 0, 1, 1, 1, 0, 1
$ dispatch   <dbl> 1, 0, 0, 0, 0, 0, 0, 0
$ latitude   <dbl> 1e-07, NA, NA, NA, NA, NA, NA, NA
$ longitude  <dbl> 1e-07, NA, NA, NA, NA, NA, NA, NA

‘contactnumber’

Rows: 119
Columns: 5
$ id        <dbl> 7, 9, 12, 13, 22, 23, 28, 29, 30, 43, 44, 47, 50, 51, 54, 55…
$ number    <chr> "27114743423", "0837000441", "0826916759", "082691 6759", "0…
$ type      <dbl> 33, 0, 0, 33, 0, 33, 33, 0, 33, 0, 33, 33, 0, 33, 0, 33, 0, …
$ person    <dbl> 8, 13, 20, 20, 36, 36, 1, 42, 42, 67, 67, 105, 132, 132, 154…
$ preferred <dbl> 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, …

‘email’

Rows: 128
Columns: 5
$ id        <dbl> 12, 15, 17, 22, 25, 26, 33, 35, 37, 39, 48, 49, 50, 51, 52, …
$ address   <chr> "wilfred.m@silicon-online.co.za", "russell@temo.co.za", "era…
$ type      <dbl> 145, 0, 0, 0, 145, 0, 0, 145, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ person    <dbl> 8, 13, 20, 36, 1, 42, 67, 105, 132, 154, 195, 199, 202, 205,…
$ preferred <dbl> 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, …

‘entity’

Rows: 307
Columns: 16
$ id         <dbl> 0, 1, 8, 15, 16, 19, 20, 22, 25, 28, 31, 34, 35, 36, 38, 40…
$ externalno <chr> "null", "001", "008", NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ level      <dbl> 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0…
$ type       <dbl> 8294, 8293, 58033, 0, 0, 0, 8293, 0, 0, 0, 0, 0, 0, 8293, 0…
$ title      <dbl> 7145, 7147, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ name       <chr> "Philip", "Aurelia", "Silicon Engineering", "Mark", "Conny"…
$ surname    <chr> "Webb", "Forbes", "Silicon Engineering", "Helfrich", "Motau…
$ identity   <chr> "null", NA, "80/04310/07", NA, NA, NA, NA, NA, NA, NA, NA, …
$ birthdate  <date> 2022-03-10, 2021-10-07, 2021-10-20, 0001-01-01, 0001-01-01…
$ gender     <dbl> 8310, 8311, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ language   <dbl> 8341, 8341, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ status     <dbl> 8300, 58198, 8301, 0, 0, 0, 8300, 0, 0, 0, 0, 0, 0, 0, 0, 8…
$ note       <chr> "null", NA, NA, "Packaging Engineering Manager", "Public Af…
$ passport   <chr> "null", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ hasphoto   <dbl> 0, 1, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ ispaid     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…

‘objectvalue’

Rows: 282
Columns: 7
$ id        <dbl> 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, …
$ trait     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ value     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ instance  <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 17, 18, 19, 2…
$ valuejson <chr> "{\\Children\\: 3, \\EntityProduct\\: 0}", "{\\Cost\\: 50, \…
$ object    <chr> "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", …
$ entity    <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …

‘transactions’

This table is an empty set.

‘wasteclass’ and ‘wasteclassification’

[1]    NA 58308 58466 58467 58635
[1]    NA 58466 58308 58467 58635
[1] 63858 63859