This document is intended to show how R can be used to upload, manipulate and visualize data as well as sharing it with an audience
In this example, we are using data from a business process management tool in order to analyze what share of applications are incomplete when they are sent from deparment A to department B. If the application is sent back by department B to department A, this is considered as a rework and shows that there was some issue with the application. The input data is from an excel file and is as follows. Each row corresponds to the applications being with one department. Here is a glimpse of the raw data. In reality there are more columns but I just showed the essential ones.
| ApplicationID | StepStart | StepDuration | User |
|---|---|---|---|
| 2016/600402 | 2016-08-30 11:05:21 | 3.25315e+03 | c620 |
| 2016/600402 | 2018-04-16 16:14:32 | 9.50000e-01 | b083 |
| 2016/600402 | 2018-04-17 09:12:06 | 2.15000e+00 | c620 |
| 2016/600402 | 2018-04-17 11:22:06 | 4.50000e-01 | b083 |
| 2016/600402 | 2018-04-17 11:50:12 | 1.13333e+00 | la78 |
| 2016/600402 | 2018-04-17 12:59:02 | 1.66667e-02 | b083 |
| 2017/003252 | 2017-08-03 16:48:31 | 5.00000e-02 | am06 |
| 2017/003252 | 2017-08-03 16:51:27 | 2.36667e+00 | lp05 |
| 2017/003252 | 2017-08-04 11:13:29 | 2.76667e+00 | Syst |
| 2017/003252 | 2017-08-04 11:13:42 | 2.66667e-01 | ca01 |
In order to get to the desired output, we need to transform this data in order to see for each individual application whether it has reworks or not. When we run the code, we get the following table with just one row per application:
| APPLICATION NO | N_RW | R_Time |
|---|---|---|
| 2017/301356 | 1 | 67.716633 |
| 2017/301513 | 1 | 16.683340 |
| 2017/301594 | 2 | 10.750000 |
| 2017/301840 | 2 | 56.116670 |
| 2017/301880 | 0 | 0.000000 |
| 2017/301954 | 2 | 990.466363 |
| 2017/302047 | 2 | 5.949997 |
| 2017/302048 | 2 | 5.933329 |
| 2017/302071 | 3 | 25.600000 |
| 2017/302113 | 2 | 19.400033 |
Then, we need to aggregate this data in order to see week over week what the rework rate is:
| TaskPeriod | Pilot_Running | StepTime | Ncases | CasesWRW | RWRate | AvgNReworks |
|---|---|---|---|---|---|---|
| 2018 51 | Yes | 9.271508 | 31 | 15 | 0.4838710 | 0.5483871 |
| 2018 52 | Yes | 17.670005 | 15 | 6 | 0.4000000 | 0.8666667 |
| 2019 01 | Yes | 26.811896 | 7 | 3 | 0.4285714 | 0.5714286 |
| 2019 02 | Yes | 17.402176 | 23 | 13 | 0.5652174 | 0.8260870 |
| 2019 03 | Yes | 20.946078 | 17 | 9 | 0.5294118 | 0.8235294 |
| 2019 04 | Yes | 7.623114 | 31 | 10 | 0.3225806 | 0.4193548 |
| 2019 05 | Yes | 20.275558 | 15 | 6 | 0.4000000 | 0.5333333 |
| 2019 06 | Yes | 29.423796 | 28 | 16 | 0.5714286 | 0.9642857 |
| 2019 07 | Yes | 34.658628 | 29 | 15 | 0.5172414 | 0.7241379 |
| 2019 08 | Yes | 20.796433 | 28 | 13 | 0.4642857 | 0.7500000 |
Which can then be used to plot the data to visualise the evolution
We can see that for the weeks where the pilot is running, the rework rate is a little bit better than before.