EDA SIP

O que são os dados

## Rows: 12,299
## Columns: 17
## $ task_number            <chr> "1735", "1742", "1971", "2134", "2251", "2283",~
## $ summary                <chr> "Flag RI on SCM Message Summary screen using me~
## $ priority               <dbl> 1, 1, 2, 5, 10, 1, 5, 5, 6, 5, 2, 1, 3, 1, 1, 8~
## $ raised_by_id           <chr> "58", "58", "7", "50", "46", "13", "13", "13", ~
## $ assigned_to_id         <chr> "58", "42", "58", "42", "13", "13", "13", "58",~
## $ authorised_by_id       <chr> "6", "6", "6", "6", "6", "58", "6", "6", "6", "~
## $ status_code            <chr> "FINISHED", "FINISHED", "FINISHED", "FINISHED",~
## $ project_code           <chr> "PC2", "PC2", "PC2", "PC2", "PC2", "PC9", "PC2"~
## $ project_breakdown_code <chr> "PBC42", "PBC21", "PBC75", "PBC42", "PBC21", "P~
## $ category               <chr> "Development", "Development", "Operational", "D~
## $ sub_category           <chr> "Enhancement", "Enhancement", "In House Support~
## $ hours_estimate         <dbl> 14.00, 7.00, 0.70, 0.70, 3.50, 7.00, 7.00, 7.00~
## $ hours_actual           <dbl> 1.75, 7.00, 0.70, 0.70, 3.50, 7.00, 7.00, 7.00,~
## $ developer_id           <chr> "58", "42", "58", "42", "13", "13", "43", "58",~
## $ developer_hours_actual <dbl> 1.75, 7.00, 0.70, 0.70, 3.50, 7.00, 7.00, 7.00,~
## $ task_performance       <dbl> 12.25, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00~
## $ developer_performance  <dbl> 12.25, 0.00, 0.00, 0.00, 0.00, 0.00, NA, 0.00, ~

Como é a distribuição do erro nas estimativas de diferentes subcategorias de tarefas? Se quiser, use também as categorias nos dados.

To explore this relationship of the distribution of error in the estimates of different subcategories of tasks. We present a point graph (Figure 1). In this graph, the points refer to the error in the estimates of different subcategories of tasks. Subcategory information is shown along the vertical axis and errors in estimates for different subcategories along the horizontal axis. Analyzing the graph, there is a trend of concentrated points between zero (0) and ten (10). At the same time, we have two outliers above twenty-five (25). It is concluded that most of the task subcategories have a very different error distribution in the estimates and in some cases the error is quite high.

Como se comparam as distribuições de tempo (real) das tarefas entre os diferentes times? Há times com tarefas consideravelmente maiores?

In graph 2, we have visually the signs that support the differences in the real-time distributions of tasks between the different teams. In this graph of points we have the sample size of each category, there is evidence of a larger difference in results for Development while Management and Operational have relatively identical results. This graph clearly shows Operational with values considerably higher than the others. Therefore, it is concluded that the graph indicates when there are atypical observations, outliers and extremes as it is possible to be studied in the graph.

EDA SIP

Fernando Tomaz

O que são os dados

Como é a distribuição do erro nas estimativas de diferentes subcategorias de tarefas? Se quiser, use também as categorias nos dados.

Como se comparam as distribuições de tempo (real) das tarefas entre os diferentes times? Há times com tarefas consideravelmente maiores?