1 Report Purpose

The purpose of this report is to provide you, the analyst, with an estimate of a project’s DevOps change cycle time. This metric reflects whether a project development team(s) is able to respond with a predictable cadence to change requests from teams that are using, maintaining, supporting, deploying, or providing requirements the development product(s).

2 Benefits

Measuring a project’s change cycle time is important in establishing a consistent software production life cycle. The change cycle time is also a key indicator of a product team’s progress in achieving a DevOps culture. A project team that establishes a consistent change cycle time will be better able to support, and continually improve support for, teams dependent on its product(s). A project with a consistent change cycle time is more predictable, which makes the entire product life cycle more manageable.

Knowing its change cycle time will help a team to:

3 Fundamental Analytic Parameter Settings Used In This Report

The following parameter settings were used:


This report examines activities between 2009-12-16 and 2019-06-06.


Project Personnel Included in This Analysis
Adam Pickeral Chris Psaltis Jay Kumar Oleg Suharev
Amir Elaguizy Christian Paredes Jed Smith Olivier Grisel
Andre Merzky Cyrille Verrier Jeff Moody Patrick Armstrong
Andrew Montalenti Daemian Mack Jerry Chen Paul Oswald
Andrew Udvare DaeMyung Kang Jim Divine Pedro Romano
Andrey Zhuchkov Dag Stenstad joe miller Perry Zou
Arfrever Frehtes Taifersar Arahesis Dan Di Spaltro John Bresnahan Peter Danecek
Arthur Lutz Dave King John Carr Philip Schwartz
ASF GitHub Bot daveb Jon Chen Philipp Strube
Aymeric Barantal David Bryson Joseph Hall Rashit Azizbaev
Ben Agricola David LaBissoniere Jouke Waleson Richard Bross
Ben Meng Dilip Patharachalam Juan Carlos Moreno Rick Wright
Benno Rice Dinesh Bhoopathy K Jonathan Harker robert
Benoît Canet Emanuele Rocca Kevin McDonald Robert Chiniquy
Bernard Kerckenaere Eric Johnson Kyoung-chan Lee Roman Bogorodskiy
Bernard Metzler Erich Eckner L. Schaub Rudolf J Streif
Bill Woodward Erinn Looney-Triggs Lex Russell Keith-Magee
Birk Nilson Eugene Polyan Loic Lambiel sam song
Bob Thompson Florent Cayré Lucy Mendel Sebastien Goasguen
Bojan Mihelac Frederic Michaud Magnus Andersson Sebb
Borja Martin Gabriel Reid Mahendra M Sengor Kusturica
Gary Wilson Marcin Kuzminski Shawn Smith
Brad Morgan Gavin McCance Mario Loria Simon Delamare
Brian Curtin Geoff Greer Mark Nottingham Stefan Friesel
Brian DeGeeter Gilly Barr Markos Gogoulos Stephane Roy
Brian McDaniel Glyph Lefkowitz Matt Black Tim Fletcher
Brian Mingus Grischa Meyer Matt Perry Tomaz Muraus
Bruno Mahé Guillaume ZITTA Michael Bennett Torsten Schlabach
Caio Romão Hugh Esco Michael Farrell Trevor Pounds
Carlo Hutson Betts Michael Mior Trevor Powell (RMS)
Carlos Reategui Ilgiz Islamgulov Michal Galet Wiktor Kołodziej
charles walker Ivan Michel Samia Xavier Barbosa
Chris Adams James E. Blair Miguel Jacq Zak Estrada
Chris Clarke Jason Gionta Mike Nerone Anthony Shaw
Chris DeRamus Jason Johnson Neil Wilson Paul Querna
Chris Gilmer Jaume Devesa Nick Bailey Rahul Ranjan
Chris Johnson Jay Doane Noah Kantrowitz sai krishna


Note that there are many more parameters that are used in tuning this report’s technical calculations. These parameters are described in appendix section 7.1.

4 Results

4.1 Fundamental Analysis

Based on a matched cadence between the following pair(s) of Jira and Git metrics, the LIBCLOUD project’s change cycle time is:

Primary:
Estimated Change Cycle Time: 3 months

With secondary estimates as follows. Secondary estimates usually occur because change cycle times have repeated cyclic occurrences. For example, a two-week change cycle time will also reoccur at approximately four weeks, six weeks, eight weeks, and so on.

Estimated Change Cycle Time: 10 months

4.2 Advanced Analysis

The analysis in this report uses a cross-correlation technique from signal processing to compare periods of work increase, decrease, and stasis between Jira and the git-like tool to characterize the time offset between Jira and the git-like tool. This time offset is an estimate of the project’s change cycle time.

The stronger the project’s implementation of DevOps techniques and, in particular, the more the project uses its toolchain to support implementation, the stronger the change cycle time estimate based on tool usage data.

The data plots below show the most likely estimates of the DevOps change cycle time for the project.
The “y-axis” shows how strong a relationship exists between the “signal” from the Git-like tool metric being used, as documented in the data plot title, versus the “signal” from the Jira metric, also documented in the title.

  • Negative values on the y-axis, with -1 being the minimum possible value, corresponds negative relationship–as one metric rises, the other falls.
  • A value of 0 on the y-axis means that the metrics have no apparent relationship.
  • Positive values on the y-axis with 1 being the maximum possible value, correspond to a positive relationship–as one metric rises or falls, the other metrics also rises or falls.

The “x-axis” shows the offset in time between the two metrics. For instance, if the plot shows metrics measured on a daily basis, a value of 14 on the x-axis means that the events generating the Jira metric occurred 14 days prior to the events generating the Git-like tool metric.

The coloring of data points corresponds to the quality of the underlying data as calculated based on the number of matching data points. The higher the data quality, the more likely the calculated DevOps change cycle time reflects the project’s genuine behavior.

We are most interested in the offset values on the x-axis that produce correspond to the largest non-negative values on the y-axis that have high data quality. The points emphasized with arrows and labels are the points where the y-axis value is highest in comparison to all of the values around it–these points are the best estimates for the project’s change cycle time. Furthermore, smaller values on the x-axis are preferred to larger values, since events occurring closer in time are (usually) more likely estimates. Larger x-axis values may just be cyclic repetitions of earlier strong values. For example, a relationship that occurs at 45 days would also be observed at 90 days, 135 days, and so on.

The graphs shown and the points highlighted for the listed time intervals provide the most likely estimate for the project’s DevOps Change Cycle Time.

5 Recommendations

Any improvement in the change cycle time must be done in the context of what the customer wants. Begin by determining the rate of change at which the customer(s) can accept changes. (This customer-accepted rate-of-change measure is called the “Takt Time”. Takt is a German word for drum beat or cadence.)

6 Supplementary Technical Information

Projects implementing DevOps practices will have a strong cadence at which work is performed, with project changes beginning with a ticket/epic/story/task change request and, through the project’s life cycle, resulting in software code changes, testing, and ultimately deployment to a production environment. In order to determine the project’s DevOps change cycle time, this report uses data collected from events recorded from the project’s use of Jira and git-like tools. As project personnel perform work, they make changes to the tickets/epics/stories/tasks with which they are associated in Jira. Likewise, as personnel perform work, components that are controlled by the project using a git-like tool (for example, GitHub, GitLab, or git itself) are changed. The analysis in this report is based on the assertion that as project personnel perform work, the level of activity recorded in Jira and git-like tools will increase or decrease in corresponding to the project’s DevOps change cycle time. Work will typically begin by being documented in Jira. As project team members make changes to fulfill the ticket/epic/story/task from Jira, they will make changes to software code and associated artifacts and update the git-like tool with those changes. In a typical project life cycle for any project (agile, iterative, waterfall, etc.), the work being done is a “wave” that begins in Jira and flows through the git-like tool. The frequency with which that work wave crests or ebbs in the project’s change cycle time.

The analysis uses a variety of metrics derived from the tool transaction data for comparison. Depending on changes being made, some metrics may provide better insight than others. The report provides all of the comparisons, but highlights the comparisons that provide the clearest estimates.

Projects that are not implementing DevOps practices may not have a strong change cycle time indicator, since non-DevOps practices do not necessarily have consistent change cycle cadence. This is neither good nor bad, but just a reflection of the project’s life cycle.

Projects that use tools in a haphazard or pathological manner may have unclear or a counter-intuitive change cycle time estimate. For example, if a project typically makes software code changes, deploys the code, and only then updates Jira with task descriptions, then the estimated change cycle time will not make sense. This particular report does not analyze the project’s underlying tool usage consistency. Other available analytic reports focus on addressing this area.

6.1 Metrics Used

This report calculates metrics from project transaction data generated either through automation or by project personnel with Jira or Git-like tools. The metrics used are:

Git-Like Tool Metrics
Metric Description
Number Of Commits Rate The number of commits in the git-like tool repository, collating the data monthly/weekly/daily as a time series, divided by the number of project contributors.
Number Of File Changes Rate The number of files changed in the git-like tool repository, collating the data monthly/weekly/daily as a time series, divided by the number of project contributors.
Number Of Line Updates Rate The number of lines changed (added, deleted, edited) in the git-like tool repository, collating the data monthly/weekly/daily as a time series, divided by the number of project contributors.
Number Of Line Additions Rate The number of lines added in the git-like tool repository, collating the data monthly/weekly/daily as a time series, divided by the number of project contributors.
Number Of Line Deletions Rate The number of lines deleted in the git-like tool repository, collating the data monthly/weekly/daily as a time series, divided by the number of project contributors.
Jira Metrics
Metric Description
Number Changes Contributor Rate The number of Jira ticket/epic/story/task changes made per all active project members, collating the data monthly/weekly/daily as a time series.
Number Creations Contributor Rate The number of Jira ticket/epic/story/task creations made per all active project members, collating the data monthly/weekly/daily as a time series.
Number Actions Contributor Rate The number of Jira ticket/epic/story/task actions (creations or changes) made per all active project members, collating the data monthly/weekly/daily as a time series.
Number Changes Changer Rate The number of Jira ticket/epic/story/task changes made per each project member listed as making a change, collating the data monthly/weekly/daily as a time series, divided by the number of project contributors.
Number Of Changes Assignee Rate The number of Jira ticket/epic/story/task changes made per project members listed as assigned a change, collating the data monthly/weekly/daily as a time series, divided by the number of project contributors.
Number Of Creations Reporter Rate The number of Jira ticket/epic/story/task creations made per project members listed as tickets/epic/story/task reporter, collating the data monthly/weekly/daily as a time series, divided by the number of project contributors.
Number Of Resolutions Resolver Rate The number of Jira ticket/epic/story/task resolutions made by each resolver, collating the data monthly/weekly/daily as a time series.
Number Of Resolutions Assignee Rate The number of Jira ticket/epic/story/task resolutions made per project members listed as tickets/epic/story/task resolvers, collating the data monthly/weekly/daily as a time series, divided by the number of project contributors.
Total Days Worked Resolve Assignee Rate The total calendar days worked to resolve a ticket/epic/story/task per project members listed as the assignee, collating the data monthly/weekly/daily as a time series, divided by the number of project contributors.
Total Days Worked Resolve Resolver Rate The number of days worked per Jira ticket/epic/story/task resolution per resolver, collating the data monthly/weekly/daily as a time series.
Mean Days Worked Resolve Assignee Rate The average (arithmetic mean) calendar days worked to resolve a ticket/epic/story/task per project members listed as the assignee, collating the data monthly/weekly/daily as a time series, divided by the number of project contributors.
Mean Days Worked Resolve Resolver Rate The average (mean) number of days worked per Jira ticket/epic/story/task resolution per resolver, collating the data monthly/weekly/daily as a time series.

6.1.1 Metrics With Adequate Supporting Data

The analytic techniques used in this report are likely to give incorrect results if the data being analyzed has more values that are the same (tied) versus unique (not tied). In particular, if many data values are absent, the analytic techniques must interpolate these absent values, which often results in a larger number of ties.

For this report, a metric is labeled as having an adequate number of non-tied or present values if less than 50% are tied or absent.

The following metrics had an adequate number of non-tied and non-absent values to be used in the analysis:

For Git-like Tools:

  • For the month interval:
    • Number Of Commits Rate
    • Number Of File Changes Rate
    • Number Of Line Updates Rate
    • Number Of Line Additions Rate
    • Number Of Line Deletions Rate
  • For the week interval:
    • Number Of Line Updates Rate
    • Number Of Line Additions Rate
    • Number Of Line Deletions Rate
  • For the day interval:
    • Number Of Line Updates Rate

For Jira:

  • For the month interval:
    • Number Changes Contributor Rate
    • Number Creations Contributor Rate
    • Number Actions Contributor Rate
    • Number Changes Changer Rate

6.1.2 Metrics Without Adequate Supporting Data

The following metrics did not have an adequate number of non-tied and non-absent values to be used in the analysis:

For Git-Like Tools:

  • For the week interval:
    • Number Of Commits Rate
    • Number Of File Changes Rate
  • For the day interval:
    • Number Of Commits Rate
    • Number Of File Changes Rate
    • Number Of Line Additions Rate
    • Number Of Line Deletions Rate

For Jira:

  • For the month interval:
    • Number Of Changes Assignee Rate
    • Number Of Creations Reporter Rate
    • Number Of Resolutions Resolver Rate
    • Number Of Resolutions Assignee Rate
    • Total Days Worked Resolve Assignee Rate
    • Total Days Worked Resolve Resolver Rate
    • Mean Days Worked Resolve Assignee Rate
    • Mean Days Worked Resolve Resolver Rate
  • For the week interval:
    • Number Changes Contributor Rate
    • Number Creations Contributor Rate
    • Number Actions Contributor Rate
    • Number Changes Changer Rate
    • Number Of Changes Assignee Rate
    • Number Of Creations Reporter Rate
    • Number Of Resolutions Resolver Rate
    • Number Of Resolutions Assignee Rate
    • Total Days Worked Resolve Assignee Rate
    • Total Days Worked Resolve Resolver Rate
    • Mean Days Worked Resolve Assignee Rate
    • Mean Days Worked Resolve Resolver Rate
  • For the day interval:
    • Number Changes Contributor Rate
    • Number Creations Contributor Rate
    • Number Actions Contributor Rate
    • Number Changes Changer Rate
    • Number Of Changes Assignee Rate
    • Number Of Creations Reporter Rate
    • Number Of Resolutions Resolver Rate
    • Number Of Resolutions Assignee Rate
    • Total Days Worked Resolve Assignee Rate
    • Total Days Worked Resolve Resolver Rate
    • Mean Days Worked Resolve Assignee Rate
    • Mean Days Worked Resolve Resolver Rate

6.2 Jira and Git Metric Selection Weighting Algorithm

The metrics selected to be used in computing the Change Cycle Time are listed below. The figure of merit for each metric is computed based on a set of factors. Each factor is assigned a weight scaled to range from 0 to 100, with 100 being the highest weight, reflecting that factor’s relative importance in selecting the metrics to be used.

The data plots shown are those that have scored highest with respect to the selection criteria. The selected metrics and time period, selection criteria, criteria weights, and metric score are shown below. Any metric and time period combination not shown scored zero.
For this report, the cut-off score was set to 25 .
Selected Git Versus Jira Metrics Providing The Best Change Cycle Time Estimates
Interval Metric Score
Month Git Metric ‘Number Of Commits Rate’ versus Jira Metric ‘Number Changes Contributor Rate’ 35.256410
Month Git Metric ‘Number Of Commits Rate’ versus Jira Metric ‘Number Changes Changer Rate’ 10.256410
Month Git Metric ‘Number Of File Changes Rate’ versus Jira Metric ‘Number Changes Changer Rate’ 10.256410
Month Git Metric ‘Number Of File Changes Rate’ versus Jira Metric ‘Number Changes Contributor Rate’ 1.923077
Month Git Metric ‘Number Of Commits Rate’ versus Jira Metric ‘Number Creations Contributor Rate’ 1.923077
Month Git Metric ‘Number Of File Changes Rate’ versus Jira Metric ‘Number Creations Contributor Rate’ 1.923077
Month Git Metric ‘Number Of Line Updates Rate’ versus Jira Metric ‘Number Creations Contributor Rate’ 1.923077
Month Git Metric ‘Number Of Line Deletions Rate’ versus Jira Metric ‘Number Creations Contributor Rate’ 1.923077
Month Git Metric ‘Number Of Commits Rate’ versus Jira Metric ‘Number Actions Contributor Rate’ 1.923077
Month Git Metric ‘Number Of File Changes Rate’ versus Jira Metric ‘Number Actions Contributor Rate’ 1.923077
Month Git Metric ‘Number Of Line Updates Rate’ versus Jira Metric ‘Number Actions Contributor Rate’ 1.923077
Month Git Metric ‘Number Of Line Deletions Rate’ versus Jira Metric ‘Number Actions Contributor Rate’ 1.923077
Month Git Metric ‘Number Of Line Additions Rate’ versus Jira Metric ‘Number Changes Changer Rate’ 1.923077
Metric Evaluation Factors For Computing Change Cycle Time
Factor Weight
Metrics with the highest correlation value that is of moderate or better data reliability 25
Metric with the earliest peak correlation value within a time interval 25
Metrics with peak interval periodicity (for example, a metric with a peak at 2, 4, and 6 weeks) 25
Metrics that agree with other metrics about peak correlation values 25

7 Appendices

7.1 Analytic Parameter Settings

This report can be adjusted by you, the analyst, at report creation time to focus on different aspects of the project’s development environment and culture. The available parameters and the values assigned to them for this version of the report are described below.

7.1.1 Setting Explanations

Parameter Usage Example Default
Start and stop dates Used when the analyst wants to examine change cycle time behavior during a narrower range than the project’s entire Jira and Git-like tool usage history If the project lead decided to start DevOps implementation after a certain date, such as January 1, 2021, the report can be set to only look at activities after that date. Start = earliest detected Jira usage, Stop = last detected Jira usage
The personnel included in the tool usage analysis Used when the analyst wants to examine change cycle time data as reflected by a subset of all project personnel. This parameter enables the analyst to select personnel by name. The project may have a core team that is implementing DevOps practices. An analysis of change cycle time may only be relevant to these personnel. All personnel are automatically included in the analysis.
Fraction of involvement in activities to be considered core team (Low to High) Used when the analyst wants to examine change cycle time data as reflected by a subset of all project personnel. This parameter enables the analyst to select personnel by percentage of activities in which they are involved (rather than by name). The project may have a core team that is implementing DevOps practices. An analysis of change cycle time may only be relevant to these personnel. The 90% most actively involved personnel are automatically included in the analysis.
Acceptable Fraction of Tied Data Points Used in an advanced analysis to clean the input data. The statistics used to estimate the change cycle time are sensitive to a large number of data points with the same value. This parameter allows the analyst to set how many tied data points are permitted. To refine an analysis, the analyst may decide to accept only 10% data point ties. Automatically set to accept 50% of data points as ties.
Acceptable Data Quality Used in advanced analysis to clean the output data. The analytic algorithms calculate a ratio of the value of the computed statistic divided by the magnitude of the standard error for that statistic. Small values of this ratio indicate that the computed statistic contains little information. The analyst has multiple competing estimates for change cycle time and wants to refine the computations to focus on change cycle time estimates supported by more data points. Automatically set to a data quality score of 0.6.
Window width in days An estimated change cycle time could have a value in days, weeks, or months. The day window width controls how much overlap is permitted with estimates in weeks. The estimated change cycle time is computed at 12 days. Setting the day window width parameter will help distinguish whether the change cycle time is really 12 days or is actually 2 weeks. Automatically set to 9 days.
Window width in weeks An estimated change cycle time could have a value in days, weeks, or months. The week window width controls how much overlap is permitted with estimates in days and months. The estimated change cycle time is computed at 5 weeks. Setting the week window width parameter will help distinguish whether the change cycle time is really 5 weeks or is actually 1 month. Automatically set to 5 days.
Window width in months An estimated change cycle time could have a value in days, weeks, or months. The month window width controls how much overlap is permitted with estimates in weeks as well as how long a change cycle time can be estimated. The estimated change cycle time is computed at 8 months. Setting the month window width parameter would ignore data implying change cycle times over 13 months as not being a useful change cycle time. Automatically set to 7 months.
Correlation Method Used to set the correlation method used in the algorithm to estimate the change cycle time The analyst chooses Spearman correlation to know if Jira and GitHub usage rise and fall monotonically together. Automatically set to Spearman.
Weight applied to metrics with the highest Jira-to-Git correlation Used to determine the weight applied to the highest single correlation in calculating the figure of merit for estimating change cycle time. The analyst believes that having the highest single correlation is the best indicator of a relationship and sets its weight to 80 (out of 100). Automatically set to 25 (out of 100).
Weight applied to metrics with the earliest maximum correlation within a time interval Used to determine the weight applied to the earliest peak correlation in calculating the figure of merit for estimating change cycle time. The analyst believes that having the earliest peak correlation is the best indicator of a relationship and sets its weight to 70 (out of 100). Automatically set to 25 (out of 100).
Weight applied to metrics with interval periodicity Used to determine the weight applied to cycling re-occurring peak correlations in calculating the figure of merit for estimating change cycle time. The analyst believes that having a cyclically re-occurring peak correlation is the best indicator of a relationship and sets its weight to 60 (out of 100). Automatically set to 25 (out of 100).
Weight applied to metrics that have matching maximum correlations Used to determine the weight applied to the correlation the show similar cycle times to other correlations across multiple tool metrics in calculating the figure of merit for estimating change cycle time. The analyst believes that having multiple peak correlations apparently related to each other across many metrics is the least indicator of a relationship and sets its weight to 10 (out of 100). Automatically set to 25 (out of 100).


7.1.2 Setting Values

The value of the parameter settings used in this report are:

Parameter Parameter Value
Start and stop dates Listed above in the ‘Fundamental Analytic Parameter Settings Used In This Report’ Section
The personnel included in the tool usage analysis Listed above in the ‘Fundamental Analytic Parameter Settings Used In This Report’ Section
Fraction of involvement in activities to be considered core team (Low to High) 10%
Acceptable Fraction of Tied Data Points 50%
Acceptable Data Quality 0.6
Window width in days 9
Window width in weeks 5
Window width in months 7
Correlation Method Spearman
Weight applied to metrics with the highest Jira-to-Git correlation 25 out of 100 allocated
Weight applied to metrics with the earliest maximum correlation within a time interval 25 out of 100 allocated
Weight applied to metrics with interval periodicity 25 out of 100 allocated
Weight applied to metrics that have matching maximum correlations 25 out of 100 allocated


7.2 Metrics Comparisons that Provide Ambiguous Results

The following data plots also had sufficient data to be computed, but were not distinguished from other data plots in providing the most likely estimates of the DevOps change cycle time metric. However, these data plots should not be dismissed because additional data or adjustments in parameters such as the window width for determining values of interest might cause a re-evaluation of the estimated change cycle time.

Although these metrics comparisons may appear to show a result, some aspect of the underlying calculation such as data quality or figure of merit scoring, did not identify the resulting as sufficiently reliable to be used. However, this information associated with these metrics may be of use in understanding the change cycle time estimate that was computed.


7.3 Statistical Analysis Technical Details

7.3.1 Time Series Analysis

A time series is defined in the National Institutes of Standards and Technologies’ Engineering Statistics Handbook1 as:

An ordered sequence of values of a variable at equally spaced time intervals. Applications: The usage of time series models is twofold:
\(\cdot\) Obtain an understanding of the underlying forces and structure that produced the observed data.
\(\cdot\) Fit a model and proceed to forecasting, monitoring or even feedback and feedforward control.

7.3.1.1 Displacement Analysis and Cross-Correlation

From Wikipedia2:

In signal processing, cross-correlation is a measure of similarity of two series as a function of the displacement of one relative to the other.

Similarly, if two scalar random vectors X and Y are time series, then

… the cross-correlations of X with Y across time are temporal cross-correlations. In probability and statistics, the definition of correlation always includes a standardizing factor in such a way that correlations have values between −1 and +1.

This report uses the cross-correlation, standardized to values between -1 and +1, between all of the possible pairings of Jira and Git-like tools metrics that have sufficient data points to detect the strongest matching “signal” between the tools. These strongest matches corresponds to when an increase or decrease of activity in Jira corresponds to an increase or decrease in the Git-like tool. The displacement between the two time series is the computed change cycle time for the project.

7.3.2 Correlation Statistics

Depending on the options chosen, this report calculates cross-correlation using the Pearson Product-Moment, Spearman, or Kendall correlation coefficients. As explained below, each correlation coefficient has different advantages and disadvantages.

The correlation coefficient chosen for this report is Spearman’s correlation coefficient.

7.3.2.1 Pearson Product-Moment Correlation Coefficient

From Wikipedia3:

…the Pearson product-moment correlation coefficient … is a statistic that measures linear correlation between two variables X and Y. It has a value between +1 and −1, where 1 is total positive linear correlation, 0 is no linear correlation, and −1 is total negative linear correlation.

The Pearson Product-Moment correlation coefficient handles tied values well, but calculates the correlation as the linear fit between two variables. However, two variables could be strongly related to each other in some way without that relationship being linear. The Pearson correlation coefficient assumes that the two underlying variables being measured, X and Y are normally distributed, which may not be true. If the normality assumption is violated, then a non-parametric correlation coefficient, such as Spearman or Kendall, would be more robust.

7.3.2.2 Spearman Correlation Coefficient

From Laerd Statistics4:

The Spearman’s rank-order correlation is the nonparametric version of the Pearson product-moment correlation. Spearman’s correlation coefficient measures the strength and direction of association between two ranked variables.
…Spearman’s correlation determines the strength and direction of the monotonic relationship between your two variables rather than the strength and direction of the linear relationship between your two variables. A monotonic relationship is a relationship that does one of the following: (1) as the value of one variable increases, so does the value of the other variable; or (2) as the value of one variable increases, the other variable value decreases.

The Spearman correlation coefficient is useful for determining if two time series are monotonically related, which is key insight into calculating the change cycle time based on the displacement between metrics generated by Jira and Git-like tools. However, the Spearman correlation coefficient can perform poorly when too many ties exist in the data.

7.3.2.3 Kendall Correlation Coefficient

From Wikipedia5:

…the Kendall rank correlation coefficient … is a statistic used to measure the ordinal association between two measured quantities.

…It is a measure of rank correlation: the similarity of the orderings of the data when ranked by each of the quantities.

…the Kendall correlation between two variables will be high when observations have a similar (or identical for a correlation of 1) rank (i.e. relative position label of the observations within the variable: 1st, 2nd, 3rd, etc.) between the two variables, and low when observations have a dissimilar (or fully different for a correlation of −1) rank between the two variables.

The Kendall correlation coefficient is useful for determining if two time series are monotonically related. It has a specific formulation for addressing data that has many ties, known as Kendall’s \(\tau\)-b. However, the amount of error in any given Kendall correlation coefficient based on the number of observations is hard to calculate, making Kendall a good candidate for comparing time series that is also difficult to assess with respect to how much error might exist in the calculations.

```


  1. Downloaded May 1, 2020.link↩︎

  2. Downloaded May 1, 2020.link↩︎

  3. Downloaded May 1, 2020.link↩︎

  4. Downloaded May 1, 2020.link↩︎

  5. Downloaded May 1, 2020.link↩︎