The Pearson product-moment correlation coefficient (or just Pearson correlation coefficient) is a formula used in statistics to evaluate the linear relationship between two continuous and quantitative variables. Pearson’s correlation coefficient (r) reflects the degree, or strength, of that relationship.

The Pearson correlation coefficient is an important element of Six Sigma. It allows you to analyze causes and effects between variables and conduct correlation tests. Correlation tests are used in the first three phases of the DMAIC cycle and help to identify what variable changes in a process or product can be made for improvement.

You will first plug in your variables for X and Y into the table below. Remember, these two variables can be measured in two different units, but both must be measured on an interval or ratio scale. Your data will then be plotted and the Pearson correlation coefficient, a number between -1 and +1 representing the strength of the correlation, will be calculated.

  • Value of 0: Indicates no correlation or association between the two variables.
  • Positive Correlation: Any value greater than zero is considered a positive correlation, meaning as X increases, Y increases as well. A value of +1 is a perfect, positive correlation between X and Y.
  • Negative Correlation: Values less than zero indicates a negative correlation, meaning as X increases, Y tends to decrease. A value of -1 is a perfect, negative correlation between X and Y and the changes in Y can be attributable to X.

Each set of variables are plotted on the chart and the Pearson correlation looks at how these points relate to a line of best fit. The coefficient indicates variation around the line of best fit; the stronger the association of the two variables the closer the coefficient will be to +1 or -1. Achieving a value of exactly +1 or -1 will plot your data points exactly on the line of best fit.


Additional Pearson Correlation facts:

  • The Pearson correlation coefficient is used to measure the strength and direction of the linear relationship between two variables, where the value r = 1 means a perfect positive correlation and the value r = -1 means a perfect negative correlation. Source:
  • The Pearson correlation coefficient is also known as Pearson’s r, the product-moment correlation coefficient, or the bivariate correlation coefficient. It was developed by Karl Pearson in the late 19th century as a measure of the degree of linear dependence between two variables. Source:
  • The Pearson correlation coefficient can be calculated by dividing the covariance of the two variables by the product of their standard deviations. Alternatively, it can be computed as the sum of the products of the standardized values of each variable, divided by the number of observations. Source:
  • The Pearson correlation coefficient has some properties that make it useful for statistical analysis. For example, it is invariant to linear transformations of the variables, meaning that adding or multiplying a constant to one or both variables does not change the value of r. It is also symmetric, meaning that r(X, Y) = r(Y, X) for any two variables X and Y. Source:
  • The Pearson correlation coefficient is often used to test hypotheses about the existence and significance of a linear relationship between two variables. However, it has some limitations and assumptions that should be considered before applying it. For instance, it only detects linear relationships and may miss other types of associations. It is also sensitive to outliers and may not be robust to violations of normality. Source:

Pearson correlation coefficient is:


Helpful Resources