Karl Pearson’s Correlation Coefficient Formula, Methods & Examples

Karl Pearson’s Correlation Coefficient is a method in statistics used to measure how strongly two sets of data are related. It tells us whether the change in one variable is connected to the change in another.

This coefficient is also called Pearson’s r, and it’s often used when studying relationships in linear regression.

If the value is positive, it means both variables increase or decrease together.
If the value is negative, one variable increases while the other decreases.
If the value is zero, it means there is no link or relationship between the two variables.

This method gives a number between -1 and +1, which shows how strong or weak the connection is. It’s a helpful tool in comparing trends and patterns in data.

Maths Notes Free PDFs

Topic	PDF Link
Class 12 Maths Important Topics Free Notes PDF	Download PDF
Class 10, 11 Mathematics Study Notes	Download PDF
Most Asked Maths Questions in Exams	Download PDF
Increasing and Decreasing Function in Maths	Download PDF

What is Karl Pearson’s Correlation Coefficient?

Karl Pearson’s coefficient of correlation is a linear correlation coefficient that comes under the range of -1 to +1. A value of -1 signifies a strong negative correlation while +1 indicates a strong positive correlation.

There are 3 assumptions of Karl Pearson’s coefficient of correlation:

Assumption 1: The variables x and y are linearly related.
Assumption 2: There is a cause-and-effect relationship between factors affecting the values of the variables x and y.
Assumption 3: The random variables x and y are normally distributed.

Degree of Correlation

Perfect: If the value is near ± 1, then it is said to be a perfect correlation: as one variable increases, the other variable tends to also increase (if positive) or decrease (if negative).
High degree: If the coefficient value lies between ± 0.50 and ± 1, then it is said to be a strong correlation.
Moderate degree: If the value lies between ± 0.30 and ± 0.49, then it is said to be a medium correlation.
Low degree: When the value lies below + .29, then it is said to be a small correlation.

No correlation: When the value is zero.

UGC NET/SET Course Online by SuperTeachers: Complete Study Material, Live Classes & More

Get UGC NET/SET SuperCoaching @ just

₹25999₹11696

Your Total Savings ₹14333

Explore SuperCoaching

People also like

Nirnay IAS 2026 - Lakshya Batch - 3 (Hinglish)

₹110000 (34% OFF)

₹73333 (Valid for 24 Months)

Explore this Supercoaching

BPSC Recorded

₹4999 (73% OFF)

₹1364 (Valid for 6 Months)

Explore this Supercoaching

CSIR NET/SET

₹10999 (69% OFF)

₹3428 (Valid till Dec'2025 Exam Date)

Explore this Supercoaching

Karl Pearson’s Correlation Coefficient Formula

Karl Pearson’s correlation coefficient is calculated as the covariance of the two variables divided by the product of the standard deviation of each data sample. It is the normalization of the covariance between the two variables to give an interpretable score.

Karl Pearson’s correlation coefficient formula is given below:

\(r = {\sum(X – \bar{X})(Y – \bar{Y})\over{\sqrt{\sum(X – \bar{X})^2}\sqrt{(Y – \bar{Y})^2}}}\)

where \(\bar{X}\) = mean of X variable

\(\bar{Y}\) = mean of Y variable

Covariance Formula: \(Cov (X, Y) = r = {\sum(X – \bar{X})(Y – \bar{Y})\over{N}} = {\sum{xy}\over{N}}\)

How to Calculate Karl Pearson’s Coefficient of Correlation

There are 4 methods to calculate Karl Pearson’s Coefficient of Correlation which are given below:

Actual Mean Method
Assumed Mean Method
Step Deviation Method
Direct Method

Calculate Karl Pearson’s Coefficient of Correlation using Actual Mean Method

In actual mean method, the actual mean is found by adding up all the numbers, then dividing by how many numbers there are. In other words, it is the sum divided by the count.

Test Series

NCERT XI-XII Physics Foundation Pack Mock Test

323 Total Tests | 3 Free Tests

English,Hindi

3 Live Test
163 Class XI Chapter Tests
157 Class XII Chapter Tests

View Test Series

Physics for Medical Exams Mock Test

74 Total Tests | 2 Free Tests

English

74 Previous Year Chapter Test

View Test Series

NCERT XI-XII Math Foundation Pack Mock Test

117 Total Tests | 2 Free Tests

English,Hindi

3 AIM IIT 🎯
49 Class XI
65 Class XII

View Test Series

Calculate Karl Pearson’s Coefficient of Correlation using Assumed Mean Method

If the given data is large, then this method is recommended rather than a direct method for calculating the mean. This method helps in reducing the calculations and results in small numerical values. Under the assumed mean method, the correlation coefficient is calculated by taking the assumed mean only. Where dx = deviations of X from its assumed mean; dy= deviations of y from its assumed mean. Pearson’s Coefficient of correlation always lies between +1 and -1.

Calculate Karl Pearson’s Coefficient of Correlation using Step Deviation Method

The step deviation method is the extended method of the assumed or short-cut method of obtaining the mean of large values. These values of deviations are divisible by a common factor that is reduced to a smaller value. The step deviation method is also called a change of origin or scale method. To calculate the Pearson product-moment correlation by Step Deviation Method, one must first determine the covariance of the two variables in question. Next, one must calculate each variable’s standard deviation. The correlation coefficient is determined by dividing the covariance by the product of the two variables’ standard deviations.

Calculate Karl Pearson’s Coefficient of Correlation using Direct Method

Steps involved in the procedure of calculation of Karl Pearson’s coefficient of correlation by the direct method.

Calculate mean values x and y.
Calculate deviations of values of x series from the mean value.
Square the deviations.
Calculate the deviation of values of the y series from a mean value.
Square the deviation.
Multiply the square of deviation of X series with the square of deviations of Y series.
Use the formula for calculating the correlation coefficient.

Types of Correlation Coefficient

The correlation coefficient shows how strongly two variables are related and in what direction. Based on this, there are three main types:

1. Positive Correlation (between 0 and +1)

This means that both variables move in the same direction.
If one value increases, the other also increases.
Example: The more time you spend exercising, the more calories you burn.

2. Negative Correlation (between 0 and -1)

Here, the two variables move in opposite directions.
If one value goes up, the other comes down.
Example: As the price of a product increases, the demand for it usually decreases.

3. Zero Correlation (exactly 0)

This means there is no connection between the two variables.
A change in one does not affect the other.
Example: A person’s height has nothing to do with their intelligence.

Change of Scale and Origin in Correlation

The correlation coefficient tells us how strongly two variables are related. A key point to remember is that this value does not change if we change the scale or origin of the data.

Change of Origin: If you add or subtract the same number from all values in a data set, it won’t change the correlation.
For example, if you add 10 to every number in a set, the correlation stays the same.
Change of Scale: If you multiply or divide all values by the same number, the correlation coefficient also stays unchanged.
For example, if you double all the values in a set, the correlation between the variables does not change.

Example:

Let’s say we have two variables:

X = [10, 20, 30]
Y = [40, 60, 80]
The correlation between X and Y is 1 (perfect positive).

Now, apply changes:

Change of Origin: Add 5 to each value of X → X becomes [15, 25, 35]
The correlation with Y is still 1.
Change of Scale: Multiply each value of Y by 2 → Y becomes [80, 120, 160]
Again, the correlation with X is still 1.

Characteristics of Karl Pearson's Coefficient of Correlation

1. Range of Values
  The value of this coefficient lies between -1 and +1.
  - A value close to +1 means a strong positive relationship.
  - A value close to -1 means a strong negative relationship.
  - A value of 0 means there is no linear relationship between the variables.

Shows Direction of Relationship

- If the coefficient is positive, it means when one variable increases, the other also increases.
- If the coefficient is negative, it means when one variable increases, the other decreases.
Only for Linear Relationships
Karl Pearson’s method only works for straight-line relationships.
It doesn’t explain relationships that are curved or non-linear.
Independent of Units
The value of the correlation coefficient does not change with different units like cm, kg, etc.
It is a pure number and only shows the strength and direction of the relation.
Same in Either Direction
The correlation between X and Y is the same as the correlation between Y and X.
So, the order of variables does not affect the result.

Properties of Karl Pearson’s Coefficient of Correlation

Karl Pearson’s coefficient of correlation shows the following properties with proof:

Property 1: Karl Pearson’s Coefficient of Correlation (r) lies between and -1 and 1 i.e. \(-1\geq{r}\geq1\)

Proof: Suppose, X and Y are two variables that take values \((x_i, y_i)\), i = 1, 2, 3, 4, …. n which means,

\(\begin{matrix}
{\bar{x}}, {\bar{y}}\text{ and }{\sigma_x} {\sigma_y} \text{ standard deviation respectively. }\\
\text{ Let us consider, }\\
\sum[{x-\bar{x}\over{\sigma_x}} \pm {y-\bar{y}\over{\sigma_y}}]^2\geq{0}\\
\sum[({x-\bar{x}\over{\sigma_x}})^2 + ({y-\bar{y}\over{\sigma_y}})^2 \pm 2{(x-\bar{x})(y-\bar{y})\over{\sigma_x\sigma_y}}]\geq{0}\\
{1\over{\sigma_x^2}}\sum{x-\bar{x}} + {1\over{\sigma_y^2}}\sum{y-\bar{y}} \pm {2\over{\sigma_x\sigma_y}}{\sum(x-\bar{x})(y-\bar{y})}\geq{0}\\
\text{ Dividing both sides by n, we get }\\
{1\over{\sigma_x^2}}{\sum{x-\bar{x}}\over{n}} + {1\over{\sigma_y^2}}{\sum{y-\bar{y}}\over{n}} \pm {2\over{\sigma_x\sigma_y}}{\sum(x-\bar{x})(y-\bar{y})\over{n}}\geq{0}\\
{1\over{\sigma_x^2}}\sigma_x^2 + {1\over{\sigma_y^2}}\sigma_y^2 \pm {2\over{\sigma_x\sigma_y}} cov(x,y)\geq{0}\\
1 + 1 \pm{2r} \geq{0}\\
2 \pm{2r} \geq{0}\\
2 (1 \pm{r}) \geq{0}\\
(1 \pm{r}) \geq{0}\\
\text{ Either } (1 + {r}) \geq{0} \text{ or } (1 – {r}) \geq{0}\\
r \geq -1 \text{ or } r \leq 1\\
\therefore -1\geq{r}\geq1
\end{matrix}\)

The least value of r is –1 and the most is +1. If r = +1, there is a perfect positive correlation between the two variables. If r = -1, there is a perfect negative correlation.

If r = 0, then there is no linear relation between the variables. However, there may be a non-linear relationship between the variables.

If it is positive but close to zero, then there will be a weak positive correlation and if is close to +1, then there will be a strong positive correlation.

Property 2: Correlation coefficient is independent of change in origin and scale

Proof: Suppose, X and Y are the original variables and after changing origin and scale, we have

\(\begin{matrix}
U = {X – a \over{h}} \text{ and } U = {Y – b \over{k}} \text{ where a, b, h, k are all constants. }\\
X – a = hU \text{ and } Y – b = kV\\
X = a + hU \text{ and } Y = b + kV\\
\bar{X} = a + h\bar{U} \text{ and } \bar{Y} = b + k\bar{V}\\
X – \bar{X} = h(U – \bar{U}) \text{ and } Y – \bar{Y} = k(V – \bar{V}) \\
\text{ Now, } r_{xy} = {\sum(x – \bar{x})(y – \bar{y})\over{\sqrt{\sum(x – \bar{x})^2}\sqrt{\sum(y – \bar{y})^2}}}\\
r_{xy} = {hk\sum(U – \bar{U})(V – \bar{V})\over{\sqrt{h^2}{\sum(U – \bar{U})^2}\sqrt{k^2}\sum{(V – \bar{V})^2}}}\\
r_{xy} = {\sum(x – \bar{x})(y – \bar{y})\over{h\sqrt{\sum(x – \bar{x})^2}k\sqrt{\sum(y – \bar{y})^2}}}\\
=r_{u,v}\\
r_{xy} = r_{u,v}
\end{matrix}\)

Learn about Limits and Continuity

Property 3: Two independent variables are uncorrelated but the converse is not true

Proof: If two variables are independent then their covariance is zero, i.e., cov (X, Y) = 0

\(\therefore r_{xy} = {cov (X, Y)\over{\sigma_x\sigma_y}} = {0\over{\sigma_x\sigma_y}} = 0\)

Thus, if two variables are independent their co-efficient of correlation is zero, i.e., independent variables are uncorrelated.

But, the converse is not true. If \(r_{xy} = 0\), then there does not exist any linear correlation between the variables as Karl Pearson’s coefficient of correlation \(r_{xy} \) is a measure of an only linear relationship. However, there may be a strong non-linear or curvilinear relationship even though \(r_{xy} = 0\).

A correlation coefficient is a pure number independent of the unit of measurement.

The correlation coefficient is symmetric.

Advantages of Karl Pearson's Coefficient of Correlation

Simple to Use
The formula is easy to understand and calculate using basic math or a calculator.
Shows Strength and Direction
It tells you not just how strong the link between two variables is, but also if it is positive (both increase or decrease together) or negative (one increases while the other decreases).
Unit-Free
It does not depend on the units (like cm, kg, etc.), so it’s useful for comparing data from different sources.
Same Result Either Way
The result is the same whether you compare X with Y or Y with X. This makes the method fair and consistent.
Widely Accepted
It’s one of the most commonly used methods in statistics and is trusted across many fields like economics, science, and social studies.

Disadvantages of Karl Pearson's Coefficient of Correlation

Only Works for Straight-Line Relationships
It can only measure linear (straight-line) relationships. If the link between the variables is curved, this method won’t give the right result.
Affected by Outliers
Unusual or extreme values in the data can change the result and give a false impression of the relationship.
Needs Normal Data
The method assumes that the data is normally distributed, which is not always true in real-life data.
No Cause and Effect
Even if two things are correlated, it does not mean one causes the other. The method only shows a connection, not a reason.
Not for Categorical Data
It can only be used when the data is in numbers. It doesn't work with categories like colors, brands, or names.

Solved Examples of Karl Pearson’s Correlation Coefficient

Example 1: Compute the correlation coefficient between x and y from the following data \(n = 10, \sum{xy} = 220, \sum{x^2} = 200, \sum{y^2} = 262, \sum{x} = 40 and \sum{y} = 50\)

Solution: \(\begin{matrix}
\text{ The formula to find the Pearson correlation coefficient is given by }\\
r = r_{xy} = \frac{Cov(x , y)}{S_x\times{S_y}}\\
Cov (x, y) = [{\sum{xy}\over{n}}] \text{ – mean of “x” . mean of “y”}\\
\text{ Mean of “x” }= [{\sum{x}\over{n}}] = {40\over{10}} = 4\\
\text{ Mean of “y” }= [{\sum{y}\over{n}}] = {50\over{10}} = 5\\
\text{ Cov (x, y) } = {50\over{10}} – 4 \times 5\\
\text{ Cov (x, y) } = 22 – 20\\
\text{ Cov (x, y) } = 2\\
\text{ SD of “x” } = \sqrt{ (\sum{x^2}/n) – (\bar{x})^2] }\\
\text{ SD of “x” } = \sqrt{ [(200/10) – (4)^2] }\\
\text{ SD of “x” } =\sqrt{ [20 – 16] }\\
\text{ SD of “x” } =\sqrt{ [4] }\\
\text{ SD of “x” } = 2\\
\text{ SD of “y” } = \sqrt{ [(∑y^2/n) – (\text{mean of y})^2] }\\
\text{ SD of “y” } = \sqrt{ [({262\over{10}}) – (5)^2] }\\
\text{ SD of “y” } = \sqrt{ [26.2 – 25] }\\
\text{ SD of “y” } = \sqrt{ [1.2] }\\
\text{ SD of “x” } = 1.0954\\
\text{ Pearson correlation coefficient is }\\
r = 2 / (2 \times 1.0954)\\
r = {2\over{2.1908}}\\
r = 0.91
\end{matrix}\)

Example 2: Find Karl Pearson’s Correlation Coefficient for the following data.

Karl Pearson’s Coefficient of Correlation

Solution:

\(\begin{matrix}
r = {{\sum{dxdy} – {\sum{dx}\times\sum{dy}\over{N}}}\over{\sqrt{\sum{dx^2} – {\sum{dx}^2\over{N}}}\times\sqrt{\sum{dy^2} – {\sum{dy}^2\over{N}}}}}\\
r = {{2116 – {47\times108\over{8}}}\over{\sqrt{1475 – {47^2\over{8}}}\times\sqrt{3468 – {108^2\over{8}}}}}\\
r = {2116 – 634.5 \over{\sqrt{1475 – 276.125} \times\sqrt{3468 – 1458}}}\\
r = {1481.5 \over{\sqrt{1198.875} \times\sqrt{2010}}}\\
r = {1481.5 \over{34.62\times44.83}}\\
r = {1481.5 \over{1552.0146}}\\
r = 0.955
\end{matrix}\)

If you are checking Karl Pearson’s Correlation Coefficient article, also check related maths articles:
Regression Coefficient	Drag Coefficient
Coefficient of Linear Expansion	Ratio and Proportion
Linear Equations	Spearman’s Rank Correlation Coefficient

Important Links
NEET Exam
NEET Previous Year Question Papers	NEET Mock Test	NEET Syllabus
CUET Exam
CUET Previous Year Question Papers	CUET Mock Test	CUET Syllabus
JEE Main Exam
JEE Main Previous Year Question Papers	JEE Main Mock Test	JEE Main Syllabus
JEE Advanced Exam
JEE Advanced Previous Year Question Papers	JEE Advanced Mock Test	JEE Advanced Syllabus

FAQs For Karl Pearson Coefficient of Correlation

What is Karl Pearson's Correlation Coefficient?

Karl Pearson's Correlation Coefficient is used in statistics to summarize the strength of the linear relationship between two data samples.

What are 3 assumptions of Karl Pearson’s coefficient of correlation?

There are 3 assumptions of Karl Pearson’s coefficient of correlation: The variables x and y are linearly related. There is a cause-and-effect relationship between factors affecting the values of the variables x and y. The random variables x and y are normally distributed.

What is Karl Pearson's Correlation Coefficient formula?

Karl Pearson's correlation coefficient formula is:\(r = {\sum(X - \bar{X})(Y - \bar{Y})\over{\sqrt{\sum(X - \bar{X})^2}\sqrt{(Y - \bar{Y})^2}}}\)where \(\bar{X}\) = mean of X variable\(\bar{Y}\) = mean of Y variable

How to calculate Karl Pearson’s Coefficient of Correlation?

There are 4 methods to calculate Karl Pearson’s Coefficient of Correlation which are given below: Actual Mean Method Assumed Mean Method Step Deviation Method Direct Method

What are the properties of Karl Pearson’s Coefficient of Correlation?

Karl Pearson’s coefficient of correlation shows the following properties:Property 1: Karl Pearson’s Coefficient of Correlation (r) lies between and -1 and 1 i.e. \(-1\geq{r}\geq1\)Property 2: Correlation coefficient is independent of change in origin and scaleProperty 3: Two independent variables are uncorrelated but the converse is not true

What is the range of the coefficient?

The coefficient rrr ranges from -1 to +1: +1: Perfect positive correlation 0: No correlation -1: Perfect negative correlation

When is Pearson’s correlation used?

To measure the strength and direction of a linear relationship between two quantitative variables. In fields like psychology, economics, biology, and finance.

What are the assumptions for using Pearson’s r?

Variables are measured on interval or ratio scales. The relationship between variables is linear. Data should be approximately normally distributed. There should be no significant outliers.

Report An Error

Download the Testbook APP & Get Pass Pro Max FREE for 7 Days

10,000+ Study Notes
Realtime Doubt Support
71000+ Mock Tests
Rankers Test Series
+ more benefits

What is Karl Pearson’s Correlation Coefficient?

People also like

Karl Pearson’s Correlation Coefficient Formula

How to Calculate Karl Pearson’s Coefficient of Correlation

Calculate Karl Pearson’s Coefficient of Correlation using Actual Mean Method

Calculate Karl Pearson’s Coefficient of Correlation using Assumed Mean Method

Calculate Karl Pearson’s Coefficient of Correlation using Step Deviation Method

Calculate Karl Pearson’s Coefficient of Correlation using Direct Method

Types of Correlation Coefficient

1. Positive Correlation (between 0 and +1)

2. Negative Correlation (between 0 and -1)

3. Zero Correlation (exactly 0)

Change of Scale and Origin in Correlation

Example:

Characteristics of Karl Pearson's Coefficient of Correlation

Properties of Karl Pearson’s Coefficient of Correlation

Advantages of Karl Pearson's Coefficient of Correlation

Disadvantages of Karl Pearson's Coefficient of Correlation

Solved Examples of Karl Pearson’s Correlation Coefficient

FAQs For Karl Pearson Coefficient of Correlation

What is Karl Pearson's Correlation Coefficient?

What are 3 assumptions of Karl Pearson’s coefficient of correlation?

What is Karl Pearson's Correlation Coefficient formula?

How to calculate Karl Pearson’s Coefficient of Correlation?

What are the properties of Karl Pearson’s Coefficient of Correlation?

What is the range of the coefficient?

When is Pearson’s correlation used?

What are the assumptions for using Pearson’s r?

Scan this QR code to Get the Testbook App