Degrees of Freedom
By AKHILESH GANTI Updated October 30, 2021
Reviewed by CHARLES POTTERS
Fact checked by SUZANNE KVILHAUG
What Are Degrees of Freedom?
Degrees of freedom refers to the maximum number of logically independent values, which are values that have the freedom to vary, in the data sample.
KEY TAKEAWAYS
• Degrees of freedom refers to the maximum number of logically independent values, which are values that have the freedom to vary, in the data sample.
• Degrees of freedom are commonly discussed in relation to various forms of hypothesis testing in statistics, such as a chi-square.
• Calculating degrees of freedom is key when trying to understand the importance of a chi-square statistic and the validity of the null hypothesis.
Understanding Degrees of Freedom
The easiest way to understand degrees of freedom conceptually is through an example:
• Consider a data sample consisting of, for the sake of simplicity, five positive integers. The values could be any number with no known relationship between them. This data sample would, theoretically, have five degrees of freedom.
• Four of the numbers in the sample are {3, 8, 5, and 4} and the average of the entire data sample is revealed to be 6.
• This must mean that the fifth number has to be 10. It can be nothing else. It does not have the freedom to vary.
• So the degrees of freedom for this data sample is 4.
The formula for degrees of freedom equals the size of the data sample minus one:
D
f
=N−1
where:
D
f
=degrees of freedom
N=sample size
Degrees of freedom are commonly discussed in relation to various forms of hypothesis testing in statistics, such as a chi-square. It is essential to calculate degrees of freedom when trying to understand the importance of a chi-square statistic and the validity of the null hypothesis.
Chi-Square Tests
There are two different kinds of chi-square tests: the test of independence, which asks a question of relationship, such as, "Is there a relationship between gender and SAT scores?"; and the goodness-of-fit test, which asks something like "If a coin is tossed 100 times, will it come up heads 50 times and tails 50 times?"
For these tests, degrees of freedom are utilized to determine if a certain null hypothesis can be rejected based on the total number of variables and samples within the experiment. For example, when considering students and course choice, a sample size of 30 or 40 students is likely not large enough to generate significant data. Getting the same or similar results from a study using a sample size of 400 or 500 students is more valid.
History of Degrees of Freedom
The earliest and most basic concept of degrees of freedom was noted in the early 1800s, intertwined in the works of mathematician and astronomer Carl Friedrich Gauss. The modern usage and understanding of the term were expounded upon first by William Sealy Gosset, an English statistician, in his article "The Probable Error of a Mean," published in Biometrika in 1908 under a pen name to preserve his anonymity.1
In his writings, Gosset did not specifically use the term "degrees of freedom." He did, however, give an explanation for the concept throughout the course of developing what would eventually be known as Student’s T-distribution. The actual term was not made popular until 1922. English biologist and statistician Ronald Fisher began using the term "degrees of freedom" when he started publishing reports and data on his work developing chi-squares.
ARTICLE SOURCES
Related Terms
Chi-Square (χ2) Statistic Definition
A chi-square (χ2) statistic is a test that measures how expectations compare to actual observed data (or model results). more
T-Test Definition
A t-test is a type of inferential statistic used to determine if there is a significant difference between the means of two groups, which may be related in certain features. more
Why Statistical Significance Matters
Statistical significance refers to a result that is not likely to occur randomly but rather is likely to be attributable to a specific cause. more
Goodness-Of-Fit
A goodness-of-fit test helps you see if your sample data is accurate or somehow skewed. Discover how the popular chi-square goodness-of-fit test works.more
How Analysis of Variance (ANOVA) Works
Analysis of variance (ANOVA) is a statistical analysis tool that separates the total variability found within a data set into two components: random and systematic factors. more
How the Wilcoxon Test Is Used
The Wilcoxon test, which refers to either the rank sum test or the signed rank test, is a nonparametric test that compares two paired groups. more
Related Articles
Investopedia is part of the Dotdash publishing family.