Analysis of Variance (ANOVA): what it is and how it is used in statistics
Jul 15, 2021
In statistics, when the means of two or more samples are compared in relation to some variable of interest (for example, anxiety after psychological treatment), tests are used to determine whether or not there are significant differences between the means.
One of them is the Analysis of Variance (ANOVA). In this article we will know what this parametric test consists of and what assumptions must be met in order to use it.
- Related article: "Psychology and statistics: the importance of probabilities in behavioral science"
Analysis of Variance (ANOVA): what is it?
In statistics, we find the concept of Analysis of Variance (ANOVA), which consists of a grouping of statistical models and their associated procedures, where the variance is partitioned into certain components, due to various explanatory variables. If we break down its acronym in English, ANOVA stands for: ANalysis Of VAriance.
The Analysis of Variance (ANOVA) is a type of parametric test. This means that a series of assumptions must be fulfilled to apply it, and that the level of the variable of interest must be, at least quantitative (i.e. at least interval, for example IQ, where there is a 0 relative).
Analysis of variance techniques
The first analysis of variance techniques were developed in the 1920s and 1930s by R.A. Fisher, a statistician and geneticist. That is why the analysis of variance (ANOVA) also known as "Fisher's Anova" or "Fisher's analysis of variance"; this is also due to the use of Fisher's F distribution (a probability distribution) as part of hypothesis testing.
Analysis of variance (ANOVA) arises from the concepts of linear regression. Linear regression, in statistics, is a mathematical model that is used to approximate the dependency relationship between a dependent variable Y (for example anxiety), the independent variables Xi (for example different treatments) and a term random.
- You may be interested: "Normal distribution: what it is, characteristics and examples in statistics"
Function of this parametric test
Thus, an analysis of variance (ANOVA) serves to determine whether different treatments (e.g. psychological treatments) show significant differences, or if, on the contrary, it can be established that their mean populations do not differ (they are practically the same, or their difference is not significant).
In other words, ANOVA is used to test hypotheses about mean differences (always more than two). ANOVA involves an analysis or decomposition of the total variability; this, in turn, can be attributed mainly to two sources of variation:
- Intergroup variability
- Intragroup variability or error
Types of ANOVA
There are two types of analysis of variance (ANOVA):
1. Anova I
When there is only one classification criterion (independent variable; for example, type of therapeutic technique). In turn, it can be intergroup (there are several experimental groups) and intragroup (there is only one experimental group).
2. Anova II
In this case, there is more than one classification criterion (independent variable). As in the previous case, this can be intergroup and intragroup.
Characteristics and assumptions
When the analysis of variance (ANOVA) is applied in experimental studies, each group consists of a certain number of subjects, and the groups may differ in this number. When the number of subjects coincides, we speak of a balanced or balanced model.
In statistics, in order to apply the analysis of variance (ANOVA), a series of assumptions must be met:
This means that the scores on the dependent variable (for example, anxiety) must follow a normal distribution. This assumption it is checked by means of the so-called goodness-of-fit tests.
It implies that there is no autocorrelation between the scores, that is, the existence of independence of the scores from each other. To ensure compliance with this assumption, we will have to perform a MAS (simple random sampling) to select the sample that we are going to study or on which we are going to work.
That term means "equality of variances of subpopulations". The variance is a statistic of variability and dispersion, and increases the greater the variability or dispersion of the scores.
The assumption of homoscedasticity is verified using the Levene or Bartlett test. In case of not fulfilling it, another alternative is to carry out a logarithmic transformation of the scores.
The above assumptions must be met when intergroup analysis of variance (ANOVA) is used. However, when using an intragroup ANOVA, the above assumptions and two more must be met:
If it is not fulfilled, it would indicate that the different sources of error correlate with each other. A possible solution if that happens is to perform a MANOVA (Multivariate Analysis of Variance).
It assumes no subject x treatment interaction; if it is not complied with, the error variance would increase.
- Bottle, J., Sueró, M., Ximénez, C. (2012). Data analysis in psychology I. Madrid: Pyramid.
- Fontes de Gracia, S. Garcia, C. Quintanilla, L. et al. (2010). Research fundamentals in psychology. Madrid.
- Martínez, M.A. Hernández, M.J. Hernández, M.V. (2014). Psychometry. Madrid: Alliance.