Education, study and knowledge

Kolmogorov-Smirnov test: what it is and how it is used in statistics

In statistics, parametric and non-parametric tests are well known and used. A widely used non-parametric test is the Kolmogorov-Smirnov test., which allows us to verify whether or not the sample scores follow a normal distribution.

It belongs to the group of so-called goodness-of-fit tests. In this article we will know its characteristics, what it is for and how it is applied.

  • Related article: "Chi-square (χ²) test: what it is and how it is used in statistics"

nonparametric tests

The Kolmogorov-Smirnov test is a type of nonparametric test. Nonparametric tests (also called free distribution) are used in inferential statistics, and have the following characteristics:

  • They propose hypotheses about goodness of fit, independence...
  • The level of measurement of the variables is low (ordinal).
  • They do not have excessive restrictions.
  • They are applicable to small samples.
  • They are robust.

Kolmogorov-Smirnov test: characteristics

The Kolmogórov-Smirnov test is one of its own belonging to statistics, specifically to

instagram story viewer
inferential statistics. Inferential statistics aims to extract information about populations.

It is a goodness-of-fit test, that is, it is used to verify whether or not the scores we have obtained from the sample follow a normal distribution. That is, it allows measuring the degree of agreement between the distribution of a data set and a specific theoretical distribution. Its objective is to indicate if the data come from a population that has the specified theoretical distribution, that is In other words, what it does is test whether the observations could reasonably come from the distribution specified.

The Kolmogorov-Smirnov test addresses the following question: Do the sample observations come from some hypothesized distribution?

Null Hypothesis and Alternative Hypothesis

As a goodness-of-fit test, it answers the question: “does the (empirical) sampling distribution fit the (theoretical) population distribution?”. In this case, the null hypothesis (H0) will establish that the empirical distribution is similar to the theoretical one (The null hypothesis is the one that is not attempted to be rejected.) In other words, the null hypothesis will establish that the observed frequency distribution is consistent with the theoretical distribution (and therefore a good fit).

In contrast, the alternative hypothesis (H1) will state that the observed frequency distribution is not consistent with the theoretical distribution (bad fit). As in other hypothesis contrast tests, the symbol α (alpha) will indicate the level of significance of the test.

  • You may be interested in: "Pearson's correlation coefficient: what it is and how to use it"

How is it calculated?

The result of the Kolmogorov-Smirnov test is represented by the letter Z. The Z is calculated from the largest difference (in absolute value) between the theoretical and observed (empirical) cumulative distribution functions.

Assumptions

In order to apply the Kolmogorov-Smirnov test correctly, a series of assumptions must be made. Firstly, the test assumes that the parameters of the test distribution have been previously specified. This procedure estimates the parameters from the sample.

On the other hand, the sample mean and standard deviation are the parameters of a normal distribution, the minimum and maximum values ​​of the sample define the range of the uniform distribution, the sample mean is the parameter of the Poisson distribution and the sample mean is the parameter of the distribution exponential.

The ability of the Kolmogorov-Smirnov test to detect deviations from the hypothesized distribution can be severely diminished. To contrast it with a normal distribution with estimated parameters, the possibility of using the K-S Lillliefors test should be considered.

Application

The Kolmogorov-Smirnov test can be applied to a sample to check if a variable (for example, academic grades or € income) is normally distributed. This is sometimes necessary to know, since many parametric tests require that the variables they use follow a normal distribution.

Advantages

Some of the advantages of the Kolmogorov-Smirnov test are:

  • It is more powerful than the Chi-square (χ²) test (also a goodness-of-fit test).
  • It is easy to calculate and use, and does not require grouping of the data.
  • The statistic is independent of the expected frequency distribution, it only depends on the sample size.

Differences with parametric tests

Parametric tests, unlike non-parametric tests such as the Kolmogorov-Smirnov test, have the following characteristics:

  • They make hypotheses about parameters.
  • The level of measurement of the variables is quantitative at least.
  • There are a number of assumptions that must be met.
  • They do not lose information.
  • They have high statistical power.

Some examples of parametric tests would be: the t-test for difference in means or the ANOVA.

Pearson's correlation coefficient: what it is and how to use it

When researching in psychology, descriptive statistics is frequently used, which offers ways of p...

Read more

Types of graphs: the various ways to represent data

Types of graphs: the various ways to represent data

All research of a scientific nature is supported and based on a set of data properly analyzed and...

Read more

The 11 best websites to learn English easily

Mandarin Chinese is the most widely spoken language in the world; however, English is perhaps the...

Read more

instagram viewer