Get Knowledge - Information

Chi Square Test on Excel: How to Calculate Distribution

615

Chi-Square Statistic: How to Calculate It / Distribution

Chi Square Test: Definitions|For upon

  1. What is a Chi-Square Test?
  2. What is a Chi-Square Statistic?
  3. Chi-Square P-Values.
  4. The Chi-Square Distribution & Chi Distribution

Calculations:

  1. How to Calculate a Chi-Square Statistic:
    • By Hand (with video)
    • SPSS Instructions.
  2. How To Test a Chi-Square Hypothesis (with video)

See also:

  • Chi-square test for normality.

What is a Chi-Square Test?

Watch the video for an overview of the tests:

Chi-Square Test Intro

There are two types of chi-square tests. Both use the chi-square statistic and distribution for different purposes:

  • The Chi-square goodness of fit test determines if sample data matches a population. For more details on this type, see The goodness of Fit Test.
  • A chi-square test for independence compares two variables in a contingency table to see if they are related. In a more general sense, it tests to see whether distributions of categorical variables differ from each other. Chi-Square Test on Excel.

What is a Chi-Square Statistic?

The formula for the chi-square statistic used in the chi-square test is:

The chi-square formula.

The chi-square formula.

The subscript “c” is the degree of freedom. “O” is your observed value and E is your expected value. You’ll rarely want to use this formula to find a critical chi-square value by hand. The summation symbol means that you’ll have to perform a calculation for every single data item in your data set. As you can probably imagine, the calculations can get very, very, lengthy and tedious. Instead, you’ll probably want to use technology:

  • Chi-Square Test in SPSS.
  • Chi-Square P-Value in Excel.

A chi-square statistic is one way to show a relationship between two categorical variables. In statistics, there are two types of variables: numerical (countable) variables and non-numerical (categorical) variables. A chi-squared statistic is a single number that tells you how much difference exists between your observed counts and the counts you would expect if there were no relationship at all in the population. Chi-Square Test on Excel.

There are a few variations on the chi-square statistic. Which one you use depends upon how you collected the data and which hypothesis is being tested. However, all of the variations use the same idea, which is that you are comparing your expected values with the values you collect. One of the most common forms can be used for contingency tables:

chi-square statistic

Where O is the observed value, E is the expected value, and “i” is the “ith” position in the contingency table.

A low value for chi-square means there is a high correlation between your two sets of data. In theory, if your observed and expected values were equal (“no difference”) then the chi-square would be zero — an event that is unlikely to happen in real life. Deciding whether a chi-square test statistic is large enough to indicate a statistically significant difference isn’t as easy as it seems. It would be nice if we could say a chi-square test statistic >10 means a difference, but unfortunately, that isn’t the case.

You could take your calculated chi-square value and compare it to a critical value from a chi-square table. If the chi-square value is more than the critical value, then there is a significant difference.

You could also use a p-value. First state the null hypothesis and the alternate hypothesis. Then generate a chi-square curve for your results along with a p-value (See: Calculate a chi-square p-value Excel). Small p-values (under 5%) usually indicate that a difference is significant (or “small enough”).

Tip: The Chi-square statistic can only be used on numbers. They can’t be used for percentages, proportions, means, or similar statistical values. For example, if you have 10 percent of 200 people, you would need to convert that to a number (20) before you can run a test statistic. Chi-Square Test on Excel.

Chi-Square P-Values.

A chi-square test will give you a p-value. The p-value will tell you if your test results are significant or not. In order to perform a chi-square test and get the p-value, you need two pieces of information:

  1. Degrees of freedom. That’s just the number of categories minus 1.
  2. The alpha level(α). This is chosen by you, or the researcher. The usual alpha level is 0.05 (5%), but you could also have other levels like 0.01 or 0.10.

In elementary statistics or AP statistics, both the degrees of freedom(df) and the alpha level is usually given to you in a question. You don’t normally have to figure out what they are. You may have to figure out the df yourself, but it’s pretty simple: count the categories and subtract 1.

Degrees of freedom are placed as a subscript after the chi-square (Χ2) symbol. For example, the following chi-square shows 6 df:
Χ26.
And this chi-square shows 4 df:
Χ24.

The Chi-Square Distribution

chi square distribution

By Geek3|Wikimedia Commons GFDL

The chi-square distribution (also called the chi-squared distribution) is a special case of the gamma distribution; A chi-square distribution with n degrees of freedom is equal to a gamma distribution with a = n / 2 and b = 0.5 (or β = 2).

Let’s say you have a random sample taken from a normal distribution. The chi-square distribution is the distribution of the sum of these random samples squared. The degree of freedom (k) is equal to the number of samples being summed. For example, if you have taken 10 samples from the normal distribution, then df = 10. The degrees of freedom in a chi-square distribution are also its mean. In this example, the mean of this particular distribution will be 10. Chi-square distributions are always right-skewed. However, the greater the degrees of freedom, the more the chi-square distribution looks like a normal distribution. Chi-Square Test on Excel.

A chi-bar-squared distribution is a mixture of chi-square distributions, mixed over their degrees of freedom. You’ll often find them when testing a hypothesis with an inequality [1].

Uses

The chi-squared distribution has many uses in statistics, including:

  • Confidence interval estimation for a population standard deviation of a normal distribution from a sample standard deviation [2].
  • Independence of two criteria of classification of qualitative variables.
  • Relationships between categorical variables (contingency tables).
  • Sample variance study when the underlying distribution is normal.
  • Tests of deviations of differences between expected and observed frequencies (one-way tables).
  • The chi-square test (goodness of fit test).

Chi Distribution

A similar distribution is the chi distribution [3]. This distribution describes the square root of a variable distributed according to a chi-square distribution.; with df = n > 0 degrees of freedom has a probability density function of:

f(x) = 2(1-n/2) x(n-1) e(-(x2)/2) / Γ(n/2)

For values where x is positive.
The CDF for this function does not have a closed form, but it can be approximated with a series of integrals, using calculus.

How to Calculate a Chi-Square Statistic

A chi-square statistic is used for testing hypotheses. Watch this video, How to calculate a chi-square.

Chi-Square Statistic (by hand)

chi-square statistic.

The chi-square formula.

The chi-square formula is a difficult formula to deal with. That’s mostly because you’re expected to add a large number of numbers. The easiest way to solve the formula is by making a table.

Example question: 256 visual artists were surveyed to find out their zodiac sign. The results were: Aries (29), Taurus (24), Gemini (22), Cancer (19), Leo (21), Virgo (18), Libra (19), Scorpio (20), Sagittarius (23), Capricorn (18), Aquarius (20), Pisces (23). Test the hypothesis that zodiac signs are evenly distributed across visual artists.

Step 1: Make a table with columns for “Categories,” “Observed,” “Expected,” “Residual (Obs-Exp)”, “(Obs-Exp)2” and “Component (Obs-Exp)2 / Exp.” Don’t worry about what these mean right now; We’ll cover that in the following steps.
 Calculate a Chi Square 1

Step 2: Fill in your categories. Categories should be given to you in the question. There are 12 zodiac signs, so:
 Calculate a Chi Square 2

Step 3: Write your counts. Counts are the number of each item in each category in column 2. You’re given the counts in the question:
 Calculate a Chi Square 3

Step 4: Calculate your expected value for column 3. In this question, we would expect the 12 zodiac signs to be evenly distributed for all 256 people, so 256/12=21.333. Write this in column 3.
 Calculate a Chi Square 4

Step 5: Subtract the expected value (Step 4) from the Observed value (Step 3) and place the result in the “Residual” column. For example, the first row is Aries: 29-21.333=7.667.
 Calculate a Chi Square 5

1Step 6: Square your results from Step 5 and place the amounts in the (Obs-Exp)2 column.
chi square table 6

Step 7: Divide the amounts in Step 6 by the expected value (Step 4) and place those results in the final column.
chi square table 7

Step 8: Add up (sum) all the values in the last column.
chi square table 8
This is the chi-square statistic: 5.094.

Like the explanation? Check out the Practically Cheating Statistics Handbook, which has hundreds more step-by-step explanations, just like this one!

SPSS Instructions.

You’ll find the chi-square test in SPSS under “Crosstabs”.

Example problem: Run a chi-square test in SPSS.

Note: In order to run a chi-square test in SPSS you should already have written a hypothesis statement. See: How to state the null hypothesis.

Watch the video for the steps:

How to run a Chi-Square Test in SPSS

Step 1: Click “Analyze,” then click “Descriptive Statistics,” then click “Crosstabs.”

How to Perform a Chi Square Test in SPSS.

Chi-square in SPSS is found in the Crosstabs command.

Step 2: Click the “Statistics” button. The statistics button is to the right of the Crosstabs window. A new pop-up window will appear.
chi square spss 2

Step 3: Click “Chi-Square” to place a check in the box and then click “Continue” to return to the Crosstabs window.

Step 4: Select the variables you want to run (in other words, choose two variables that you want to compare using the chi-square test). Click one variable in the left window and then click the arrow at the top to move the variable into “Row(s).” Repeat to add a second variable to the “Column(s)” window.

Step 5: Click “cells” and then check “Rows” and “Columns”. Click “Continue.”

Step 6: Click “OK” to run the Chi-Square Test. The Chi-Square tests will be returned at the bottom of the output sheet in the “Chi-Square Tests” box.

Step 7: Compare the p-value returned in the chi-square area (listed in the Asymp Sig column) to your chosen alpha level.

How To Test a Chi-Square Hypothesis (Test for Independence)

Watch the video for how to test a hypothesis with chi-square using the p-value method:

Chi-Square Test (P-Value Method)

A chi-square test for independence shows how categorical variables are related. There are a few variations on the statistic; which one you use depends upon how you collected the data. It also depends on how your hypothesis is worded. All of the variations use the same idea; you are comparing the values you expect to get (expected values) with the values you actually collect (observed values). One of the most common forms can be used in a contingency table.

The chi-square hypothesis test is appropriate if you have:

  • Discrete outcomes (categorical.)
  • Dichotomous variables.
  • Ordinal variables.

For example, you could have a clinical trial with blood sugar outcomes of hypoglycemic, normoglycemic, or hyperglycemic.

Test a Chi-Square Hypothesis: Steps

Sample question: Test the chi-square hypothesis with the following characteristics:

  1. 11 Degrees of Freedom
  2. Chi-square test statistic of 5.094

Note: Degrees of freedom equals the number of categories minus 1.

Step 1: Take the chi-square statistic. Find the p-value in the chi-square table. If you are unfamiliar with chi-square tables, the chi-square table link also includes a short video on how to read the table. The closest value for df=11 and 5.094 is between .900 and .950.
Note: The chi-square table doesn’t offer exact values for every single possibility. If you use a calculator, you can get an exact value. The exact p-value is 0.9265.

Step 2: Use the p-value you found in Step 1. Decide whether to support or reject the null hypothesis. In general, small p-values (1% to 5%) would cause you to reject the null hypothesis. This very large p-value (92.65%) means that the null hypothesis should not be rejected.

Like the explanation? Check out the Practically Cheating Statistics Handbook, which has hundreds more step-by-step explanations, just like this one!

Reference

[1] Chi-Bar-Squared. Retrieved March 9, 2022, from:
[2] Johns Hopkins.
[3] Kenney, J. F., and Keeping, E. S. Mathematics of Statistics, Pt. 2, 2nd ed. Princeton, NJ: Van Nostrand, 1951.
CITE THIS AS:
Stephanie Glen. “Chi-Square Statistic: How to Calculate It / Distribution” From StatisticsHowTo.com: Elementary Statistics for the rest of us! https://www.statisticshowto.com/probability-and-statistics/chi-square/

Need help with homework or test questions? With Chegg Study, you can get step-by-step solutions to your questions from an expert in the field. Your first 30 minutes with a Chegg tutor is free!

The article was originally published here.

TO GET MORE KNOWLEDGE ABOUT Chi Square Test, PLEASE VISIT OUR SITE: Forupon.com.

Comments are closed.