Understanding Statistics

Introduction to Significance Testing

If you are going to implement a quantitative design for your thesis or dissertation, you will probably be using some form of null hypothesis significance testing. It may have been a while since you took your graduate-level statistics course, so the following is a brief refresher about what a null hypothesis is.

Null Hypothesis Significance Testing

In most quantitative research questions, there are both null hypotheses (noted as H0) and alternative hypotheses (noted as H1). In their most simplistic state, null hypotheses note that there will be no difference/effect/relationship (the term you use depends on the type of statistics you run) between key study variables. The opposite of a null hypothesis is an alternative hypothesis, which is basically what you learned about hypotheses in science class: propositions about what you expect will happen in your study in regards to your research question(s).

Null hypotheses significance testing is the probability that a relationship resulted from the chance of random sampling. Typically in research, the significance level (a.k.a. alpha level) is .05; you may recall seeing p < .05 in your early coursework in statistics. When we use an alpha level of .05, we as researchers are willing to accept a 5% chance that the findings are due to chance. If the test statistic that you obtained is greater than the critical value (which is the value associated with the alpha level of .05), then you would reject the null hypotheses, which is essentially saying that the alternative hypotheses appear to be true. If the test statistic that you obtained is less than the critical value, you cannot reject the null hypotheses.

Types of Errors

There are two types of errors that researchers can make when using null hypotheses significance testing: Type I and Type II errors. Type I errors (also referred to as α errors) occur when researchers reject null hypotheses when in fact the null hypotheses are true. Type II errors (also referred to as β errors) occur when researchers fail to reject null hypotheses when the null hypotheses are false.

 

Steps in Null Hypothesis Significance Testing

Below are the five steps to null hypotheses significance testing. It is important for you to know and understand these steps, but there are many statistical software programs that will actually do most of this for you when you run the appropriate statistic.

1. State your hypotheses (both the null and the alternative).

2. Select your desired significance level (typically .05).

3. Calculate the test statistic.

4. Find the critical value.

5. Make your decision about null hypotheses using the results of the test to determine if the obtained statistic is greater or less than the critical value.

Parametric and Nonparametric Statistics

Most statistics courses tend to focus on parametric statistics; however, you might find that as you prepare to analyze your dissertation data, parametric statistics might not be an appropriate choice for your research. The following are some of the differences between parametric and nonparametric statistics.

Parametric Statistics

Parametric statistics are any statistical tests based on underlying assumptions about data’s distribution. In other words, parametric statistics are based on the parameters of the normal curve. Because parametric statistics are based on the normal curve, data must meet certain assumptions, or parametric statistics cannot be calculated. Prior to running any parametric statistics, you should always be sure to test the assumptions for the tests that you are planning to run.

Nonparametric Statistics

As implied by the name, nonparametric statistics are not based on the parameters of the normal curve. Therefore, if your data violate the assumptions of a usual parametric and nonparametric statistics might better define the data, try running the nonparametric equivalent of the parametric test. You should also consider using nonparametric equivalent tests when you have limited sample sizes (e.g., n < 30). Though nonparametric statistical tests have more flexibility than do parametric statistical tests, nonparametric tests are not as robust; therefore, most statisticians recommend that when appropriate, parametric statistics are preferred.

Parametric and Nonparametric Equivalencies

The table below outlines some common research designs and their appropriate parametric and nonparametric equivalents.

 
 
 
 
 
 
 
 
 

A Change of Plans: Using the Correct Statistics Test that Fits Your Data

You have successfully defended your dissertation proposal, and now you have your dissertation data. It might seem like the next logical step would be to run your analysis according to the analysis plan from your proposal.

However, it is important to make sure that the statistics test you plan to run is appropriate for the data that you obtained. Sticking to a rigid plan for which statistics test to use that you made before you obtained your data can be very dangerous in terms of the validity of your entire project, because that plan may no longer apply to your data, especially if your hypothesis is very different from what you found.

Some common issues with statistical procedures that you should consider include limited number of participants, limited number of cases across groups, and violations of normality of key variables. To test for these issues, you will want to work closely with your advisor or a statistical consultant and examine the data to determine if the statistics test you want to use is still appropriate. Once you have a sense of the data, you can make a more informed decision about how to proceed with the analysis.

If you happen to find violations of certain assumptions, there is no need for panic or despair; you can always adjust which statistics test to use to meet the needs of the data. For example, if the data violate the assumption of normality, non-parametric alternatives might be an appropriate alternative to your original analysis plan. If you do have to make any alterations to your analysis plan between creating your proposal and running the analysis, you will want to work closely with your advisor and your committee to make sure that you are on the right track and see your work from the perspective of someone else.

 
 

Data Crash Course: Types of Data and Scales of Measurement

Before you begin to collect data for your thesis or dissertation, it may be helpful for you to review the different types of data and scales of measurement available to you. You can use the following cheat sheet as a reference guide as you prepare to collect your dissertation data.

Scales of Measurement

Nominal: (a.k.a., categorical) refers to characteristic data that have no numeric value (i.e., ethnicity)

     Dichotomous: refers to types of nominal data that only have two categories (i.e., alive or dead)

Ordinal: refers to numeric values of rank order (i.e., first, second, third, etc.) that have indeterminate intervals between adjacent values

Interval: refers to values that are continuous in nature and that are in the same metric

Ratio: refers to values that are continuous in nature and that are in the same metric and includes a nonarbitrary zero

Measures of Central Tendency

Mean: arithmetic average of a variable

Median: midpoint of a distribution (i.e., the same number of scores are above and below the median)

Mode: most frequently occurring value

Variability

Variability: quantitative measure of the degree to which data in a distribution are spread out or clustered together, which is used to describe distribution

Range: difference between the upper limit and lower limit of a variable

Deviation: distance from the mean

Variance: mean of the squared deviation scores

Standard deviation: standardized measure of the variability of a variable and the square root of variance

Kurtosis: refers to the sharpness of the peak of a distribution

     Mesokurtic: normal frequency

     Leptokurtic: sharp peak (i.e., little variability)

     Platykuritc:flat peak (i.e., much variability)

Distribution

Distribution: arrangement of values that variables take in a sample (i.e., the shape of the data)

Types of Distribution

Normal distribution: (a.k.a., the normal curve) occurs when the mean, median, and mode are the same value; normally distributed data will have a mean of 100 and a standard deviation of 15

Negatively skewed: occurs when the mode is greater than the median and the mean and when the median is greater than the mean

Positively skewed: occurs when the mean is greater than the median and the mode and when the median is greater than the mode

More Articles on PhDStudent.com

  • Default
  • Title
  • Date
  • Random
load more hold SHIFT key to load all load all

  Still can't find what you're looking for?

Post your question to the forum where grad students, faculty, & more can respond.

Receive instant access to the forum when you create your free account on PhDStudent.com.
Become a member today and join an exclusive academic network to connect with other students and faculty.

Sign Up Now
image

  Still can't find what you're looking for?

Post your question to the forum

Sign Up Now
image

PhDStudent