CSUF Department of Psychology
Unit Banner

Correlational Research

Two Kinds of Research

Most research in psychology falls into one of two categories: correlational and experimental.  Understanding the differences between these two types of research is one of the major goals of any introductory research methods course.

Correlational Research

Correlational research tests for statistical relationships between variables.  The researcher begins with the idea that there might be a relationship between two variables.  She or he then measures both variables for each of a large number of cases and checks to see if they are in fact related.  The relationship of interest could be either a D relationship or an R relationship, so this might involve making a bar graph and computing d or making a line graph or scatterplot and computing r.  It probably also involves null hypothesis testing to see if the observed relationship is statistically significant.

For example, imagine that a health psychologist is interested in testing the claim that people with more friends tend to be healthier.  She surveys 500 people in her community, asking them how many friends they have and getting some measure of their overall health.  Then she makes a scatterplot and sees that there is a positive correlation between these variables.  Specifically, she finds that r = +.3, concluding that there is a moderate tendency for people with more friends to be healthier.

Correlational Research and D Relationships

Many people mistakenly think that correlational research must involve R relationships.  This is probably because R relationships are more likely to be referred to as correlations than D relationships.  It could also be because correlational research is, in practice, more likely to involve R relationships than D relationships.  Regardless, it is important to remember that correlational research can involve D relationships as easily as R relationships  

For example, imagine that the health psychologist described above surveys two groups of 50 people: hospital patients being treated for chronic diseases and healthy community members.  In other words, one of her variables is categorical with the two values hospital patient and community member.  Then she compares the two groups in terms of the mean number of friends by making a bar graph and finding that d = 0.50., concluding that there is a moderate tendency for healthy community members to have more friends than hospital patients. This is also correlational research.

A Problem for Correlational Research: Inferring Causal Relationships

Sometimes statistical relationships are interpreted in terms of changes in one variable causing changes in the other. For example, a psychologist might claim that having more friends is not just positively correlated with better health, but that it actually causes people to be healthier.  One problem with correlational research, however, is that it provides only weak evidence for such causal interpretations.  This is because it is possible for two variables—call them X and Y—to be related even though X does not cause Y.  There are essentially two reasons for this. 

The Directionality Problem

The directionality problem refers to the fact that X and Y will be statistically related if X causes Y... or if Y causes X.  Imagine, for example, that there is a positive correlation between number of friends and health.  This could be because having more friends somehow causes one to be healthier, or it could be because being healthier causes one to have more friends.  Maybe healthier people get out of the house more and engage in more social activities, which results in their having more friends.  Or imagine that a correlational study shows that people with pets are happier than people without pets.  This could be because having pets causes people to be happier, or it could be because being happier causes people to have pets.  There is no way to know for sure based on a correlational study.

The directionality problem is not always an issue because sometimes the value of one variable (Y) is determined after the value of the other variable (X) so that Y could not possibly have caused X.  For example, imagine that there is a statistical relationship between having been bullied by in school and how strongly one favors mandatory sentences for convicted criminals as an adult.  Note that there is no way for a person’s attitudes as an adult to affect his or her experiences as a child.  So the only plausible causal interpretation here is that being bullied as children causes people to favor mandatory sentences as an adult.  However, even this causal interpretation is not well supported because of another problem discussed below.

The Third-Variable Problem

The third-variable problem refers to the fact that X and Y will also be statistically related when there is a third variable that affects both of them, even if X does not cause Y and Y does not cause X.  Imagine again that there is a positive correlation between the number of friends one has and how healthy one is.  This does not have to be because having friends causes one to be healthier, and it does not have to be because being healthy causes one to have more friends.  There could be a third variable that causes people to have more friends and to be healthier.  One possibility is happiness.  Maybe being happier causes a person to have more friends and to be healthier. 

Note that the third-variable problem is not just that there is a third variable that is related to one or both variables; it is that there is a third variable that causes changes in both of the other variables, producing an incidental or spurious correlation between them. Do not make the following kind of mistake.  Ted says that the friends and health study has the third-variable problem because a person’s health is also related to his or her diet.  This is wrong because for diet to be a third variable, it would have to cause people to have more friends and cause them to be healthier, thus producing the statistical relationship between friends and health.  Although we know that one’s diet has an effect on one’s health, it seems unlikely that it has an effect on the number of friends one has.

A question that students often ask about the third-variable problem goes something like this: “Couldn’t you always think of some third variable that might be causing changes in both X and Y?” The answer is essentially “yes,” and this is why it is such a big problem.  In fact, even if you cannot think of some plausible third variable in a particular situation, there could still be one that you have not thought of, and this will always be the case with correlational research.

Look at Dr. Price's teaching blog on Jumping to Causal Conclusions for more examples on this topic.

"Correlation Does Not Imply Causation"

The thing to remember here is that you have to be very careful about interpreting the results of correlational research as evidence for causal intepretations.  Just because there is a statistical relationship between X and Y does not mean that X causes Y or that Y causes X.  Psychologists and psychology majors often remember it this way: "Correlation does not imply causation." As Dr. Price discusses in his teaching blog, it is important to be clear about The Meaning of "Imply."The Meaning of "Imply."

But Don't be Too Hard on Correlational Research

Sometimes in a research methods course, we spend so much time trying to show that correlational research does not support causal interpretations that we leave the impression that correlational research is not useful.  This is not at all true.  It is useful for at least three reasons.

First, correlational research is fine for demonstrating statistical relationships.  A correlational study that shows a positive relationship between number of friends and health has shown that such a relationship exists, even if it does not show that having more friends causes better health.  Sometimes, just knowing that two variables are related is interesting … regardless of whether either one has a causal influence on the other.  In many situations, after establishing that there is a relationship, researchers will conduct more studies to determine why.

Second, correlational research often shows that there is no relationship between two variables, in which case neither the directionality problem nor the third-variable problem is much of an issue. Imagine, for example, a study showing that number of pets is not related to health. Except under unusual circumstances (which we will not get into here), this pretty much rules out the possibility that pets affect health, that health affects pets, and that there is a third variable that affects both. The two variables are simply unrelated and the researcher can move on to another question.

Third, there are situations in which the major alternative to correlational research (i.e., experimental research) cannot be conducted for practical or ethical reasons.  In such cases, correlational research might be the best approach available.