Hypotheses
As the Corbet and Le Roy text tells you, a hypothesis is a testable statement
of relationships that comes from theory. That is a pretty good
definition, so I will not try to improve it. The text also explains
each part of the definition: its testability, its specification of variables
and the roles that they play, and the fact that it comes from theory, and
is not just a wild guess.
He elaborates on the theory part by saying that political research should be guided by theory. That statement could be even stronger--we might say that it MUST be guided by theory. Why? The world is filled with variables and we can think of ingenious ways of measuring many of them. And we have powerful statistical programs that allow us to quickly test possible relationships. There are so many that we know that some relationships happen by chance, that there is in fact no logical causal link between them. These are often called "nonsense" relationships (like the number of Methodist ministers being ordained being related to the production of Jamaican rum). So if we just go out and use a program like MicroCase to test all possible relationships among political variables in a large data set, we will find some relationships that exist either in the real world by mere chance or just in the sample, again by mere chance. That is why we always start with theory. If you remember, theory guides and focuses our inquiry. More precisely, it suggests relationships that should hold if the theory is correct.
Ok, so suppose you have a relationship--a hypothesis--that should be true. One of two things can happen when you observe data. Either the data are consistent with the hypothesis or not. If the hypothesis holds, the we conclude that the data supports the theory. Notice I said "supports," not "proves." Logically speaking, one can never prove a theory--all one can do is support it. New evidence may arise that may undermine the theory. Some third variable may be found that causes the original bivariate relationship to become spurious. Since we can never ever test for the effects of all possible third variables that could confound the original relationship, we can never fully prove a hypothesis or the theory from which the hypothesis was derived.
The other alternative is that the data do not support the hypothesis. Assuming that we made no mistakes in measurement, research design or procedures, this means that something is wrong with the theory from which the hypothesis was derived. In this case we can say that the theory was disproved. That is a stronger statement. What it means is that the theory needs to be revised to account for what we found when we tested the hypothesis. The only caveat I would add here is that often that assumption I made in the second sentence is not a good one--measurements can be inadequate and research design can be faulty, like having a sample that is too small or poorly chosen. If you think about this you see why methodology is so important because it can create mistakes in either direction. (As you will see later, we use statistical rules that minimize the chances of wrongly finding support when in fact none exists--but that is a later topic--significance tests).
Sometimes you hear people talking about laws, like Michel's iron law of oligarchy. Scientifically speaking there really are no such things. All we have are theories that have been thoroughly tested over time that seem to have a great deal of support, so we generally accept them as true until new and convincing evidence arises.
Stating Hypotheses
The Corbett/Le Roy text does a good job in telling you how to state hypotheses. So I have only a couple of comments. You remember, I hope, that when we talked about levels of measurement, we said that the level determines the kinds of statistics we can use. Well, the level also has implications for how we can state a hypothesis. So in the case of ordinal, interval, and ratio measurements, we can simply state the relationship in terms of whether it is a positive (meaning that as the independent variable increases, so does the dependent) or negative (as the independent increases the dependent decreases).
In the case of a hypothesis with at least one nominal level variable in it, we have to state things more precisely. For example, Southerners have more religious intolerance than non Southerners. Or, Democrats are more supportive of Social Security than Republicans.
Corbett/Le Roy add an important point in the section on "Stating Hypotheses with Antecedent or Intervening Variables:" we can add control or conditioning or intervening variables to the hypothesis. For example, "among whites, Southerners have more religious intolerance than among non Southerners." Here race is controlled for, implying that the relationship only holds for whites. To put it another way, the relationship is conditioned by race. Here's another: "regardless of age, Democrats are more supportive of Social Security than Republicans." In this one we are controlling for age, saying that the relationship holds true for ALLl age groups ("regardless of age").
The only thing I would add is that we need to specify at the beginning of the hypothesis about the population to which the hypothesis applies, unless that is very obvious. For example, suppose we are doing a study of college students and their political knowledge and skills. We might say that there is a positive relationship between age and political knowledge. But it would be better, in the sense of more precise, to say that "Among college students, there is a positive relationship between age and political knowledge." On the other hand, if in a general population study we hypothesize that age increases the likelihood of voting, we really do not need to say: "Among adult citizens over 18, there is a positive relationship between age and voting." (You will note here that voting is really a dichotomous variable that can be treated like a ratio variable or an ordinal variable where 0 is no, did not vote, and 1 is "more voting" or even an exact amount of more voting -- yes, did vote once. Do you remember what I said about dichotomous variables being a special case? Hope so. If not remember it now!)
The Null Hypothesis
This is a term that I usually skip, though it is a conventional term that is used in testing hypotheses. Using the definition in Corbett, the null hypothesis (usually labeled as H0) is stated as just the opposite of what you hope to find, that there is NO relationship between the two variables in your hypothesis. If you think about the null hypothesis, then we have two possible outcomes after we gather and analyze data. We either reject the null, which means that there is a relationship, and therefore we found support for the theory. Or we fail to reject the null, which means that there was no relationship, and therefore the theory must be changed.
I know all this is confusing, because it seems to be working in the opposite direction than we are really interested. The reason it is done this way reflects the conservative nature of science. Rejecting the null, which is what we hope to do, is a sneaky way of avoiding saying that we accept the real hypothesis (usually labeled H1), which, of course, we never really completely do. I am personally satisfied to say that the data "supports H1" rather than saying that it proves H1. But scientific convention here is to say that we reject the alternative explanation in H0, that there is no relationship. This eliminates an alternative theoretical explanation. As we eliminate more and more alternative theoretical explanations (by rejecting more and more H0's), we in effect compile more and more evidence that supports H1 and the theory on which it was based. So be it. I am certainly not enough of an authority to change conventional terminology of science.
The phrase, "failing to reject," is particularly confusing because it is a kind of double negative. Why not just "accept the null?" Again, it is because of the conservative nature of science--that is too close to saying we proved something, so instead of saying "accept," we say "fail to reject." After a while you will get used to these conventional ways of saying things.
Constructing "Good" Hypotheses
By good, I mean hypotheses that are clear, concise, and unambiguous. Here are some guidelines to follow.
1. Unless it it obvious, start with the theoretical population to which the hypothesis applies. We talked about that earlier. This is not a hard and fast rule, but it makes the research easier to read and less likely to be misinterpreted.
2. The hypothesis must have two variables in it. That means both variables must vary--yeah, I know that is a tautology--but you will be surprised how often people throw a constant into a hypothesis. For example, to state that "high income people are more likely to be Republicans" has no variables in it. High versus what? Moderate income or low income or what? And Republican versus independent or what? The hypothesis must make clear how each variable actually changes. In interval level measurements that may be clear, but it is often not clear in nominal level measurements. It would be better to say that "Among voters, higher income people tend to be Republican while lower income people tend to be Democrats." This wording is more precise, and that is a big part of science, even if it is tedious and wordy.
3. Avoid compound hypotheses. That means that if you have a third variable in a hypothesis, you should clearly describe the role it plays. Suppose we say "among nations, income and education are positively associated with rates of political participation." This is really two separate hypotheses and should be stated as such. If you have a third variable in a hypothesis, its role relative to the independent and dependent variables should be clearly stated: intervening, conditioning, control. So you might say: "Among nations, regardless of income, education is positively associated with rates of political participation." This makes clear that income is a control variable that you think will have no effect on the simple bivariate relationship. (By the way, I have been using this term a couple of times--all bivariate means is a two variable relationship.)
4. The relationship must be clear. Don't just say that income is related to charitable giving. That does not make clear HOW you think it will work. You probably intend to have a positive relationship. But I could make a case that in fact it is negative, that poor people, who better appreciate the ill fortunes of life, are likely to give higher proportions of their available income to charity than wealthy who need to feel that their high standard of living is due to their own individual effort.
5. Avoid value statements. You have already run into these in our discussion of measurement. Research and hypotheses can never fully answer questions of values, though we can shed some light on partial indicators that might be related to values. We can, however, have hypotheses that relate the values that people have. So for example, we should not have the following as a hypothesis: "In comparing schools, Clemson is a better college than USC Aiken." This cannot be tested (though obviously we would fail to reject the null!). On the other hand, we can test the following related hypothesis: "In comparing students, Clemson students are more likely to think they go to a top rank college than USC Aiken are to think they go to a top rank college." Note that this is only a related hypothesis, because perceptions of students are only a small part of the quality of a college.
6. Just as in operational definitions, avoid tautologies. For example, to say that "Among adult citizens those who vote less frequently have lower levels of political participation" is to really say nothing, because an important part of political participation is voting frequency--by definition!
7. General is better--avoid proper names if at all possible. For example, to say that "among American citizens, Clinton is more likely to be rated as having been a weaker leader than Jimmy Carter" is not very general at all. It would be better to say that "Among American citizens, presidents with major scandals in their administrations are likely to be rated as weaker leaders than presidents without major scandals."
Multiple Causation
We may reject a lot of null hypotheses and still fall far short of explaining all of the change that takes place in the dependent variable. To put this in statistical terms, we can rarely explain all the variance. In the complex real world, many independent variables affect dependent variables, and it is practically impossible to account for all of these possible independent variables, let alone all the possible control variables. So at best we end up with some support for incomplete theories. But if the question is important, like how to encourage citizens to play an active and informed role in the political life of their communities, it is worth doing.
Where Hypotheses Come From
You should already know that the obvious answer to this is theory, hypotheses come from theory. If you get a hypothesis from a theory, you get it by way of deductive logic. Deduction as a Latin root that means to "draw out." So we draw out hypotheses that are implied by theory. For example, balance of power theory says that nations form alliances in order to defend themselves from other stronger and potentially dangerous nations. We might hypothesize that the stronger a neighboring nation relative to a given nation, the more likely that the given nation will form a military alliance with another nation that is equally strong. We might find this to be true when we apply it to the early 1800s, but then find that it is not true in the Cold War period following WWII. The latter finding--in which we fail to reject the null--suggests that the theory needed to be modified when applied to a bipolar world with two strongly opposing ideologies.
You can also get hypotheses through inductive logic. This is much more common, mainly because political science is not rich in theories. So we have to generate hypotheses from other sources. Here what happens is that we first observe what is happening in the world around us, or read historical accounts, or immerse ourselves in some subject area, or do detailed case studies so that we can explain what is going on in the case studies. Then we try to generalize from that single experience--that is, we induce a more general statement that is a testable hypothesis. So we get a general hypothesis from an explanation that came from one or a few experiences and observations. For example, I might observe that a well financed local candidate who is challenging an incumbent narrowly wins a campaign by appealing to voters on the basis of their hatred of taxes and fees. This places the incumbent on the defensive, even though the fees and taxes are very unlikely to end. This suggests that even though incumbents generally have the advantage, incumbents who are on the defensive with a well financed opponent are in real trouble. That is a more general explanation. From this we can get at least a couple of hypotheses:
H1: In elections involving a challenger and an incumbent, the more a challenger spends relative to the incumbent, the greater the chances of election.
H2: In elections involving a challenger and an incumbent, the more the incumbent is placed on the defensive relative to the challenger, the more likely the challenger is to win the election.
The second hypothesis involves a variable, relative defensiveness, that
could prove difficult to operationalize, but it is at least in theory measurable.
We might, for example, do "content analyses" of campaign statements or campaign
literature. We might look for the frequency of statements defending existing
policies.
Assignment:
Look at the following hypotheses and answer the following questions
about each one:
a. What is the theoretical population?
b. What is the independent variable?
c. What is the dependent variable?
e. What problem or problems does the hypothesis have, and how might
you correct each problem?
1. The higher the price of housing, the less littering will take place in a neighborhood.
2. Urban areas tend to have higher rates of violent crimes than do rural areas.
3. Incumbents receive a lot of campaign contributions.
4. The Irish are a more agreeable people than the French.
5. The S.C. Education Lottery passed in 2000 because more money was spent on the pro-lottery side.
6. Rural residents are more likely to be conservative because they do not face as many social problems as those who live in urban areas and therefore do not see as much need for government.
7. Citizens pay more attention to politics during periods of national crisis than than they do in relatively tranquil periods of time.
8. Young people are more likely to rate education as a very important issue than old people.
Read the following excerpt from Arthur Schleshinger's A Thousand Days (Boston: Houghton Mifflin, 1965), an important history of the Kennedy presidency. a) Write down in a sentence or two the explanation that he offers for our involvement in Vietnam. b) Then write a general hypothesis that you can induce from the explanation that can at least in theory be tested.
"Most intractable of all of President Kennedy's problems
in South East Asia was the problem of Vietnam. In the end this was to consume
more of the President's attention and concern than anything else in Asia.
The American commitment to the Saigon government was now of nearly seven
year's standing. After the Geneva Agreements of 1954 had split Vietnam
along the 17th parallel, President Eisenhower had written Prime Minister
Ngo Dinh Diem...pledging American support 'to assist the Government of
Viet-Nam in developing and maintaining a strong, viable state, capable
of resisting attempted subversion of aggression through military means.'
...
Whether we were right in 1954 to undertake this
commitment will long be of interest to historians, but it had ceased by
1961 to be of interest to policy makers. Whether we had vital interests
in South Vietnam before 1954, the Eisenhower letter created those interests.
Whether we should have drawn the line where we did, once it was drawn we
became every succeeding year yet more imprisoned by it. Whether the domino
theory was valid in 1954, it had acquired validity seven years later, after
neighboring governments had staked their own security on the ability of
the United States to live up to its pledge to Saigon. Kennedy...had no
choice now but to work within the situation he had inherited. The Eisenhower
policy left us no alternative in 1961 but to continue the effort of 1954."