Concepts, Variables, and Measurement
"When I use a word," Humpty Dumpty said in a rather
scornful tone, "it means just what I choose it to mean--neither more nor
less."
"The question is," said Alice, "whether you can
make words mean so many different things."
"The question is," said Humpty Dumpty, "which
is to be the master--that's all."
Lewis Carroll, Through the Looking Glass
Ok, we are now at the third step in the scientific research
process, concepts. But, as you know, in reporting research the concepts
often get mixed in with the theory or the problem statement. If the concepts
are very concrete, the writer may talk about them in the hypotheses or
not distinguish them from variables. Let us start with some definitions--the
concept of "concept?"
Definition of Concept
Another text that I once used defines concepts as "a label we put on a phenomenon that enables us to link separate observations and to make generalizations. A convenience, a name we give to observations and events" (Louise G. White, Political Analysis, 2nd ed. Pacific Grove, California: Brooks Cole, 1990.) This is quite a mouthful, but it really says the same things as the earlier definition--labels we use for things ("phenomenon") that seem to have things in common (the links between those separate observations and the generalizations that tie them together).
If the definition of concept is not clear, let's look at some examples of concepts: alienation, apathy, freedom, efficacy, tolerance, public interest, ideology, justice, voting, presidential behavior, political participation, political trust, political interest, political knowledge, political experience, congressional seniority, campaign contributions, media consumption, newspaper reading, power, influence, modernization, political development, fundamentalism, southernness, and community needs . We could go on, but I hope these examples illustrate the point.
Variables and Measurement (Operational Definitions)
The text gives a definition for a variable: "a measured concept." That is not quite good enough, because if you measure a concept, that is, if you turn it into something that can be observed and all the measurements come out the same, then it is a constant, not a variable. So let us approach the definition of a variable another way.
Every concept has some kinds of properties associated with it. Usually they are implicit in the definition. "Political efficacy," for example has properties of feelings of being able to get what you want when you become involved in the political process. So if we try to measure this feeling with a question that asks the degree to which someone agrees with the statement: Elected leaders pay attention to the opinions of people like me, we have turned the concept into something that we can observe--the answers to this question. If the property of thinking that elected officials pay attention varies over individuals, then we can conclude that this varying property is a variable. So a variable is a property associated with a concept that varies when measured. Pretty simple. But if the property does not vary, it is a constant. That is important to know because if we are looking for how some concepts cause change in other concepts (which as you will see later is the essence of hypotheses), we must have variables, not constants.
The process, or the steps we use in measuring a variable is the operational definition. This is almost precisely the definition the text gives. As we noted in previewing the steps of the scientific process, laying out precisely how these steps are to be performed is critical to the overall process. It has to be done so that someone else can follow the steps and end up with the same measurements if performed on the same objects.
You should realize that this process of measurement is fraught with problems. Operational measures may not capture all of what you are concerned with, especially if the concept is rich in meaning and has a lot of dimensions. Sometimes we try to create compound measures to deal with this (a subject we will add to the course later--it is not in the text). Sometimes we have to settle for an operational measure that is less than perfect, that only measures part of the concept. Cost and time keep can keep us from doing better. If you measure a concept differently than others, or measure a different aspect of it than someone else, you are likely to get different results. You cannot be sure whether the differences are a result of a different measurement method (or operational definition) or something else, like a change in the population. For example, suppose we use a new question to measure partisan identification than was used in the last election and we find that the Democrats have gained 15 percentage points in support. Is that a real gain or is it a function of the change in the operational definition? We simply do not know. This is why in using operational definitions, we have to worry about the quality of those measures. Generally, when we talk about the quality of measures we use two terms: their reliability and their validity.
1. Reliability refers to the consistency of an operationalized measure. A reliable measure will yield the same results over and over again when applied to the same thing. An elastic yardstick is unreliable. Ten people can use it to measure the same object and they will likely get ten different answers. If you have a survey question that can be interpreted several different ways, it is going to be unreliable. One person may interpret it one way and another may interpret it another way. You do not know which interpretation people are taking. Later, when we talk about survey questions, we will go over some rules on how to write reliable questions. Even answers to questions that are clear may be unreliable, depending on how they are interpreted. In a survey on product confusion, respondents reported that the shapes of the two products were the same. The researcher concluded that respondents were confusing the shape of the main body. However, a distinctive part of the overall shape were the two smokestacks on each product. How do we know that the respondents were not referring to the smokestacks instead of the shape of the main body. We do not. This vague answer that is subject to multiple interpretations leads to an unreliable measurement.
2. Validity refers to whether the measure actually measures what it is supposed to measure. If a measure is unreliable, it is also invalid. That is, if you do not know what it is measuring, it certainly cannot be said to be measuring what it is supposed to be measuring. On the other hand, you can have a consistently unreliable measure. For example, if we measure income level by asking someone how many years of formal education they have completed, we will get consistent results, but education is not income (although they are positively related). If the "trade dress" of a product refers to the total image of a product, then measuring how people perceive the product's color and shape by themselves falls far short of measuring the product's "trade dress." It is an invalid measure.
We can break validity down even further and talk about four kinds of validity: face validity, predictive validity, convergent validity, and criterion validity.
a. Face validity refers to whether a measure, on its face, seems to be related to the concept that is presumably being measured. Here we are talking about some kind of logical connection, asking does this make sense? If we measure political participation by whether a person has a strong or weak party identification, the relationship is logically tenuous at best. Participation refers to involvement, and although we know that strong identifiers are more likely to vote in general, many weak identifiers also vote. This sounds like more of a hypothesis between two different concepts than a way to measure a single concept. Moreover, identification is a psychological state of being while participation refers to actions that can be directly observed--two different kinds of things. The logical connection is quite weak and only indirect if it exists at all. Good rule of thumb--measure things as directly as possible--indirect measures will have more face validity problems. However, some things, like attitudes, must be measured indirectly because you cannot directly observe them.
b. Predictive validity refers to whether a new measure of something has the same predictive relationship with something else that the old measure had. For example, suppose we have a new way to measure ideology. The new measure should have the same relationship with issue positions (like abortion, government spending, and so on) as the old measure of ideology. If it doesn't, then the measure lacks predictive validity. For example, a new SAT test that has a weaker relationship to success in college has a weaker predictive validity.
c. Convergent validity refers to whether two different measures of presumably the same thing are consistent with each other--whether they converge to give the same measurement. For example, if SAT scores and ACT scores are convergent, then someone who scores high on one test should also score high on the other. Different measures of ideology should classify the same people the same way. If they do not, then they lack convergent validity.
d. Criterion validity is a test of a measure when the measure has several different parts or indicators in it--compound measures. Each part, or criterion, of the measure should have a relationship with all the parts in the measure for the variable to which the first measure is related in a hypothesis. (Yeah, I know that is a complicated sentence, but it is the best I can do.) For example, suppose we measure strength of partisanship by strength of self-identification and straight ticket voting. Further suppose we have a hypothesis that relates partisanship to political participation. Then each of these two criteria in the partisanship measure (strength of self-identification and straight ticket voting) should be related to each of the parts in a compound political participation measure (like voting turnout and campaign contributions). If straight ticket voting is positively related to voting turnout but not campaign contributions while strength of self-identification is related to both turnout and contributions, then the compound measure for strength of partisanship has a validity problem.
e. Content validity.
This also applies to measures that have different parts or indicators in
them. But here the question is whether they cover all the relevant aspects
of a concept. Remember that you do not always have the time and money to
cover all aspects. But does it get at enough of the essential ones? To
put it another way, does the measure have sufficient content in it to be
acceptable. This is sometimes a judgment call, but if you can err on the
side of covering as much as you can.
Assignment for next class: Answer the following questions.
1. Concept: educational level for a state
Definition: the amount of formal education that a state's population
has completed
Operational Measure: per pupil spending on k-12 education according
to the lastest edition of the Book of States
a. Evaluate the quality of the definition (clear? appropriate? defined
using simpler words? not circular?)
b. Is the measure reliable? Why or why not?
c. Is the measure valid? Why or why not?
2. Concept: citizen political knowledge
Definition: the degree to which citizens try to inform themselves about
public affairs and politics
Operational Measure: in a survey, ask the number of days that the individual
read the newspaper the preceeding week
a. Evaluate the quality of the definition (clear? appropriate? defined
using simpler words? not circular?)
b. Is the measure reliable? Why or why not?
c. Is the measure valid? Why or why not?
3. Concept: professionalism in state government
Definition: whether or not a state hires people who are professionally
trained in their area of work
Operational Measure: Do a survey of political reporters for the highest
circulation newspaper in all 50 state capitols and ask them if they regard
the bureaucracy in their home state as professional or not.
a. Evaluate the quality of the definition (clear? appropriate? defined
using simpler words? not circular?)
b. Is the measure reliable? Why or why not?
c. Is the measure valid? Why or why not?
4. Concept: racial prejudice among individuals
Definition: the degree to which a person feels that the racial groups
in which she/he classifies her/himself is superior to other racial groups
Operational Measure: #1. In a survey, ask whether one favors or opposes
affirmative action programs. #2. Ask whether they feel prejudiced against
other groups than their own. Those who oppose affirmative action and say
the are prejudiced are the most prejudiced. Those who support affirmative
action programs and say they are not prejudiced are not prejudiced. Those
with combinations of answers are classified as moderately prejudiced.
a. Evaluate the quality of the definition (clear? appropriate? defined
using simpler words? not circular?)
b. Is the measure reliable? Why or why not?
c. Is the measure valid? Why or why not?
5. Concept: freedom of the press for nations
Definition: the extent to which government lets media outlets operate
freely
Operational Measure: the number of privately-owned television stations
per per capita, taken from the current edition of the World Handbook
of Political and Social Indicators.
a. Evaluate the quality of the definition (clear? appropriate? defined
using simpler words? not circular?)
b. Is the measure reliable? Why or why not?
c. Is the measure valid? Why or why not?
6. Concept: political participation
Definition: the extent to which an individual citizen actively involves
him or herself in elections.
Operational definition: ask in a survey "Did you vote in the last presidential
election?" Record the score as 0 if they did not vote and 1 if they did
vote.
a. Evaluate the quality of the definition (clear? appropriate? defined
using simpler words? not circular?)
b. Is the measure reliable? Why or why not?
c. Is the measure valid? Why or why not?
7. Concept: community needs
Definition: what is needed by a community.
Operational definition: ask the elected members of the community major
decision making body what they think the most important things are that should
be done in their community to improve it.
a. Evaluate the quality of the definition (clear? appropriate? defined
using simpler words? not circular?)
b. Is the measure reliable? Why or why not?
c. Is the measure valid? Why or why not?
8. Find a political science journal article and locate TWO concepts. Use articles
you have not used before. Describe how the researcher defined them. Where
did she/he get the definition? Evaluate the definitions. Then describe
the operational definition of the concept. Evaluate it in terms of reliability
and validity.