Operationalization, Units of Analysis,
and Levels of Measurement
We have already introduced the concept of operationalization and talked
about the reliability and validity of the operationalized concepts, which
we call variables if the properties of those concepts varied when the measurement
rules are applied to them. The material in this section is really a continuation
of the last section. We will talk about exactly what kinds of things to which we
will apply the rules of operationalization (cases or units of analysis),
where we might find these cases (surveys, etc), and levels of measurement
(nominal, ordinal, interval, ratio, and dichotomous).
Cases--Units of Analysis and Population
This refers to the thing to which we apply the rules of measurement. In social science it is often a person, because social science studies people. However, it could be things other than people: cities, constitutions, groups, countries, laws, and so on. The text says it is the item for which we have data. Each thing that we perform measurements on is a unit of analysis. If you are not sure, just ask "what is the ______'s measurement," where measurement refers to what you are trying to measure, like party identification, or level of democracy, or freedom of the press, or whatever. Whatever you fill in the blank is the unit of analysis. So if we are trying to measure the candidate's party identification, the unit of analysis is each candidate. Another term that you will sometimes see in computer science or business is "record." It took me a while to figure this out, but the terms record, unit of analysis, and case all refer to the same thing--they are interchangeable--simply jargon from different fields.
All of the units to which the measurement could be applied (and later we will see that this is all the units to which the hypothesis applies) is called the population. It is the group of units in which we are ultimately interested. This term is not in the text, but it is an important term nevertheless.
The text also talks about individual and aggregate data.
The text does a pretty good job here, so I will add little. You should
realize that all data are ultimately individual. This is because things
like literacy rates, average income, unemployment, of even percentage in
one party or another are ultimately caused by individuals. In effect, aggregate
data are often collective data for groups of individuals, like the percentage of
Baptists in each county in the state. This can get
a bit confusing at times, but just think of what is measured and how it
is put together and you should be ok.
Sources for Data
Again, the text does a pretty good job here, so I will add little. If you use data that someone else collected, as we do in using the data sets provided by MicroCase, it is called secondary data. If you collect the data yourself, it is called primary data. In using secondary data, you can not always get the measurements of all the variables in which you are interested, so sometimes you have to make substitutions. They may not be as valid as we would like--like substituting education for income, things that you know should be at least moderately associated with each other. Whether you use surveys, experiments, direct observation, content analysis, or data from public records or other archives depends on the problem you are interested in studying. As a rule, you should use data that are closest to the units of analysis in your problem. So if you are interested in citizen knowledge and participation, your data should come from surveys of citizens. However, you could also compare groups of citizens using aggregate data to see if things like educational spending in a state is associated with higher rates of voting in that state.
Levels of Measurement
Here the text introduces four levels of measurement: Nominal, ordinal, interval, and ratio. I have no complaints in the description of these levels, except that it left one out. So let me briefly review them and add one more, a kind of special case. Remember the major point here--the level of measurement is important because it dictates the kind of statistical analysis that we can do. So if you have a choice, higher is better. Let's start with the lowest and proceed to the highest and then go to the special case.
1. Nominal or categorical. This is just categories that fall in no particular order, e.g. religious affiliation. It does not measure more or less of anything.
2. Ordinal. Here we are measuring more of less of something, but not in exact amounts, e.g. slightly, moderately, or strongly agreeing with some statement. You know that strongly agreeing is more agreement than moderate agreement, but you are not sure exactly how much more.
3. Interval. This level measures more or less in exact amounts but without an absolute zero. The text correctly points out that few, if any, examples exist of this in political science or social science for that matter. So we can just ignore it for all practical purposes.
4. Ratio. Same as interval, in exact amounts, but it makes sense to say that a unit has zero of whatever it is that we are measuring. For example, 0 years of age is when one is born. Or 0 education means that no years of school were completed, or 0 income, and so on. You can do a lot of math with this level of measurement. One caveat exists, however. Usually we do not measure things exactly. We do not measure exact age, as it changes every second. So strictly speaking, we often group ratio measurements to the nearest whole unit, whether it be years or nearest thousand dollars of income. So strictly speaking, most ratio measurements are really ordinal. However, as a rule of thumb, we can pretend that it is ratio if we have in the neighborhood of 7 or more groups. If you group the ratio data in really broad groups (usually less than 7), it should be treated like ordinal data. So for example, if we took years of education and grouped it so that we had, less than high school, high school degree, some college, and college degree of more, 4 groups, it should be treated like ordinal data.
5. Dichotomous. If you have only two groups so that the data
is either one thing or not that thing (male and not male is the same thing
as male or female for this purpose), then it is dichotomous. All yes/no
questions are dichotomous. What is neat about dichotomous measurements
is that we can treat them like ANY of the other levels statistically. You
will see this later when we start doing some statistics.
Assignment for next class:
Again go to some journal articles--anything in social science will so
this time. Find TWO articles (different ones than you have used before)
and answer the following:
1. What are the units of analysis?
2. What is the population?
3. Is it primary or secondary data? Is it individual or aggregate?
4. What is the source of the data?
5. What are the variables and what is the level of measurement of each?
Make
sure that you find real research articles, not book reviews or something
else!