The problem of respondent attrition: survey methodology is key; longitudinal surveys will suffer from attrition and nothing will change that; however, years of lessons learned in the field show that straightforward survey methodology can minimize the impact of losing respondents
The central problem of longitudinal surveys is attrition. The National Longitudinal Survey of Youth in 1979 (NLSY79), which this issue of the Monthly Labor Review features, is the gold standard for sample retention against which longitudinal surveys are usually measured. However, we cannot understand how the NLSY79 has done so well without considering what was done differently in the other cohorts of the NLS and what we have learned by formal evaluations of attrition aversion measures that evolved over a quarter century of field work. The lessons here are hard-won and, to some, unconventional.
The NLS began in 1965 at the urging of an Assistant Secretary of Labor, Daniel Patrick Moynihan. He believed that although the Current Population Survey provided crucial snapshots of the Nation's labor force and labor market, the Nation needed a data source that was more dynamic and capable of tracking the long-run evolution of careers. The task of starting the study went to Howard Rosen at the Department of Labor, who enlisted Herb Parnes from Ohio State University, to assemble a team, design the surveys, and analyze the data. This team comprised representatives from the Census Bureau, Ohio State University, and the Department of Labor.
The original plan was to follow the cohorts for 5 years to study some of the pressing questions of the time--the shrinking labor force participation rate of older men, the problem of youth unemployment and the transition from school to work, and the growing labor force participation of women whose children were entering school, leading to steady growth in the number of working mothers. Childcare was an important issue along with the problem of how the family would pay for a college education for the children of the baby boom.
Over time, the project has expanded. (Table 1 shows the various cohorts of the NLS, their start and stop dates, sizes, and age ranges covered.) Because the project began with a 5-year horizon, neither the Census Bureau, Ohio State, nor the Department of Labor had a plan for sample retention over the long run; after all, longitudinal surveys were still quite rare. The studies shortly proved their worth and the project became open-ended in terms of duration. However, the original limitations on intended duration led to some problems with attrition that conflicted with a revised plan to follow the respondents over the balance of their lives. In particular, the "following-rule" (1) that the Census Bureau used specified that when a respondent missed two consecutive interviews, the Census Bureau would drop that respondent from the study.
The following-rule and the original 5-year horizon struck with the greatest force on the Young Men's cohort. In 1981, about two-thirds of the cohort responded to the survey. Some analysts believed that the rate of attrition reflected veterans' refusing to participate in a Government survey. Although the rate of attrition among black veterans was a few percentage points higher than that for nonveterans, for whites, the differential for veterans as a whole was essentially zero. Within the Young Men's cohort, blacks had the highest attrition rates. For whites, attrition in the Young Men's cohort was a bit higher than that for two women's cohorts, but male respondents have always had higher attrition than females.
The two-and-out-following-rule that the Census Bureau employed had serious ramifications, given the attrition pattern for the young men, and high attrition among blacks. By 1981, the Census Bureau had stopped tracking 11 percent of the young men because they had, at some point, missed two consecutive interviews. Blacks make up 28 percent of the young men's sample, but 57 percent of the cases dropped because of the following-rule were black. Our current rule-of-thumb is that in the next round, one can obtain an interview on about 25 percent of the respondents who have missed two interviews in a row. When interviewing began for the NLSY79, performance specifications did not allow respondents to be dropped simply based on consecutive missed interviews.
The original design for the surveys alternated in-person interviews with telephone and mail-out surveys, with the in-person version conducted every 5 years. (2) As a result, the content of the interview was more comprehensive every 5 years, with smaller updates in between. The NLS approach to the Mature Women's cohort is emblematic of the general approach of the survey. The 1967 interview of the women ages 30-44 focused on the longest held jobs between schooling and marriage, between marriage and the birth of her first child, and after the birth of the first child. The survey sought the most important (that is, longest held) job holdings, probed for significant periods not working, and ascertained why the woman did not work. The respondent answered CPS questions about the previous week; these questions accounted for a significant part of the interview.
This approach to collecting labor force behavior data left unanswered questions about work history, especially for women with frequent job transitions and women who missed the in-person interview. There were modules that collected retrospective data about fertility and marriage, but in the 1960s, marriages ending in divorce were less frequent, compared with current divorce rates. The NLS did not attempt to collect an event history on marriage, but nonetheless, the survey probably collected most of the transitions in marital status and cohabitation for the Mature Women and Older Men's cohorts.
The original cohort data collection effort frequently captured data on respondents' behavior by asking retrospective questions, sometimes at wide intervals, to capture particular data domains. For example, rather than collecting pregnancy roster data on the Young Women's cohort as those events occurred, the NLS would ask about many years' experience all at once. As Frank Mott documents, this strategy for data collection opens the way for more measurement error. (3)
With the strategy used then, missing one interview can leave an important part of the data record distressingly incomplete.
It is in this context that we start this article by focusing on the historical record of the completion rates for the various cohorts of the NLS and how the strategy for both data collection and the rules for continuing to follow nonrespondents generate startling impacts on the completeness of the data coming out of a longitudinal study. This article continues by describing some of the fielding techniques the NLS program has employed to offset the secular trend toward lower completion rates.
The historical record
The remainder of this article describes the two original women's cohorts: the NLSY79 and the NLSY97. The two original men's cohorts were cancelled in the early 1980s. (4) In 1981, the Census Bureau completed interviews of 65 percent of the original respondents for the Young Men and 52.5 percent of the Mature Men. However, corrected for mortality, the numbers are higher, with 66.8 percent completed of the respondents still alive for the Young Men and 74.8 percent of the Mature Men. After 15 years, the completion rate for the Mature Women was 69.7 percent (73.5 percent of those still alive), and for the Young Women it was 68.8 percent (69.4 percent of those still alive). As mentioned earlier, the lower completion rate for the Young Men reflects a following-rule that dropped blacks at an unusually high rate.