Transitioning to a Mixed-Mode Study Design in a National Household Panel Study: Effects on Fieldwork Outcomes, Sample Composition and Costs
The U.S. Panel Study of Income Dynamics (PSID) made a planned transition to a web-first mixed-mode data collection design in 2021 (web and computer-assisted telephone interviewing [CATI]), following nearly five decades of collecting data primarily using CATI with professional interviewers. To evaluate potential effects of mode on fieldwork outcomes, two sequential mixed-mode protocols were introduced using an experimental design. One protocol randomized sample families to a "web-first" treatment, which encouraged response through an online interview, followed by an offer of telephone to complete the interview; a second protocol randomized sample families to a "CATI-first" treatment until the last phase of fieldwork when the option to complete a web interview was offered. This paper examines the comparative effects of the two protocols on fieldwork outcomes, including response rates, interviewer contact attempts, fieldwork duration, and cost. Comparisons are also made with fieldwork outcomes and characteristics of non-responding sample members from the prior-wave when a traditional telephone design was used. We found that the web-first design compared to the CATI-first design led to comparably high response rates, and faster interview completion with lower effort and cost. With some notable exceptions, compared to the prior wave, the mixed-mode design reduced effort and had generally similar patterns of non-response among key respondent subgroups. The results provide new empirical evidence on the effects of mixing modes on fieldwork outcomes and costs and contribute to the small body of experimental evidence on the use of mixed-mode designs in household panel studies.
Impact of Mode Switching on Nonresponse and Bias in a Multimode Longitudinal Study of Young Adults
Young adults are generally hard to survey, presenting researchers with numerous difficulties. They are hard to locate and contact due to high mobility. They are hard to persuade and exhibit high levels of resistance to survey participation. As a result, they pose a greater challenge for longitudinal surveys. This paper explores the role of mode of data collection in young adults' decisions to stay in a longitudinal panel. We draw on data from the National Young Adult Health Survey (NYAHS). NYAHS is a longitudinal study (three annual waves and 2 brief between-wave follow-up surveys) of adults aged 18-34 initially recruited in 2019 through RDD sampling of cell phone numbers nationwide. All sampled cell phone numbers were randomly assigned to one of three experimental conditions; the conditions differed in mode of data collection used in subsequent interviews once screened in. In the first condition, young adults continue all rounds of interviews by telephone ("telephone only" condition). The second group of young adults completed one round of interview by web and the rest by telephone ("telephone mostly" condition). The last third was asked to complete three interviews online and two interviews by telephone ("web mostly" condition). We examined the impact of mode switching on young adults' likelihood of participating in later surveys and on nonresponse bias in key survey outcomes. We found that switching young adults from telephone to web had an immediate negative effect on their likelihood of participating in that web survey, but it did not have a continued negative effect. Switching them from web to telephone increased response rates and reduced nonresponse bias. The findings have important practical implications on how to survey young adults.
Assessing consent for and response to health survey components in an era of falling response rates: National Health and Nutrition Examination Survey, 2011-2018
Response rates for national population-based surveys have declined, including the National Health and Nutrition Examination Survey (NHANES). Declining response to the initial NHANES interview may impact consent and participation in downstream survey components such as record linkage, physical exams, storage of biological samples and phlebotomy. Interview response rates dropped from 68% in 2011-2012 to 53% in 2017-2018 for adults age 18 and older. Response was higher for children (1-17 years) but with a similar downward trend (2011-2012, 81%; 2017-2018, 65%). Despite declining interview response rates, changes in consent and response rates for downstream components over time have been mixed. Among those interviewed, the examination response rate was over 93%, consent for record linkage was over 90%, and consent for storage of specimens for future research was over 99%. The availability of a blood sample for storage ranged between 60%-65% for children and 78%-85% for adults.
Effects of the COVID-19 crisis on survey fieldwork: Experience and lessons from two major supplements to the U.S. Panel Study of Income Dynamics
Two major supplements to the Panel Study of Income Dynamics (PSID) were in the field during the COVID-19 outbreak in the United States: the 2019 waves of the PSID Child Development Supplement (CDS-19) and the PSID Transition into Adulthood Supplement (TAS-19). Both CDS-19 and TAS-19 abruptly terminated all face-to-face fieldwork and, for TAS-19, shifted interviewers from working in a centralized call center to working from their homes. Overall, COVID-19 had a net negative effect on response rates in CDS-19 and terminated all home visits that represented an important study component. For TAS-19, the overall effect of Covid-19 was uncertain, but negative. The costs were high of adapting to COVID-19 and providing paid time-off benefits to staff affected by the pandemic. Longitudinal surveys, such as CDS, TAS, and PSID, that span the pandemic will provide valuable information on its life course and intergenerational consequences, making ongoing data collection of vital importance.
Does Benefit Framing Improve Record Linkage Consent Rates? A Survey Experiment
Survey researchers are increasingly seeking opportunities to link interview data with administrative records. However, obtaining consent from all survey respondents (or certain subgroups) remains a barrier to performing record linkage in many studies. We experimentally investigated whether emphasizing different benefits of record linkage to respondents in a telephone survey of employee working conditions improves respondents' willingness to consent to linkage of employment administrative records relative to a neutral consent request. We found that emphasizing linkage benefits related to "time savings" yielded a small, albeit statistically significant, improvement in the overall linkage consent rate (86.0) relative to the neutral consent request (83.8 percent). The time savings argument was particularly effective among "busy" respondents. A second benefit argument related to "improved study value" did not yield a statistically significant improvement in the linkage consent rate (84.4 percent) relative to the neutral request. This benefit argument was also ineffective among the subgroup of respondents considered to be most likely to have a self-interest in the study outcomes. The article concludes with a brief discussion of the practical implications of these findings and offers suggestions for possible research extensions.
Tree-based Machine Learning Methods for Survey Research
Predictive modeling methods from the field of machine learning have become a popular tool across various disciplines for exploring and analyzing diverse data. These methods often do not require specific prior knowledge about the functional form of the relationship under study and are able to adapt to complex non-linear and non-additive interrelations between the outcome and its predictors while focusing specifically on prediction performance. This modeling perspective is beginning to be adopted by survey researchers in order to adjust or improve various aspects of data collection and/or survey management. To facilitate this strand of research, this paper (1) provides an introduction to prominent tree-based machine learning methods, (2) reviews and discusses previous and (potential) prospective applications of tree-based supervised learning in survey research, and (3) exemplifies the usage of these techniques in the context of modeling and predicting nonresponse in panel surveys.
Reducing speeding in web surveys by providing immediate feedback
It is well known that some survey respondents reduce the effort they invest in answering questions by taking mental shortcuts - survey satisficing. This is a concern because such shortcuts can reduce the quality of responses and, potentially, the accuracy of survey estimates. This article explores "speeding," an extreme type of satisficing, which we define as answering so quickly that respondents could not have given much, if any, thought to their answers. To reduce speeding among online respondents we implemented an interactive prompting technique. When respondents answered faster than a minimal response time threshold, they received a message encouraging them to answer carefully and take their time. Across six web survey experiments, this prompting technique reduced speeding on subsequent questions compared to a no prompt control. Prompting slowed response times whether the speeding that triggered the prompt occurred early or late in the questionnaire, in the first or later waves of a longitudinal survey, among respondents recruited from non-probability or probability panels, or whether the prompt was delivered on only the first or on all speeding episodes. In addition to reducing speeding, the prompts increased response accuracy on simple arithmetic questions for a key subgroup. Prompting also reduced later straightlining in one experiment, suggesting the benefits may generalize to other types of mental shortcuts. Although the prompting could have annoyed respondents, it was not accompanied by a noticeable increase in breakoffs. As an alternative technique, respondents in one experiment were asked to explicitly commit to responding carefully. This global approach complemented the more local, interactive prompting technique on several measures. Taken together, these results suggest that interactive interventions of this sort may be useful for increasing respondents' conscientiousness in online questionnaires, even though these questionnaires are self-administered.
Survey Breakoffs in a Computer-Assisted Telephone Interview
Nearly 23% of all telephone interviews in the most recently completed wave of the Panel Study of Income Dynamics break off at least once, requiring multiple sessions to complete the interview. Given this high rate, a study was undertaken to better understand the causes and consequences of temporary breakoffs in a computer-assisted telephone interview setting. The majority of studies examining breakoffs have been conducted in the context of self-administered web surveys. The present study uses new paradata collected on telephone interview breakoffs to describe their prevalence, associated field effort, the instrument sections and questions on which they occur, their source - whether respondent-initiated, interviewer-initiated, or related to telephone problems - and associations with respondent and interviewer characteristics. The results provide information about the survey response process and suggest a set of recommendations for instrument design and interviewer training, as well as additional paradata that should be collected to provide more insight into the breakoff phenomenon.
The Impact of Survey and Response Modes on Current Smoking Prevalence Estimates Using TUS-CPS: 1992-2003
This study identified whether survey administration mode (telephone or in-person) and respondent type (self or proxy) result in discrepant prevalence of current smoking in the adult U.S. population, while controlling for key sociodemographic characteristics and longitudinal changes of smoking prevalence over the 11-year period from 1992-2003. We used a multiple logistic regression analysis with replicate weights to model the current smoking status logit as a function of a number of covariates. The final model included individual- and family-level sociodemographic characteristics, survey attributes, and multiple two-way interactions of survey mode and respondent type with other covariates. The respondent type is a significant predictor of current smoking prevalence and the magnitude of the difference depends on the age, sex, and education of the person whose smoking status is being reported. Furthermore, the survey mode has significant interactions with survey year, sex, and age. We conclude that using an overall unadjusted estimate of the current smoking prevalence may result in underestimating the current smoking rate when conducting proxy or telephone interviews especially for some sub-populations, such as young adults. We propose that estimates could be improved if more detailed information regarding the respondent type and survey administration mode characteristics were considered in addition to commonly used survey year and sociodemographic characteristics. This information is critical given that future surveillance is moving toward more complex designs. Thus, adjustment of estimates should be contemplated when comparing current smoking prevalence results within a given survey series with major changes in methodology over time and between different surveys using various modes and respondent types.
Informed Consent for Web Paradata Use
Survey researchers are making increasing use of paradata - such as keystrokes, clicks, and timestamps - to evaluate and improve survey instruments but also to understand respondents and how they answer surveys. Since the introduction of paradata, researchers have been asking whether and how respondents should be informed about the capture and use of their paradata while completing a survey. In a series of three vignette-based experiments, we examine alternative ways of informing respondents about capture of paradata and seeking consent for their use. In all three experiments, any mention of paradata lowers stated willingness to participate in the hypothetical surveys. Even the condition where respondents were asked to consent to the use of paradata at the end of an actual survey resulted in a significant proportion declining. Our research shows that requiring such explicit consent may reduce survey participation without adequately informing survey respondents about what paradata are and why they are being used.
Assessing Quality of Answers to a Global Subjective Well-being Question Through Response Times
Many large-scale surveys measure subjective well-being (SWB) through a single survey item. This paper takes advantages of response time data to explore the relation between time taken to answer a single SWB item and the reliability and validity of answers to this SWB item. We found that reliability and validity of answers to the SWB item are low for fast respondents aged 70 and above and slow respondents between the age of 50 and 70. The findings indicate that longer time spent answering the single SWB item is associated with data of lower quality for respondents aged between 50 and 70, but data of higher quality for respondents aged 70 and above. This paper speaks to the importance of capitalizing response times that are readily available from computerized interviews to evaluate answers provided by respondents and calls for survey researchers' attention to differences in time taken to answer a survey question across respondent subgroups.
Age and Sex Effects in Anchoring Vignette Studies: Methodological and Empirical Contributions
Anchoring vignettes are an increasingly popular tool for identifying and correcting for group differences in use of subjective ordered response categories. However, existing techniques to maximize (use of the same standards for self-ratings as for vignette-ratings), which center on matching vignette characters' demographic characteristics to respondents' own characteristics, appear at times to be ineffective or to pose interpretive difficulties. Specifically, respondents often appear to neglect instructions to treat vignette characters as age peers. Furthermore, when vignette characters' sex is matched to respondents' sex, interpretation of sex differences in rating style is rendered problematic. This study applies two experimental manipulations to a national American sample (n=1,765) to clarify best practices for enhancing response consistency. First, an analysis of two methods of highlighting vignette characters' age suggests that both yield better response consistency than previous, less prominent means. Second, a comparison of ratings of same- and opposite-sex vignette characters suggests that, with avoidable exceptions, the sex of the respondent rather than of the vignette character drives observed sex differences in rating style. Implications for interpretation and design of anchoring vignette studies are discussed. In addition, this study clarifies the importance of two additional measurement assumptions, and . It also presents empirical findings of significant sex, educational, and racial/ethnic differences in styles of rating health, and racial/ethnic differences in styles of rating political efficacy. These findings underscore the incomparability of unadjusted subjective self-ratings across demographic groups, and thus support the potential utility of the anchoring vignette method.
Evaluating a Modular Design Approach to Collecting Survey Data Using Text Messages
This article presents analyses of data from a pilot study in Nepal that was designed to provide an initial examination of the errors and costs associated with an innovative methodology for survey data collection. We embedded a randomized experiment within a long-standing panel survey, collecting data on a small number of items with varying sensitivity from a probability sample of 450 young Nepalese adults. Survey items ranged from simple demographics to indicators of substance abuse and mental health problems. Sampled adults were randomly assigned to one of three different modes of data collection: 1) a standard one-time telephone interview, 2) a "single sitting" back-and-forth interview with an interviewer using text messaging, and 3) an interview using text messages within a modular design framework (which generally involves breaking the survey response task into distinct parts over a short period of time). Respondents in the modular group were asked to respond (via text message exchanges with an interviewer) to only one question on a given day, rather than complete the entire survey. Both bivariate and multivariate analyses demonstrate that the two text messaging modes increased the probability of disclosing sensitive information relative to the telephone mode, and that respondents in the modular design group, while responding less frequently, found the survey to be significantly easier. Further, those who responded in the modular group were not unique in terms of available covariates, suggesting that the reduced item response rates only introduced limited nonresponse bias. Future research should consider enhancing this methodology, applying it with other modes of data collection (e. g., web surveys), and continuously evaluating its effectiveness from a total survey error perspective.
Helping Respondents Provide Good Answers in Web Surveys
This paper reports on a series of experiments to explore ways to use the technology of Web surveys to help respondents provide well-formed answers to questions that may be difficult to answer. Specifically, we focus on the use of drop-down or select lists and JavaScript lookup tables as alternatives to open text fields for the collection of information on prescription drugs. The first two experiments were conducted among members of opt-in panels in the U.S. The third experiment was conducted in the 2013 Health and Retirement Study Internet Survey. Respondents in each of the studies were randomly assigned to one of three input methods: text field, drop box, or JavaScript lookup, and asked to provide the names of prescription drugs they were taking. We compare both the quality of answers obtained using the three methods, and the effort (time) taken to provide such answers. We examine differences in performance on the three input format types by key respondent demographics and Internet experience. We discuss some of the technical challenges of implementing complex question types and offer some recommendations for the use of such tools in Web surveys.
Comparing Paper and Tablet Modes of Retrospective Activity Space Data Collection
Individual actions are both constrained and facilitated by the social context in which individuals are embedded. But research to test specific hypotheses about the role of space on human behaviors and well-being is limited by the difficulty of collecting accurate and personally relevant social context data. We report on a project in Chitwan, Nepal, that directly addresses challenges to collect accurate activity space data. We test if a computer assisted interviewing (CAI) tablet-based approach to collecting activity space data was more accurate than a paper map-based approach; we also examine which subgroups of respondents provided more accurate data with the tablet mode compared to paper. Results show that the tablet approach yielded more accurate data when comparing respondent-indicated locations to the known locations as verified by on-the-ground staff. In addition, the accuracy of the data provided by older and less healthy respondents benefited more from the tablet mode.
Heaping at Round Numbers on Financial Questions: The Role of Satisficing
Survey responses to quantitative financial questions frequently display strong patterns of heaping at round numbers. This paper uses two studies to examine variation in rounding across questions and by individual characteristics. Rounding was more common for respondents low in ability, for respondents low in motivation, and for more difficult questions, all consistent with theories of satisficing. Questions that require more difficult information retrieval and integration of information exhibit more heaping. The use of records, which lowers task difficulty, reduces rounding as well. Higher episodic memory is associated with less rounding, and standard measures of motivation are negatively associated with rounding. These relationships, along with the fact that longer response latencies are associated with less rounding, all support the idea that rounding is a manifestation of satisficing on open-ended financial questions. Rounding patterns also appear remarkably similar across the two studies, despite being fielded in different modes and employing different question order and wording.