JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS

Bayesian Multilevel Latent Class Models for the Multiple Imputation of Nested Categorical Data
Vidotto D, Vermunt JK and van Deun K
With this article, we propose using a Bayesian multilevel latent class (BMLC; or mixture) model for the multiple imputation of nested categorical data. Unlike recently developed methods that can only pick up associations between pairs of variables, the multilevel mixture model we propose is flexible enough to automatically deal with complex interactions in the joint distribution of the variables to be estimated. After formally introducing the model and showing how it can be implemented, we carry out a simulation study and a real-data study in order to assess its performance and compare it with the commonly used listwise deletion and an available R-routine. Results indicate that the BMLC model is able to recover unbiased parameter estimates of the analysis models considered in our studies, as well as to correctly reflect the uncertainty due to missing data, outperforming the competing methods.
Normal Theory Two-Stage ML Estimator When Data Are Missing at the Item Level
Savalei V and Rhemtulla M
In many modeling contexts, the variables in the model are linear composites of the raw items measured for each participant; for instance, regression and path analysis models rely on scale scores, and structural equation models often use parcels as indicators of latent constructs. Currently, no analytic estimation method exists to appropriately handle missing data at the item level. Item-level multiple imputation (MI), however, can handle such missing data straightforwardly. In this article, we develop an analytic approach for dealing with item-level missing data-that is, one that obtains a unique set of parameter estimates directly from the incomplete data set and does not require imputations. The proposed approach is a variant of the two-stage maximum likelihood (TSML) methodology, and it is the analytic equivalent of item-level MI. We compare the new TSML approach to three existing alternatives for handling item-level missing data: scale-level full information maximum likelihood, available-case maximum likelihood, and item-level MI. We find that the TSML approach is the best analytic approach, and its performance is similar to item-level MI. We recommend its implementation in popular software and its further study.
Solutions for Determining the Significance Region Using the Johnson-Neyman Type Procedure in Generalized Linear (Mixed) Models
Lazar AA and Zerbe GO
Researchers often compare the relationship between an outcome and covariate for two or more groups by evaluating whether the fitted regression curves differ significantly. When they do, researchers need to determine the "significance region," or the values of the covariate where the curves significantly differ. In analysis of covariance (ANCOVA), the Johnson-Neyman procedure can be used to determine the significance region; for the hierarchical linear model (HLM), the Miyazaki and Maier (M-M) procedure has been suggested. However, neither procedure can assume nonnormally distributed data. Furthermore, the M-M procedure produces biased (downward) results because it uses the Wald test, does not control the inflated Type I error rate due to multiple testing, and requires implementing multiple software packages to determine the significance region. In this article, we address these limitations by proposing solutions for determining the significance region suitable for generalized linear (mixed) model (GLM or GLMM). These proposed solutions incorporate test statistics that resolve the biased results, control the Type I error rate using Scheffé's method, and uses a single statistical software package to determine the significance region.
Sensitivity Analysis and Bounding of Causal Effects With Alternative Identifying Assumptions
Jo B and Vinokur AD
When identification of causal effects relies on untestable assumptions regarding nonidentified parameters, sensitivity of causal effect estimates is often questioned. For proper interpretation of causal effect estimates in this situation, deriving bounds on causal parameters or exploring the sensitivity of estimates to scientifically plausible alternative assumptions can be critical. In this paper, we propose a practical way of bounding and sensitivity analysis, where multiple identifying assumptions are combined to construct tighter common bounds. In particular, we focus on the use of competing identifying assumptions that impose different restrictions on the same non-identified parameter. Since these assumptions are connected through the same parameter, direct translation across them is possible. Based on this cross-translatability, various information in the data, carried by alternative assumptions, can be effectively combined to construct tighter bounds on causal effects. Flexibility of the suggested approach is demonstrated focusing on the estimation of the complier average causal effect (CACE) in a randomized job search intervention trial that suffers from noncompliance and subsequent missing outcomes.
Bias Mechanisms in Intention-to-Treat Analysis With Data Subject to Treatment Noncompliance and Missing Outcomes
Jo B
An analytical approach was employed to compare sensitivity of causal effect estimates with different assumptions on treatment noncompliance and non-response behaviors. The core of this approach is to fully clarify bias mechanisms of considered models and to connect these models based on common parameters. Focusing on intention-to-treat analysis, systematic model comparisons are performed on the basis of explicit bias mechanisms and connectivity between models. The method is applied to the Johns Hopkins school intervention trial, where assessment of the intention-to-treat effect on school children's mental health is likely to be affected by assumptions about intervention noncompliance and nonresponse at follow-up assessments. The example calls attention to the importance of focusing on each case in investigating relative sensitivity of causal effect estimates with different identifying assumptions, instead of pursuing a general conclusion that applies to every occasion.
Uncertainty in Rank Estimation: Implications for Value-Added Modeling Accountability Systems
Lockwood JR, Louis TA and McCaffrey DF
Accountability for public education often requires estimating and ranking the quality of individual teachers or schools on the basis of student test scores. Although the properties of estimators of teacher-or-school effects are well established, less is known about the properties of rank estimators. We investigate performance of rank (percentile) estimators in a basic, two-stage hierarchical model capturing the essential features of the more complicated models that are commonly used to estimate effects. We use simulation to study mean squared error (MSE) performance of percentile estimates and to find the operating characteristics of decision rules based on estimated percentiles. Each depends on the signal-to-noise ratio (the ratio of the teacher or school variance component to the variance of the direct, teacher- or school-specific estimator) and only moderately on the number of teachers or schools. Results show that even when using optimal procedures, MSE is large for the commonly encountered variance ratios, with an unrealistically large ratio required for ideal performance. Percentile-specific MSE results reveal interesting interactions between variance ratios and estimators, especially for extreme percentiles, which are of considerable practical import. These interactions are apparent in the performance of decision rules for the identification of extreme percentiles, underscoring the statistical and practical complexity of the multiple-goal inferences faced in value-added modeling. Our results highlight the need to assess whether even optimal percentile estimators perform sufficiently well to be used in evaluating teachers or schools.
Models for Value-Added Modeling of Teacher Effects
McCaffrey DF, Lockwood JR, Koretz D, Louis TA and Hamilton L
The use of complex value-added models that attempt to isolate the contributions of teachers or schools to student development is increasing. Several variations on these models are being applied in the research literature, and policy makers have expressed interest in using these models for evaluating teachers and schools. In this article, we present a general multivariate, longitudinal mixed-model that incorporates the complex grouping structures inherent to longitudinal student data linked to teachers. We summarize the principal existing modeling approaches, show how these approaches are special cases of the proposed model, and discuss possible extensions to model more complex data structures. We present simulation and analytical results that clarify the interplay between estimated teacher effects and repeated outcomes on students over time. We also explore the potential impact of model misspecifications, including missing student covariates and assumptions about the accumulation of teacher effects over time, on key inferences made from the models. We conclude that mixed models that account for student correlation over time are reasonably robust to such misspecifications when all the schools in the sample serve similar student populations. However, student characteristics are likely to confound estimated teacher effects when schools serve distinctly different populations.