I am accepting graduate students for Fall of 2022. Please contact me to learn about opportunities in my research group.
Growth curve modeling is commonly used in psychological, educational, and social science research. The mainstream estimators for growth curve modeling are based on normal theory, but real data are unlikely to be exactly normally distributed. To improve estimation and inference with non-normal data, various estimators have been proposed. Among these estimators, the asymptotically distribution free ( ADF ) estimator does not need to rely on any distribution assumption but it is not efficient with small and modest sample sizes. We propose a distributionally weighted least squares ( DLS ) estimator in the growth curve modeling framework. DLS combines normal theory based and ADF based generalized least squares estimation to balance the information from the data and the normality assumption. Computer simulation results suggest that model-implied covariance based DLS ( DL S M ) generally provides more accurate and efficient estimates than the examined alternative methods regardless of the distribution. In addition, the relative biases of standard error estimates and the Type I error rates of the Satorra–Bentler test statistic ( T SB ) in DL S M were competitive with the classical methods including maximum likelihood and generalized least squares estimation. We illustrate how to implement DL S M and select the optimal tuning parameter by a bootstrap procedure in a real data example.
Du, H., Bentler, P.M., & Rosseel, Y (In press). Distributionally-weighted least squares in structural equation modeling. Structural Equation Modeling.
Stefany Mena is awarded the National Science Foundation Graduate Research Fellowship (NSF GRFP) in 2020. The NSF GRFP is a three-year fellowship awarded to doctoral students in STEM fields.
Missing data are exceedingly common across a variety of disciplines, such as educational, social, and behavioral science areas. Missing not at random (MNAR) mechanism where missingness is related to unobserved data is widespread in real data and has detrimental consequence. However, the existing MNAR-based methods have potential problems such as leaving the data incomplete and failing to accommodate incomplete covariates with interactions, non-linear terms, and random slopes. We propose a Bayesian latent variable imputation approach to impute missing data due to MNAR (and other missingness mechanisms) and estimate the model of substantive interest simultaneously. In addition, even when the incomplete covariates involves interactions, non-linear terms, and random slopes, the proposed method can handle missingness appropriately. Computer simulation results suggested that the proposed Bayesian latent variable selection model (BLVSM) was quite effective when the outcome and/or covariates were MNAR. Except when the sample size was small, estimates from the proposed BLVSM tracked closely with those from the complete data analysis. With a small sample size, when the outcome was less predictable from the covariates, the missingness proportions of the covariates and the outcome were larger, and the missingness selection processes of the covariates and the outcome were more MNAR and MAR, the performance of BLVSM was less satisfactory. When the sample size was large, BLVSM always performed well. In contrast, the method with an MAR assumption provided biased estimates and undercoverage confidence intervals when the missingness was MNAR. The robustness and the implementation of BLVSM in real data were also illustrated. The proposed method is available in the Blimp software application, and the paper includes a data analysis example illustrating its use.
Du, H., & Enders, C. K., Keller, B. T., Bradbury, T. & Karney, B. (In press). A Bayesian latent variable selection model for nonignorable missingness. Multivariate Behavioral Research.
In real data analysis with structural equation modeling, data are unlikely to be exactly normally distributed. If we ignore the non-normality reality, the parameter estimates, standard error estimates, and model fit statistics from normal theory based methods such as maximum likelihood (ML) and normal theory based generalized least squares estimation (GLS) are unreliable. On the other hand, the asymptotically distribution free (ADF) estimator does not rely on any distribution assumption but cannot demonstrate its efficiency advantage with small and modest sample sizes. The methods which adopt misspecified loss functions including ridge GLS (RGLS) can provide better estimates and inferences than the normal theory based methods and the ADF estimator in some cases. We propose a distributionally-weighted least squares (DLS) estimator, and expect that it can perform better than the existing generalized least squares, because it combines normal theory based and ADF based generalized least squares estimation. Computer simulation results suggest that model-implied covariance based DLS ( DLS_M ) provided relatively accurate and efficient estimates in terms of RMSE. In addition, the empirical standard errors, the relative biases of standard error estimates, and the Type I error rates of the Jiang-Yuan rank adjusted model fit test statistic ( T_JY ) in DL S_M were competitive with the classical methods including ML, GLS, and RGLS. The performance of DLS_M depends on its tuning parameter a . We illustrate how to implement DLS_M and select the optimal a by a bootstrap procedure in a real data example.
Du, H., & Bentler, P.M. (In press). Distributionally-weighted least squares in structural equation modeling. Psychological Methods