STATISTICA NEERLANDICA

Estimating random effects in a finite Markov chain with absorbing states: Application to cognitive data
Wang P, Abner EL, Liu C, Fardo DW, Schmitt FA, Jicha GA, Van Eldik LJ and Kryscio RJ
Finite Markov chains with absorbing states are popular tools for analyzing longitudinal data with categorical responses. The one step transition probabilities can be defined in terms of fixed and random effects but it is difficult to estimate these effects due to many unknown parameters. In this article we propose a three-step estimation method. In the first step the fixed effects are estimated by using a marginal likelihood function, in the second step the random effects are estimated after substituting the estimated fixed effects into a joint likelihood function defined as a h-likelihood, and in the third step the covariance matrix for the vector of random effects is estimated using the Hessian matrix for this likelihood function. An application involving an analysis of longitudinal cognitive data is used to illustrate the method.
A phenomenological model for COVID-19 data taking into account neighboring-provinces effect and random noise
Calatayud J, Jornet M and Mateu J
We model the incidence of the COVID-19 disease during the first wave of the epidemic in Castilla-Leon (Spain). Within-province dynamics may be governed by a generalized logistic map, but this lacks of spatial structure. To couple the provinces, we relate the daily new infections through a density-independent parameter that entails positive spatial correlation. Pointwise values of the input parameters are fitted by an optimization procedure. To accommodate the significant variability in the daily data, with abruptly increasing and decreasing magnitudes, a random noise is incorporated into the model, whose parameters are calibrated by maximum likelihood estimation. The calculated paths of the stochastic response and the probabilistic regions are in good agreement with the data.
Rank correlation inferences for clustered data with small sample size
Hunsberger S, Long L, Reese SE, Hong GH, Myles IA, Zerbe CS, Chetchotisakd P and Shih JH
This paper develops methods to test for associations between two variables with clustered data using a -Statistic approach with a second-order approximation to the variance of the parameter estimate for the test statistic. The tests that are presented are for clustered versions of: Pearsons test, the Spearman rank correlation and Kendall's for continuous data or ordinal data and for alternative measures of Kendall's that allow for ties in the data. Shih and Fay use the -Statistic approach but only consider a first-order approximation. The first-order approximation has inflated significance level in scenarios with small sample sizes. We derive the test statistics using the second-order approximations aiming to improve the type I error rates. The method applies to data where clusters have the same number of measurements for each variable or where one of the variables may be measured once per cluster while the other variable may be measured multiple times. We evaluate the performance of the test statistics through simulation with small sample sizes. The methods are all available in the R package cluscor.
Change-point analysis through integer-valued autoregressive process with application to some COVID-19 data
Chattopadhyay S, Maiti R, Das S and Biswas A
In this article, we consider the problem of change-point analysis for the count time series data through an integer-valued autoregressive process of order 1 (INAR(1)) with time-varying covariates. These types of features we observe in many real-life scenarios especially in the COVID-19 data sets, where the number of active cases over time starts falling and then again increases. In order to capture those features, we use Poisson INAR(1) process with a time-varying smoothing covariate. By using such model, we can model both the components in the active cases at time-point namely, (i) number of nonrecovery cases from the previous time-point and (ii) number of new cases at time-point . We study some theoretical properties of the proposed model along with forecasting. Some simulation studies are performed to study the effectiveness of the proposed method. Finally, we analyze two COVID-19 data sets and compare our proposed model with another PINAR(1) process which has time-varying covariate but no change-point, to demonstrate the overall performance of our proposed model.
Mixed-effects models for health care longitudinal data with an informative visiting process: A Monte Carlo simulation study
Gasparini A, Abrams KR, Barrett JK, Major RW, Sweeting MJ, Brunskill NJ and Crowther MJ
Electronic health records are being increasingly used in medical research to answer more relevant and detailed clinical questions; however, they pose new and significant methodological challenges. For instance, observation times are likely correlated with the underlying disease severity: Patients with worse conditions utilise health care more and may have worse biomarker values recorded. Traditional methods for analysing longitudinal data assume independence between observation times and disease severity; yet, with health care data, such assumptions unlikely hold. Through Monte Carlo simulation, we compare different analytical approaches proposed to account for an informative visiting process to assess whether they lead to unbiased results. Furthermore, we formalise a joint model for the observation process and the longitudinal outcome within an extended joint modelling framework. We illustrate our results using data from a pragmatic trial on enhanced care for individuals with chronic kidney disease, and we introduce user-friendly software that can be used to fit the joint model for the observation process and a longitudinal outcome.
Bayesian estimation of explained variance in ANOVA designs
Marsman M, Waldorp L, Dablander F and Wagenmakers EJ
We propose to use the squared multiple correlation coefficient as an effect size measure for experimental analysis-of-variance designs and to use Bayesian methods to estimate its posterior distribution. We provide the expressions for the squared multiple, semipartial, and partial correlation coefficients corresponding to four commonly used analysis-of-variance designs and illustrate our contribution with two worked examples.
Application of one-step method to parameter estimation in ODE models
Dattner I and Gugushvili S
In this paper, we study application of Le Cam's one-step method to parameter estimation in ordinary differential equation models. This computationally simple technique can serve as an alternative to numerical evaluation of the popular non-linear least squares estimator, which typically requires the use of a multistep iterative algorithm and repetitive numerical integration of the ordinary differential equation system. The one-step method starts from a preliminary -consistent estimator of the parameter of interest and next turns it into an asymptotic (as the sample size n→∞) equivalent of the least squares estimator through a numerically straightforward procedure. We demonstrate performance of the one-step estimator via extensive simulations and real data examples. The method enables the researcher to obtain both point and interval estimates. The preliminary -consistent estimator that we use depends on non-parametric smoothing, and we provide a data-driven methodology for choosing its tuning parameter and support it by theory. An easy implementation scheme of the one-step method for practical use is pointed out.
Analytic posteriors for Pearson's correlation coefficient
Ly A, Marsman M and Wagenmakers EJ
Pearson's correlation is one of the most common measures of linear dependence. Recently, Bernardo (11th International Workshop on Objective Bayes Methodology, 2015) introduced a flexible class of priors to study this measure in a Bayesian setting. For this large class of priors, we show that the (marginal) posterior for Pearson's correlation coefficient and all of the posterior moments are analytic. Our results are available in the open-source software package JASP.
Non-parametric regression in clustered multistate current status data with informative cluster size
Lan L, Bandyopadhyay D and Datta S
Datasets examining periodontal disease records current (disease) status information of tooth-sites, whose stochastic behavior can be attributed to a multistate system with state occupation determined at a single inspection time. In addition, the tooth-sites remain clustered within a subject, and the number of available tooth-sites may be representative of the true PD status of that subject, leading to an 'informative cluster size' scenario. To provide insulation against incorrect model assumptions, we propose a nonparametric regression framework to estimate state occupation probabilities at a given time and state exit/entry distributions, utilizing weighted monotonic regression and smoothing techniques. We demonstrate the superior performance of our proposed weighted estimators over the un-weighted counterparts via. a simulation study, and illustrate the methodology using a dataset on periodontal disease.
Semiparametric regression models and sensitivity analysis of longitudinal data with nonrandom dropouts
Todem D, Kim K, Fine J and Peng L
We propose a family of regression models to adjust for nonrandom dropouts in the analysis of longitudinal outcomes with fully observed covariates. The approach conceptually focuses on generalized linear models with random effects. A novel formulation of a shared random effects model is presented and shown to provide a dropout selection parameter with a meaningful interpretation. The proposed semiparametric and parametric models are made part of a sensitivity analysis to delineate the range of inferences consistent with observed data. Concerns about model identifiability are addressed by fixing some model parameters to construct functional estimators that are used as the basis of a global sensitivity test for parameter contrasts. Our simulation studies demonstrate a large reduction of bias for the semiparametric model relatively to the parametric model at times where the dropout rate is high or the dropout model is misspecified. The methodology's practical utility is illustrated in a data analysis.
Estimation of a k-monotone density: characterizations, consistency and minimax lower bounds
Balabdaoui F and Wellner JA
The classes of monotone or convex (and necessarily monotone) densities on ℝ(+) can be viewed as special cases of the classes of k-monotone densities on ℝ(+). These classes bridge the gap between the classes of monotone (1-monotone) and convex decreasing (2-monotone) densities for which asymptotic results are known, and the class of completely monotone (∞-monotone) densities on ℝ(+). In this paper we consider non-parametric maximum likelihood and least squares estimators of a k-monotone density g(0).We prove existence of the estimators and give characterizations. We also establish consistency properties, and show that the estimators are splines of degree k - 1 with simple knots. We further provide asymptotic minimax risk lower bounds for estimating the derivatives[Formula: see text], at a fixed point x(0) under the assumption that [Formula: see text].
M-estimation in the presence of unequal scale
Cressie N