Correlated GMM logistic regression models with time-dependent covariates and valid estimating equations

Description
When analyzing longitudinal data it is essential to account both for the correlation inherent from the repeated measures of the responses as well as the correlation realized on account of the feedback created between the responses at a particular time

When analyzing longitudinal data it is essential to account both for the correlation inherent from the repeated measures of the responses as well as the correlation realized on account of the feedback created between the responses at a particular time and the predictors at other times. A generalized method of moments (GMM) for estimating the coefficients in longitudinal data is presented. The appropriate and valid estimating equations associated with the time-dependent covariates are identified, thus providing substantial gains in efficiency over generalized estimating equations (GEE) with the independent working correlation. Identifying the estimating equations for computation is of utmost importance. This paper provides a technique for identifying the relevant estimating equations through a general method of moments. I develop an approach that makes use of all the valid estimating equations necessary with each time-dependent and time-independent covariate. Moreover, my approach does not assume that feedback is always present over time, or present at the same degree. I fit the GMM correlated logistic regression model in SAS with PROC IML. I examine two datasets for illustrative purposes. I look at rehospitalization in a Medicare database. I revisit data regarding the relationship between the body mass index and future morbidity among children in the Philippines. These datasets allow us to compare my results with some earlier methods of analyses.
Date Created
2012
Agent

Multivariate generalization of reduced major axis regression

150996-Thumbnail Image.png
Description
A least total area of triangle method was proposed by Teissier (1948) for fitting a straight line to data from a pair of variables without treating either variable as the dependent variable while allowing each of the variables to have

A least total area of triangle method was proposed by Teissier (1948) for fitting a straight line to data from a pair of variables without treating either variable as the dependent variable while allowing each of the variables to have measurement errors. This method is commonly called Reduced Major Axis (RMA) regression and is often used instead of Ordinary Least Squares (OLS) regression. Results for confidence intervals, hypothesis testing and asymptotic distributions of coefficient estimates in the bivariate case are reviewed. A generalization of RMA to more than two variables for fitting a plane to data is obtained by minimizing the sum of a function of the volumes obtained by drawing, from each data point, lines parallel to each coordinate axis to the fitted plane (Draper and Yang 1997; Goodman and Tofallis 2003). Generalized RMA results for the multivariate case obtained by Draper and Yang (1997) are reviewed and some investigations of multivariate RMA are given. A linear model is proposed that does not specify a dependent variable and allows for errors in the measurement of each variable. Coefficients in the model are estimated by minimization of the function of the volumes previously mentioned. Methods for obtaining coefficient estimates are discussed and simulations are used to investigate the distribution of coefficient estimates. The effects of sample size, sampling error and correlation among variables on the estimates are studied. Bootstrap methods are used to obtain confidence intervals for model coefficients. Residual analysis is considered for assessing model assumptions. Outlier and influential case diagnostics are developed and a forward selection method is proposed for subset selection of model variables. A real data example is provided that uses the methods developed. Topics for further research are discussed.
Date Created
2012
Agent

A correlated random effects model for nonignorable missing data in value-added assessment of teacher effects

150494-Thumbnail Image.png
Description
Value-added models (VAMs) are used by many states to assess contributions of individual teachers and schools to students' academic growth. The generalized persistence VAM, one of the most flexible in the literature, estimates the ``value added'' by individual teachers to

Value-added models (VAMs) are used by many states to assess contributions of individual teachers and schools to students' academic growth. The generalized persistence VAM, one of the most flexible in the literature, estimates the ``value added'' by individual teachers to their students' current and future test scores by employing a mixed model with a longitudinal database of test scores. There is concern, however, that missing values that are common in the longitudinal student scores can bias value-added assessments, especially when the models serve as a basis for personnel decisions -- such as promoting or dismissing teachers -- as they are being used in some states. Certain types of missing data require that the VAM be modeled jointly with the missingness process in order to obtain unbiased parameter estimates. This dissertation studies two problems. First, the flexibility and multimembership random effects structure of the generalized persistence model lead to computational challenges that have limited the model's availability. To this point, no methods have been developed for scalable maximum likelihood estimation of the model. An EM algorithm to compute maximum likelihood estimates efficiently is developed, making use of the sparse structure of the random effects and error covariance matrices. The algorithm is implemented in the package GPvam in R statistical software. Illustrations of the gains in computational efficiency achieved by the estimation procedure are given. Furthermore, to address the presence of potentially nonignorable missing data, a flexible correlated random effects model is developed that extends the generalized persistence model to jointly model the test scores and the missingness process, allowing the process to depend on both students and teachers. The joint model gives the ability to test the sensitivity of the VAM to the presence of nonignorable missing data. Estimation of the model is challenging due to the non-hierarchical dependence structure and the resulting intractable high-dimensional integrals. Maximum likelihood estimation of the model is performed using an EM algorithm with fully exponential Laplace approximations for the E step. The methods are illustrated with data from university calculus classes and with data from standardized test scores from an urban school district.
Date Created
2012
Agent

Saddle squares in random two person zero sum games with finitely many strategies

149960-Thumbnail Image.png
Description
By the von Neumann min-max theorem, a two person zero sum game with finitely many pure strategies has a unique value for each player (summing to zero) and each player has a non-empty set of optimal mixed strategies. If

By the von Neumann min-max theorem, a two person zero sum game with finitely many pure strategies has a unique value for each player (summing to zero) and each player has a non-empty set of optimal mixed strategies. If the payoffs are independent, identically distributed (iid) uniform (0,1) random variables, then with probability one, both players have unique optimal mixed strategies utilizing the same number of pure strategies with positive probability (Jonasson 2004). The pure strategies with positive probability in the unique optimal mixed strategies are called saddle squares. In 1957, Goldman evaluated the probability of a saddle point (a 1 by 1 saddle square), which was rediscovered by many authors including Thorp (1979). Thorp gave two proofs of the probability of a saddle point, one using combinatorics and one using a beta integral. In 1965, Falk and Thrall investigated the integrals required for the probabilities of a 2 by 2 saddle square for 2 × n and m × 2 games with iid uniform (0,1) payoffs, but they were not able to evaluate the integrals. This dissertation generalizes Thorp's beta integral proof of Goldman's probability of a saddle point, establishing an integral formula for the probability that a m × n game with iid uniform (0,1) payoffs has a k by k saddle square (k ≤ m,n). Additionally, the probabilities of a 2 by 2 and a 3 by 3 saddle square for a 3 × 3 game with iid uniform(0,1) payoffs are found. For these, the 14 integrals observed by Falk and Thrall are dissected into 38 disjoint domains, and the integrals are evaluated using the basic properties of the dilogarithm function. The final results for the probabilities of a 2 by 2 and a 3 by 3 saddle square in a 3 × 3 game are linear combinations of 1, π2, and ln(2) with rational coefficients.
Date Created
2011
Agent