Questions about Missing Data

 

Questions about Full Information Maximum Likelihood (FIML)

What is Amos's Full Information Maximum Likelihood (FIML) method for missing data analysis?

Is it possible to obtain MLE using LISREL or EQS? Are their drawbacks?

If your data really is Missing Completely at Random, and you can assume multivariate normality, then are methods such as pairwise and listwise deletion acceptable?

 

Questions about why you can't get the usual fit indices with missing data

I am running a CFA model, but I fail to get fit indices in the text output. What am I doing wrong?

What should a user expect as far as output (test statistics such as GFI) with missing data?

I have inserted the CMIN value in the caption, but no value for df or p is provided and the output simply echoes the \df and \p legend. Is there some other command I need to provide?

I have had little success in obtaining any fit indicator, apart from the CMIN value for the FIML default analysis. What's wrong?

My reading shows some support for using RMSEA, but I could not get a value using \rmsea. Why?

Why can't I get modifications indices with missing data?

 

Questions about non-normal data multiple imputation approaches

Do you plan to implement any non-normal theory EM algorithms for such things as skewed variables or categorical variables?

Do you plan to implement a multiple imputation approach?

 

Questions about bugs, glitches, and Amos workarounds with missing data

Your description says what your missing data approach is not, but doesn't say what you do.

I am getting an error message when I run Amos with an SPSS file. What's wrong?

 

Questions about missing data analysis by the EM algorithm

What is the optimal replacement method for missing data? Is it Maximum Likelihood Estimation?

Are you doing a single imputation normal theory EM algorithm?

Amos will not provide goodness-of-fit indices unless there is no missing data in the input matrix, so we used the EMCOV program. Is this the most efficient way to handle this situation?

When we entered our raw data into EMCOV, we standardized it. Is standardization the recommended practice?

 

 

Questions about Full Information Maximum Likelihood (FIML)

 

Q. What is Amos's Full Information Maximum Likelihood (FIML) method for missing data analysis?

A. Unlike many other methods, Amos's full information maximum likelihood (FIML) estimation uses all information of the observed data. The likelihood is computed for the observed portion of each case's data and then accumulated and maximized. Amos's ML approach usually yields results equivalent to Don Rubin's EM approach, except that Amos also incorporates constrained moment matrix estimation. In addition, FIML requires no imputation (or E-step) and typically converges faster. Introductions to FIML estimation are given by Jim Arbuckle in the edited volume by Marcoulides and Schumacker (1996) and by Werner Wothke in the white paper on Longitudinal and multi-group modeling with missing data.

 

Q. From reading Marcoulides and Schumacker (1996) Chapter 9, we found that it is possible to obtain MLE using LISREL or EQS. It looks (from a cursory review) like you can enter your raw data directly into LISREL, and use a specified procedure which will then replace your missing data using MLE. LISREL offers the listwise deletion method. And EQS also offers regression imputation methods. We would recommend that you check out the appropriate manuals for verification. You could also use various methods of missing data imputation and compare the resulting models. Are their any drawbacks to obtaining MLE in this manner?

A. The LISREL/EQS method referred to in Chapter 9 is Paul Allison's multigroup approach. Data are sorted into patterns of missingness, each pattern is entered as separate group in a multi-group model specification. You may enter (sorted) raw data or means and covariance matrices as input. This approach does not replace missing data, nor does it use listwise deletion.

If there are just a few patterns of missingness, Paul Allison's parameterization produces FIML estimates with most SEM programs that can handle multi-group analyses. However, when there are many missingness patterns, Paul's approach quickly becomes rather laborious, if not impossible to implement. One colleague we know wasted $10,000 in real grant money for mainframe CPU time with a 20-group matrix-sampling model and never even obtained any starting values!

 

Q. If your data really is Missing Completely At Random (MCAR), and you can assume multivariate normality, then are methods such as pairwise and listwise deletion acceptable?

A. When the data are MCAR, then listwise or pairwise deletion methods are asymptotically equivalent to full information ML, but the standard errors of their parameter estimates can be considerably larger. In other words, FIML makes more efficient use of the data at hand. With pairwise and, even more so, with listwise deletion methods, considerably larger samples can be needed to achieve the same statistical power.

  

 

Questions about why you can't get the usual fit indices with missing data

 

Q. I am running a CFA model in Amos 3.6 which seems to run well, but I fail to get fit indices (any!) in the text output. My model seems similar to the Amos Users' Guide Example 8, except that I have 4 latent factors, where 2 latent factors are correlated. Two pairs of latent variables are assumed to correlate, with no higher-order correlation between the two pairs. What am I doing wrong?

A. Most likely you are analyzing a problem with missing data. Many fit indices are only defined for the complete data case, some are just not defined for means- and multi-group models, and others need the fit of the saturated model that Amos 3.6 does not compute automatically when there are missing data. However, you can use the AIC statistic that is printed out, and you can compute LR chi-square statistics from the CMIN values of hierarchically nested models. This latter procedure is demonstration in worked Examples 17 and 18 of the Amos Users' Guide.

Amos's FIML analysis of incomplete data is leading-edge technology in statistical estimation, but we realize now that Amos users need more fit statistics in the incomplete-data case. The next major Amos release will be able to compute many more fit statistics than Version 3.6.

 

Q. What should a user expect as far as output (test statistics such as GFI) with missing data? I know it will not provide a chi square (from the manual), hence "missing" modification indices make sense (they are not produced (an error message says this much). I have also noticed that the output is rather sparse (no GFI, AGFI, etc.). It seems there may be a problem with the text macro in this case as I cannot get it to display the degrees of freedom \df .

A. Computing the chi square statistic requires fitting the saturated model. When some data values are missing, Amos 3.6 does not do this because it is time consuming. With complete data, it takes practically no time at all. Most other fit measures depend on the chi square statistic, and so they are not reported either. Also, as you noted, modification indices are not reported. Besides the various fit measures that depend on chi-square, and the modification indices, there is no other output that is unavailable with missing data.

As for the text macro for displaying degrees of freedom (\df), you are right that degrees of freedom is defined (and could be easily calculated), although it is not displayable with the text macro. It did not occur to us that a person might want to display the degrees of freedom when the chi square statistic is not available. However, now that you bring it up, it probably would be a good idea to let the user display it if he/she wants to.

 

Q. Using your example in the Amos Users' Guide page 112, I have managed to insert the CMIN value in the caption, but no value for df or p is provided. The output simply echoes the \df and \p legend. Is that because my analysis is Maximum likelihood, or is there some other command I need to provide beyond the $Smc and $Standardized?

A. It appears that you have missing data. In that case CMIN is a "Function of the log-likelihood," and not a chi-square statistic. If you have several nested models, you can construct LR chi-square statistics from the respective CMIN values. This is demonstrated to some degree in Examples 17 and 18 of the Amos Users' Guide. Jim Arbuckle's chapter in Marcoulides and Schumacker (1996) provides the statistical background of FIML estimation with incomplete data.

The "Function of the log-likelihood" does not come with a p-value. Hence, the \p text macro is deactivated for missing data. Also, when you have missing data, it is no longer guaranteed that all pairs of variables, i.e., all t [=p(p+3)/2] first and second order moments, were observed. In the case of complete data, DF = t - (no. free parameters). With incomplete data, the first and second order moments cease to be sufficient statistics, and may be incomplete, so Amos 3.6 takes the conservative option of not printing the DF at all.

However, you can put the number of free parameters onto the path diagram, using the \npar text macro. The DF for an LR chi-square test among nested models can be calculated as the difference between the corresponding \npar values.

 

Q. I have had little success in obtaining any fit indicator, apart from the CMIN value for the Maximum Likelihood default analysis. What's wrong?

A. Note that the CMIN value can be one of various things, depending on the estimation method used. For instance, it is SS_residual for ULS, Wishart chi-square for complete-data ML, and "Function of the log-likelihood" for FIML with incomplete data.

 

Q. My reading shows some support for using RMSEA, but I could not get a value using \rmsea. Is that because most of the fit indices are indicators of the fit between two models, rather than a fit between the model and the observed data (all my variables are Observed)? I did warn you that my understanding of SEM is very patchy!

A. RMSEA is not available because its calculation requires the DF, which Amos does not define with missing data. To compare several models, you might want to use the AIC statistic.

 

Q: Why can't I get modifications indices with missing data?

A: Theoretically, one should be able to derive more exact MI's from the first and second derivatives of the FIML solution, but this is not yet implemented in Amos 3.6, nor other SEM programs, such as Mx. One reason this feature has not been added is because typically it's best to have the investigator tell the model what to do than vice-versa.

 

 

Questions about non-normal data multiple imputation approaches

 

Q. Do you have plans to implement any non-normal theory EM algorithms for such things as skewed variables or categorical variables. This would seem to be a very important selling point since existing implementations require something other than a generic PC.

A. With skewed and categorical data, multivariate ML solutions are really difficult to compute even with complete data (this is why no SEM program uses ML for categorical data). Variants of the EM algorithm may be useful as computational tricks for evaluating high-dimensional integration problems. So chances are that you will be seeing some type of EM algorithm in future releases of Amos, but it might first be restricted to "complete data" scenarios. Either way, SEM analyses of non-normal and categorical data will be compute intensive. You will need a fast machine and reasonably large datasets.

By the way, Don Rubin's EM approach is just one of several algorithms that can be used for obtaining ML estimates. The FIML algorithm used by Amos is also ML, but more closely related to that proposed by Hartley & Hocking (1971) than that by Dempster, Laird & Rubin (1977). FIML is harder to program, because it involves both 1st and 2nd-order derivatives, but it runs much faster and gives standard error estimates. The EM algorithm only uses 1st order derivatives and does not, per se, produce standard errors.

 

Q. Do you have plans to implement a multiple imputation approach?

A. While multiple imputations would be nice for estimating the missing data, they are not needed in Amos's FIML approach for model estimation and would actually slow down the calculations considerably. In our view, imputation of missing data should be performed after model estimation, because you gain efficiency for the imputations using the implied moments of a parsimonious instead of a saturated model.

 

 

Questions about bugs, glitches, and Amos workarounds with missing data

 

Q. I'm using AMOS 3.6 with an SPSS.sav file that has -1's coded as missing data. I did a simple regression model adding only a click of the Means button. Your description says what your missing data approach is not and, except for a reference to Rubin and Little, doesn't say what you do.

A. First, please note that the -1's should be recoded to SPSS's "system missing" code before calling Amos. If you leave the missing values at "-1" in an SPSS *.SAV file, there is some chance that Amos will interpret them as observed, not missing values. Now to the second part of your question. Amos could improve upon its description of the missing value technique. That notwithstanding, the formulas and some simulation results are given in Arbuckle, J.L. (1996) Full Information Estimation in the Presence of Incomplete Data in Marcoulides and Schumacker (1996).

 

Q. When I run Amos with an SPSS file, the program displays the message:

Error: Could not read file.

The text output of the problem says:

The analysis will not continue because an error was discovered after reading line number 11 of the file ...

This was the last line read:

$Missing = 1,7E+308

A. The reason for this error message is because your Windows system uses a comma as a decimal symbol. Amos 3.6 is not fully internationalized. Its missing data parser requires decimal points, no decimal commas. To work around this problem, set up your Windows to use a period as a decimal separator. For instance, you can change the Windows 95 default and country environment to "English (United States)." To make the change, follow this command path:

Start => Settings => Control Panel => Regional Settings

Then, click on the Regional Settings tab, pick English (United States) as the local setting, click <ok>, and restart the computer when prompted.

There are similar routines for changing the decimal symbol on Windows 3.1 and NT systems. Please check your system manual about specifics.

 

 

 

Questions about missing data analysis by the EM algorithm

 

Q. What is the optimal replacement method for missing data? I was taught that it is Maximum Likelihood Estimation.

A. The Maximum Likelihood does not have to "replace" the missing values. For instance, Mx and Amos assume that the data are missing at random (MAR), and then compute the likelihood of the parameter values given the observed data of each case. The ML estimates are obtained from the point at which the likelihood has its maximum. A different, admittedly rather popular approach to ML with missing data is Don Rubin's EM algorithm. EM does impute the unobserved data.

Both maximum likelihood methods typically yield identical point estimates of the model parameters, except that computational differences are sometimes encountered. EM does not produce standard error estimates, at least not in the original formulation by Dempster, Laird and Rubin (1977).

 

Q. I ran the same data using Graham's EM program (freeware) that is a normal theory implementation of the EM algorithm. I get virtually the same results you obtain using a single imputation with Graham's program. Are you doing a single imputation normal theory EM algorithm?

A. No. Amos does not use imputations. Instead, the model is estimated by full information maximum likelihood (FIML) from the observed portion of the data.

 

Q. Though Amos performs its own missing data replacement when estimating values for path coefficients, etc., Amos will not provide goodness-of-fit indices (like CFI, NNFI etc. - though it does provide log likelihood functions) unless there is no missing data in the input matrix. We encountered this problem with our own raw data matrix. So we used a program called EMCOV to generate an imputed covariance matrix which we could enter into Amos and which was then used to calculate goodness-of-fit indices for our model. Is this the most efficient way to handle this issue?

A. The question touches upon several issues that we address in turn:

  1. Replacement of missing data: Amos does not "replace" the missing data [see question: What is Amos's FIML method for missing data analysis? for a more complete explanation]. The fit statistic that should be used for assessing model fit is the old-fashioned chi-square statistic. When missing data are encountered, Amos 3.6 prints out a value labeled "Function of log likelihood." In addition, the Amos Graphics text macro \CMIN and the column labeled CMIN in the table of summary fit statistics at the end of the text output also display "Function of log likelihood" values (and not fit chi-square statistics) when there are missing data.
  2. Fit chi-square statistics can easily be obtained from "Function of log likelihood" values by simple hand calculations. Examples 17 and 18 of the Amos Users' Guide give detailed how-to instructions.
  3. As a short example, consider the Amos models in fit_miss.zip. There are four nested models: baseline (independence and zero means), single-factor, two-factor (spatial and verbal), and saturated models. Differences in their "Function of log likelihood" values are likelihood ratio chi-square statistics. One possibility would be to subtract the "Function of log likelihood" value of the saturated model in order to arrive at fit chi-square statistics for each entertained model. This should be possible in the majority of cases where all models converge to an admissible and (hopefully) global solution. The df are computed as the difference in number of parameters between saturated and working models:

    Model

    CMIN

    No. parms

    Chi-square

    df

    p

    Baseline

    2251.604

    6

    888.02

    21

    0.000

    1 Factor

    1390.256

    18

    26.67

    9

    <.005

    2 Factors

    1375.133

    19

    11.55

    8

    >.05

    Saturated

    1363.583

    27

    0.00

    0

    n/a

     

  4. Some secondary fit statistics (e.g., NFI, NFI2, TLI, PFI, PFI2, RNI) work by comparison between the fitted model and the null (or baseline) model. The null model is typically a diagonal matrix of estimated variances, which normally fits easily enough with MLE raw data methods in Amos and Mx. When computing fit indices, just remember to use chi-square statistics in the formulas, not the CMIN values.
  5. For example, NFI = (Fb-Ft)/Fb = (888.02 - 11.55)/888.02 = .987 for the 2-factor model, and NFI = 0.970 for the 1-factor model. Formulas for other fit statistics are listed in the Amos Users' Guide, Appendix C.

  6. An alternative to NFI, CFI, etc. indices is Akaike's Information Criterion (AIC) which Williams & Holahan's (1994) SEM paper showed worked rather well at picking the true model. The AIC statistic is provided automatically by Amos. In cases for which a usable solution for the saturated model cannot be obtained (which can happen, of course), the AIC statistics can still be used to evaluate the relative fit among several models.
  7. As Stan Mulaik and others have pointed out, use of AIC in a large sample size favors saturated models. While AIC worked quite well for the Williams and Holahan cases, the same may not be the case for other sample sizes or other models. The power of a particular design and sample size to estimate a parameter of interest seems critical here. What is too large a sample for some parsimony comparisons may be just right for others and vice versa. This subtle issue is a difficult one to convey and it is difficult to recommend strategies that are generally practical.

  8. With nested models, the LR chi-square statistic can be used even without a saturated model.
  9. It is usually possible to fit the saturated model, and obtain a chi-squared goodness of fit of the model, and then to obtain fit indices such as NFI etc., if they are of interest. There are just a few cases where the saturated model is not well behaved.
  10. Because the saturated model has many parameters, convergence to its ML solution may be slow. This is why Amos 3.6 does not automatically compute the chi-square statistic with missing data. We will attempt to make this a bit easier in future releases of Amos. We're open to suggestions from users on the best way to do this.

  11. As far as "completing" a covariance matrix first with EMCOV and then computing "complete-data" fit statistics from that matrix, this would be an ad-hoc method for obtaining some exploratory guidance but not a stringent approach for serious model testing. For instance, it is far from clear how expectations and confidence intervals of the CFI, NNFI etc. statistics might be affected by varying patterns and frequencies of missing data. Most of these statistics are only defined for the complete-data case, this is why Amos makes no effort trying to compute them with missing data.
  12. So, EMCOV can be used to replace missing raw data using its Maximum Likelihood Estimation procedure. It should be noted that SEM analyses usually underestimate standard errors and critical ratios when using an EMCOV-generated covariance matrix

    This relationship between standard errors resulting from EMCOV-ML, versus those obtained by FIML, makes perfect sense. EMCOV estimates the means and covariance matrix of a saturated model, and substitutes the conditional expectations for missing data, given the observed data of each case. While these expectations are "best" estimates in a sense, the completed data matrix lacks some of the residual variation (or uncertainty) usually found in empirical data. In other words, EMCOV (like many other EM methods) shrinks the error variance, and subsequently the standard errors obtained by EMCOV-ML can be somewhat too small.

    In summary, matrices completed by EMCOV should only be used for exploratory purposes. An Amos analysis bases on raw-data (without imputations) is still necessary to obtain exact solutions with likelihood-based standard errors, critical ratios, and statistical tests of model fit.

  13. Two final comments about missing data estimation: First, based on our simulations (which admittedly are not extensive yet), the Amos standard errors are pretty good. However, be careful with non-normal data (see Peter Bentler's work). Second, even if the data are MCAR, using pairwise deletion procedures gives you no good way to estimate standard errors. Estimating standard errors becomes the bigger part of the problem. You might as well use EMCOV to get better estimates, and deal with getting standard errors by bootstrapping or multiple imputations.

 

Q. When we entered our raw data into EMCOV, we standardized it to: 1) equate the variances of the variables and, 2) to reduce the iterations needed to attain convergence. What is the effect of this standardization? The problem was that we had Achievement scores (with ranges of 100 - 800) along with simple, dichotomous variables. The variance for the Achievement variable was over 150,000. Should we have simply scaled down the Achievement scores using a factor of 10 instead of standardizing? What is the recommended practice and why?

A. Our colleague John Graham has done a great job of fielding this question. The following is his response.

"It is a good idea to standardize your variables before running EMCOV (or any EM algorithm program). The convergence criterion is dependent upon the variance of the items. If you have variances larger than 1, the default EMCOV convergence criterion (.00001) will be conservative. However, if you have many variances less than 1, the criterion will be too liberal.

My solution is to standardize before running EMCOV, and then back-standardize to the original scale when you are finished. If you want to go one step further, you could also transform any variables that have skewed distributions, and then back transform after running EMCOV (this makes most sense if you are using the multiple imputation option).

By the way, Joe Schafer's Norm program does this standardizing and back standardizing automatically. His programs are currently available as a stand-alone PC program."

 

Return to SmallWaters Home Page