Questions about Missing Data
Questions about Full Information Maximum Likelihood (FIML)
What is Amos's Full Information Maximum Likelihood (FIML) method for missing data analysis?
Is it possible to obtain MLE using LISREL or EQS? Are their drawbacks?
If your data really is Missing Completely at Random, and you can assume multivariate normality, then are methods such as pairwise and listwise deletion acceptable?
Questions about why you can't get the usual fit indices with missing data
I am running a CFA model, but I fail to get fit indices in the text output. What am I doing wrong?
What should a user expect as far as output (test statistics such as GFI) with missing data?
I have inserted the CMIN value in the caption, but no value for df or p is provided and the output simply echoes the \df and \p legend. Is there some other command I need to provide?
I have had little success in obtaining any fit indicator, apart from the CMIN value for the FIML default analysis. What's wrong?
My reading shows some support for using RMSEA, but I could not get a value using \rmsea. Why?
Why can't I get modifications indices with missing data?
Questions about nonnormal data multiple imputation approaches
Do you plan to implement any nonnormal theory EM algorithms for such things as skewed variables or categorical variables?
Do you plan to implement a multiple imputation approach?
Questions about bugs, glitches, and Amos workarounds with missing data
Your description says what your missing data approach is not, but doesn't say what you do.
I am getting an error message when I run Amos with an SPSS file. What's wrong?
Questions about missing data analysis by the EM algorithm
What is the optimal replacement method for missing data? Is it Maximum Likelihood Estimation?
Are you doing a single imputation normal theory EM algorithm?
Amos will not provide goodnessoffit indices unless there is no missing data in the input matrix, so we used the EMCOV program. Is this the most efficient way to handle this situation?
When we entered our raw data into EMCOV, we standardized it. Is standardization the recommended practice?
Questions about Full Information Maximum Likelihood (FIML)
Q. What is Amos's Full Information Maximum Likelihood (FIML) method for missing data analysis?
A. Unlike many other methods, Amos's full information maximum likelihood (FIML) estimation uses all information of the observed data. The likelihood is computed for the observed portion of each case's data and then accumulated and maximized. Amos's ML approach usually yields results equivalent to Don Rubin's EM approach, except that Amos also incorporates constrained moment matrix estimation. In addition, FIML requires no imputation (or Estep) and typically converges faster. Introductions to FIML estimation are given by Jim Arbuckle in the edited volume by Marcoulides and Schumacker (1996) and by Werner Wothke in the white paper on Longitudinal and multigroup modeling with missing data.
Q. From reading Marcoulides and Schumacker (1996) Chapter 9, we found that it is possible to obtain MLE using LISREL or EQS. It looks (from a cursory review) like you can enter your raw data directly into LISREL, and use a specified procedure which will then replace your missing data using MLE. LISREL offers the listwise deletion method. And EQS also offers regression imputation methods. We would recommend that you check out the appropriate manuals for verification. You could also use various methods of missing data imputation and compare the resulting models. Are their any drawbacks to obtaining MLE in this manner?
A. The LISREL/EQS method referred to in Chapter 9 is Paul Allison's multigroup approach. Data are sorted into patterns of missingness, each pattern is entered as separate group in a multigroup model specification. You may enter (sorted) raw data or means and covariance matrices as input. This approach does not replace missing data, nor does it use listwise deletion.
If there are just a few patterns of missingness, Paul Allison's parameterization produces FIML estimates with most SEM programs that can handle multigroup analyses. However, when there are many missingness patterns, Paul's approach quickly becomes rather laborious, if not impossible to implement. One colleague we know wasted $10,000 in real grant money for mainframe CPU time with a 20group matrixsampling model and never even obtained any starting values!
Q. If your data really is Missing Completely At Random (MCAR), and you can assume multivariate normality, then are methods such as pairwise and listwise deletion acceptable?
A. When the data are MCAR, then listwise or pairwise deletion methods are asymptotically equivalent to full information ML, but the standard errors of their parameter estimates can be considerably larger. In other words, FIML makes more efficient use of the data at hand. With pairwise and, even more so, with listwise deletion methods, considerably larger samples can be needed to achieve the same statistical power.
Questions about why you can't get the usual fit indices with missing data
Q. I am running a CFA model in Amos 3.6 which seems to run well, but I fail to get fit indices (any!) in the text output. My model seems similar to the Amos Users' Guide Example 8, except that I have 4 latent factors, where 2 latent factors are correlated. Two pairs of latent variables are assumed to correlate, with no higherorder correlation between the two pairs. What am I doing wrong?
A. Most likely you are analyzing a problem with missing data. Many fit indices are only defined for the complete data case, some are just not defined for means and multigroup models, and others need the fit of the saturated model that Amos 3.6 does not compute automatically when there are missing data. However, you can use the AIC statistic that is printed out, and you can compute LR chisquare statistics from the CMIN values of hierarchically nested models. This latter procedure is demonstration in worked Examples 17 and 18 of the Amos Users' Guide.
Amos's FIML analysis of incomplete data is leadingedge technology in statistical estimation, but we realize now that Amos users need more fit statistics in the incompletedata case. The next major Amos release will be able to compute many more fit statistics than Version 3.6.
Q. What should a user expect as far as output (test statistics such as GFI) with missing data? I know it will not provide a chi square (from the manual), hence "missing" modification indices make sense (they are not produced (an error message says this much). I have also noticed that the output is rather sparse (no GFI, AGFI, etc.). It seems there may be a problem with the text macro in this case as I cannot get it to display the degrees of freedom \df .
A. Computing the chi square statistic requires fitting the saturated model. When some data values are missing, Amos 3.6 does not do this because it is time consuming. With complete data, it takes practically no time at all. Most other fit measures depend on the chi square statistic, and so they are not reported either. Also, as you noted, modification indices are not reported. Besides the various fit measures that depend on chisquare, and the modification indices, there is no other output that is unavailable with missing data.
As for the text macro for displaying degrees of freedom (\df), you are right that degrees of freedom is defined (and could be easily calculated), although it is not displayable with the text macro. It did not occur to us that a person might want to display the degrees of freedom when the chi square statistic is not available. However, now that you bring it up, it probably would be a good idea to let the user display it if he/she wants to.
Q. Using your example in the Amos Users' Guide page 112, I have managed to insert the CMIN value in the caption, but no value for df or p is provided. The output simply echoes the \df and \p legend. Is that because my analysis is Maximum likelihood, or is there some other command I need to provide beyond the $Smc and $Standardized?
A. It appears that you have missing data. In that case CMIN is a "Function of the loglikelihood," and not a chisquare statistic. If you have several nested models, you can construct LR chisquare statistics from the respective CMIN values. This is demonstrated to some degree in Examples 17 and 18 of the Amos Users' Guide. Jim Arbuckle's chapter in Marcoulides and Schumacker (1996) provides the statistical background of FIML estimation with incomplete data.
The "Function of the loglikelihood" does not come with a pvalue. Hence, the \p text macro is deactivated for missing data. Also, when you have missing data, it is no longer guaranteed that all pairs of variables, i.e., all t [=p(p+3)/2] first and second order moments, were observed. In the case of complete data, DF = t  (no. free parameters). With incomplete data, the first and second order moments cease to be sufficient statistics, and may be incomplete, so Amos 3.6 takes the conservative option of not printing the DF at all.
However, you can put the number of free parameters onto the path diagram, using the \npar text macro. The DF for an LR chisquare test among nested models can be calculated as the difference between the corresponding \npar values.
Q. I have had little success in obtaining any fit indicator, apart from the CMIN value for the Maximum Likelihood default analysis. What's wrong?
A. Note that the CMIN value can be one of various things, depending on the estimation method used. For instance, it is SS_residual for ULS, Wishart chisquare for completedata ML, and "Function of the loglikelihood" for FIML with incomplete data.
Q. My reading shows some support for using RMSEA, but I could not get a value using \rmsea. Is that because most of the fit indices are indicators of the fit between two models, rather than a fit between the model and the observed data (all my variables are Observed)? I did warn you that my understanding of SEM is very patchy!
A. RMSEA is not available because its calculation requires the DF, which Amos does not define with missing data. To compare several models, you might want to use the AIC statistic.
Q: Why can't I get modifications indices with missing data?
A: Theoretically, one should be able to derive more exact MI's from the first and second derivatives of the FIML solution, but this is not yet implemented in Amos 3.6, nor other SEM programs, such as Mx. One reason this feature has not been added is because typically it's best to have the investigator tell the model what to do than viceversa.
Questions about nonnormal data multiple imputation approaches
Q. Do you have plans to implement any nonnormal theory EM algorithms for such things as skewed variables or categorical variables. This would seem to be a very important selling point since existing implementations require something other than a generic PC.
A. With skewed and categorical data, multivariate ML solutions are really difficult to compute even with complete data (this is why no SEM program uses ML for categorical data). Variants of the EM algorithm may be useful as computational tricks for evaluating highdimensional integration problems. So chances are that you will be seeing some type of EM algorithm in future releases of Amos, but it might first be restricted to "complete data" scenarios. Either way, SEM analyses of nonnormal and categorical data will be compute intensive. You will need a fast machine and reasonably large datasets.
By the way, Don Rubin's EM approach is just one of several algorithms that can be used for obtaining ML estimates. The FIML algorithm used by Amos is also ML, but more closely related to that proposed by Hartley & Hocking (1971) than that by Dempster, Laird & Rubin (1977). FIML is harder to program, because it involves both 1st and 2ndorder derivatives, but it runs much faster and gives standard error estimates. The EM algorithm only uses 1st order derivatives and does not, per se, produce standard errors.
Q. Do you have plans to implement a multiple imputation approach?
A. While multiple imputations would be nice for estimating the missing data, they are not needed in Amos's FIML approach for model estimation and would actually slow down the calculations considerably. In our view, imputation of missing data should be performed after model estimation, because you gain efficiency for the imputations using the implied moments of a parsimonious instead of a saturated model.
Questions about bugs, glitches, and Amos workarounds with missing data
Q. I'm using AMOS 3.6 with an SPSS.sav file that has 1's coded as missing data. I did a simple regression model adding only a click of the Means button. Your description says what your missing data approach is not and, except for a reference to Rubin and Little, doesn't say what you do.
A. First, please note that the 1's should be recoded to SPSS's "system missing" code before calling Amos. If you leave the missing values at "1" in an SPSS *.SAV file, there is some chance that Amos will interpret them as observed, not missing values. Now to the second part of your question. Amos could improve upon its description of the missing value technique. That notwithstanding, the formulas and some simulation results are given in Arbuckle, J.L. (1996) Full Information Estimation in the Presence of Incomplete Data in Marcoulides and Schumacker (1996).
Q. When I run Amos with an SPSS file, the program displays the message:
Error: Could not read file.
The text output of the problem says:
The analysis will not continue because an error was discovered after reading line number 11 of the file ...
This was the last line read:
$Missing = 1,7E+308
A. The reason for this error message is because your Windows system uses a comma as a decimal symbol. Amos 3.6 is not fully internationalized. Its missing data parser requires decimal points, no decimal commas. To work around this problem, set up your Windows to use a period as a decimal separator. For instance, you can change the Windows 95 default and country environment to "English (United States)." To make the change, follow this command path:
Start => Settings => Control Panel => Regional Settings
Then, click on the Regional Settings tab, pick English (United States) as the local setting, click <ok>, and restart the computer when prompted.
There are similar routines for changing the decimal symbol on Windows 3.1 and NT systems. Please check your system manual about specifics.
Questions about missing data analysis by the EM algorithm
Q. What is the optimal replacement method for missing data? I was taught that it is Maximum Likelihood Estimation.
A. The Maximum Likelihood does not have to "replace" the missing values. For instance, Mx and Amos assume that the data are missing at random (MAR), and then compute the likelihood of the parameter values given the observed data of each case. The ML estimates are obtained from the point at which the likelihood has its maximum. A different, admittedly rather popular approach to ML with missing data is Don Rubin's EM algorithm. EM does impute the unobserved data.
Both maximum likelihood methods typically yield identical point estimates of the model parameters, except that computational differences are sometimes encountered. EM does not produce standard error estimates, at least not in the original formulation by Dempster, Laird and Rubin (1977).
Q. I ran the same data using Graham's EM program (freeware) that is a normal theory implementation of the EM algorithm. I get virtually the same results you obtain using a single imputation with Graham's program. Are you doing a single imputation normal theory EM algorithm?
A. No. Amos does not use imputations. Instead, the model is estimated by full information maximum likelihood (FIML) from the observed portion of the data.
Q. Though Amos performs its own missing data replacement when estimating values for path coefficients, etc., Amos will not provide goodnessoffit indices (like CFI, NNFI etc.  though it does provide log likelihood functions) unless there is no missing data in the input matrix. We encountered this problem with our own raw data matrix. So we used a program called EMCOV to generate an imputed covariance matrix which we could enter into Amos and which was then used to calculate goodnessoffit indices for our model. Is this the most efficient way to handle this issue?
A. The question touches upon several issues that we address in turn:
As a short example, consider the Amos models in fit_miss.zip. There are four nested models: baseline (independence and zero means), singlefactor, twofactor (spatial and verbal), and saturated models. Differences in their "Function of log likelihood" values are likelihood ratio chisquare statistics. One possibility would be to subtract the "Function of log likelihood" value of the saturated model in order to arrive at fit chisquare statistics for each entertained model. This should be possible in the majority of cases where all models converge to an admissible and (hopefully) global solution. The df are computed as the difference in number of parameters between saturated and working models:
Model 
CMIN 
No. parms 
Chisquare 
df 
p 
2251.604 
6 
888.02 
21 
0.000 

1390.256 
18 
26.67 
9 
<.005 

1375.133 
19 
11.55 
8 
>.05 

1363.583 
27 
0.00 
0 
n/a 
For example, NFI = (FbFt)/Fb = (888.02  11.55)/888.02 = .987 for the 2factor model, and NFI = 0.970 for the 1factor model. Formulas for other fit statistics are listed in the Amos Users' Guide, Appendix C.
As Stan Mulaik and others have pointed out, use of AIC in a large sample size favors saturated models. While AIC worked quite well for the Williams and Holahan cases, the same may not be the case for other sample sizes or other models. The power of a particular design and sample size to estimate a parameter of interest seems critical here. What is too large a sample for some parsimony comparisons may be just right for others and vice versa. This subtle issue is a difficult one to convey and it is difficult to recommend strategies that are generally practical.
Because the saturated model has many parameters, convergence to its ML solution may be slow. This is why Amos 3.6 does not automatically compute the chisquare statistic with missing data. We will attempt to make this a bit easier in future releases of Amos. We're open to suggestions from users on the best way to do this.
So, EMCOV can be used to replace missing raw data using its Maximum Likelihood Estimation procedure. It should be noted that SEM analyses usually underestimate standard errors and critical ratios when using an EMCOVgenerated covariance matrix
This relationship between standard errors resulting from EMCOVML, versus those obtained by FIML, makes perfect sense. EMCOV estimates the means and covariance matrix of a saturated model, and substitutes the conditional expectations for missing data, given the observed data of each case. While these expectations are "best" estimates in a sense, the completed data matrix lacks some of the residual variation (or uncertainty) usually found in empirical data. In other words, EMCOV (like many other EM methods) shrinks the error variance, and subsequently the standard errors obtained by EMCOVML can be somewhat too small.
In summary, matrices completed by EMCOV should only be used for exploratory purposes. An Amos analysis bases on rawdata (without imputations) is still necessary to obtain exact solutions with likelihoodbased standard errors, critical ratios, and statistical tests of model fit.
Q. When we entered our raw data into EMCOV, we standardized it to: 1) equate the variances of the variables and, 2) to reduce the iterations needed to attain convergence. What is the effect of this standardization? The problem was that we had Achievement scores (with ranges of 100  800) along with simple, dichotomous variables. The variance for the Achievement variable was over 150,000. Should we have simply scaled down the Achievement scores using a factor of 10 instead of standardizing? What is the recommended practice and why?
A. Our colleague John Graham has done a great job of fielding this question. The following is his response.
"It is a good idea to standardize your variables before running EMCOV (or any EM algorithm program). The convergence criterion is dependent upon the variance of the items. If you have variances larger than 1, the default EMCOV convergence criterion (.00001) will be conservative. However, if you have many variances less than 1, the criterion will be too liberal.
My solution is to standardize before running EMCOV, and then backstandardize to the original scale when you are finished. If you want to go one step further, you could also transform any variables that have skewed distributions, and then back transform after running EMCOV (this makes most sense if you are using the multiple imputation option).
By the way, Joe Schafer's Norm program does this standardizing and back standardizing automatically. His programs are currently available as a standalone PC program."