Skip to content

closes #71: Function EFA_POOLED()#72

Open
andreassoteriadesmoj wants to merge 4 commits intomdsteiner:masterfrom
moj-analytical-services:efa-pooled
Open

closes #71: Function EFA_POOLED()#72
andreassoteriadesmoj wants to merge 4 commits intomdsteiner:masterfrom
moj-analytical-services:efa-pooled

Conversation

@andreassoteriadesmoj
Copy link

EFA_POOLED() is similar to psych::fa.pooled() and can be used to peform EFA on multiple data imputations using the EFA() function. This PR creates EFA_POOLED() (55fcab5) and a few helper functions (2e022b8) that EFA_POOLED() uses behind the scenes.

EFA_POOLED() returns some of the standard objects that EFA() also returns, including communalities, unrotated and rotated pattern loadings, structure loadings and interfactor correlations (for oblique solutions), and fit indices. Note, however, that in the case of EFA_POOLED(), these results are based on the pooled estimates (see function documentation for details). In addition, EFA_POOLED() also returns confidence intervals for the pooled estimates, as well as the individual EFA() fits for each imputation.

@mdsteiner
Copy link
Owner

Thanks very much! Just so you know what to expect: It will probably take a couple of weaks until I find the time to really go through it and check it, but I will get back to you.

@andreassoteriadesmoj
Copy link
Author

Thanks very much! Just so you know what to expect: It will probably take a couple of weaks until I find the time to really go through it and check it, but I will get back to you.

That's fine, thanks for letting me know!

@mdsteiner
Copy link
Owner

Quick question: In psych::fa.pooled the CI is calculated differently, (i.e., without dividing by sqrt(m), see https://github.com/cran/psych/blob/6c1403fe9c2911552e6bdbabae4730679b3ffea6/R/fa.pooled.r#L59). So the psych intervalls will be broader than the cis from EFA_POOLED. Do you have a rationale for doing it differently?

@andreassoteriadesmoj
Copy link
Author

Quick question: In psych::fa.pooled the CI is calculated differently, (i.e., without dividing by sqrt(m), see https://github.com/cran/psych/blob/6c1403fe9c2911552e6bdbabae4730679b3ffea6/R/fa.pooled.r#L59). So the psych intervalls will be broader than the cis from EFA_POOLED. Do you have a rationale for doing it differently?

It's not clear to me why psych uses the SD instead of the SE. Function .calc_cis() divides SD by the square root of the sample size to get the SE, which is the definition of CIs that I'm familiar with from the literature.

@mdsteiner
Copy link
Owner

If we divide by sqrt(m), the SE becomes smaller, so the assumption would be that we gain precision by adding more and more imputations. I'm not sure that's correct. In Hayes and Enders (2022), eq. 11, they use a more complicated expression for deriving the SEs, but I have no expertise at all in this area. A quick chat with ChatGPT (which by no means has to be correct, but that's what I could do in the time given) tells me that probably only the implementations used in Hayes and Enders is correct. Briefly it states that

  • dividing by sqrt(m) provides the standard error of the mean of the imputation-specific loadings, but this is not the goal of MI, rather we want to get the sampling uncertainty of the loading estimate.
  • The psych version provides an estimate of how much solutions vary across imputations and thus provides a typical range of plausible loadings across imputations, but again not the sampling uncertainty of the loading estimate.

Hayes, T., & Enders, C. K. (2022). Maximum likelihood and multiple imputation missing data handling: How they work, and how to make them work in practice. In H. Cooper (Ed.), APA Handbook of research methods in psychology (2nd ed., pp. 27-51). Washington, DC: American Psychological Association.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants