Potential sign of bug in confidence calculations #415

curlette · 2016-05-14T05:35:03Z

During my analysis of the College Scorecard data I came across the following:

The plot below is of 500 simulated scores for a school whose tuition was inferred with 94% confidence (Everest Univ. Jacksonville).

It is noticeably somewhat bimodal, which we would expect to lower the confidence.

@fsaad @vkmvkmvkmvkm @raxraxraxraxrax @gregory-marton

alxempirical · 2016-05-14T06:54:14Z

Assuming you are using a crosscat metamodel, not gpmcc, I happened on a possible explanation for this behavior earlier this week.

TL;DR:

            # TODO: multistate impute doesn't exist yet
            # e,confidence = su.impute_and_confidence_multistate(M_c, X_L, X_D, Y, Q, n,
            #                                                    self.get_next_seed)

INFER draws 100 approximate posterior samples for an observed row by sampling from the category distributions ("cluster_model" in the crosscat source code nomenclature) for the latent categories assigned to the rows in the last ANALYZE iteration on each model. This category distribution is a gaussian, so univariate.

Then the confidence estimate makes an entirely new crosscat state from the posterior sample, trains it for 100 iterations, and returns the mean frequency of the maximum-likelihood category over those training iterations. Since that state is being trained on a sample from a gaussian, it's not surprising that the ML category has very high frequency. Essentially, the confidence-estimate code never gets to see the other mode of the posterior sample.

SIMULATE draws samples given observed-row conditions using the same code as INFER, but it draws them from all models in the generator unless you specify otherwise. So SIMULATEd samples can have multiple modes, they just come from different models. The confidence estimate in INFER (and the inference itself) is based only on the first model in the generator.

curlette · 2016-05-14T07:21:59Z

Makes sense, thanks!

alxempirical · 2016-05-14T07:25:53Z

If you don't mind, I think it would be good to keep this issue open. It looks like you have brought a serious bug to light.

alxempirical · 2016-05-15T20:49:12Z

@curlette, can you send the bdb file to alx@<the rest of my github name>.com, please?

SIMULATE draws samples given observed-row conditions using the same code as INFER, but it draws them from all models in the generator unless you specify otherwise. So SIMULATEd samples can have multiple modes, they just come from different models. The confidence estimate in INFER (and the inference itself) is based only on the first model in the generator.

I misread the code in the link. The first model is used only for the confidence calculation. The imputation sampling is done over all models, so you would expect the two modes to appear in the samples generated by impute, which are then passed to continuous_imputation_confidence.

curlette closed this as completed May 14, 2016

alxempirical reopened this May 14, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential sign of bug in confidence calculations #415

Potential sign of bug in confidence calculations #415

curlette commented May 14, 2016

alxempirical commented May 14, 2016 •

edited

Loading

curlette commented May 14, 2016

alxempirical commented May 14, 2016

alxempirical commented May 15, 2016 •

edited

Loading

Potential sign of bug in confidence calculations #415

Potential sign of bug in confidence calculations #415

Comments

curlette commented May 14, 2016

alxempirical commented May 14, 2016 • edited Loading

curlette commented May 14, 2016

alxempirical commented May 14, 2016

alxempirical commented May 15, 2016 • edited Loading

alxempirical commented May 14, 2016 •

edited

Loading

alxempirical commented May 15, 2016 •

edited

Loading