Skip to content

Conversation

@kpedro88
Copy link
Contributor

@kpedro88 kpedro88 commented Mar 6, 2025

New feature: evaluating the transfer factor polynomial, after performing a fit in RooFit, can propagate the parameter uncertainty via the covariance matrix to provide an error band (confidence interval) on the resulting fit. This is enabled by the errorband boolean option when calling the polynomial.

Bug fixes:

  1. Remove sigmascale from the fit parametrizations, as it was applied in the wrong part of the formula and moved the parameter error away from unity.
  2. Recognize newer Python representation of ROOT TH1 in _to_numpy() utility function.
  3. Avoid wrong error bars when converting histograms from numpy to ROOT. When TH1::SetDefaultSumw2() is set to true, calling SetBinContent() (rather than Fill()) sets the bin error to 0. RooFit has a weird default where it sets the bin error to be equal to the bin content if the input TH1 bin error is 0: https://root.cern.ch/doc/v622/RooDataHist_8cxx_source.html#l01300.

Minor improvements:

  1. Add a method to print the decorrelated parameter representations as strings.
  2. Add more information to some error messages and reprs.

@rkansal47
Copy link
Collaborator

Thanks a lot for the errorband feature and fix 3 (that likely explains why I never saw any change to the fit results when specifying errors...).

One question - on sigmascale, what do you mean by "it was applied in the wrong part of the formula and moved the parameter error away from unity."?

@kpedro88
Copy link
Contributor Author

kpedro88 commented Mar 6, 2025

sigmascale applied in the correct place would look like this:

scaledparams = failObs * (1 + 1.0 / np.maximum(1.0, np.sqrt(failObs))) ** (qcdparams/sigmascale)

but when checking the behavior in some different cases, the uncertainty on the per-bin yields is already 1 as desired, so it seemed simpler just to remove it.

@rkansal47
Copy link
Collaborator

rkansal47 commented Mar 6, 2025

Do you mean qcdparams * sigmascale? I believe the idea is to loosen the constraint on the fail QCD bin yields to help the fit. In that case, applying it in the exponent vs. as it was before in the sum in the parentheses is ~equivalent to first order.

I still use it in my fits so I would be in favour of leaving the test as it is, but I haven't checked in a long time if it's still necessary so I leave it up to @nsmith- and others.

@kpedro88
Copy link
Contributor Author

kpedro88 commented Mar 6, 2025

I discussed this with @nsmith- when I was confused by the output from the test. He explained it as follows:
n*(1+1/√n)^p = {n for p = 0, n + sqrt(n) for p = 1}
which makes p act like the number of standard deviations from the initial value.

Whether you scale p as p*sigmascale or p/sigmascale just depends on how you define sigmascale. But having sigmascale in the base of the expression is just wrong as far as I can tell.

@nsmith-
Copy link
Owner

nsmith- commented Mar 7, 2025

Right, having sigmascale in the base of the expression was just a longstanding bug. I suppose to be consistent with the current usage we would want p * sigmascale.

@nsmith-
Copy link
Owner

nsmith- commented Mar 7, 2025

For reference

import numpy as np
import matplotlib.pyplot as plt

param = np.linspace(-1, 1, 101)
n = 100
sigmascale = 10
oldfactor = n * (1 + sigmascale / np.sqrt(n)) ** param
newfactor = n * (1 + 1 / np.sqrt(n)) ** (param * sigmascale)

fig, ax = plt.subplots()
ax.plot(param, oldfactor, label="old")
ax.plot(param, newfactor, label="new")
ax.axhline(n, color="grey", linestyle="--")
ax.axhline(n + np.sqrt(n), color="red", linestyle="--")
ax.axhline(n - np.sqrt(n), color="blue", linestyle="--")

ax.set_yscale("log")
ax.legend()
ax.set_title(f"{n=}, {sigmascale=}")
ax.set_xlabel("parameter")
ax.set_ylabel("n expected")

image

@kpedro88
Copy link
Contributor Author

kpedro88 commented Mar 7, 2025

I can update the example to use p * sigmascale if desired. It wasn't clear to me that this really improved the condition of the example fit, though...

@nsmith-
Copy link
Owner

nsmith- commented Mar 7, 2025

Yeah for the purpose it serves (getting the error on qcdparam in the ballpark of 1), 10 vs 0.1 won't make a difference.

@kpedro88
Copy link
Contributor Author

kpedro88 commented Mar 7, 2025

The pytest workflow failure seems to be upstream in conda

@nsmith-
Copy link
Owner

nsmith- commented Mar 7, 2025

Yeah it's been quite flaky lately. Anyway you'll probably want to rebase onto #46 which fixes the python 3.8 incompatibilities

@kpedro88
Copy link
Contributor Author

kpedro88 commented Mar 7, 2025

rebased now (thanks for the py3.8 fixes!)

@kpedro88
Copy link
Contributor Author

kpedro88 commented Mar 7, 2025

Found one more py3.8 item to fix, committed here.

Copy link
Owner

@nsmith- nsmith- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding these!

@nsmith- nsmith- merged commit 36ab6d3 into nsmith-:master Mar 11, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants