Skip to content

Update captum requirement from <0.8.0,>=0.5.0 to >=0.5.0,<0.9.0 #570

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dependabot[bot]
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Jul 1, 2025

Updates the requirements on captum to permit the latest version.

Release notes

Sourced from captum's releases.

Captum v0.8.0 Release

The v0.8.0 release of Captum offers new influence functions for data attribution, improvements to feature attribution methods (including LLM prompt attribution), enhanced type annotations for modern Python type checking, and a variety of other small changes. Note that support for Python 3.8 and PyTorch 1.10 have been dropped, and Captum Insights will be deprecated next major release.

Data Attribution: New Influence Functions

This version offers two different implementations that both calculate the "infinitesimal" influence score as defined in the paper "Understanding Black-box Predictions via Influence Functions".

Example:

from captum.influence._core.influence_function import NaiveInfluenceFunction
from torch import nn
from torch.utils.data import DataLoader
train_dl = DataLoader(your_dataset, batch_size=8)  # your dataloader
criterion = nn.MSELoss(reduction="none")
influence = NaiveInfluenceFunction(
net,
train_dl,
checkpoint_path,  # path to your model checkpoint
loss_fn=criterion,
batch_size=batch_size,
)
compute pairwise influences using influence implementation
influence_train_test_influences = influence.influence(
(test_samples, test_labels)  # your test data (Tensors)
)

What is the "infinitesimal" influence score

More details on the "infinitesimal" influence score: This "infinitesimal" influence score approximately answers the question if a given training example were infinitesimally down-weighted and the model re-trained to optimality, how much would the loss on a given test example change. Mathematically, the aforementioned influence score is given by $\nabla_\theta L(x)' H^{-1} \nabla_\theta L(z)$, where $\nabla_\theta L(x)$ is the gradient of the loss, considering only training example x with respect to (a subset of) model parameters $\theta$, $\nabla_\theta L(z)$ is the analogous quantity for a test example $z$, and $H$ is the Hessian of the (subset of) model parameters at a given model checkpoint.

What the two implementations have in common

Both implementations compute a low-rank approximation of the inverse Hessian, i.e. a tall and skinny (with width k) matrix $R$ such that $H^{-1} \approx RR'$, where $k$ is small. In particular, let $L$ be the matrix of width $k$ whose columns contain the top-k eigenvectors of $H$, and let $V$ be the $k$ by $k$ matrix whose diagonals contain the corresponding eigenvalues. Both implementations let $R=LV^{-1}L'$. Thus, the core computational step is computing the top-k eigenvalues / eigenvectors. This approximation is useful for several reasons:

  • It avoids numerical issues associated with inverting small eigenvalues
  • Since the influence score is given by $\nabla_\theta L(x)' H^{-1} \nabla_\theta L(z)$, which is approximated by $(\nabla_\theta L(x)' R) (\nabla_\theta L(z)' R)$, we can compute an "influence embedding" for a given example x, $\nabla_\theta L(x)' R$, such that the influence score of one example on another is approximately the dot-product of their respective embeddings. Because k is small, i.e. 50, these influence embeddings are low-dimensional.
  • Even for large models, we can store $R$ in memory, provided k is small. This means influence embeddings (and thus influence scores) can be efficiently computed by doing a backwards pass to compute $\nabla_\theta L(x)$ and then multiplying by $R'$. This is orders of magnitude faster than the previous LISSA approach of Koh et al, which to compute the influence score involving a given example, need to compute Hessian-vector products involving on the order of 10^4 examples.

The implementations differ in how they compute the top-k eigenvalues / eigenvectors.

How NaiveInfluenceFunction computes the top-k eigenvalues / eigenvectors

It is "naive" in that it computes the top-k eigenvalues / eigenvectors by explicitly forming the Hessian, converting it to a 2D tensor, computing its eigenvectors / eigenvalues, and then sorting. See documentation of the _set_projections_naive_influence_function method for more details.

How ArnoldiInfluenceFunction computes the top-k eigenvalues / eigenvectors

... (truncated)

Commits
  • fc17d50 Add necessary deps to environment.yml (#1510)
  • 7b1e649 Update meta.yaml with description and BSD version (#1509)
  • bde33ee Update setup.py classifier (Beta -> Stable) (#1508)
  • 00325c8 Update LLM attr tutorial notebook with note on required Captum version (#1496)
  • fcc1933 Update conda build file with required build and run dependencies (#1506)
  • 2bf8729 Update version to 0.8.0 (#1504)
  • c196342 Add Captum Insights deprecation message to README (#1498)
  • 8a61aaf Remove dated "beta" text in README (#1499)
  • 649d82f Reduce redundant major.minor specification in setup.py (#1500)
  • 0e70a12 Add additional keywords to setup.py (#1501)
  • Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

📚 Documentation preview 📚: https://pytorch-tabular--570.org.readthedocs.build/en/570/

Updates the requirements on [captum](https://github.com/pytorch/captum) to permit the latest version.
- [Release notes](https://github.com/pytorch/captum/releases)
- [Commits](pytorch/captum@v0.5.0...v0.8.0)

---
updated-dependencies:
- dependency-name: captum
  dependency-version: 0.8.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
@dependabot dependabot bot requested a review from manujosephv July 1, 2025 11:37
@dependabot dependabot bot added the dependencies Pull requests that update a dependency file label Jul 1, 2025
@dependabot @github
Copy link
Contributor Author

dependabot bot commented on behalf of github Jul 1, 2025

The reviewers field in the dependabot.yml file will be removed soon. Please use the code owners file to specify reviewers for Dependabot PRs. For more information, see this blog post.

@dosubot dosubot bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Jul 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file size:XS This PR changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants