Skip to content

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: NVIDIA/cudnn-frontend
Failed to load repositories. Confirm that selected base ref is valid, then try again.
base: f6266a9e2a4f699ca7714b99aa76bd9fea7862c3
Choose a base ref
head repository: NVIDIA/cudnn-frontend
Failed to load repositories. Confirm that selected head ref is valid, then try again.
compare: 91b7532f3386768bba4f444ee7672b497f34da8a
Choose a head ref
  • 1 commit
  • 112 files changed
  • 1 contributor

Commits on Jan 28, 2025

  1. # cudnn frontend v1.10 release notes (#126)

    cudnn frontend v1.10 is the preferred cudnn frontend to be used for
    cudnn backend 9.7.0 and later as it adds to the Blackwell specific
    ## New API
    - cudnn Frontend v1.10 introduces two new operators,
    block_scale_quantize and block_scale_dequantize to specify the scaling
    and de-scaling of low precision datatypes supported from Blackwell GPU
    - `create_execution_plan(int64_t const engine_id,
    std::unordered_map<KnobType_t, int64_t> const &knobs)` allows creation
    of a custom execution plan with hardcoded engine and knobs. Added a
    sample in `samples/cpp/misc/custom_plan.cpp` to showcase how to work
    with different `Engine` and `Knobs`.
    ## Improvements
    - Users can now query behavior notes of a particular execution plan
    using `get_behavior_notes(std::vector<BehaviorNote_t> &notes) const` and
    `get_behavior_notes_for_plan_at_index(int64_t const index,
    std::vector<BehaviorNote_t> &notes) const` functions.
    - SDPA operations now accept both left window and right window size with
    respect to diagonal. See for more details.
    - SDPA operations now accept a diagonal alignment for the Attention
    score matrix to be used describe the above window. When `s_q != s_kv`,
    and causal mask is on this can be used to specify if the diagonal is top
    left or bottom right.
    - Bottom right causal masking can now be enabled on the sdpa_fp8
    ## Bug fixes
    - Fixed a regression in cuDNN FrontEnd v1.9.0 where the softmax node
    would override user-set dims and strides for softmax_stats and m_zinv.
    This also affected sdpa_forward and sdpa_fp8_forward node
    ## New samples
    - Added an example to showcase how native cuda graphs can be constructed
    from the SDPA operation graph.
    Anerudhan authored Jan 28, 2025
    Copy the full SHA
    91b7532 View commit details