Skip to content

Add CUDA attention kernels, gradient norms, and CI improvements#69

Open
Eamon2009 wants to merge 31 commits into
codeaddict-masterfrom
master
Open

Add CUDA attention kernels, gradient norms, and CI improvements#69
Eamon2009 wants to merge 31 commits into
codeaddict-masterfrom
master

Conversation

@Eamon2009
Copy link
Copy Markdown
Owner

No description provided.

Eamon2009 and others added 18 commits June 1, 2026 01:00
* docs: report [run_20260530_165216] (~791 tok/s)

 Includes metrics for generalization gap, throughput (~791 tok/s), and gradient norms.
Parameters: 6.68M | lr: 1e-3 | batch: 16 | steps: 6000 - Achieved best validation loss of 4.1319 at step 3900

* docs:report [run_20260530_165216](~791 tok/s)  (#61)

Includes metrics for generalization gap, throughput (~791 tok/s), and gradient norms.
Parameters: 6.68M | lr: 1e-3 | batch: 16 | steps: 6000 - Achieved best validation loss of 4.1319 at step 3900

Co-authored-by: Max <eamon5174@gmail.com>

* feat(cuda): add attention forward and backward kernel declarations

Introduces the header declarations for `attention_forward` and
`attention_backward` operations inside the `quadtrix::cuda` namespace.
Configured with support for custom CUDA streams and head partitioning.

---------

Co-authored-by: Max <eamon5174@gmail.com>
- Defines `DType` and `DeviceKind` enums supporting standard types (F32, F16, BF16, I32, U8).
- Implements `dtype_name` and `dtype_size` metadata helper functions.
- Adds an explicit `Status` struct for non-throwing error propagation alongside `checked_mul` for safe allocation size computation.
- Introduces `check_cuda` and `abort_on_cuda` error macros and handling mechanisms, exposed via the `QUADTRIX_CUDA_CHECK` macro.
- Introduces the `GeluMode` enum to toggle between `Exact` and `Approximate` mathematical variants.
- Declares the `gelu_forward` and `gelu_backward` kernel entrypoints.
- Configures both signatures with optional stream execution and a default mode of `GeluMode::Approximate`.
…ker builds

Updated CI workflow to restrict branches for push events and improved input descriptions for image selection and push options.
Added macOS binary build and release steps to CI workflow.
Removed dependency on build-macos-x64 for the release job.
@Eamon2009 Eamon2009 requested a review from codeaddict-119 June 3, 2026 05:29
@Eamon2009 Eamon2009 self-assigned this Jun 3, 2026
@Eamon2009 Eamon2009 added the cuda label Jun 3, 2026
@Eamon2009
Copy link
Copy Markdown
Owner Author

/run-checks

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 3, 2026

✅ All checks passed!

Co-Authored-By: codeenthusiasm23 <273188204+codeenthusiasm23@users.noreply.github.com>
Co-Authored-By: Eamon Sippy <eamon112009@gmail.com>
Removed s390x build configurations and added a step to write detailed release notes.
Introduces a central Python execution script to concurrently manage and
orchestrate the development environment for both the frontend and backend.
- Detects system OS to invoke correct `npm` and `python` (virtualenv) binary variants.
- Verifies existence of the local PyTorch `.pt` model checkpoint before starting.
- Configures environment variables dynamically for Uvicorn (FastAPI) and Vite.
- Handles cross-origin setups (CORS) linking ports interactively.
- Gracefully handles process termination (`Ctrl+C`) by forwarding termination signals.
- Automatically launches the frontend application in the system web browser.
@Eamon2009
Copy link
Copy Markdown
Owner Author

/run-checks

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 3, 2026

✅ All checks passed!

Bumps [actions/github-script](https://github.com/actions/github-script) from 7 to 9.
- [Release notes](https://github.com/actions/github-script/releases)
- [Commits](actions/github-script@v7...v9)

---
updated-dependencies:
- dependency-name: actions/github-script
  dependency-version: '9'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants