Skip to content

Feature Request: DFLASH support (from 40 tok/sec to 400 tok/sec) #21978

@KotPrezesOfficial

Description

@KotPrezesOfficial

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

It allows to generate tokens so much faster, without any loss in quality.
More here: https://github.com/z-lab/dflash

Motivation

Because it allows faster generation

Possible Implementation

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions