Skip to content

Update utils.py #41

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Update utils.py #41

wants to merge 2 commits into from

Conversation

jiyzhang
Copy link

in the function lm_stream_generator, the token returned by mlx_lm.lm_stream_generate is a mlx_lm.GenerationResponse, which includes a mx.array field
@dataclass class GenerationResponse: """ The output of :func:stream_generate`.

Args:
    text (str): The next segment of decoded text. This can be an empty string.
    token (int): The next token.
    logprobs (mx.array): A vector of log probabilities.
    prompt_tokens (int): The number of tokens in the prompt.
    prompt_tps (float): The prompt processing tokens-per-second.
    generation_tokens (int): The number of generated tokens.
    generation_tps (float): The tokens-per-second for generation.
    peak_memory (float): The peak memory used so far in GB.
    finish_reason (str): The reason the response is being sent: "length", "stop" or `None`
"""

text: str
token: int
logprobs: mx.array
prompt_tokens: int
prompt_tps: float
generation_tokens: int
generation_tps: float
peak_memory: float
finish_reason: Optional[str] = None

`

As pydantic doesn't support mx.array, it will fail the model_dump() below
yield f"data: {json.dumps(chunk.model_dump())}\n\n"

@Blaizzy
Copy link
Contributor

Blaizzy commented Dec 20, 2024

Thanks for the fix @jiyzhang!

once the tests clear I will merge

@jiyzhang
Copy link
Author

Thanks for the fix @jiyzhang!

once the tests clear I will merge

My update changed the interface of the function lm_stream_generator, which might be the reason to fail the test. I'll update a new version which will keep the output structure.

@Blaizzy
Copy link
Contributor

Blaizzy commented Dec 20, 2024

You need to run

pre-commit run --all

The token returned by mlx_lm.lm_steam_generato includes a field with mx.array which will fail the Pydantic dump with the error "TypeError: Object of type array is not JSON serializable"
@jiyzhang
Copy link
Author

just add new commit which changes the mx.array to [] so the server won't fail with the error:
File "/Users/macsmith/miniconda3/envs/fastmlx310/lib/python3.10/site-packages/fastmlx/utils.py", line 406, in lm_stream_generator yield f"data: {json.dumps(chunk.model_dump())}\n\n" TypeError: Object of type array is not JSON serializable

This commit didn't change the response format and value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants