How to make mobile app inference result consistent with llama-cli? #15847

YusakuNo1 · 2025-09-07T01:37:01Z

YusakuNo1
Sep 7, 2025

When I"m trying the same SLM model, e.g. Qwen/Qwen3-0.6B, from llama-cli, the result is great; but when I use the same GGUF, the same prompt and temperature of 0, the response from iOS is bad, sometime the answers are long. For example, this is my llama-cli query:

llama-cli -m models-custom/Qwen3-0.6B-Q4_K_M.gguf \
--temp 0 \
--repeat_penalty 1.1 \
-p "You are a helpful assistant with access to the following tools:

tools:
- name: get_current_weather
  description: Get the current weather in a given location.
  parameters:
    location: (string) The city and state, e.g., San Francisco, CA.
    unit: (string, optional) The unit of temperature, either \"celsius\" or \"fahrenheit\".

You can use these tools to answer user questions. When you need to use a tool, respond with a JSON object in the following format:
{
  \"tool_name\": \"name of the tool\",
  \"tool_args\": { \"argument1\": \"value1\", \"argument2\": \"value2\" }
}
Do not use any other format. If you can answer without a tool, respond directly to the user.

User: what is the weather in Seattle? Answer with unit of fahrenheit
Assistant: "

The result is great from llama-cli

<think>
Okay, the user is asking for the weather in Seattle and wants the unit as Fahrenheit. I need to use the get_current_weather tool. The location parameter should be "Seattle" and the unit set to "fahrenheit". Let me make sure I format the JSON correctly with those arguments.
</think>

{
  "tool_name": "get_current_weather",
  "tool_args": {
    "location": "Seattle",
    "unit": "fahrenheit"
  }
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to make mobile app inference result consistent with llama-cli? #15847

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

How to make mobile app inference result consistent with llama-cli? #15847

Uh oh!

YusakuNo1 Sep 7, 2025

Replies: 0 comments

YusakuNo1
Sep 7, 2025