Evaluating LLM Prompts with mlflow.js #112

neviaumi · 2025-02-25T13:40:21Z

neviaumi
Feb 25, 2025

Hi all!
First of all, thank you for the amazing work on making MLflow accessible from Node.js! It's a great tool, and I'm excited to see how it can be applied in more projects.
I'm currently working on a project that leverages OpenAI extensively. Specifically, I use OpenAI to summarize my portfolio, process job descriptions (JDs) into contextual information, and then customize several items that will be used during interviews.
However, I'm facing a challenge: since my prompts are also being generated with the help of AI, it's become really difficult to track the versions of these prompts. Additionally, I've noticed that the AI occasionally "forgets" the context or rules I set for it, leading to inconsistent results.
For example:

A prompt might successfully apply all the rules, but after a refinement or two, it might forget 1 or 2 rules that were applied correctly 3–4 prompts earlier.

This makes it really important to have a UI or dashboard—like MLflow's interface—to access prompt metrics and properly evaluate whether my prompts are working as intended.
That said, I noticed that MLflow's Python implementation includes an [LLM evaluation feature](https://mlflow.org/docs/latest/llms/llm-evaluate/index.html#quickstart), but I'm not able to find documentation or examples for using this feature in mlflow.js.
Could anyone guide me on how to use the LLM evaluation features with mlflow.js? Is this currently supported or planned for future development? Any help would be greatly appreciated!
Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Evaluating LLM Prompts with mlflow.js #112

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

**Evaluating LLM Prompts with mlflow.js** #112

Uh oh!

neviaumi Feb 25, 2025

Replies: 0 comments

Evaluating LLM Prompts with mlflow.js #112

neviaumi
Feb 25, 2025