diff --git a/.gitignore b/.gitignore index 3501d70e6..072bd184a 100644 --- a/.gitignore +++ b/.gitignore @@ -125,3 +125,12 @@ uv.lock # venvstacks build artifact: regenerated from venvstacks.toml + pyproject.toml # by packaging/build.py:_generate_venvstacks_toml() on every Swift app build. packaging/_venvstacks_resolved.toml +.claude/ + +# Video generation artifacts (job outputs live under {base_path}, never in-repo; +# these guard against measurement/test runs writing into the worktree) +*.mp4 +*.mp4.metadata.json +video-jobs/ +video-artifacts/ +p0_video/ diff --git a/docs/upstream-sync.md b/docs/upstream-sync.md index 810ee4226..33f2763e8 100644 --- a/docs/upstream-sync.md +++ b/docs/upstream-sync.md @@ -427,3 +427,29 @@ Qwen-Gemma / oQ)的相关度。 破坏部分 HTTP 客户端 / Copilot CLI > 下次 review 上游 open PR 时,把结论(引入 / 跳过)回填到对应小节。 + +--- + +## 2026-06-11 分化标记: 视频生成引擎 (fmlx 自有, 永不回流) + +feat/video-engine 引入文生视频引擎 (Wan2.2 T2V A14B via mlx-gen, 设计 +docs/video-generation-engine-spec.md). 这是 fmlx 与上游的有意分化, +不向上游 PR. 对上游同源文件的补丁面 (cherry-pick 撞冲突时参考): + +- model_discovery.py: ModelType/EngineType Literal + model_index.json + 识别分支 + _register_model 视频臂与跳过过滤 +- engine_pool.py: Literal + 映射 + get_engine 入口 video 拒绝臂 + + _load_engine 防御臂 +- server.py: video 路由挂载 / pre-pool 400 / 默认模型 chat-capable 过滤 / + ModelInfo.model_type / lifespan 构造与关停 VideoJobManager +- process_memory_enforcer.py: 视频内存租约 (acquire/set pid/release + + ceiling 扣减 + 动态 ceiling 加回) +- settings.py: VideoSettings section + huggingface.disable_xet +- admin/routes.py: valid_types/type_to_engine + 列表与删除门放宽 + + global-settings video 字段 +- cli.py: HF_HUB_DISABLE_XET 注入 +- exceptions.py: ModelTypeNotLoadableError + +全新文件 (无冲突面): omlx/video/*, omlx/api/video_models.py, +omlx/api/video_routes.py, tests/test_video_*.py, +scripts/video_p0_measure.py. diff --git a/docs/video-generation-engine-spec.md b/docs/video-generation-engine-spec.md new file mode 100644 index 000000000..2c37cfc27 --- /dev/null +++ b/docs/video-generation-engine-spec.md @@ -0,0 +1,627 @@ +# fmlx 视频生成引擎 spec (Wan2.2 T2V, mlx-gen 运行时) + +状态: 设计稿 v2 (2026-06-10), 未实现, 待拍板. +v2 = v1 经 6 视角对抗评审修订: 22 条 blocker/major 发现全部确认并吸收 +(拒绝臂位置, OpenAI SDK multipart 兼容, Metal wired 双进程治理, 租约双重 +计数, venv 污染, A/B 协议算术错误等). 评审记录见会话工单. +定位: Flyto MLX (fmlx) 自有功能, 不回流上游 (soft-fork 自有分化, 参见 §10). +本文档所有代码事实均经子代理逐行核实 (file:line 可验证), mlx-gen 事实核实 +到其源码与 pyproject (2026-06-10, v0.18.14). + +## 0. 背景与定位 + +fmlx 当前是 LLM/VLM/audio 推理引擎. 战略方向调整: Apple Silicon 单机统一内存 +(128GB 级) 对本地多媒体生成 (文生视频/文生图) 是结构性优势 -- 大权重 + 大激活 +全在 UMA 里, 不需要多卡切分. fmlx 要把 "单机多媒体" 做成与上游 oMLX 的核心差异. + +第一个落点: Wan2.2 T2V A14B (MLX 量化 8bit, diffusers 布局, 42.4GB) 已完整下载 +并逐文件校验通过, 位于 m5max `~/.fmlx/models/AbstractFramework/wan2.2-t2v-a14b-diffusers-8bit`. +该权重就是为纯 MLX 运行时 mlx-gen 制作的 (safetensors dtype = U32+scales+biases, +即 mx.quantize 格式), mlx-gen 文档的示例命令逐字引用这个 repo. + +## 1. 目标与非目标 + +目标 (MVP, P1): + +1. fmlx 能发现 diffusers 布局的视频模型 (model_index.json), 类型化为 `video`, + 在 /v1/models 与 admin 列表中正确展示, 可删除, 不污染 chat 模型列表, + 不会成为隐式默认模型. +2. 新增 OpenAI 形态的异步 job API: POST /v1/videos 提交, GET 轮询, list 枚举, + content 下载. 官方 openai SDK 的 client.videos.* 可直接打通 (含其 + multipart 提交形态). +3. 生成跑在独立 venv 的 subprocess worker 里, 与 LLM 服务进程隔离; worker + 自身被 Metal wired limit 钉死在租约内 (预防性, 非反应性). +4. 视频任务持有内存租约 (lease), 经现有 ProcessMemoryEnforcer 单一咽喉点 + 传播. 与中小 LLM (权重 + 工作集能与 lease 共存于 ceiling 内) 真共驻; + 与超大 LLM (如 glm4.5 85GB) 是设计上的互斥 -- job 排队等内存, 不硬挤, + 不重蹈 m5max kernel panic. +5. 全链路在 m5max 真机 A/B 验证后才可合并 (本项目铁律: 单测过 != 真机过). + +非目标 (MVP 不做, 部分进 P2): + +- SSE 进度流 (轮询够用), 图生视频 (I2V) 输入上传, 文生图 (FLUX 系), TI2V-5B, + admin 专属视频 UI 页, 多并发生成 (Semaphore(1) 一次一个), 分布式队列, + ModelScope 下载视频模型 (有 flat symlink 陷阱, 见 §4.1), 训练/LoRA, + 为视频任务主动驱逐已加载 LLM (MVP 只被动排队, 驱逐策略 P2). + +## 2. 关键事实 (设计依据) + +以下两小节分别是外部运行时与本仓代码的核实结论, 全部影响 §3 的架构取舍. + +### 2.1 mlx-gen 运行时 + +| 维度 | 事实 | +|---|---| +| 真身 | filipstrand/mflux 的 fork; Python 包名是 `mflux`, `import mlxgen` 只是 sys.modules 别名 | +| 视频类 | `mflux.models.wan.variants.Wan2_2_TI2V` 一个类管全部 Wan 变体; `ModelConfig.wan2_2_t2v_a14b()` + `model_path=<本地目录>` 即可加载我们已下载的目录 (路径解析规则 1 "exists_locally" 短路一切) | +| 生成 API | `generate_video(seed, prompt, steps, height, width, num_frames, fps, ..., progress_callback)` 阻塞同步, batch=1; ProgressEvent 带 phase/step/total_steps | +| 取消 | 无一等取消 API; callback 里抛异常可中断但实例报废 -- 健壮取消 = 杀 subprocess | +| 依赖 | 不是纯 MLX: torch 是硬依赖 (UMT5 text encoder 走 torch/CPU), 另有 transformers>=5, huggingface-hub>=1.1.6,<2, opencv, matplotlib, av (PyAV 自带 ffmpeg wheel); twine 混在 runtime deps 里 (供应链卫生信号, 计入 §9 风险评级) | +| 输出 | GeneratedVideo (PIL frames + 元数据), .save() 写 MP4 + 健康校验 + metadata sidecar | +| license / 版本 | MIT; v0.18.14 (2026-06-08); 两周内 15 个 release, bus factor 1 (lpalbou) | +| 实测内存 (官方, M5 Max 128GB) | T2V A14B q8: 物理峰值 20.7 GiB, MLX 峰值 15.5 GiB, 154.8s @ 384x224, 33 帧, 12 步; 生产分辨率 (480x240, 101 帧, 25 步) 约 30 分钟. 注意: 这是唯一公开测点, 是小 profile | +| 多线程 | 文档明言 model 实例有状态, 必须串行访问 | + +依赖结论: mlx-gen 的 transformers>=5 / hf-hub<2 / torch 与 fmlx 主 venv 共装冲突 +风险高且无必要 -- 这是 subprocess + 独立 venv 方案的第一推力. + +### 2.2 fmlx 代码侧事实 (全部 file:line 已核实) + +- 发现机制只认根 config.json (`_is_model_dir`, model_discovery.py:697-699). + Wan2.2 目录在 owner/repo 两级布局下整体隐身; 但在 FLAT 布局下 (恰好是 + ModelScope 下载器产出的 symlink 形态, ms_downloader.py:665) 会被当成 org + 文件夹下钻, transformer/ transformer_2/ vae/ text_encoder/ 各自带 config.json, + 会注册成 4 个幽灵 "llm" 模型, 甚至可能成为默认模型 (server.py:1279-1290). + 这是现存隐患, 发现机制改造必须先行. +- pool.get_engine 在调用 _load_engine 之前就跑内存准入循环 (engine_pool.py: + 359-396): projected = current + entry.estimated_size, 不够就 LRU 驱逐已加载 + 模型. 任何 "在 _load_engine 里拒绝" 的方案都晚了 -- 拒绝必须在准入循环 + 之前 (§3, §4.1). +- 隐式默认模型 = available_models[0] (server.py:1279-1292), 不分类型; model + fallback (server.py:698-711) 会重试默认模型. video 条目必须从两处排除. +- 全局 MLX executor 是 max_workers=1 的单线程池 (engine_core.py:106-120), + 全部非 batched 引擎的 GPU 操作串行其上 (Metal command-buffer race, #85). + in-process 跑分钟级 diffusion 会头部阻塞 audio/embedding/unload. 第二推力. +- mlx-gen 无取消 API + audio 模板全链路无超时 (audio_routes.py 零 asyncio.wait_for). + 第三推力: subprocess kill 是唯一可靠的取消/超时手段. +- 内存防护活体只有 phys-based 链路: ProcessMemoryEnforcer (1s tick) -> + `_get_hard_limit_bytes` (process_memory_enforcer.py:495-517) 是单一咽喉点, + pool 准入/软硬水位/prefill gate cap 全部从它派生; estimate/monitor 那套在 + 生产 inert (memory_monitor 永远 None), 不得依赖. +- 动态 ceiling 在 safe/balanced/aggressive 档 (生产默认 balanced) 是 + 系统级感知的 (own_phys + free + inactive + active*ratio, process_memory_ + enforcer.py:483-493): worker 真实占用会经由 free 下降自动压低父进程 + ceiling -- 与显式租约叠加就是双重计数, 必须修正 (§4.4). +- `get_phys_footprint(pid)` 接受任意 pid (utils/proc_memory.py:94-118), 父进程 + 可测 worker 足迹; 失败返回 0, 必须按错误处理. 同文件 :63 已声明 + `ri_lifetime_max_phys_footprint` (进程生命期峰值 ledger) 但全仓未读 -- + P0 测量的正确仪器 (§7). +- Metal wired limit 是 per-process 可设的 (mx.set_wired_limit; enforcer 对 + 自己进程已这么做, process_memory_enforcer.py:407-446). worker 子进程默认 + 继承机器级 cap (~107.5GB), 不主动设限就没有任何 Metal 级约束 -- 这是 + 双进程 wired-sum panic 的根源, 也是 §4.4 预防性方案的依据. +- 后台 job 的成熟范式在 admin 侧: HFDownloader (task dict + status enum + + asyncio.create_task + 协作取消) 与 OQManager (Semaphore(1) + is_quantizing + 被推理端点 503 联动). /v1/responses 的 `background` 字段是死的 (零消费者), + ResponseStore 只存终态无生命周期, 都不能直接复用. +- 非 LLM 引擎接入范式: audio 三件套 (BaseNonStreamingEngine + api/audio_routes.py + 直连 pool). 注意 audio 路由的条件挂载 (server.py:439-448) 发生在模块 import + 时, 彼时 settings 尚未初始化 (init_server 才注入) -- 它能工作只因为它的门 + 是 "mlx_audio 可 import". settings 驱动的门不能放在那里 (§4.3). + +### 2.3 共驻内存风险 (m5max 教训) + +128GB 机 Metal wired cap 约 107.5GB, 越线 = 整机 kernel panic (已发生过). +要害不是稳态而是瞬时尖峰; 1s 轮询与 chunk 边界读数都看不见 sub-poll 瞬时. +M5 Max 内存带宽 ~0.5TB/s 量级, 一次 mx.eval 可以在远小于 2s 的窗口里物化 +几十 GB -- 任何 "轮询 + 杀进程" 的反应式手段都不构成 panic 兜底, 只能做 +次级清理. 预防性手段只有两类: Metal wired limit (per-process, 越限退化为 +非常驻页或分配失败, 不越机器 cap) 与余量常数 (prefill 侧 12GB margin 的 +方法论, settings.py:404-412). 视频侧两者都要用 (§4.4). + +## 3. 总体架构 + +决策: subprocess worker + 独立 venv + job manager, 视频模型注册进 pool 名册 +但被 typed 拒绝在加载链路之外 (拒绝点在准入循环之前). + +``` +fmlx server 进程 (主 venv, 无 mlx-gen 依赖) + |- model_discovery: 认出 video 模型 (model_index.json), 列表/删除/设置 + |- server.get_engine: alias 解析后 entry.model_type=="video" -> 400 + 指引 + |- pool.get_engine: 入口处 (准入循环之前) video -> ModelTypeNotLoadableError + |- /v1/videos 路由 (api/video_routes.py, 无条件挂载, handler 内按设置门控) + |- VideoJobManager (omlx/video/manager.py, lifespan 内构造, 注入 enforcer) + | |- queue + Semaphore(1) + job 持久化 (JSON per job) + | |- 内存租约: enforcer.acquire_video_lease(bytes) / set_video_worker_pid + | | / release_video_lease + | |- spawn: /bin/python -I /video/worker.py --spec job.json + | |- 监控: stdout JSONL (进度 + 相位心跳) + 足迹 watchdog + 停滞超时 + | |- 取消/超时: SIGTERM -> 5s -> SIGKILL + |- ProcessMemoryEnforcer: ceiling -= lease; 动态 ceiling 加回 min(worker, lease) + | + +-- worker 子进程 (video venv: 锁定依赖集, 不 import omlx) + |- 进场即 mx.set_wired_limit(lease 内值) -- Metal 级自缚 (预防性) + |- mflux Wan2_2_TI2V(model_config=..., model_path=registry 提供的本地目录) + |- generate_video(progress_callback -> stdout JSONL) + |- video.save(//output.mp4) -> exit 0 +``` + +为什么不 in-process (按否决强度排序): + +1. 依赖冲突: torch/transformers>=5/hf-hub<2 装进主 venv 风险不可控. +2. 取消与超时: mlx-gen 无取消 API, in-process 卡死的 denoise 永久占住全局 + MLX executor 且无 kill 手段; subprocess 杀进程即回收一切 (含 Metal 内存). +3. executor 头部阻塞: 分钟级任务串行在 max_workers=1 的全局执行器上, + audio/embedding 全堵. +4. 崩溃隔离: 视频管线 NaN/Metal 错误不殃及 LLM 服务. +5. 内存回收确定性: 进程退出即归零, 无碎片/泄漏累积. + +代价与对策: + +- worker 内存对父进程 phys_footprint 不可见, 但对动态 ceiling 可见 (经 free + 下降) -> 租约 + 加回修正, 计一次不计两次 (§4.4). +- 每个 job 冷加载权重 (42GB 读盘) -> MVP 接受; P2 再考虑常驻 worker + idle TTL. +- 双进程 wired-sum -> worker 自缚 wired limit (预防) + watchdog (清理), §4.4. + +视频模型与 engine pool 的关系: 发现机制注册 entry (model_type=engine_type= +"video"), 使列表/设置/删除/类型护栏全部生效. 但 video 条目永不可加载: + +- pool.get_engine 在 entry 查到后, already-loaded 快路径与准入循环之前, + 对 video 抛新 typed 异常 `ModelTypeNotLoadableError` (子类 EnginePoolError, + 消息携带 "use POST /v1/videos"). 这保证零驱逐/零 settle barrier/零 507 + 副作用 -- 若拒绝放在 _load_engine 里, 一次误指 video 模型的 chat 请求 + 就会先按 42GB 跑准入, 驱逐在驻的生产 LLM 再被拒 (评审 blocker, 已核实). +- server.get_engine 在 alias 解析后, 进 pool 之前, 查 entry.model_type == + "video" -> HTTPException 400 + /v1/videos 指引 (chat/embeddings/rerank 全部 + 流经此函数, 一处护全). 异常映射链在 EnginePoolError->500 之前加 + ModelTypeNotLoadableError->400 臂. 原 v1 计划的 _suggest_endpoint_for_engine + 加提示是死代码 (该函数只对成功返回的 engine 实例 isinstance, video 永远 + 没有实例), 撤销. +- /v1/models/{id}/load 与 admin load 端点各自加 pre-pool 类型检查 -> 400 + (公共 load 端点的 blanket except Exception->500 会吞 typed 异常, 必须在 + 进 pool 前查). +- _load_engine 的 dispatch 链里保留防御性 raise (同 typed 异常), 护住其他 + pool.get_engine 调用方. +- 默认模型卫生: 隐式默认选择 (server.py:1279-1292) 过滤到 model_type in + {"llm","vlm"}, 无候选则 default=None (落到现有干净 400); model_fallback + (server.py:698-711) 重试前校验默认模型类型; admin 默认模型设置器 + (admin/routes.py:2171-2173) 拒绝 video 条目. + +权重生命周期完全归 worker 子进程; pool 的 42GB 准入与卸载 settle barrier +对 video 条目因前置拒绝而永不触发. + +## 4. 模块设计 + +按改动面从发现层到 API 层再到内存与配置依次展开. + +### 4.1 模型发现与类型系统 + +改动点 (全部小而集中): + +- `_is_model_dir` (model_discovery.py:697-699): `config.json 存在` 或 + `model_index.json 存在` 均算模型根. 后者必须先于 org-folder 下钻判定, + 这同时修掉 §2.2 的幽灵组件隐患 (flat 布局不再下钻 transformer/ 等子目录). +- `detect_model_type` (model_discovery.py:385-549): 在 "config.json 缺失 -> + llm" 早退 (404-406) 之前加 model_index.json 分支: 读 `_class_name`, + 在允许清单内 (MVP = {"WanPipeline"}) -> "video"; 不在清单 -> 跳过哨兵, + `_register_model` 据此跳过并 log warning (不注册不可跑的管线, 也不产 + 幽灵). 契约说明: 所有 config.json 路径保持现有 str 返回契约不变, 哨兵 + 只出现在 "model_index.json 存在且 _class_name 不在清单" 的新分支 -- + 现有测试零破坏. +- Literal 与映射五处同改 + 一致性测试: model_discovery.py:26-27, + engine_pool.py:56-57, `_MODEL_TYPE_TO_ENGINE` (engine_pool.py:203-211), + `_register_model` if/elif (model_discovery.py:737-751), admin valid_types + + type_to_engine (admin/routes.py:1860, 1870-1878). 三份重复映射已是现存 + 债务, 加断言测试防 silent "batched" 降级. +- 加载链路拒绝 (位置是要害, 见 §3): pool.get_engine 入口 typed 拒绝 + + server.get_engine pre-pool 400 + 两个 load 端点 pre-pool 400 + + _load_engine 防御臂. 新异常类入 engine_pool 异常族. +- 默认模型与 fallback 卫生 (见 §3 末尾): 隐式默认过滤 / fallback 校验 / + admin setter 拒绝, 配 "video 模型按字典序排第一" 的发现 fixture 单测. + 这顺带修掉 embedding/audio 模型当默认的同款现存隐患. +- `estimate_model_size`: 递归 **/*.safetensors 分支 (679-681) 已覆盖 diffusers + 布局 (42GB), 不改; 该值对 video 只作展示 -- 准确表述: video 条目因前置 + 拒绝永不进入会消费 estimated_size 的准入循环. +- /v1/models 卫生: ModelInfo (api/openai_models.py:409-415) 增加 `model_type` + 字段并在 server.py:1717-1722 填充 (对 OpenAI 客户端是 additive); + 这同时激活 cli.py:349 现成但 inert 的 llm/vlm 过滤. admin chat picker + (dashboard.js:2081) 已天然排除未知类型. +- admin DELETE / 本地列表的 config.json 门 (admin/routes.py:4538/4547/4495/ + 4511) 放宽为 config.json|model_index.json, 否则 42GB 模型在 UI 不可见 + 不可删. 共享一个 is_model_root() helper, 不再三处发散. + +### 4.2 VideoJobManager 与 worker 协议 + +构造与接线: VideoJobManager 在 lifespan 启动序里构造, 紧跟 enforcer 块之后 +(server.py:367 后), 构造器注入 `enforcer: ProcessMemoryEnforcer | None` +(镜像 server.py:365-366 给 pool 注入 enforcer 的先例); 实例存 +`_server_state.video_job_manager`, 路由经 `_get_video_job_manager()` 懒访问器 +取用 (audio_routes.py:68-80 范式, 单测可 patch). 不得在 init_server 里构造 +(init_server 先于 lifespan, 彼时 enforcer 不存在); "仿 OQManager" 只指 +job/队列/持久化形态, 不指构造位置. + +job 模型: + +- id (uuid4, 前缀 "video_"), 对外 status 严格四值 `queued|in_progress| + completed|failed` (与 openai SDK Video.status Literal 完全一致; 取消不是 + wire 状态, 内部记日志/metrics 即可, to_dict() 永不输出 cancelled). +- progress 0-100, phase 字符串, created_at / started_at / completed_at, + `expires_at` (nullable; 产物被保留策略清除时置为清除时刻, 记录本身保留 + 且 status 不变), 请求参数回显, 产物路径, error. +- error: null 或结构化 `{code, message}` (对齐 openai SDK Video.error 形态). + 稳定 code 集: `worker_crashed` (非零退出), `worker_stalled` (停滞超时), + `job_timeout` (单次运行超时), `memory_lease_exceeded` (watchdog 足迹超租约), + `monitor_failed` (连续 3 次足迹读 0), `server_restarted` (启动回放), + `output_invalid` (exit 0 但 mp4 健康校验失败). worker 的 failure manifest + 用同一 {code, message, detail?} schema, manager 透传. + +队列与时钟: + +- FIFO + asyncio.Semaphore(1); 队列深度上限 (settings, 默认 4), 超限提交 + 直接 503. 一次只有一个 worker 子进程. +- 内存准入只在 dispatch (spawn 前) 评估 (判据见 §4.4): 不满足 -> job 留在 + 队头, 乘 enforcer 1s tick 节奏每 ~5s 重查, 永不对已接收的 job 503. + 饱和的 LLM 服务可以让视频 job 长等 -- 这是接受的取舍 (§9), 用户可 DELETE + 取消排队中的 job. +- job_timeout_seconds (默认 7200) 的时钟从 worker spawn 起算 (per-run), + 排队等待不计时. 停滞超时见下. + +持久化与产物: + +- 每 job 一个 JSON, 原子写 (tmp+replace, 仿 responses_utils.py:447-454), + 目录 {base_path}/video-jobs/; 产物 {base_path}/video-artifacts/{job_id}/. + 启动时回放: in_progress/queued 的标记为 failed (code=server_restarted). +- 保留策略: 数量 + 总字节双上限, LRU 清产物; 清除只删 blob 并置 expires_at, + job 记录保留 (list 与 GET 仍可见历史). + +worker (omlx/video/worker.py, 只 import mflux + mlx + 标准库, 不 import omlx): + +- spawn 形态: `/bin/python -I /video/worker.py --spec + `. `-I` (isolated) 隔离 sys.path/PYTHONPATH/用户 site, + 防 worker 误 import 主仓 omlx; env 由 manager 白名单构造 (PATH, HOME, + TMPDIR + 刻意选择的 HF 变量), 不整体继承. +- spec 内 model dir 只能取自 registry entry.model_path (discovery 扫描产物, + server 自有根目录下); request.model 字符串在任何分支都不得参与路径构造, + resolve 失败一律 404. +- 进场顺序: 先 `mx.set_wired_limit(lease_bytes - wired_margin)` (lease 经 + spec 传入; mlx 本来就是 mflux 依赖, 不破 "不 import omlx" 规则), 再加载 + 模型. 这是 wired-sum 治理的承重墙 (§4.4). +- 进度协议: stdout 每行一个 JSON. 两类行: 相位转换心跳 + `{"phase": "loading"|"text_encoding"|"denoise"|"vae_decode"|"saving"}` + (静默长相位 -- 42GB 权重加载/torch 文本编码/VAE decode -- 也有活性信号) + 与步进行 `{"phase": "denoise", "step": n, "total_steps": m}` (接 + ProgressCallback). +- 结束: video.save(输出路径, validate_health=True) + metadata sidecar; + 异常时写 failure manifest JSON 后 exit 非零. + +监控与终止 (manager 侧, 统一在 2s watchdog tick 里): + +- 足迹: get_phys_footprint(worker_pid); 连续 3 次读 0 -> 杀, + code=monitor_failed; 足迹 > lease -> SIGKILL, code=memory_lease_exceeded. + 注意 watchdog 定位是次级清理/泄漏检测, 不是 panic 兜底 (§2.3, §4.4). +- 停滞: 追踪 last_jsonl_line_at, in_progress 且静默超过 + progress_stall_timeout_seconds (settings, 默认 600) -> SIGTERM -> 5s -> + SIGKILL, code=worker_stalled. 相位心跳的存在使该阈值在生产分辨率 + (单步可 ~70s+) 下既不误杀也不失效. +- 单次运行超时 job_timeout_seconds 同终止路径, code=job_timeout. +- DELETE 取消: SIGTERM -> 5s -> SIGKILL, 释放租约, 删记录与产物. + +mlx-gen 演进风险的真实缓解次序 (v1 的 "CLI 兜底" 评审降级): 第一道 = 锁定 +依赖集的精确 pin (§4.5 lockfile) -- 依赖冻结后 API/CLI 都不会在运行期破裂, +破裂只能经显式升级 PR 进来, 是可 review 的代码变更; 第二道 = vendor wan +子树 (MIT 允许; 真实规模约 130 文件含 models/wan + models/common + utils + +callbacks, 且 torch/transformers 依赖不因 vendor 消失 -- 诚实代价见 §9); +CLI 形态切换只是第三道且会破坏 JSONL 进度协议, 不作为设计依赖. + +### 4.3 /v1/videos API (OpenAI 形态) + +路由文件 api/video_routes.py. 挂载: 无条件 include_router (mcp_router +先例, server.py:435-437) -- 不能用 audio 的条件挂载范式, 因为那发生在 +import 时而 settings 彼时未初始化 (§2.2). 门控全部在 handler 内: +settings.video.enabled 为 false 或 manager 未初始化 -> 503 + 指引; +venv 探测失败 -> 503 + 安装指引 (指引文本用 §4.5 修正后的命令). +router 级 Depends(verify_api_key). + +| 端点 | 行为 | +|---|---| +| POST /v1/videos | 见下方提交语义 | +| GET /v1/videos | MVP 必做 (LRU 清产物后这是唯一枚举手段). 参数 limit (默认 20, 上限 100) / after (游标 = job id) / order (asc|desc, 默认 desc, 按 created_at). 响应 {"object": "list", "data": [...], "has_more": bool, "last_id": ...} -- openai SDK 游标分页所需字段 | +| GET /v1/videos/{id} | job 对象 (status, progress, phase, error, expires_at, ...) | +| GET /v1/videos/{id}/content | FileResponse mp4 (media_type=video/mp4, 支持 Range); 未完成 -> 409; completed 但产物已被保留策略清除 -> 404 + code=artifact_expired (响应体指向 expires_at); handler 必须先查文件存在 (FileResponse 对缺失路径会 500) | +| DELETE /v1/videos/{id} | queued/in_progress: 杀 worker (SIGTERM->5s->SIGKILL) + 释放租约 + 删记录与产物; completed/failed: 删记录与产物. 返回 {"id", "object": "video.deleted", "deleted": true} (openai SDK VideoDeleteResponse 形态); 之后 GET -> 404 | + +提交语义 (POST /v1/videos): + +- 兼容要害 (评审 blocker): openai SDK 的 client.videos.create 发送的是 + multipart/form-data (为 input_reference 文件域), 纯 JSON pydantic body + 会对官方 SDK 一律 422. handler 收原始 Request, 按 Content-Type 分支: + multipart -> await request.form(); JSON/缺失 -> await request.json(); + 两路归一进同一个内部 pydantic 模型 (video_models.py 保留). FastAPI 不能 + 按 content type 在同路径派发两个 handler, 必须单 handler 手工解析 -- + 与 audio_routes 的 pydantic-body 范式刻意不同, 此处注明原因. +- 字段: model, prompt, 可选 size "WxH", seconds (SDK 发的是字符串字面量 + "4"|"8"|"12"; multipart 下所有字段都是字符串, 数值字段必须走 pydantic + lax 强转), 以及 fmlx 扩展 negative_prompt/steps/fps/seed/guidance/ + guidance_2 (扩展字段碰撞政策: 若未来 OpenAI 占用同名字段, fmlx 语义让位, + 扩展迁移到 fmlx_ 前缀; MVP 不预先加前缀). +- seconds 按 fps 折算帧数, 强制 4n+1; size 向上取整到 16 的倍数. +- 准入即拒 (400/413): 参数越静态上限 (max_frames/max_steps/max_pixels, + settings); 或按 §4.4 的逐请求峰值预测器 predicted_peak(W,H,frames) + + margin > lease -- 响应体带 预测值 vs lease 数字. 静态上限是 UX 边界, + 预测器才是内存边界. +- 接受 -> 立即返回 job 对象 (status=queued). +- 503 仅三种: 队列满 / venv 缺失 / 内存 guard 关闭 (均为提交时点的持久性 + 条件, 带可操作原因). 内存紧张不 503, 进队等 (§4.2). + +错误映射: 模型不存在 404 (带 available 提示); 模型非 video 类型 400. +每请求计入 metrics (record_request_complete, 0 token, 仿 audio_routes.py: +426-436). + +### 4.4 内存共驻: 三层治理 (wired 自缚 / 租约 / watchdog) + +第一层, 预防 (承重墙): Metal wired limit 把两个进程各自钉死. + +- worker 进场即 `mx.set_wired_limit(lease_bytes - wired_margin)` (§4.2). + 越限退化为非常驻页 (变慢) 或分配失败 (job 失败, manager 上报 failed) -- + 永不向机器 cap 方向增长. 与 enforcer 对父进程的现有做法同机制 + (process_memory_enforcer.py:407-446). +- acquire_video_lease 时父进程把自身 wired limit 重设为 (static_ceiling - + lease), release 时恢复. 若父进程在驻 wired 已超新限 (如 85GB 模型在载), + MLX 退化为非常驻页 (decode 变慢) 而非 panic -- 可接受, 且 §4.4 准入判据 + 使该情形罕见. + +第二层, 预算 (租约, 改 process_memory_enforcer.py 约 50 行): + +- `_video_lease_bytes` 在 `_get_hard_limit_bytes` (495-517) 末尾扣减: + `ceiling = max(0, ceiling - lease)`. 单一咽喉点, pool 准入/软硬水位/ + admission_paused/prefill gate cap 下一个 1s tick 全部自动收紧, 零 + scheduler 改动. +- 双重计数修正 (评审 major): 动态 ceiling (483-493, 非 custom 档) 会因 + worker 占用经 free 下降而再降一次. 修正: 非 custom 分支加回 + `min(get_phys_footprint(worker_pid), lease)` -- worker 被精确计一次. + clamp 到 lease 保证失控 worker (watchdog 杀掉前的窗口) 不会反向抬高 + 父进程 ceiling; 足迹读 0 时退化为今天的双重计数, 即 fail-conservative. +- API: `acquire_video_lease(bytes)` (spawn 前, 此时无 pid, 加回项为 0, + 正确 -- 尚未分配), `set_video_worker_pid(pid)` (spawn 后立即), + `release_video_lease()` (清两者). 改值即 `_propagate_memory_limit()` + (现有 runtime setter 范式, 372-400). +- guard 关闭 (get_final_ceiling()==0) -> 拒绝视频任务 (提交时 503, §4.3), + 不在无防护机器上引入 panic 源. + +dispatch 准入判据 (评审修正: 不得在 "落租约即触发硬压力" 的窗口里放行): + +- enforcer 存在且 guard 启用, 且 `recent_peak_bytes() <= min( + soft_ratio * (ceiling - lease), (ceiling - lease) - prefill_transient_ + margin)` -- 用滚动峰值而非瞬时值, 且要求落租约后系统直接处于 "ok 压力 + + 在驻负载不触 prefill gate" 的状态. 不满足 -> 留队重查 (§4.2). +- 在途长 prefill 的残余情形 (判据通过后, 租约落地前才进来的增长型负载): + 租约落地使 gate cap 收紧, 该 prefill 的下一个 chunk 被 gate 干净拒绝 + (503 类错误, 无 panic) -- 这是设计内行为, 记入 §9 取舍表. MVP 不做 + drain (等 prefill 排空再落租约), P2 视实测再议. +- 与超大模型的互斥算术 (评审 blocker 的修正): 107.5 ceiling - 28 lease = + 79.5, glm4.5 (85GB 权重) 根本放不进 -- 即设计上 video 与 >=80GB LLM + 互斥, job 排队直到大模型被 TTL/手动卸载. 真共驻的适用域是 "LLM 权重 + + 工作集 <= ceiling - lease - 余量", 128GB 机上约 <=50GB 级模型. §1 目标 4 + 与 §7 A/B 协议均按此表述. + +第三层, 清理 (watchdog, §4.2): 2s 足迹轮询超租约杀 + 停滞杀 + 超时杀. +定位是泄漏检测与次级清理 -- sub-2s 的 wired 冲刺由第一层挡, 不靠它. + +lease 大小: settings.video.memory_lease_gb, 默认初值 28, 由 P0 实测校准 +(§7: 用 lifetime-max ledger 测真峰 + 最差单步瞬时); 校准值与依赖 lock +digest 绑定 (§4.5, §9.1). + +### 4.5 settings (VideoSettings 新 section) + +按四件套范式接 (settings.py:789-817 / 879-912 / 1136-1154 / 1376-1397) + +admin GET/POST + GlobalSettingsRequest 平铺字段 + _settings.html 表单. + +| 字段 | 默认 | 说明 | +|---|---|---| +| enabled | false | 总开关; false 时 handler 一律 503 (路由仍挂载, §4.3) | +| worker_python | "" | video venv 的 python 路径; 空 = {base_path}/venvs/video/bin/python | +| memory_lease_gb | 36 | P0 已校准 (低 RAM 上限角 27.9 + 6 余量, §6), 与 lock digest 绑定 | +| max_queued_jobs | 4 | 超限提交 503 | +| job_timeout_seconds | 7200 | 单次运行超时 (spawn 起算), 排队不计 | +| progress_stall_timeout_seconds | 600 | JSONL 静默杀 (§4.2) | +| default_steps / default_fps | 20 / 16 | 未显式给参时的生成默认 (P0 校准) | +| max_frames / max_steps / max_pixels_per_frame | 121 / 50 / 1280x720 | 请求 UX 上限; 内存边界由峰值预测器把守 (§4.3/§4.4) | +| artifacts_max_count / artifacts_max_gb | 50 / 50 | 产物保留 (LRU 清 blob, 记录保留) | + +venv 管理 (评审 blocker 修正: v1 的裸命令会从仓库 cwd 装进生产主 venv): + +- 锁定: 仓库提交 `omlx/video/requirements.in` (一行 `mlx-gen==0.18.14`) 与 + `omlx/video/requirements.lock` (`uv pip compile --generate-hashes`, 必须在 + macOS arm64 + 与 worker venv 相同 Python minor 上生成 -- mlx 只有 darwin + 轮子, 理想在 m5max 上生成). +- 创建 (文档化命令, 也是 503 指引文本): + +``` +uv venv -p 3.12 {base_path}/venvs/video +uv pip sync --python {base_path}/venvs/video/bin/python omlx/video/requirements.lock +``` + +- 警告: 裸 `uv pip install` 按 VIRTUAL_ENV / 最近 .venv 解析目标, 从仓库根 + 执行就是生产 fmlx venv -- 该形态永远不得用于此用途. +- 启动探测: 跑 ` -c "import mflux"`, 且断言 worker_python + 与主进程 sys.executable 不是同一解释器 (防误配); 失败 -> 提交一律 503 + 带安装指引. admin 一键安装是 P2. + +### 4.6 admin 面 (MVP 最小) + +- 模型列表自动获得 video 条目 (get_status 透传 model_type, 零改动); + 类型下拉 (_modal_model_settings.html:272-280) 加 video 选项; 删除可用 + (§4.1 的门放宽). +- job 可见性 MVP 靠 GET /v1/videos (已升必做) 与日志; admin 视频页 P2. + +### 4.7 与下载链路的关系 (顺带修复, 建议拆独立小 PR) + +- HF 下载器对 diffusers repo 零改动可用 (snapshot_download 全树落 + //, on_complete 触发再发现). +- 中国网络 Xet 墙: `HF_HUB_DISABLE_XET` 在 huggingface_hub import 时冻结, + 只能进程级注入 -- 加到 cli.py:115-140 的 serve 启动 env 块, 由 + settings.huggingface.disable_xet 驱动 (默认 false, 文档建议国内开). + 本次 42GB 下载即是被 Xet 卡死 6.5 小时, 换 LFS 链路后 8.8MB/s 拉完. +- ModelScope 下载视频模型 MVP 明确不支持 (flat symlink 触发幽灵组件, + §4.1 的发现修复使其不再产幽灵, 但 MS 路径的正式支持等 P2). + +## 5. 文件清单 + +| 路径 | 新/改 | 预估 LOC | 内容 | +|---|---|---|---| +| omlx/video/__init__.py | 新 | 10 | 导出 | +| omlx/video/manager.py | 新 | ~650 | job 模型/队列/持久化/spawn/watchdog/停滞/租约/保留策略 | +| omlx/video/worker.py | 新 | ~180 | 子进程脚本 (wired 自缚 + JSONL + manifest), 只依赖 mflux/mlx | +| omlx/video/requirements.in + .lock | 新 | -- | 依赖锁 (§4.5) | +| omlx/api/video_routes.py | 新 | ~300 | 5 端点 + 双 content-type 解析 + 门控 | +| omlx/api/video_models.py | 新 | ~110 | pydantic 内部模型/响应/error code 枚举 | +| omlx/model_discovery.py | 改 | ~60 | _is_model_dir / detect_model_type / 注册臂 / Literal | +| omlx/engine_pool.py | 改 | ~30 | Literal / 映射 / get_engine 入口拒绝 + 新异常 / _load_engine 防御臂 | +| omlx/server.py | 改 | ~40 | 路由挂载 / pre-pool 400 / 异常映射臂 / 默认模型与 fallback 卫生 / ModelInfo.model_type / manager 构造接线 | +| omlx/process_memory_enforcer.py | 改 | ~50 | 租约三 API + 扣减 + 动态 ceiling 加回 + 父进程 wired 重设 | +| omlx/settings.py | 改 | ~120 | VideoSettings 四件套 | +| omlx/admin/routes.py | 改 | ~45 | valid_types / 映射 / DELETE 与列表门 / global-settings / 默认 setter 拒绝 video | +| omlx/cli.py | 改 | ~6 | disable_xet env 注入 | +| templates/static | 改 | ~20 | 类型下拉 + settings 表单 | +| tests/ (多文件) | 新 | ~1500 | 见 §7 | + +合计新增约 1.25k (业务) + 1.5k (测试), 修改约 370, 分布在 8 个上游同源 +文件的小补丁 (§10). + +## 6. 初始默认值与 P0 实测记录 + +生成参数默认 (服务端兜底, 客户端可覆盖, UX 上限受 settings 钳制, 内存 +边界由预测器把守): size 480x272 (16 倍数), seconds 3 (按 fps=16 折 49 帧, +4n+1), steps 20, guidance 4.0 / guidance_2 3.0 (mlx-gen A14B 模型默认), +seed 随机. 实测默认档约 8 分钟出片 (含 ~60s 冷加载). + +### 6.1 P0 实测 (m5max, M5 Max 128GB, mlx-gen==0.18.14, A14B q8, 2026-06-11) + +| 档位 | 参数 | 真峰值 (lifetime ledger) | 用时 | 最差 0.5s 瞬时 | +|---|---|---|---|---| +| default (自然) | 480x272, 49f, 20 步 | 49.32 GB | 537s | 10.98 GB | +| steps40 (自然) | 480x272, 49f, 40 步 | 49.32 GB | 861s | -- | +| mid_spatial (自然) | 832x480, 49f, 20 步 | 75.46 GB | 2560s | -- | +| frames101 (自然) | 480x272, 101f, 20 步 | 49.44 GB | 1278s | -- | +| default (低 RAM) | 480x272, 49f, 20 步 | 18.83 GB | 491s | 3.15 GB | +| mid_spatial (低 RAM) | 832x480, 49f, 20 步 | 21.88 GB | 2566s | 5.29 GB | + +结论 (全部进入实现): +1. 峰值与步数无关 (49.32 == 49.32 逐字节), 与帧数无关 (49.32 vs 49.44); + 只随单帧空间 token (W/16 x H/16) 线性增长. 预测器公式据此定为 + peak = BASE + COEF x spatial_tokens, 帧数不进公式. +2. 低 RAM 模式 (worker 默认): 内存降 62% 且不慢 (491s vs 537s), 空间缩放 + 也被压平 (3.06x token 只 +3GB). 校准: BASE=17.5, COEF=0.0029 GB/token, + margin=6 (最差瞬时 5.29 padded). 上限角 1280x720 预测 27.9+6=33.9, + lease 默认 36 可容纳. +3. 自然模式是无谓的奢侈 (中档分辨率即 75GB), 仅留作 worker 可选项. +4. 共驻算术: 107.5 ceiling - 36 lease = 71.5GB 留给 LLM, 128GB 机与 + <=50GB 级模型真共驻成立; 与 85GB 级 (glm4.5) 互斥, job 排队. + +## 7. 测试计划 + +单测 (CI 无 GPU, 全部不碰真权重): + +- discovery: diffusers 布局 fixture (空权重文件) -> 认出 video / 不产幽灵 + 组件 / 未知 pipeline 跳过 + log / flat 与 owner-repo 两种布局 / video + 模型按字典序第一时不成为默认模型. +- 类型映射一致性断言 (三份映射 + valid_types 同步). +- 加载拒绝: pool.get_engine 对 video 条目零驱逐零加载直接 typed 异常; + server 侧 chat/embeddings/load 端点 400 + 指引. +- manager 状态机: 提交/排队/取消/超时/停滞/worker 非零退出/manifest 透传/ + 启动回放 (code=server_restarted), worker 用假脚本 (输出 JSONL + touch + mp4) 替身; enforcer 经构造注入假实现 (§4.2 接线即测试缝). +- 并发与竞态: asyncio.gather 多提交 -> 恰一 running + 队列上限 503, 永不 + 双 worker; watchdog 足迹读 0 路径; worker 退出与 watchdog tick 竞态. +- 租约: acquire/release 对 ceiling 的影响, 动态 ceiling 加回 clamp, + guard 关闭拒绝, 准入判据 (滚动峰值口径). +- API: 双 content-type 解析 (multipart 字符串字段强转 + JSON), schema/ + 错误码/Range 下载/越限 400/预测器 413/list 分页游标/expires_at 与 + artifact_expired/DELETE 全状态语义. +- 保留策略: 数量与字节双上限 LRU, blob 删除后记录可见且 expires_at 置位. +- 回归保护: /v1/models payload 增量字段不破坏现有断言 (排查既有精确匹配 + 测试), settings 章节枚举类测试同步. + +P0 真机测量 (m5max, 无 fmlx 代码, 用户 go-ahead 后执行): + +- 仪器: 外部足迹轮询曲线 (相位归因, 接 JSONL 流) + 内核 lifetime-max + ledger (`ri_lifetime_max_phys_footprint`, proc_memory.py:63 已声明未读, + 加一个 get_phys_footprint_lifetime_max 变体在 video.save 返回后, 进程 + 退出前自读) -- worker 每 job 一进程, lifetime max == 含加载/VAE/全部 + sub-poll 尖峰的真峰值. 轮询曲线只定相位形状, 真峰值用 ledger. +- 测量矩阵: 默认档 (480x272, 49f) / 中档 (832x480, 81f) / 上限角 + (1280x720, 121f) / 一个 steps 变体 (验证 steps 不动峰值). 拟合 + peak ~= W + a*latent_tokens (若非融合 SDPA 注意叠加二次项), 同时记录 + 每档最差单步瞬时 (sub-poll delta) -- 它而非稳态峰决定 lease 内 margin + (settings.py:404 方法论). +- 产出: lease 默认值 + 峰值预测器系数 (回填 §4.3/§4.4/§4.5) + 默认参数 + 档位 + lock digest 绑定记录. + +P1 真机 A/B (评审修正: v1 的 "glm4.5 共驻" 算术不可能, 85+28 > 107.5): + +- 场景 A, 互斥语义 (glm4.5 85GB): 大模型在载且保持活跃 (pin 或持续流量, + 防 TTL 中途卸载), POST /v1/videos. 断言: job 停留 queued 且 GET 可见 + 内存原因; 无 worker 进程 (ps 验证); 零 OVER_HARD; 整机存活. 然后卸载 + glm4.5, 断言 job 在 ~2 个 enforcer tick 内转 in_progress. 测试用短超时, + 不用 2h 默认. +- 场景 B, 真共驻 (<=50GB 级模型, 如 gemma4-26b 量化档): 视频 job 运行中 + 发 LLM 长 prefill. 断言: prefill gate 在 (ceiling - lease) 收紧后的 cap + 下干净拒绝或正常完成 (按预算算术预期), admission pause 行为符合水位, + 零 OVER_HARD, 零 panic; 视频 job 正常完成且 mp4 健康. +- 场景 C, 释放与恢复: job 结束 (完成与 DELETE 两路) 后断言租约释放, + 父进程 wired limit 恢复, LLM 满额服务恢复, 产物可 Range 下载. +- 回归: 完整 pytest 套件零回归 (基线见 docs/upstream-sync.md). + +## 8. 阶段划分 + +- P0 真机测量 (先行, 零集成代码): §7 P0. 产出校准数据回填本 spec. +- P1 MVP: §4 全部 + §7 单测 + §7 A/B 三场景. 单分支 feat/video-engine, + 人审人合. +- P2 (按需排期): admin 视频页 + SSE 进度, I2V (图上传), TI2V-5B 与 bf16 + 变体, 文生图 (FLUX 系同运行时, /v1/images), 常驻 worker + idle TTL, + per-model 生成默认, ModelScope 正式支持, admin 一键装 venv, 视频任务 + 主动驱逐 LLM 的策略, drain 式租约落地. + +## 9. 风险与对策 + +| 风险 | 等级 | 对策 | +|---|---|---| +| mlx-gen 高速 0.x 演进 + bus factor 1 (twine 混 runtime deps 的卫生信号) | 高 | 第一道: hash 锁全依赖集 (§4.5), 破裂只能经升级 PR 进来; 第二道: vendor wan 子树 (诚实规模 ~130 文件, torch/transformers 不因 vendor 消失); CLI 切换仅第三道. 升级程序见 §9.1 | +| 生产分辨率内存未实测 (官方数是小 profile, 上限角约 39x 测点 latent 量) | 高 | P0 测量矩阵 + lifetime-max ledger 定真峰; 逐请求峰值预测器把内存边界从静态 caps 解耦 (§4.3); worker wired 自缚保底 | +| 双进程 Metal wired-sum 越机器 cap | 中 | 预防层: worker 进场 wired 自缚 + 父进程 acquire 时 wired 重设 (§4.4 第一层); watchdog 仅作次级清理; A/B 场景 B 专项验证 | +| 租约落地瞬间触发硬压力误伤在途 LLM 请求 | 中 | 准入判据用滚动峰值且要求落地后即处 ok 压力 (§4.4); 残余: 落地后才增长的在途 prefill 被 gate 干净拒绝 (设计内, 无 panic) | +| 与 >=80GB LLM 互斥导致视频 job 长等 | 低 | 设计内取舍, 排队原因对 GET 可见, 可 DELETE; 主动驱逐策略 P2 | +| worker 卡死 (不出步进也不退出) | 中 | 相位心跳 + progress_stall_timeout_seconds 停滞杀 + 单次运行超时, 双层杀 (§4.2, 已入 settings 与状态机) | +| 队列任务跨重启丢失 | 低 | 持久化 + 启动标记 failed (code=server_restarted), 不静默消失; MVP 不做断点续跑 | +| 产物盘占用 | 低 | 双上限 LRU 清 blob, 记录保留 + expires_at | +| settings 旧版本降级丢字段 | 低 | 已知 from_dict 行为, 文档注明 | + +### 9.1 升级与依赖漂移程序 + +1. 锁整个 venv 而非顶层包: requirements.lock (hash) 进仓, venv 创建/重建 + 一律 `uv pip sync`. 一切依赖变更 (顶层 bump 或传递漂移) 必须经 PR. +2. 每次 lock 变更的合并门: 在 m5max 重跑 P0 测量矩阵 (至少默认档 + 上限角), + PR body 携带数字 (真峰/最差瞬时); 新峰值 + margin 逼近在配 lease 时, + 同 PR 重校准 memory_lease_gb 与预测器系数. +3. lease 默认与预测器系数的有效性与 lock digest 绑定 (spec 与 settings + 注释双处记 digest); digest 不匹配时启动 log warning. +4. 输出质量回归: 升级 PR 附固定 seed 的 golden 短片对比 (人工目检即可, + MVP 不做自动指标). + +## 10. 与上游 soft-fork 的关系 + +本功能是 fmlx 自有分化, 永不回流. 冲突面控制策略: 业务全部在新文件 +(omlx/video/, api/video_*), 对上游同源文件只做小而可 grep 的补丁 +(§5 清单中 8 个 "改" 文件, 约 370 行, 其中 cli.py/templates 各 <=20 行). +上游 cherry-pick 撞到这些文件时, 冲突块小且语义独立, 解决成本可控. +docs/upstream-sync.md 记一条分化标记. + +## 11. 待拍板的未决问题 + +1. lease 默认 / 预测器系数 / 默认参数档位 -- 已由 P0 实测回填 (§6.1), 关闭. +2. settings.video.enabled 默认 false (需手动开启) vs 默认 true (venv 缺失 + 时 503 指引) -- 倾向 false, 灰度心智. +3. Xet 修复 (§4.7) 是否拆独立小 PR 先行 (与视频无耦合, 运维价值即时) -- + 倾向拆. +4. 真共驻适用域的产品表述: 文档要不要给出 "128GB 机建议 <=50GB LLM 与 + 视频并用" 的明确指引 -- 倾向给, 写进 README 视频章节. diff --git a/omlx/admin/i18n/en.json b/omlx/admin/i18n/en.json index 512804277..811753370 100644 --- a/omlx/admin/i18n/en.json +++ b/omlx/admin/i18n/en.json @@ -767,5 +767,22 @@ "cluster.router.unknown": "unknown", "cluster.healthy.yes": "yes", "cluster.healthy.no": "no", - "cluster.save.failed": "Save failed" + "cluster.save.failed": "Save failed", + "settings": { + "video": { + "title": "Video Generation", + "enabled": "Enable video generation", + "enabled_desc": "Serve POST /v1/videos via the subprocess worker. Requires the video worker venv.", + "memory_lease": "Memory lease (GB)", + "memory_lease_desc": "Reserved against the memory ceiling while a job runs; co-resident LLMs throttle accordingly.", + "default_steps": "Default denoise steps", + "default_fps": "Default FPS", + "max_queued_jobs": "Max queued jobs", + "job_timeout": "Job timeout (seconds)", + "artifacts_max_gb": "Artifact storage cap (GB)", + "artifacts_max_gb_desc": "Oldest video files are purged beyond this; job records are kept.", + "worker_python": "Worker python path", + "worker_python_desc": "Python of the isolated video venv. Empty = default path." + } + } } diff --git a/omlx/admin/i18n/ru.json b/omlx/admin/i18n/ru.json index 6d68461be..061dd7d40 100644 --- a/omlx/admin/i18n/ru.json +++ b/omlx/admin/i18n/ru.json @@ -742,5 +742,22 @@ "cluster.router.unknown": "неизвестно", "cluster.healthy.yes": "да", "cluster.healthy.no": "нет", - "cluster.save.failed": "Не удалось сохранить" + "cluster.save.failed": "Не удалось сохранить", + "settings": { + "video": { + "title": "Генерация видео", + "enabled": "Включить генерацию видео", + "enabled_desc": "Обслуживать POST /v1/videos через подпроцесс-воркер. Требуется venv видео-воркера.", + "memory_lease": "Резерв памяти (ГБ)", + "memory_lease_desc": "Резервируется из лимита памяти на время задачи; LLM соответственно замедляются.", + "default_steps": "Шаги диффузии по умолчанию", + "default_fps": "FPS по умолчанию", + "max_queued_jobs": "Макс. задач в очереди", + "job_timeout": "Таймаут задачи (сек)", + "artifacts_max_gb": "Лимит хранения (ГБ)", + "artifacts_max_gb_desc": "Старые видеофайлы удаляются сверх лимита; записи задач сохраняются.", + "worker_python": "Путь к python воркера", + "worker_python_desc": "Python изолированного видео-venv. Пусто = путь по умолчанию." + } + } } diff --git a/omlx/admin/i18n/zh.json b/omlx/admin/i18n/zh.json index 964cfe02b..2719f4d46 100644 --- a/omlx/admin/i18n/zh.json +++ b/omlx/admin/i18n/zh.json @@ -765,5 +765,22 @@ "cluster.router.unknown": "未知", "cluster.healthy.yes": "是", "cluster.healthy.no": "否", - "cluster.save.failed": "保存失败" + "cluster.save.failed": "保存失败", + "settings": { + "video": { + "title": "视频生成", + "enabled": "启用视频生成", + "enabled_desc": "通过子进程 worker 提供 POST /v1/videos。需要先安装视频 worker venv。", + "memory_lease": "内存租约 (GB)", + "memory_lease_desc": "任务运行期间从内存上限中预留;共驻的 LLM 会相应限流。", + "default_steps": "默认去噪步数", + "default_fps": "默认帧率", + "max_queued_jobs": "最大排队任务数", + "job_timeout": "任务超时 (秒)", + "artifacts_max_gb": "产物存储上限 (GB)", + "artifacts_max_gb_desc": "超限时清除最旧的视频文件;任务记录保留。", + "worker_python": "Worker python 路径", + "worker_python_desc": "独立视频 venv 的 python。留空用默认路径。" + } + } } diff --git a/omlx/admin/routes.py b/omlx/admin/routes.py index a0c5d8e8d..97145037e 100644 --- a/omlx/admin/routes.py +++ b/omlx/admin/routes.py @@ -281,6 +281,21 @@ class GlobalSettingsRequest(BaseModel): # Idle timeout settings. null disables the global fallback. idle_timeout_seconds: int | None = Field(default=None, ge=60) + # Video generation settings (docs/video-generation-engine-spec.md 4.5) + video_enabled: bool | None = None + video_worker_python: str | None = None + video_memory_lease_gb: float | None = Field(default=None, gt=0) + video_max_queued_jobs: int | None = Field(default=None, ge=1) + video_job_timeout_seconds: int | None = Field(default=None, ge=60) + video_progress_stall_timeout_seconds: int | None = Field(default=None, ge=30) + video_default_steps: int | None = Field(default=None, ge=1) + video_default_fps: int | None = Field(default=None, ge=1) + video_max_frames: int | None = Field(default=None, ge=5) + video_max_steps: int | None = Field(default=None, ge=1) + video_max_pixels_per_frame: int | None = Field(default=None, ge=256) + video_artifacts_max_count: int | None = Field(default=None, ge=1) + video_artifacts_max_gb: float | None = Field(default=None, gt=0) + # Auth settings api_key: str | None = None skip_api_key_verification: bool | None = None @@ -1857,7 +1872,7 @@ async def update_model_settings( ) current_settings.model_alias = alias_value if "model_type_override" in sent: - valid_types = {"llm", "vlm", "embedding", "reranker", "audio_stt", "audio_tts", "audio_sts"} + valid_types = {"llm", "vlm", "embedding", "reranker", "audio_stt", "audio_tts", "audio_sts", "video"} # Treat empty string as None (auto-detect) override_value = request.model_type_override or None if override_value is not None and override_value not in valid_types: @@ -1875,6 +1890,7 @@ async def update_model_settings( "audio_stt": "audio_stt", "audio_tts": "audio_tts", "audio_sts": "audio_sts", + "video": "video", } if override_value: entry.model_type = override_value @@ -2849,6 +2865,7 @@ async def get_global_settings(is_admin: bool = Depends(require_admin)): "idle_timeout": { "idle_timeout_seconds": global_settings.idle_timeout.idle_timeout_seconds, }, + "video": global_settings.video.to_dict(), } @@ -3268,6 +3285,30 @@ async def update_global_settings( else: logger.info("Idle timeout disabled") + # Apply video settings (Live for caps/timeouts; enabled flips per-request + # gating immediately because handlers read settings.video each call. + # worker_python/memory_lease affect the NEXT job dispatch.) + _video_fields = { + "video_enabled": "enabled", + "video_worker_python": "worker_python", + "video_memory_lease_gb": "memory_lease_gb", + "video_max_queued_jobs": "max_queued_jobs", + "video_job_timeout_seconds": "job_timeout_seconds", + "video_progress_stall_timeout_seconds": "progress_stall_timeout_seconds", + "video_default_steps": "default_steps", + "video_default_fps": "default_fps", + "video_max_frames": "max_frames", + "video_max_steps": "max_steps", + "video_max_pixels_per_frame": "max_pixels_per_frame", + "video_artifacts_max_count": "artifacts_max_count", + "video_artifacts_max_gb": "artifacts_max_gb", + } + for req_field, attr in _video_fields.items(): + value = getattr(request, req_field, None) + if value is not None: + setattr(global_settings.video, attr, value) + runtime_applied.append(req_field) + # Apply auth settings (API key change) if request.api_key is not None: from ..server import _server_state @@ -4465,7 +4506,7 @@ async def list_hf_models(is_admin: bool = Depends(require_admin)): model_dirs = global_settings.model.get_model_dirs(global_settings.base_path) - from ..model_discovery import _resolve_hf_cache_entry + from ..model_discovery import _is_model_dir, _resolve_hf_cache_entry def _add_model(model_path: Path, model_name: str) -> None: if model_name in seen_names: @@ -4492,7 +4533,9 @@ def _add_model(model_path: Path, model_name: str) -> None: if not subdir.is_dir() or subdir.name.startswith("."): continue - if (subdir / "config.json").exists(): + # _is_model_dir accepts config.json or model_index.json roots + # (diffusers-layout video models) and excludes adapters. + if _is_model_dir(subdir): # Level 1: direct model folder _add_model(subdir, subdir.name) else: @@ -4500,7 +4543,7 @@ def _add_model(model_path: Path, model_name: str) -> None: hf_resolved = _resolve_hf_cache_entry(subdir) if hf_resolved is not None: snapshot_path, model_name = hf_resolved - if (snapshot_path / "config.json").exists(): + if _is_model_dir(snapshot_path): _add_model(snapshot_path, model_name) continue @@ -4508,7 +4551,7 @@ def _add_model(model_path: Path, model_name: str) -> None: for child in sorted(subdir.iterdir()): if not child.is_dir() or child.name.startswith("."): continue - if (child / "config.json").exists(): + if _is_model_dir(child): _add_model(child, child.name) return {"models": models} @@ -4528,14 +4571,18 @@ async def delete_hf_model( model_dirs = global_settings.model.get_model_dirs(global_settings.base_path) - # Search for model across all directories in both flat and org-folder layouts + # Search for model across all directories in both flat and org-folder + # layouts. _is_model_dir accepts config.json or model_index.json roots + # (diffusers-layout video models must be deletable too). + from ..model_discovery import _is_model_dir + model_path = None parent_model_dir = None for model_dir in model_dirs: if not model_dir.exists(): continue candidate = model_dir / model_name - if candidate.is_dir() and (candidate / "config.json").exists(): + if candidate.is_dir() and _is_model_dir(candidate): model_path = candidate parent_model_dir = model_dir break @@ -4544,7 +4591,7 @@ async def delete_hf_model( if not subdir.is_dir() or subdir.name.startswith("."): continue candidate = subdir / model_name - if candidate.is_dir() and (candidate / "config.json").exists(): + if candidate.is_dir() and _is_model_dir(candidate): model_path = candidate parent_model_dir = model_dir break diff --git a/omlx/admin/static/js/dashboard.js b/omlx/admin/static/js/dashboard.js index a493b9d04..d6baf91b4 100644 --- a/omlx/admin/static/js/dashboard.js +++ b/omlx/admin/static/js/dashboard.js @@ -44,6 +44,7 @@ integrations: { copilot_model: null, codex_model: null, opencode_model: null, openclaw_model: null, pi_model: null, openclaw_tools_profile: 'full' }, ui: { language: 'en' }, idle_timeout: { idle_timeout_seconds: null }, + video: { enabled: false, worker_python: '', memory_lease_gb: 36, max_queued_jobs: 4, job_timeout_seconds: 7200, progress_stall_timeout_seconds: 600, default_steps: 20, default_fps: 16, max_frames: 121, max_steps: 50, max_pixels_per_frame: 921600, artifacts_max_count: 50, artifacts_max_gb: 50 }, system: { total_memory_bytes: 0, total_memory: '', auto_model_memory: '', ssd_total_bytes: 0, ssd_total: '' }, }, @@ -781,6 +782,7 @@ claude_code: { ...this.globalSettings.claude_code, ...data.claude_code }, integrations: { ...this.globalSettings.integrations, ...data.integrations }, idle_timeout: { ...this.globalSettings.idle_timeout, ...data.idle_timeout }, + video: { ...this.globalSettings.video, ...data.video }, system: { ...this.globalSettings.system, ...data.system }, }; this.globalSettings.ui = data.ui || { language: 'en' }; @@ -884,6 +886,14 @@ ...(this.globalSettings.auth.api_key ? { api_key: this.globalSettings.auth.api_key } : {}), skip_api_key_verification: this.globalSettings.auth.skip_api_key_verification, idle_timeout_seconds: this.globalSettings.idle_timeout?.idle_timeout_seconds ?? null, + video_enabled: this.globalSettings.video?.enabled ?? null, + video_worker_python: this.globalSettings.video?.worker_python || null, + video_memory_lease_gb: this.globalSettings.video?.memory_lease_gb ?? null, + video_max_queued_jobs: this.globalSettings.video?.max_queued_jobs ?? null, + video_job_timeout_seconds: this.globalSettings.video?.job_timeout_seconds ?? null, + video_default_steps: this.globalSettings.video?.default_steps ?? null, + video_default_fps: this.globalSettings.video?.default_fps ?? null, + video_artifacts_max_gb: this.globalSettings.video?.artifacts_max_gb ?? null, }), }); diff --git a/omlx/admin/templates/dashboard/_modal_model_settings.html b/omlx/admin/templates/dashboard/_modal_model_settings.html index 0121801ac..4fdad8b8c 100644 --- a/omlx/admin/templates/dashboard/_modal_model_settings.html +++ b/omlx/admin/templates/dashboard/_modal_model_settings.html @@ -277,6 +277,7 @@

{{ +
diff --git a/omlx/admin/templates/dashboard/_settings.html b/omlx/admin/templates/dashboard/_settings.html index 87ba1d5ae..b192ca715 100644 --- a/omlx/admin/templates/dashboard/_settings.html +++ b/omlx/admin/templates/dashboard/_settings.html @@ -575,6 +575,83 @@

{{ t('settings.gl

+ +
+
+
+ +

{{ t('settings.video.title') }}

+
+
+
+
+
+ +

{{ t('settings.video.enabled_desc') }}

+
+ +
+
+
+ +

{{ t('settings.video.memory_lease_desc') }}

+
+ +
+
+
+ +
+ +
+
+
+ +
+ +
+
+
+ +
+ +
+
+
+ +
+ +
+
+
+ +

{{ t('settings.video.artifacts_max_gb_desc') }}

+
+ +
+
+
+ +

{{ t('settings.video.worker_python_desc') }}

+
+ +
+
+
+
diff --git a/omlx/api/openai_models.py b/omlx/api/openai_models.py index 900d3a50f..3e32003a1 100644 --- a/omlx/api/openai_models.py +++ b/omlx/api/openai_models.py @@ -413,6 +413,10 @@ class ModelInfo(BaseModel): object: str = "model" created: int = Field(default_factory=get_unix_timestamp) owned_by: str = "omlx" + # fmlx extension (additive; OpenAI clients ignore unknown fields). + # Lets clients filter non-chat models (video/embedding/audio) out of + # chat pickers; the CLI's llm/vlm filter consumes it. + model_type: str = "llm" class ModelsResponse(BaseModel): diff --git a/omlx/api/video_models.py b/omlx/api/video_models.py new file mode 100644 index 000000000..e64cd3443 --- /dev/null +++ b/omlx/api/video_models.py @@ -0,0 +1,40 @@ +# SPDX-License-Identifier: Apache-2.0 +"""Request/response models for the /v1/videos API. + +POST /v1/videos accepts BOTH application/json and multipart/form-data -- +the official openai SDK sends multipart (all fields as strings), so the +route normalizes either body into VideoCreateParams here. Pydantic v2 lax +coercion handles the string-to-number conversion ("4" -> 4). +Design: docs/video-generation-engine-spec.md section 4.3. +""" + +from __future__ import annotations + +from typing import Optional + +from pydantic import BaseModel, Field + + +class VideoCreateParams(BaseModel): + """Normalized create-video parameters (JSON or multipart source). + + OpenAI-compatible core: model, prompt, size ("WxH"), seconds (the SDK + sends string literals like "4"). fmlx extensions: negative_prompt, + frames/steps/fps/seed/guidance/guidance_2. Extension collision policy: + if OpenAI later claims an extension name, fmlx semantics yield and the + extension moves to an fmlx_ prefix (spec 4.3). + """ + + model: str + prompt: str = Field(min_length=1) + size: Optional[str] = None # "WxH", e.g. "480x272" + seconds: Optional[float] = None + negative_prompt: Optional[str] = None + width: Optional[int] = None # Explicit override beats size + height: Optional[int] = None + frames: Optional[int] = None # Explicit override beats seconds*fps + steps: Optional[int] = None + fps: Optional[int] = None + seed: Optional[int] = None + guidance: Optional[float] = None + guidance_2: Optional[float] = None diff --git a/omlx/api/video_routes.py b/omlx/api/video_routes.py new file mode 100644 index 000000000..a9a56a77f --- /dev/null +++ b/omlx/api/video_routes.py @@ -0,0 +1,331 @@ +# SPDX-License-Identifier: Apache-2.0 +"""/v1/videos -- OpenAI-style async video generation job API. + +Endpoints (design: docs/video-generation-engine-spec.md section 4.3): +- POST /v1/videos submit, returns job object immediately +- GET /v1/videos cursor-paginated list +- GET /v1/videos/{id} poll job object +- GET /v1/videos/{id}/content download the mp4 (Range supported) +- DELETE /v1/videos/{id} cancel/delete job + artifacts + +The router is mounted UNCONDITIONALLY at import time (settings are not +initialized yet at that point); all gating happens per-request: +settings.video.enabled off -> 503, manager missing -> 503, worker venv +unusable -> 503 with install guidance. +""" + +from __future__ import annotations + +import logging +import math +import random +import uuid +from pathlib import Path +from typing import Any + +from fastapi import APIRouter, HTTPException, Request +from fastapi.responses import FileResponse +from pydantic import ValidationError + +from .video_models import VideoCreateParams + +logger = logging.getLogger(__name__) + +router = APIRouter() + +GB = 1024**3 + +# Per-request peak predictor, calibrated from the P0 low-RAM measurement +# matrix on m5max (2026-06-11, mlx-gen==0.18.14 lock; recalibrate on every +# lock bump, spec 9.1). Empirical findings: peak scales with PER-FRAME +# spatial latent tokens (W/16 * H/16) and is invariant to frame count and +# step count (measured: 480x272 49f==101f within 0.2GB; 20 vs 40 steps +# byte-identical). Low-RAM mode (the worker default): 510 tok -> 18.83GB, +# 1560 tok -> 21.88GB => BASE 17.5, COEF 0.0029 GB/token. Margin covers +# the worst observed sub-poll transient (5.29GB per 0.5s) padded. +_PEAK_BASE_GB = 17.5 +_PEAK_COEF_GB_PER_SPATIAL_TOKEN = 0.0029 +_PEAK_MARGIN_GB = 6.0 + + +def _get_video_manager(): + """Active VideoJobManager from server state (test-patchable).""" + from omlx.server import _server_state + + settings = getattr(_server_state, "global_settings", None) + video_settings = getattr(settings, "video", None) if settings else None + if video_settings is None or not video_settings.enabled: + raise HTTPException( + status_code=503, + detail=( + "Video generation is disabled. Enable settings.video.enabled " + "and configure the worker venv " + "(docs/video-generation-engine-spec.md)." + ), + ) + manager = getattr(_server_state, "video_job_manager", None) + if manager is None: + raise HTTPException( + status_code=503, detail="Video job manager not initialized" + ) + return manager + + +def _get_engine_pool(): + from omlx.server import _server_state + + pool = _server_state.engine_pool + if pool is None: + raise HTTPException(status_code=503, detail="Server not initialized") + return pool + + +def _resolve_model(model_id: str) -> str: + from omlx.server import resolve_model_id + + return resolve_model_id(model_id) or model_id + + +def _record_video_request(model_id: str) -> None: + """Record request count without treating anything as tokens.""" + try: + from omlx.server import get_server_metrics + + get_server_metrics().record_request_complete( + prompt_tokens=0, + completion_tokens=0, + cached_tokens=0, + model_id=model_id, + ) + except Exception as exc: # noqa: BLE001 + logger.warning("Failed to record video metrics for %s: %s", model_id, exc) + + +def _round_up(value: int, multiple: int) -> int: + return ((value + multiple - 1) // multiple) * multiple + + +def _normalize_params( + params: VideoCreateParams, video_settings: Any +) -> dict[str, Any]: + """Apply defaults, dimension rules (W/H multiples of 16, frames 4n+1) + and UX caps. Raises HTTPException 400 on violations.""" + width = params.width + height = params.height + if (width is None or height is None) and params.size: + try: + w_str, h_str = params.size.lower().split("x", 1) + width = width or int(w_str) + height = height or int(h_str) + except ValueError: + raise HTTPException( + status_code=400, + detail=f"Invalid size '{params.size}', expected 'WxH'", + ) + width = width or 480 + height = height or 272 + if width <= 0 or height <= 0: + raise HTTPException(status_code=400, detail="size must be positive") + width = _round_up(width, 16) + height = _round_up(height, 16) + + fps = params.fps or int(video_settings.default_fps) + steps = params.steps or int(video_settings.default_steps) + + frames = params.frames + if frames is None: + seconds = params.seconds if params.seconds is not None else 3.0 + if seconds <= 0: + raise HTTPException(status_code=400, detail="seconds must be positive") + frames = int(round(seconds * fps)) + # Wan requires 4n+1 frames + frames = max(5, 4 * math.ceil((frames - 1) / 4) + 1) + + if frames > int(video_settings.max_frames): + raise HTTPException( + status_code=400, + detail=f"frames {frames} exceeds max_frames " + f"{video_settings.max_frames}", + ) + if steps > int(video_settings.max_steps): + raise HTTPException( + status_code=400, + detail=f"steps {steps} exceeds max_steps {video_settings.max_steps}", + ) + if width * height > int(video_settings.max_pixels_per_frame): + raise HTTPException( + status_code=400, + detail=f"{width}x{height} exceeds max_pixels_per_frame " + f"{video_settings.max_pixels_per_frame}", + ) + + # Memory bound: predicted peak must fit the lease (spec 4.3/4.4). The + # static caps above are UX bounds only. Peak is frame-count-invariant + # (P0 measured), so only per-frame spatial tokens enter the formula. + spatial_tokens = (width / 16) * (height / 16) + predicted_gb = _PEAK_BASE_GB + _PEAK_COEF_GB_PER_SPATIAL_TOKEN * spatial_tokens + lease_gb = float(video_settings.memory_lease_gb) + if predicted_gb + _PEAK_MARGIN_GB > lease_gb: + raise HTTPException( + status_code=413, + detail=( + f"Predicted memory peak {predicted_gb:.1f}GB (+{_PEAK_MARGIN_GB}GB " + f"margin) exceeds video.memory_lease_gb {lease_gb:.0f}GB. " + "Reduce resolution/frames or raise the lease." + ), + ) + + seed = params.seed if params.seed is not None else random.randint(0, 2**31 - 1) + normalized: dict[str, Any] = { + "prompt": params.prompt, + "width": width, + "height": height, + "frames": frames, + "steps": steps, + "fps": fps, + "seed": int(seed), + "seconds": round(frames / fps, 2), + } + if params.negative_prompt: + normalized["negative_prompt"] = params.negative_prompt + if params.guidance is not None: + normalized["guidance"] = float(params.guidance) + if params.guidance_2 is not None: + normalized["guidance_2"] = float(params.guidance_2) + return normalized + + +async def _parse_create_body(request: Request) -> VideoCreateParams: + """Accept JSON or multipart (openai SDK sends multipart, all-string + fields; pydantic lax coercion converts them).""" + content_type = (request.headers.get("content-type") or "").lower() + try: + if "multipart/form-data" in content_type: + form = await request.form() + data = {k: v for k, v in form.items() if isinstance(v, str)} + else: + data = await request.json() + except Exception: + raise HTTPException(status_code=400, detail="Malformed request body") + try: + return VideoCreateParams.model_validate(data) + except ValidationError as e: + raise HTTPException(status_code=400, detail=str(e)) + + +@router.post("/v1/videos") +async def create_video(request: Request): + manager = _get_video_manager() + params = await _parse_create_body(request) + + pool = _get_engine_pool() + resolved = _resolve_model(params.model) + entry = pool.get_entry(resolved) if hasattr(pool, "get_entry") else None + if entry is None: + raise HTTPException( + status_code=404, + detail=f"Model '{params.model}' not found", + ) + if getattr(entry, "model_type", "") != "video": + raise HTTPException( + status_code=400, + detail=( + f"Model '{params.model}' is not a video generation model " + f"(model_type={getattr(entry, 'model_type', '?')})" + ), + ) + + ok, reason = manager.guard_available() + if not ok: + raise HTTPException(status_code=503, detail=reason) + venv_ok, venv_reason = await manager.probe_worker_venv() + if not venv_ok: + raise HTTPException(status_code=503, detail=venv_reason) + + from omlx.server import _server_state + + video_settings = _server_state.global_settings.video + normalized = _normalize_params(params, video_settings) + + from omlx.video.manager import QueueFullError, VideoJob + + job = VideoJob( + id=f"video_{uuid.uuid4().hex}", + model_id=resolved, + model_dir=str(entry.model_path), + params=normalized, + ) + try: + await manager.submit(job) + except QueueFullError as e: + raise HTTPException(status_code=503, detail=str(e)) + _record_video_request(resolved) + return job.to_dict() + + +@router.get("/v1/videos") +async def list_videos( + limit: int = 20, after: str | None = None, order: str = "desc" +): + manager = _get_video_manager() + limit = max(1, min(int(limit), 100)) + if order not in ("asc", "desc"): + raise HTTPException(status_code=400, detail="order must be asc|desc") + page, has_more = manager.list_jobs(limit=limit, after=after, order=order) + data = [j.to_dict() for j in page] + return { + "object": "list", + "data": data, + "has_more": has_more, + "first_id": data[0]["id"] if data else None, + "last_id": data[-1]["id"] if data else None, + } + + +@router.get("/v1/videos/{video_id}") +async def get_video(video_id: str): + manager = _get_video_manager() + job = manager.get(video_id) + if job is None: + raise HTTPException(status_code=404, detail=f"Video '{video_id}' not found") + return job.to_dict() + + +@router.get("/v1/videos/{video_id}/content") +async def get_video_content(video_id: str): + manager = _get_video_manager() + job = manager.get(video_id) + if job is None: + raise HTTPException(status_code=404, detail=f"Video '{video_id}' not found") + if job.status != "completed": + raise HTTPException( + status_code=409, + detail=f"Video '{video_id}' is {job.status}, content not available", + ) + if not job.artifact_path or not Path(job.artifact_path).exists(): + # Artifact purged by retention (spec 4.3): record outlives the blob. + raise HTTPException( + status_code=404, + detail={ + "code": "artifact_expired", + "message": ( + f"The artifact for '{video_id}' was purged by the " + "retention policy" + ), + "expires_at": int(job.expires_at) if job.expires_at else None, + }, + ) + return FileResponse( + job.artifact_path, + media_type="video/mp4", + filename=f"{video_id}.mp4", + ) + + +@router.delete("/v1/videos/{video_id}") +async def delete_video(video_id: str): + manager = _get_video_manager() + deleted = await manager.delete(video_id) + if not deleted: + raise HTTPException(status_code=404, detail=f"Video '{video_id}' not found") + return {"id": video_id, "object": "video.deleted", "deleted": True} diff --git a/omlx/cli.py b/omlx/cli.py index 75669ff6b..e4d5aea5d 100644 --- a/omlx/cli.py +++ b/omlx/cli.py @@ -116,6 +116,14 @@ def serve_command(args): if settings.huggingface.endpoint: os.environ["HF_ENDPOINT"] = settings.huggingface.endpoint + # Disable the Xet transfer backend if configured. huggingface_hub reads + # HF_HUB_DISABLE_XET into constants at import time, so it must be set + # here -- before any huggingface_hub import -- and cannot be toggled + # per-download. Xet (cas-bridge.xethub.hf.co) is unreachable from some + # networks (observed: mainland China); the plain LFS path works. + if settings.huggingface.disable_xet: + os.environ["HF_HUB_DISABLE_XET"] = "1" + # Apply ModelScope endpoint if configured. The modelscope SDK builds its URL # as https://, so this must be a BARE host -- a full URL # like "https://modelscope.cn" becomes "https://https://modelscope.cn" and diff --git a/omlx/engine_pool.py b/omlx/engine_pool.py index 02719a633..ef89bde74 100644 --- a/omlx/engine_pool.py +++ b/omlx/engine_pool.py @@ -38,6 +38,7 @@ ModelLoadingError, ModelNotFoundError, ModelTooLargeError, + ModelTypeNotLoadableError, ) from .model_discovery import DiscoveredModel, discover_models, format_size from .engine_core import get_mlx_executor @@ -53,8 +54,8 @@ class EngineEntry: model_id: str # Directory name (e.g., "llama-3b") model_path: str # Full path to model directory - model_type: Literal["llm", "vlm", "embedding", "reranker", "audio_stt", "audio_tts", "audio_sts"] # Model type - engine_type: Literal["batched", "simple", "embedding", "reranker", "vlm", "audio_stt", "audio_tts", "audio_sts"] # Engine type to use + model_type: Literal["llm", "vlm", "embedding", "reranker", "audio_stt", "audio_tts", "audio_sts", "video"] # Model type + engine_type: Literal["batched", "simple", "embedding", "reranker", "vlm", "audio_stt", "audio_tts", "audio_sts", "video"] # Engine type to use estimated_size: int # Pre-calculated from safetensors (bytes) config_model_type: str = "" # Raw model_type from config.json (e.g., "deepseekocr_2") thinking_default: bool | None = None # True if model thinks by default, False if not, None if unknown @@ -208,6 +209,7 @@ def discover_models( "audio_stt": "audio_stt", "audio_tts": "audio_tts", "audio_sts": "audio_sts", + "video": "video", } def apply_settings_overrides( @@ -336,6 +338,14 @@ async def get_engine( if not entry: raise ModelNotFoundError(model_id, list(self._entries.keys())) + # Video models are job-managed (POST /v1/videos) and never + # pool-loaded. Reject BEFORE the admission loop below -- letting + # a 42GB video entry into admission would evict resident LLM + # engines before failing (docs/video-generation-engine-spec.md + # section 3). + if entry.model_type == "video": + raise ModelTypeNotLoadableError(model_id, entry.model_type) + # Already loaded - just update access time if entry.engine is not None: # If force_lm requested but current engine is VLM, unload and reload @@ -661,6 +671,11 @@ async def _load_engine(self, model_id: str, force_lm: bool = False) -> None: model_name=entry.model_path, config_model_type=entry.config_model_type, ) + elif entry.engine_type == "video": + # Defense in depth: get_engine rejects video entries + # before admission; this arm catches any other caller + # so a diffusers dir never falls into BatchedEngine. + raise ModelTypeNotLoadableError(model_id, entry.model_type) else: engine = BatchedEngine( model_name=entry.model_path, diff --git a/omlx/exceptions.py b/omlx/exceptions.py index 71aeea083..160e8e1c8 100644 --- a/omlx/exceptions.py +++ b/omlx/exceptions.py @@ -426,6 +426,25 @@ def __init__(self, model_id: str): super().__init__(f"Model '{model_id}' is already being loaded") +class ModelTypeNotLoadableError(EnginePoolError): + """Raised when a model type is not pool-loadable (e.g. video models). + + Video generation models are job-managed by the VideoJobManager and are + never loaded into the engine pool. Raised by EnginePool.get_engine + BEFORE the memory-admission loop so a misrouted request cannot evict + resident LLM engines (docs/video-generation-engine-spec.md section 3). + The server layer maps this to HTTP 400 with an endpoint hint. + """ + + def __init__(self, model_id: str, model_type: str): + self.model_id = model_id + self.model_type = model_type + super().__init__( + f"Model '{model_id}' is a {model_type} generation model and " + "cannot be loaded as an inference engine. Use POST /v1/videos." + ) + + # ============================================================================= # MCP Errors # ============================================================================= diff --git a/omlx/model_discovery.py b/omlx/model_discovery.py index 731ecf2f5..6637cbb3d 100644 --- a/omlx/model_discovery.py +++ b/omlx/model_discovery.py @@ -23,8 +23,15 @@ logger = logging.getLogger(__name__) -ModelType = Literal["llm", "vlm", "embedding", "reranker", "audio_stt", "audio_tts", "audio_sts"] -EngineType = Literal["batched", "vlm", "embedding", "reranker", "audio_stt", "audio_tts", "audio_sts"] +ModelType = Literal["llm", "vlm", "embedding", "reranker", "audio_stt", "audio_tts", "audio_sts", "video"] +EngineType = Literal["batched", "vlm", "embedding", "reranker", "audio_stt", "audio_tts", "audio_sts", "video"] + +# Diffusers pipeline classes (model_index.json "_class_name") that fmlx can +# serve via the video engine (docs/video-generation-engine-spec.md). Unknown +# pipeline classes are skipped at discovery -- registering them would produce +# unloadable entries, and historically the org-folder descent turned their +# component subdirs into phantom "llm" models. +VIDEO_PIPELINE_CLASSES = {"WanPipeline"} # Known VLM (Vision-Language Model) types from mlx-vlm VLM_MODEL_TYPES = { @@ -401,6 +408,14 @@ def detect_model_type(model_path: Path) -> ModelType: Returns: Model type: "llm", "vlm", "embedding", "reranker", "audio_stt", "audio_tts", or "audio_sts" """ + # Diffusers-layout video models: model_index.json at the root, no root + # config.json. Must run before the missing-config.json fallback below. + # Unknown pipeline classes never reach this point for registration -- + # _register_model skips them outright. + pipeline_class = read_model_index_pipeline_class(model_path) + if pipeline_class in VIDEO_PIPELINE_CLASSES: + return "video" + config_path = model_path / "config.json" if not config_path.exists(): return "llm" @@ -694,9 +709,35 @@ def _is_adapter_dir(path: Path) -> bool: return (path / "adapter_config.json").exists() +def read_model_index_pipeline_class(path: Path) -> str | None: + """Return the "_class_name" from a diffusers model_index.json, else None. + + Diffusers-layout models (e.g. Wan2.2 T2V) have model_index.json at the + root and no root config.json. + """ + index_path = path / "model_index.json" + if not index_path.exists(): + return None + try: + with open(index_path) as f: + value = json.load(f).get("_class_name") + return value if isinstance(value, str) else None + except (json.JSONDecodeError, OSError): + return None + + def _is_model_dir(path: Path) -> bool: - """Check if a directory contains a valid model (has config.json).""" - return (path / "config.json").exists() and not _is_adapter_dir(path) + """Check if a directory contains a valid model. + + A model root has either config.json (transformers layout) or + model_index.json (diffusers layout). The model_index.json check must + live here -- it is what stops the org-folder descent in + discover_models() from registering diffusers component subdirs + (transformer/, vae/, ...) as phantom standalone models. + """ + if _is_adapter_dir(path): + return False + return (path / "config.json").exists() or (path / "model_index.json").exists() def _resolve_hf_cache_entry(path: Path) -> tuple[Path, str] | None: @@ -734,6 +775,24 @@ def _register_model( logger.info(f"Skipping unsupported model: {model_id}") return + # Diffusers-layout dirs whose pipeline class fmlx cannot serve are + # skipped outright -- registering them would produce unloadable + # entries (docs/video-generation-engine-spec.md section 4.1). This + # includes model_index.json files with a missing/unreadable + # _class_name (pipeline_class None): without a root config.json + # such a dir would otherwise register as an unloadable llm entry. + pipeline_class = read_model_index_pipeline_class(model_dir) + if ( + (model_dir / "model_index.json").exists() + and not (model_dir / "config.json").exists() + and pipeline_class not in VIDEO_PIPELINE_CLASSES + ): + logger.warning( + f"Skipping unsupported diffusers pipeline " + f"'{pipeline_class}': {model_id}" + ) + return + model_type = detect_model_type(model_dir) if model_type == "embedding": engine_type: EngineType = "embedding" @@ -747,18 +806,25 @@ def _register_model( engine_type = "audio_tts" elif model_type == "audio_sts": engine_type = "audio_sts" + elif model_type == "video": + engine_type = "video" else: engine_type = "batched" estimated_size = estimate_model_size(model_dir) - # Read raw config model_type for sub-type detection (e.g., OCR models) + # Read raw config model_type for sub-type detection (e.g., OCR models). + # Video models have no root config.json; surface the diffusers + # pipeline class instead so the admin UI shows something meaningful. config_model_type = "" - try: - import json - with open(model_dir / "config.json") as f: - config_model_type = json.load(f).get("model_type", "") - except Exception: - pass + if model_type == "video": + config_model_type = pipeline_class or "" + else: + try: + import json + with open(model_dir / "config.json") as f: + config_model_type = json.load(f).get("model_type", "") + except Exception: + pass thinking_default = detect_thinking_default(model_dir) preserve_thinking_default = detect_preserve_thinking(model_dir) diff --git a/omlx/process_memory_enforcer.py b/omlx/process_memory_enforcer.py index 7d11fee31..6b9cdc781 100644 --- a/omlx/process_memory_enforcer.py +++ b/omlx/process_memory_enforcer.py @@ -33,6 +33,7 @@ import logging import subprocess import sys +from collections import deque from typing import TYPE_CHECKING, Any import mlx.core as mx @@ -291,6 +292,7 @@ def __init__( hard_threshold: float = 0.95, prefill_safe_zone_ratio: float = 0.80, prefill_min_chunk_tokens: int = 32, + prefill_transient_margin_gb: float = 0.0, ): """ Initialize the process memory enforcer. @@ -317,6 +319,11 @@ def __init__( prefill_safe_zone_ratio: Fraction of hard cap below which prefill runs at full chunk size; above triggers adaptive shrink. prefill_min_chunk_tokens: Floor for adaptive shrink. + prefill_transient_margin_gb: Conservative margin added to the + modelled per-chunk prefill peak by the scheduler's + forward-front gate, covering the MoE expert-dequant activation + spike that estimate_prefill_peak_bytes does not model. + Propagated to each scheduler. 0 = no extra margin. """ self._engine_pool = engine_pool self._memory_guard_tier = self._normalize_tier(memory_guard_tier) @@ -331,6 +338,9 @@ def __init__( self._hard_threshold = hard_threshold self._prefill_safe_zone_ratio = prefill_safe_zone_ratio self._prefill_min_chunk_tokens = prefill_min_chunk_tokens + self._prefill_transient_margin_bytes = max( + 0, int(prefill_transient_margin_gb * 1024**3) + ) self._task: asyncio.Task | None = None self._running = False # Most recently observed pressure level, consumed by scheduler / @@ -340,6 +350,21 @@ def __init__( # or the call failed). Used by the admin dashboard to surface a # warning when the kernel iogpu.wired_limit_mb is below this. self._metal_wired_limit_request: int = 0 + # Rolling window of recent usage readings + their high-water mark. + # Prefill memory dips into a trough between chunks, so the instant + # reading can read low mid-prefill; preflight admission consults this + # peak instead so it does not wave through a request that will wall + # the next chunk. Updated on every poll iteration. + self._usage_window: deque[int] = deque(maxlen=5) + self._recent_peak_bytes: int = 0 + # Video job memory lease (docs/video-generation-engine-spec.md 4.4). + # While held, the lease is subtracted from the final ceiling so pool + # admission, watermarks and the prefill gate all tighten coherently. + # The worker pid lets the dynamic ceiling count the subprocess + # exactly once (its real usage drains system free pages, which + # would otherwise stack on top of the explicit lease). + self._video_lease_bytes: int = 0 + self._video_worker_pid: int | None = None @staticmethod def _normalize_tier(tier: str) -> str: @@ -463,17 +488,31 @@ def _get_dynamic_ceiling(self) -> int: if self._memory_guard_tier == "custom": return max(0, self._memory_guard_custom_ceiling_bytes) + # Video worker correction: the worker's real usage drains system + # free pages, shrinking this ceiling -- but the lease is ALREADY + # subtracted in _get_hard_limit_bytes. Add the worker's footprint + # back (clamped to the lease) so it is counted exactly once. A + # footprint read of 0 (failure) degrades to double-counting, which + # is fail-conservative. + worker_extra = 0 + if self._video_worker_pid is not None and self._video_lease_bytes > 0: + worker = get_phys_footprint(self._video_worker_pid) + if worker > 0: + worker_extra = min(worker, self._video_lease_bytes) + omlx_usage = get_phys_footprint() stats = get_macos_vm_stats() if stats is None: - return max(0, omlx_usage + psutil.virtual_memory().available) + return max( + 0, omlx_usage + worker_extra + psutil.virtual_memory().available + ) ratio = _ACTIVE_RECLAIM_RATIO[self._memory_guard_tier] reclaimable = ( stats["free"] + stats["inactive"] + int(stats["active"] * ratio) ) - return max(0, omlx_usage + reclaimable) + return max(0, omlx_usage + worker_extra + reclaimable) def _get_hard_limit_bytes(self) -> int: """Final hard ceiling = min(static, dynamic, metal_cap). @@ -497,12 +536,85 @@ def _get_hard_limit_bytes(self) -> int: metal_cap = get_effective_metal_cap_bytes() if metal_cap > 0: candidates.append(metal_cap) - return min(candidates) + ceiling = min(candidates) + if self._video_lease_bytes > 0: + # Clamp to >= 1, never 0: every consumer treats ceiling 0 as + # "guard disabled", which would drop all protection exactly + # while a video job holds memory. A 1-byte ceiling instead + # pauses admission and trips the gate -- the safe direction. + return max(1, ceiling - self._video_lease_bytes) + return ceiling def get_final_ceiling(self) -> int: """Public accessor used by engine_pool pre-load admission.""" return self._get_hard_limit_bytes() + def recent_peak_bytes(self) -> int: + """Recent high-water memory usage over the last few poll ticks.""" + return self._recent_peak_bytes + + @property + def video_lease_bytes(self) -> int: + """Currently held video memory lease (0 when none).""" + return self._video_lease_bytes + + def acquire_video_lease(self, lease_bytes: int) -> None: + """Reserve memory for a video worker job. + + Subtracts the lease from the final ceiling (single choke point: + pool admission, soft/hard watermarks, admission_paused and the + prefill gate cap all derive from it) and lowers this process's + Metal wired limit so parent + worker wired sets cannot stack + toward the machine cap. One lease at a time -- the VideoJobManager + serializes jobs (docs/video-generation-engine-spec.md 4.4). + + Raises: + RuntimeError: If a lease is already held. + ValueError: If lease_bytes is not positive. + """ + if lease_bytes <= 0: + raise ValueError(f"lease_bytes must be positive, got {lease_bytes}") + if self._video_lease_bytes > 0: + raise RuntimeError( + "A video memory lease is already held " + f"({_format_gb(self._video_lease_bytes)})" + ) + self._video_lease_bytes = int(lease_bytes) + if self._prefill_memory_guard: + target = max(1, self._get_static_ceiling() - self._video_lease_bytes) + _apply_metal_wired_limit(target) + self._metal_wired_limit_request = target + if self._running: + self._propagate_memory_limit() + logger.info( + "[videolease] acquired %s (ceiling now %s)", + _format_gb(self._video_lease_bytes), + _format_gb(self._get_hard_limit_bytes()), + ) + + def set_video_worker_pid(self, pid: int | None) -> None: + """Bind the running video worker pid for dynamic-ceiling correction.""" + self._video_worker_pid = pid + + def release_video_lease(self) -> None: + """Release the video memory lease and restore the Metal wired limit.""" + if self._video_lease_bytes <= 0: + return + released = self._video_lease_bytes + self._video_lease_bytes = 0 + self._video_worker_pid = None + if self._prefill_memory_guard: + static_ceiling = self._get_static_ceiling() + _apply_metal_wired_limit(static_ceiling) + self._metal_wired_limit_request = static_ceiling + if self._running: + self._propagate_memory_limit() + logger.info( + "[videolease] released %s (ceiling now %s)", + _format_gb(released), + _format_gb(self._get_hard_limit_bytes()), + ) + def _soft_bytes(self) -> int: """Soft watermark: ceiling * soft_threshold.""" ceiling = self._get_hard_limit_bytes() @@ -589,6 +701,10 @@ def _propagate_memory_limit(self) -> None: scheduler._admission_paused = admission_paused scheduler._prefill_safe_zone_ratio = self._prefill_safe_zone_ratio scheduler._prefill_min_chunk_tokens = self._prefill_min_chunk_tokens + scheduler._prefill_transient_margin_bytes = ( + self._prefill_transient_margin_bytes + ) + scheduler._memory_recent_peak_bytes = self._recent_peak_bytes bg = getattr(scheduler, "batch_generator", None) if bg is not None and hasattr(bg, "_memory_limit_bytes"): bg._memory_limit_bytes = soft_limit @@ -671,6 +787,8 @@ async def _check_and_enforce(self) -> None: return current = self._current_usage_bytes() + self._usage_window.append(current) + self._recent_peak_bytes = max(self._usage_window) if self._usage_window else current soft = int(ceiling * self._soft_threshold) hard = int(ceiling * self._hard_threshold) prev_level = self._pressure_level diff --git a/omlx/scheduler.py b/omlx/scheduler.py index 253845bf9..148986c12 100644 --- a/omlx/scheduler.py +++ b/omlx/scheduler.py @@ -796,10 +796,26 @@ def __init__( # soft_threshold. Schedulers stop admitting new prefills while this is # set; in-flight requests proceed. self._admission_paused: bool = False + # Recent high-water memory usage, propagated from ProcessMemoryEnforcer. + # Preflight admission maxes the instant reading against this so it does + # not wave through a request during a prefill trough that would wall + # the next chunk. 0 until the enforcer sets it. + self._memory_recent_peak_bytes: int = 0 # Adaptive prefill throttle params, propagated from enforcer. # Until set, _adaptive_chunk_size is a no-op (returns requested as-is). self._prefill_safe_zone_ratio: float = 0.80 self._prefill_min_chunk_tokens: int = 32 + # Conservative transient margin (bytes) added to the modelled per-chunk + # prefill peak by the forward-front gate (_prefill_forward_gate). + # estimate_prefill_peak_bytes only models KV + SDPA; it does NOT model + # the MoE expert-dequant activation spike, which on glm4.5-air-106b + # (MoE) is the dominant single-step transient. Sized from the observed + # worst-case single-step current jump on m5max (see MemorySettings + # .prefill_transient_margin_gb). 0 until the enforcer propagates it. + self._prefill_transient_margin_bytes: int = 0 + # One-shot guard for _log_prefill_gate_state_once (loud resolved-state + # log so a mis-propagated margin can't ship silently inert again). + self._prefill_gate_state_logged: bool = False # EWMA estimator of per-token chunk transient bytes, used by # _adaptive_chunk_size in the caution zone. Owned per-scheduler. _tracker_model_id = "" @@ -1729,6 +1745,153 @@ def _apply_turboquant_kv_convert(self, prompt_cache: list[Any]) -> None: f"cache layers to {bits}-bit{skip_msg}" ) + def _prefill_forward_gate( + self, + chunk_tokens: int, + *, + request_id: str, + loop_label: str, + ) -> None: + """Forward-FRONT memory gate: refuse a prefill chunk BEFORE it runs. + + PRIMARY protection against a single request's prefill transient breaching + the Metal cap and kernel-panicking the box. The legacy chunk-END check + (after self.model(...) + mx.eval) only fires once the transient has + already been allocated -- on Apple Silicon a chunk that overshoots the + cap panics the whole machine, so the after-the-fact check never runs. + This predicts the next chunk's peak and raises BEFORE the forward when it + would exceed the cap; the call-site handler in _schedule_waiting catches + the RuntimeError, _sync_and_clear_cache()s the accumulated KV, and emits + a finish_reason="error" output instead of crashing. + + predicted_peak = current(phys high-water) + estimate(optional) + margin + - current: max(active, phys_footprint, recent_peak) -- ALL three are + LIVE production signals (the same readings [memcheck:external] and + the enforcer use). recent_peak is the enforcer's rolling high-water, + so a mid-prefill trough in the instant reading does not mask the + real footprint. + - estimate: OPTIONAL. memory_monitor.estimate_prefill_peak_bytes models + this chunk's KV + SDPA. In production scheduler.memory_monitor is + never wired (see estimate-guards-inert finding), so estimate is 0 and + the gate is phys+margin only -- which is correct, because at chunk + granularity the KV+SDPA term is tiny and the margin dominates anyway. + - margin: _prefill_transient_margin_bytes is the real safety mechanism. + It covers the un-modelled MoE expert-dequant spike (the dominant + single-step transient on glm4.5-air). CRITICALLY it is propagated from + the ENFORCER (live), not the memory_monitor (dead), so unlike the + estimate this gate's safety actually fires in production. Sized so + margin >= worst-case single-step transient (see settings). + + Trip point: with estimate~0, the gate fires once current > cap - margin. + Functional residual: on a model that fills most of the cap (glm4.5-air, + 85GB on 128GB), a long prompt is refused cleanly (503-class) once its + accumulated KV approaches the headroom -- the correct behaviour (refuse + the request, do not crash the box); fit a longer context by using a + smaller quant. See MemorySettings.prefill_transient_margin_gb. + + No-op (returns) only when the guard is off, the hard limit is unset, or + chunk_tokens <= 0. NOT a no-op when the monitor/estimate is missing -- + that is the whole point of being phys-based. + + Raises: + RuntimeError: when the predicted peak exceeds the hard limit. + """ + if not self._prefill_memory_guard: + return + if self._memory_hard_limit_bytes <= 0: + return + if chunk_tokens <= 0: + return + + # Emit the resolved gate state ONCE, before it can matter, so a + # mis-propagated margin (the exact silent failure that made the prior + # monitor-based gate inert) is visible in the log instead of shipping + # blind. See _log_prefill_gate_state. + self._log_prefill_gate_state_once() + + # Estimate is OPTIONAL (monitor is unwired in production). Phys reading + # + the enforcer-propagated margin carry the guarantee. + estimate = 0 + if self.memory_monitor is not None: + estimate = self.memory_monitor.estimate_prefill_peak_bytes( + chunk_tokens, self.config.prefill_step_size + ) + + predicted_transient = estimate + self._prefill_transient_margin_bytes + current = max( + mx.get_active_memory(), + get_phys_footprint(), + self._memory_recent_peak_bytes, + ) + predicted_peak = current + predicted_transient + + if predicted_peak > self._memory_hard_limit_bytes: + logger.warning( + "[memgate:%s] rid=%s refusing prefill chunk (n=%d) BEFORE " + "forward: predicted peak %.3fGB = current %.3fGB + transient " + "%.3fGB (estimate %.3fGB + margin %.3fGB) exceeds hard cap " + "%.3fGB. Aborting request to avoid a Metal-cap kernel panic.", + loop_label, + request_id, + chunk_tokens, + predicted_peak / 1024**3, + current / 1024**3, + predicted_transient / 1024**3, + estimate / 1024**3, + self._prefill_transient_margin_bytes / 1024**3, + self._memory_hard_limit_bytes / 1024**3, + ) + raise RuntimeError( + "Prefill refused before forward: predicted peak " + f"{predicted_peak / 1024**3:.1f}GB (current " + f"{current / 1024**3:.1f}GB + transient " + f"{predicted_transient / 1024**3:.1f}GB) would exceed the " + f"memory ceiling {self._memory_hard_limit_bytes / 1024**3:.1f}GB. " + "Reduce context length or increase --max-process-memory." + ) + + def _log_prefill_gate_state_once(self) -> None: + """Log the resolved prefill-gate configuration exactly once. + + The prior monitor-based gate shipped INERT and SILENT (its memory_monitor + was never wired, so it no-op'd with no signal -- found only on hardware). + This makes the resolved state loud, the first time the gate runs, so the + one dependency that still matters -- the margin propagated from the + enforcer -- is visible instead of shipping blind. A margin of 0 degrades + the gate to the bare cap check; that is surfaced as a WARNING, not a + silent no-op. + """ + if getattr(self, "_prefill_gate_state_logged", False): + return + self._prefill_gate_state_logged = True + + estimator_live = False + if self.memory_monitor is not None: + try: + estimator_live = ( + self.memory_monitor.estimate_prefill_peak_bytes( + self.config.prefill_step_size, + self.config.prefill_step_size, + ) + > 0 + ) + except Exception: + estimator_live = False + + margin_bytes = self._prefill_transient_margin_bytes + emit = logger.warning if margin_bytes <= 0 else logger.info + emit( + "[memgate] prefill forward gate ACTIVE (phys-based): " + "margin=%.1fGB, cap=%.1fGB, model-dim estimator=%s%s", + margin_bytes / 1024**3, + self._memory_hard_limit_bytes / 1024**3, + "active" if estimator_live + else "DISABLED (phys+margin only)", + " -- WARNING: margin=0, gate degraded to the bare cap check" + if margin_bytes <= 0 + else "", + ) + def _do_external_prefill( self, request: "Request", @@ -1885,6 +2048,15 @@ def _do_external_prefill( extra_kwargs, n_to_process ) + # Forward-FRONT gate: predict this chunk's peak and refuse BEFORE + # the forward if it would breach the Metal cap (post-forward checks + # cannot save us -- the overshoot kernel-panics the machine). + self._prefill_forward_gate( + n_to_process, + request_id=request.request_id, + loop_label="external", + ) + _throttle_pre = get_phys_footprint() self.model(input_arr[:, :n_to_process], cache=prompt_cache, **model_kwargs) mx.eval([c.state for c in prompt_cache]) @@ -2223,6 +2395,17 @@ def _step_prefill_chunk(self, state: _PrefillState) -> bool: chunk = state.tokens_remaining[:, :n] state.tokens_remaining = state.tokens_remaining[:, n:] + + # Forward-FRONT gate: predict this chunk's peak and refuse BEFORE the + # forward if it would breach the Metal cap. Mirrors the external loop; + # raises RuntimeError that _advance_chunked_prefills converts into a + # finish_reason="error" output without crashing the machine. + self._prefill_forward_gate( + n, + request_id=state.request.request_id, + loop_label="chunked_step", + ) + _throttle_pre = get_phys_footprint() self.model(chunk, cache=state.cache) mx.eval([c.state for c in state.cache]) @@ -4510,6 +4693,16 @@ def _preflight_memory_check(self, request: "Request") -> str | None: """ Estimate whether prefill would exceed memory limits. + NOTE: this guard is monitor-DEPENDENT and therefore INERT in production + -- scheduler.memory_monitor is never wired, so estimate==0 and this + returns None (no rejection) for every request (see the + estimate-guards-inert finding). It is kept for the test-injected-monitor + path and as documentation; the live single-request protection is the + phys-based _prefill_forward_gate. A phys-only version here would buy + nothing: at admission time current is the idle baseline (~weights), well + below cap - margin, so it would never reject. Do not mistake this for an + active guard. + Computes worst-case peak memory for the last prefill chunk (model weights + KV cache + SDPA attention matrix) and rejects if it would exceed the hard limit. @@ -4541,7 +4734,11 @@ def _preflight_memory_check(self, request: "Request") -> str | None: if peak == 0: return None # can't estimate, skip - current = max(mx.get_active_memory(), get_phys_footprint()) + current = max( + mx.get_active_memory(), + get_phys_footprint(), + self._memory_recent_peak_bytes, + ) if current + peak > self._memory_hard_limit_bytes: from .utils.hardware import format_bytes diff --git a/omlx/server.py b/omlx/server.py index 0d5df6745..2db9f33dd 100644 --- a/omlx/server.py +++ b/omlx/server.py @@ -165,6 +165,7 @@ ModelLoadingError, ModelNotFoundError, ModelTooLargeError, + ModelTypeNotLoadableError, SchedulerQueueFullError, ) from .model_discovery import format_size @@ -227,6 +228,7 @@ class ServerState: responses_store: ResponseStore = field(default_factory=ResponseStore) oq_manager: Optional[object] = None # OQManager hf_uploader: Optional[object] = None # HFUploader + video_job_manager: Optional[object] = None # VideoJobManager # Global server state instance @@ -359,12 +361,29 @@ async def lifespan(app: FastAPI): hard_threshold=mem_cfg.hard_threshold, prefill_safe_zone_ratio=mem_cfg.prefill_safe_zone_ratio, prefill_min_chunk_tokens=mem_cfg.prefill_min_chunk_tokens, + prefill_transient_margin_gb=mem_cfg.prefill_transient_margin_gb, ) _server_state.process_memory_enforcer = enforcer _server_state.engine_pool._process_memory_enforcer = enforcer _server_state.engine_pool._get_final_ceiling = enforcer.get_final_ceiling enforcer.start() + # Video job manager -- constructed AFTER the enforcer so the memory + # lease can be constructor-injected (testability seam, spec 4.2). + # Cheap when video is disabled: no worker spawns until a job arrives. + if _server_state.global_settings is not None: + from .video.manager import VideoJobManager + + try: + _server_state.video_job_manager = VideoJobManager( + settings=_server_state.global_settings.video, + base_path=_server_state.global_settings.base_path, + enforcer=_server_state.process_memory_enforcer, + ) + except Exception as e: # noqa: BLE001 -- never block serving on video + logger.warning(f"Video job manager unavailable: {e}") + _server_state.video_job_manager = None + # Start TTL-only checker if process memory enforcer is not running # (enforcer already includes TTL checks in its polling loop) ttl_task = None @@ -398,6 +417,13 @@ async def _ttl_check_loop(): # Shutdown: Save all-time stats, stop TTL task, process memory enforcer, etc. get_server_metrics().save_alltime() + # isinstance (not None-check): tests patch _server_state wholesale and + # auto-created mock attributes are not awaitable. + from .video.manager import VideoJobManager as _VideoJobManager + if isinstance(_server_state.video_job_manager, _VideoJobManager): + await _server_state.video_job_manager.shutdown() + _server_state.video_job_manager = None + logger.info("Video job manager stopped") if ttl_task is not None: ttl_task.cancel() try: @@ -446,6 +472,13 @@ async def _ttl_check_loop(): except ImportError: pass +# Video routes are mounted unconditionally -- a settings-driven gate cannot +# live here because settings are not initialized at import time (the audio +# gate above only works because it tests import availability). Each handler +# gates on settings.video.enabled / manager presence / worker venv instead. +from .api.video_routes import router as video_router +app.include_router(video_router, dependencies=[Depends(verify_api_key)]) + # Include admin routes from .admin.routes import router as admin_router, set_admin_getters from .admin.auth import _RedirectToLogin @@ -690,15 +723,41 @@ async def get_engine( # Resolve alias to real model_id model_id = pool.resolve_model_id(model_id, _server_state.settings_manager) + # Video models are job-managed; reject BEFORE pool.get_engine so the + # 42GB entry never enters the admission/eviction loop (a misrouted + # chat request must not evict resident LLMs). Spec section 3. + _entry = pool.get_entry(model_id) + if _entry is not None and getattr(_entry, "model_type", "") == "video": + raise HTTPException( + status_code=400, + detail=( + f"Model '{model_id}' is a video generation model. " + "Use POST /v1/videos." + ), + ) + try: engine = await pool.get_engine(model_id) except ModelNotFoundError as e: - # Fallback to default model if enabled (LLM only) + # Fallback to default model if enabled (LLM only). The default can + # still be set to a non-chat model via admin/settings; verify its + # type before retrying (spec 4.1 fallback hygiene). + _default_entry = ( + pool.get_entry(_server_state.default_model) + if _server_state.default_model + else None + ) + _default_type = getattr(_default_entry, "model_type", None) if ( engine_type == EngineType.LLM and _server_state.global_settings and _server_state.global_settings.model.model_fallback and _server_state.default_model + # Block fallback only onto a KNOWN non-chat type; unknown + # entries (or non-string types from test doubles) preserve + # the old fallback behavior. + and (not isinstance(_default_type, str) + or _default_type in ("llm", "vlm")) ): logger.info( f"Model '{model_id}' not found, falling back to " @@ -729,6 +788,9 @@ async def get_engine( raise HTTPException(status_code=507, detail=str(e)) except ModelLoadingError as e: raise HTTPException(status_code=409, detail=str(e)) + except ModelTypeNotLoadableError as e: + # Defense in depth: the pre-pool check above normally catches this + raise HTTPException(status_code=400, detail=str(e)) except EnginePoolError as e: raise HTTPException(status_code=500, detail=str(e)) @@ -1274,8 +1336,16 @@ def init_server( f"No models found in {', '.join(dir_list)}. Add models to serve them." ) - # Set default model (from settings file, fallback to first model) + # Set default model (from settings file, fallback to first model). + # Implicit selection filters to chat-capable types so a video (or + # embedding/audio) model that sorts first never becomes the target of + # model-less chat requests (spec 4.1 default-model hygiene). available_models = _server_state.engine_pool.get_model_ids() + + def _chat_capable(mid: str) -> bool: + entry = _server_state.engine_pool.get_entry(mid) + return entry is not None and entry.model_type in ("llm", "vlm") + if available_models: if settings_default: if settings_default in available_models: @@ -1284,9 +1354,13 @@ def init_server( logger.warning( f"Default model '{settings_default}' not found, using first model" ) - _server_state.default_model = available_models[0] + _server_state.default_model = next( + (m for m in available_models if _chat_capable(m)), None + ) else: - _server_state.default_model = available_models[0] + _server_state.default_model = next( + (m for m in available_models if _chat_capable(m)), None + ) else: _server_state.default_model = None @@ -1717,6 +1791,7 @@ async def list_models(_: bool = Depends(verify_api_key)) -> ModelsResponse: ModelInfo( id=display_id, owned_by="omlx", + model_type=m.get("model_type", "llm"), ) ) @@ -1773,6 +1848,16 @@ async def load_model_public(model_id: str, _: bool = Depends(verify_api_key)): entry = _server_state.engine_pool.get_entry(model_id) if entry is None: raise HTTPException(status_code=404, detail=f"Model not found: {model_id}") + if entry.model_type == "video": + # Pre-pool check: the blanket except below would swallow the typed + # rejection into a 500 (spec 4.1). + raise HTTPException( + status_code=400, + detail=( + f"Model '{model_id}' is a video generation model and is " + "not pool-loaded. Use POST /v1/videos." + ), + ) if entry.engine is not None: return {"status": "ok", "model_id": model_id, "message": f"Already loaded: {model_id}"} diff --git a/omlx/settings.py b/omlx/settings.py index adba2a6d8..b818d835e 100644 --- a/omlx/settings.py +++ b/omlx/settings.py @@ -391,6 +391,33 @@ class MemorySettings: # aborted via the same cleanup path the hard-limit RuntimeError uses. prefill_safe_zone_ratio: float = 0.80 prefill_min_chunk_tokens: int = 32 + # Conservative transient margin used by the scheduler's forward-FRONT memory + # gate (_prefill_forward_gate). The gate is PHYS-based: it refuses a prefill + # chunk before its forward when current(max active/phys/recent_peak) + this + # margin would breach the hard cap, so the transient never lands on the Metal + # ceiling (which would kernel-panic the whole machine -- an after-the-fact + # Python check cannot catch it). The model-dim estimate is optional and, in + # production, absent (scheduler.memory_monitor is never wired), so this margin + # IS the safety mechanism -- it is propagated from the ENFORCER (live), unlike + # the dead monitor. + # + # The load-bearing guarantee: margin >= the worst-case single-step transient. + # That transient (chiefly MoE expert-dequant on glm4.5-air) is SUB-POLL -- + # faster than the enforcer's 1s sample -- so it is invisible to every memory + # read and MUST be carried here, not by reading the footprint more cleverly. + # On 2026-06-06 m5max a single glm4.5-air-106b prefill peaked at 110.4GB vs a + # 107.5GB cap, an effective transient up to ~10.6GB above the pre-step + # baseline; margin 10 was too small. 12 = ceil(10.6) padded. Extra cushion: + # the box only actually panics nearer ~110, so an admitted chunk needs a + # transient > ~14.5GB above the trip point to crash -- margin 12 clears that. + # + # Trip point: gate fires once current > cap - margin. Functional residual: a + # model that fills most of the cap (85GB on 128GB) gets long prompts refused + # cleanly (503-class) -- correct (refuse the request, do not crash the box); + # fit longer contexts with a smaller quant. Watch [memgate]/[memcheck] on + # hardware. Set to 0 only to disable the margin (gate degrades to the bare cap + # check; logged as a WARNING at startup). + prefill_transient_margin_gb: float = 12.0 def to_dict(self) -> dict[str, Any]: """Convert to dictionary.""" @@ -402,6 +429,7 @@ def to_dict(self) -> dict[str, Any]: "hard_threshold": self.hard_threshold, "prefill_safe_zone_ratio": self.prefill_safe_zone_ratio, "prefill_min_chunk_tokens": self.prefill_min_chunk_tokens, + "prefill_transient_margin_gb": self.prefill_transient_margin_gb, } @classmethod @@ -440,6 +468,9 @@ def from_dict(cls, data: dict[str, Any]) -> MemorySettings: prefill_min_chunk_tokens=int( data.get("prefill_min_chunk_tokens", 32) ), + prefill_transient_margin_gb=float( + data.get("prefill_transient_margin_gb", 12.0) + ), ) @@ -539,15 +570,24 @@ class HuggingFaceSettings: """HuggingFace Hub configuration settings.""" endpoint: str = "" # Empty string = use HF default (https://huggingface.co) + # Disable the Xet chunk-CAS transfer backend (cas-bridge.xethub.hf.co). + # huggingface_hub freezes HF_HUB_DISABLE_XET at import time, so this can + # only be applied process-wide at serve startup (cli.py env block) -- + # never per-download. Xet is unreachable from some networks (observed: + # mainland China); the plain LFS path works. + disable_xet: bool = False def to_dict(self) -> dict[str, Any]: """Convert to dictionary.""" - return {"endpoint": self.endpoint} + return {"endpoint": self.endpoint, "disable_xet": self.disable_xet} @classmethod def from_dict(cls, data: dict[str, Any]) -> HuggingFaceSettings: """Create from dictionary.""" - return cls(endpoint=data.get("endpoint", "")) + return cls( + endpoint=data.get("endpoint", ""), + disable_xet=bool(data.get("disable_xet", False)), + ) @dataclass @@ -566,6 +606,77 @@ def from_dict(cls, data: dict[str, Any]) -> ModelScopeSettings: return cls(endpoint=data.get("endpoint", "")) +@dataclass +class VideoSettings: + """Video generation engine settings (docs/video-generation-engine-spec.md). + + The video engine runs mlx-gen in a subprocess worker from its own venv + ({base_path}/venvs/video by default); these settings gate the /v1/videos + API and bound its resource use. memory_lease_gb is reserved against the + process memory enforcer ceiling for the duration of a job so co-resident + LLM serving throttles instead of stacking toward the Metal cap. + """ + + enabled: bool = False # Master switch; handlers return 503 when off + worker_python: str = "" # Empty = {base_path}/venvs/video/bin/python + memory_lease_gb: float = 36.0 # Reserved against the enforcer ceiling per job (P0-calibrated) + max_queued_jobs: int = 4 # Submissions beyond this 503 + job_timeout_seconds: int = 7200 # Per-run clock, starts at worker spawn + progress_stall_timeout_seconds: int = 600 # Kill when worker JSONL goes silent + default_steps: int = 20 + default_fps: int = 16 + max_frames: int = 121 # UX bound; memory bound is the peak predictor + max_steps: int = 50 + max_pixels_per_frame: int = 1280 * 720 + artifacts_max_count: int = 50 # LRU-purge artifact blobs beyond this + artifacts_max_gb: float = 50.0 + + def get_worker_python(self, base_path: Path) -> Path: + """Resolve the worker venv python path.""" + if self.worker_python: + return Path(self.worker_python).expanduser() + return base_path / "venvs" / "video" / "bin" / "python" + + def to_dict(self) -> dict[str, Any]: + """Convert to dictionary.""" + return { + "enabled": self.enabled, + "worker_python": self.worker_python, + "memory_lease_gb": self.memory_lease_gb, + "max_queued_jobs": self.max_queued_jobs, + "job_timeout_seconds": self.job_timeout_seconds, + "progress_stall_timeout_seconds": self.progress_stall_timeout_seconds, + "default_steps": self.default_steps, + "default_fps": self.default_fps, + "max_frames": self.max_frames, + "max_steps": self.max_steps, + "max_pixels_per_frame": self.max_pixels_per_frame, + "artifacts_max_count": self.artifacts_max_count, + "artifacts_max_gb": self.artifacts_max_gb, + } + + @classmethod + def from_dict(cls, data: dict[str, Any]) -> VideoSettings: + """Create from dictionary.""" + return cls( + enabled=bool(data.get("enabled", False)), + worker_python=data.get("worker_python", ""), + memory_lease_gb=float(data.get("memory_lease_gb", 36.0)), + max_queued_jobs=int(data.get("max_queued_jobs", 4)), + job_timeout_seconds=int(data.get("job_timeout_seconds", 7200)), + progress_stall_timeout_seconds=int( + data.get("progress_stall_timeout_seconds", 600) + ), + default_steps=int(data.get("default_steps", 20)), + default_fps=int(data.get("default_fps", 16)), + max_frames=int(data.get("max_frames", 121)), + max_steps=int(data.get("max_steps", 50)), + max_pixels_per_frame=int(data.get("max_pixels_per_frame", 1280 * 720)), + artifacts_max_count=int(data.get("artifacts_max_count", 50)), + artifacts_max_gb=float(data.get("artifacts_max_gb", 50.0)), + ) + + @dataclass class NetworkSettings: """Network proxy and TLS trust settings.""" @@ -784,6 +895,7 @@ class GlobalSettings: integrations: IntegrationSettings = field(default_factory=IntegrationSettings) ui: UISettings = field(default_factory=UISettings) idle_timeout: ModelIdleTimeoutSettings = field(default_factory=ModelIdleTimeoutSettings) + video: VideoSettings = field(default_factory=VideoSettings) @classmethod def load( @@ -879,6 +991,8 @@ def _load_from_file(self, path: Path) -> None: self.ui = UISettings.from_dict(data["ui"]) if "idle_timeout" in data: self.idle_timeout = ModelIdleTimeoutSettings.from_dict(data["idle_timeout"]) + if "video" in data: + self.video = VideoSettings.from_dict(data["video"]) except json.JSONDecodeError as e: logger.warning(f"Failed to parse settings file {path}: {e}") @@ -1120,6 +1234,7 @@ def save(self) -> None: "integrations": self.integrations.to_dict(), "ui": self.ui.to_dict(), "idle_timeout": self.idle_timeout.to_dict(), + "video": self.video.to_dict(), } try: @@ -1363,6 +1478,7 @@ def to_dict(self) -> dict[str, Any]: "integrations": self.integrations.to_dict(), "ui": self.ui.to_dict(), "idle_timeout": self.idle_timeout.to_dict(), + "video": self.video.to_dict(), } diff --git a/omlx/video/__init__.py b/omlx/video/__init__.py new file mode 100644 index 000000000..f50b05583 --- /dev/null +++ b/omlx/video/__init__.py @@ -0,0 +1,15 @@ +# SPDX-License-Identifier: Apache-2.0 +"""Video generation engine: job manager + subprocess worker. + +The video engine runs mlx-gen (Wan2.2 text-to-video) in a subprocess worker +from its own venv, coordinated by VideoJobManager with a memory lease held +against the ProcessMemoryEnforcer ceiling. Design: +docs/video-generation-engine-spec.md. + +Note: worker.py is NOT imported here -- it runs under the video venv python +and must stay importable without omlx on sys.path. +""" + +from .manager import VideoJob, VideoJobManager + +__all__ = ["VideoJob", "VideoJobManager"] diff --git a/omlx/video/manager.py b/omlx/video/manager.py new file mode 100644 index 000000000..e1065d8b7 --- /dev/null +++ b/omlx/video/manager.py @@ -0,0 +1,666 @@ +# SPDX-License-Identifier: Apache-2.0 +"""VideoJobManager: async job queue for subprocess video generation. + +Job shape follows the admin downloader/OQ patterns (task dict + status enum ++ cooperative cancel) with persistence (one JSON per job, atomic write) and +a memory lease held against the ProcessMemoryEnforcer for the duration of +each run. Design: docs/video-generation-engine-spec.md sections 4.2/4.4. + +Wire status is exactly the OpenAI four-value enum: queued | in_progress | +completed | failed. Cancellation is not a wire state -- DELETE kills the +worker and removes the record entirely. +""" + +from __future__ import annotations + +import asyncio +import json +import logging +import os +import shutil +import time +from dataclasses import dataclass, field +from pathlib import Path +from typing import Any, Optional + +from ..utils.proc_memory import get_phys_footprint + +logger = logging.getLogger(__name__) + +GB = 1024**3 + +# Stable error codes (spec 4.2). The worker failure manifest uses the same +# {code, message, detail?} schema and is passed through. +ERR_WORKER_CRASHED = "worker_crashed" +ERR_WORKER_STALLED = "worker_stalled" +ERR_JOB_TIMEOUT = "job_timeout" +ERR_LEASE_EXCEEDED = "memory_lease_exceeded" +ERR_MONITOR_FAILED = "monitor_failed" +ERR_SERVER_RESTARTED = "server_restarted" +ERR_OUTPUT_INVALID = "output_invalid" + +_WATCHDOG_INTERVAL_S = 2.0 +_ADMISSION_RECHECK_S = 5.0 +_SIGTERM_GRACE_S = 5.0 + + +@dataclass +class VideoJob: + """One video generation job.""" + + id: str + model_id: str + model_dir: str + params: dict[str, Any] # prompt, width, height, frames, steps, fps, seed, ... + status: str = "queued" # queued | in_progress | completed | failed + progress: int = 0 # 0-100 + phase: str = "" + error: Optional[dict[str, str]] = None # {code, message} when failed + created_at: float = field(default_factory=time.time) + started_at: Optional[float] = None + completed_at: Optional[float] = None + expires_at: Optional[float] = None # Set when the artifact blob is purged + artifact_path: Optional[str] = None + wall_seconds: Optional[float] = None + peak_memory_gb: Optional[float] = None # Worker lifetime-max, for records + + def to_dict(self) -> dict[str, Any]: + """Wire shape: OpenAI video object fields + fmlx extensions.""" + return { + "id": self.id, + "object": "video", + "model": self.model_id, + "status": self.status, + "progress": self.progress, + "created_at": int(self.created_at), + "completed_at": int(self.completed_at) if self.completed_at else None, + "expires_at": int(self.expires_at) if self.expires_at else None, + "error": self.error, + "seconds": str(self.params.get("seconds", "")), + "size": f"{self.params.get('width')}x{self.params.get('height')}", + # fmlx extensions + "phase": self.phase, + "prompt": self.params.get("prompt", ""), + "frames": self.params.get("frames"), + "fps": self.params.get("fps"), + "steps": self.params.get("steps"), + "seed": self.params.get("seed"), + "wall_seconds": self.wall_seconds, + } + + def to_persist(self) -> dict[str, Any]: + return { + "id": self.id, + "model_id": self.model_id, + "model_dir": self.model_dir, + "params": self.params, + "status": self.status, + "progress": self.progress, + "phase": self.phase, + "error": self.error, + "created_at": self.created_at, + "started_at": self.started_at, + "completed_at": self.completed_at, + "expires_at": self.expires_at, + "artifact_path": self.artifact_path, + "wall_seconds": self.wall_seconds, + "peak_memory_gb": self.peak_memory_gb, + } + + @classmethod + def from_persist(cls, data: dict[str, Any]) -> "VideoJob": + return cls( + id=str(data["id"]), + model_id=str(data.get("model_id", "")), + model_dir=str(data.get("model_dir", "")), + params=dict(data.get("params") or {}), + status=str(data.get("status", "failed")), + progress=int(data.get("progress") or 0), + phase=str(data.get("phase", "") or ""), + error=data.get("error"), + created_at=float(data.get("created_at") or time.time()), + started_at=data.get("started_at"), + completed_at=data.get("completed_at"), + expires_at=data.get("expires_at"), + artifact_path=data.get("artifact_path"), + wall_seconds=data.get("wall_seconds"), + peak_memory_gb=data.get("peak_memory_gb"), + ) + + +class VideoJobManager: + """Serializes video generation jobs against a memory lease. + + Constructed in server lifespan AFTER the ProcessMemoryEnforcer so the + enforcer can be constructor-injected (testability seam, spec 4.2). + """ + + def __init__( + self, + *, + settings: Any, # VideoSettings + base_path: Path, + enforcer: Any | None, # ProcessMemoryEnforcer | None + worker_script: Path | None = None, + ): + self._settings = settings + self._base_path = Path(base_path) + self._enforcer = enforcer + self._worker_script = worker_script or ( + Path(__file__).parent / "worker.py" + ) + self._jobs: dict[str, VideoJob] = {} + self._queue: list[str] = [] # FIFO of queued job ids + self._dispatcher: asyncio.Task | None = None + self._current_proc: asyncio.subprocess.Process | None = None + self._current_job_id: str | None = None + self._wake = asyncio.Event() + self._shutdown = False + self._venv_probe_result: tuple[bool, str] | None = None + + self.jobs_dir.mkdir(parents=True, exist_ok=True) + self.artifacts_dir.mkdir(parents=True, exist_ok=True) + self._replay_persisted() + + # -- paths --------------------------------------------------------------- + + @property + def jobs_dir(self) -> Path: + return self._base_path / "video-jobs" + + @property + def artifacts_dir(self) -> Path: + return self._base_path / "video-artifacts" + + def worker_python(self) -> Path: + return self._settings.get_worker_python(self._base_path) + + # -- persistence --------------------------------------------------------- + + def _persist(self, job: VideoJob) -> None: + path = self.jobs_dir / f"{job.id}.json" + tmp = path.with_suffix(".tmp") + try: + with open(tmp, "w") as f: + json.dump(job.to_persist(), f, indent=1) + os.replace(tmp, path) + except OSError as e: + logger.error(f"[video] failed to persist job {job.id}: {e}") + + def _replay_persisted(self) -> None: + """Reload job records at startup; in-flight jobs become failed.""" + for path in sorted(self.jobs_dir.glob("video_*.json")): + try: + with open(path) as f: + job = VideoJob.from_persist(json.load(f)) + except Exception as e: + logger.warning(f"[video] skipping unreadable job file {path}: {e}") + continue + if job.status in ("queued", "in_progress"): + job.status = "failed" + job.error = { + "code": ERR_SERVER_RESTARTED, + "message": "Server restarted while the job was active", + } + job.completed_at = time.time() + self._persist(job) + self._jobs[job.id] = job + + # -- venv probe ---------------------------------------------------------- + + async def probe_worker_venv(self, force: bool = False) -> tuple[bool, str]: + """Check the worker venv is usable (cached after first success).""" + if self._venv_probe_result and self._venv_probe_result[0] and not force: + return self._venv_probe_result + py = self.worker_python() + install_hint = ( + "Install with: uv venv -p 3.12 {base}/venvs/video && " + "uv pip sync --python {base}/venvs/video/bin/python " + "omlx/video/requirements.lock".format(base=self._base_path) + ) + if not py.exists(): + self._venv_probe_result = ( + False, f"Video worker python not found at {py}. {install_hint}" + ) + return self._venv_probe_result + try: + proc = await asyncio.create_subprocess_exec( + str(py), "-c", "import mflux", + stdout=asyncio.subprocess.DEVNULL, + stderr=asyncio.subprocess.PIPE, + ) + _, stderr = await asyncio.wait_for(proc.communicate(), timeout=60) + if proc.returncode != 0: + self._venv_probe_result = ( + False, + f"Video worker venv at {py} cannot import mflux: " + f"{(stderr or b'').decode()[-300:]}. {install_hint}", + ) + return self._venv_probe_result + except Exception as e: + self._venv_probe_result = (False, f"Video worker venv probe failed: {e}") + return self._venv_probe_result + self._venv_probe_result = (True, "") + return self._venv_probe_result + + # -- memory admission ---------------------------------------------------- + + def _lease_bytes(self) -> int: + return int(float(self._settings.memory_lease_gb) * GB) + + def guard_available(self) -> tuple[bool, str]: + """Submission-time check: refuse jobs without a live memory guard.""" + enf = self._enforcer + if enf is None or not getattr(enf, "is_running", False): + return False, ( + "Video jobs require the process memory guard, which is not " + "running on this server" + ) + if enf.get_final_ceiling() <= 0: + return False, ( + "Video jobs require memory.prefill_memory_guard to be " + "enabled (the guard is currently disabled)" + ) + return True, "" + + def _memory_admission(self) -> tuple[bool, str]: + """Dispatch-time predicate (spec 4.4): the lease must land with the + system already at ok pressure and resident load below both the + post-lease soft watermark and the post-lease prefill-gate trip.""" + enf = self._enforcer + ok, reason = self.guard_available() + if not ok: + return False, reason + assert enf is not None # guard_available() established this + ceiling = enf.get_final_ceiling() + lease = self._lease_bytes() + post = ceiling - lease + if post <= 0: + return False, ( + f"memory lease {lease / GB:.0f}GB does not fit under the " + f"ceiling {ceiling / GB:.1f}GB" + ) + soft_ratio = float(getattr(enf, "_soft_threshold", 0.85) or 0.85) + margin = int(getattr(enf, "_prefill_transient_margin_bytes", 12 * GB) + or 12 * GB) + budget = min(int(post * soft_ratio), post - margin) + if budget <= 0: + budget = int(post * soft_ratio) + peak = int(enf.recent_peak_bytes() or 0) + if peak > budget: + return False, ( + f"waiting for memory: resident usage {peak / GB:.1f}GB above " + f"post-lease budget {budget / GB:.1f}GB " + f"(ceiling {ceiling / GB:.1f}GB, lease {lease / GB:.0f}GB)" + ) + return True, "" + + # -- public API ---------------------------------------------------------- + + def get(self, job_id: str) -> VideoJob | None: + return self._jobs.get(job_id) + + def list_jobs( + self, limit: int = 20, after: str | None = None, order: str = "desc" + ) -> tuple[list[VideoJob], bool]: + jobs = sorted( + self._jobs.values(), + key=lambda j: j.created_at, + reverse=(order != "asc"), + ) + if after: + ids = [j.id for j in jobs] + try: + start = ids.index(after) + 1 + jobs = jobs[start:] + except ValueError: + pass + page = jobs[:limit] + return page, len(jobs) > limit + + def queue_depth(self) -> int: + return len(self._queue) + + async def submit(self, job: VideoJob) -> VideoJob: + """Accept a job into the queue (caller validates params + caps).""" + if len(self._queue) >= int(self._settings.max_queued_jobs): + raise QueueFullError( + f"Video queue is full ({len(self._queue)}/" + f"{self._settings.max_queued_jobs})" + ) + self._jobs[job.id] = job + self._queue.append(job.id) + self._persist(job) + self._ensure_dispatcher() + self._wake.set() + logger.info(f"[video] queued {job.id} ({job.model_id})") + return job + + async def delete(self, job_id: str) -> bool: + """DELETE semantics (spec 4.3): kill if running, drop record+blobs.""" + job = self._jobs.get(job_id) + if job is None: + return False + if job_id in self._queue: + self._queue.remove(job_id) + if self._current_job_id == job_id and self._current_proc is not None: + await self._terminate_proc(self._current_proc) + self._jobs.pop(job_id, None) + try: + (self.jobs_dir / f"{job_id}.json").unlink(missing_ok=True) + except OSError: + pass + blob_dir = self.artifacts_dir / job_id + if blob_dir.exists(): + shutil.rmtree(blob_dir, ignore_errors=True) + logger.info(f"[video] deleted {job_id}") + return True + + async def shutdown(self) -> None: + self._shutdown = True + self._wake.set() + if self._current_proc is not None: + await self._terminate_proc(self._current_proc) + if self._dispatcher is not None: + self._dispatcher.cancel() + try: + await self._dispatcher + except (asyncio.CancelledError, Exception): + pass + + # -- dispatcher ---------------------------------------------------------- + + def _ensure_dispatcher(self) -> None: + if self._dispatcher is None or self._dispatcher.done(): + self._dispatcher = asyncio.create_task(self._dispatch_loop()) + + async def _dispatch_loop(self) -> None: + while not self._shutdown: + if not self._queue: + self._wake.clear() + try: + await asyncio.wait_for(self._wake.wait(), timeout=60) + except asyncio.TimeoutError: + continue + continue + job_id = self._queue[0] + job = self._jobs.get(job_id) + if job is None or job.status != "queued": + self._queue.pop(0) + continue + ok, reason = self._memory_admission() + if not ok: + if job.phase != reason: + job.phase = reason + self._persist(job) + await asyncio.sleep(_ADMISSION_RECHECK_S) + continue + self._queue.pop(0) + try: + await self._run_job(job) + except Exception as e: # noqa: BLE001 -- dispatcher must survive + logger.exception(f"[video] job {job.id} runner crashed: {e}") + if job.status == "in_progress": + self._finish(job, "failed", ERR_WORKER_CRASHED, str(e)) + + # -- job execution ------------------------------------------------------- + + def _finish( + self, job: VideoJob, status: str, code: str | None = None, + message: str | None = None, + ) -> None: + job.status = status + job.completed_at = time.time() + if job.started_at: + job.wall_seconds = round(job.completed_at - job.started_at, 1) + if status == "failed": + job.error = {"code": code or ERR_WORKER_CRASHED, + "message": message or ""} + else: + job.progress = 100 + job.phase = "done" + self._persist(job) + logger.info( + f"[video] {job.id} -> {status}" + + (f" ({code}: {message})" if code else "") + ) + + async def _terminate_proc(self, proc: asyncio.subprocess.Process) -> None: + if proc.returncode is not None: + return + try: + proc.terminate() + except ProcessLookupError: + return + try: + await asyncio.wait_for(proc.wait(), timeout=_SIGTERM_GRACE_S) + except asyncio.TimeoutError: + try: + proc.kill() + except ProcessLookupError: + pass + await proc.wait() + + def _worker_env(self) -> dict[str, str]: + """Explicit whitelist -- never inherit the full server env.""" + env = {} + for key in ("PATH", "HOME", "TMPDIR", "USER", "LANG"): + if key in os.environ: + env[key] = os.environ[key] + # The worker loads everything from the local model dir; forbid + # accidental network fetches. + env["HF_HUB_OFFLINE"] = "1" + env["HF_HUB_DISABLE_TELEMETRY"] = "1" + return env + + async def _run_job(self, job: VideoJob) -> None: + enf = self._enforcer + lease = self._lease_bytes() + blob_dir = self.artifacts_dir / job.id + blob_dir.mkdir(parents=True, exist_ok=True) + output_path = blob_dir / "output.mp4" + manifest_path = blob_dir / "manifest.json" + spec_path = blob_dir / "spec.json" + + spec = dict(job.params) + spec.update( + model_dir=job.model_dir, + output_path=str(output_path), + manifest_path=str(manifest_path), + lease_bytes=lease, + ) + with open(spec_path, "w") as f: + json.dump(spec, f, indent=1) + + job.status = "in_progress" + job.started_at = time.time() + job.phase = "starting" + self._persist(job) + + if enf is not None: + enf.acquire_video_lease(lease) + try: + proc = await asyncio.create_subprocess_exec( + str(self.worker_python()), "-I", str(self._worker_script), + "--spec", str(spec_path), + stdout=asyncio.subprocess.PIPE, + stderr=asyncio.subprocess.DEVNULL, + env=self._worker_env(), + ) + self._current_proc = proc + self._current_job_id = job.id + if enf is not None: + enf.set_video_worker_pid(proc.pid) + + kill_reason: list[tuple[str, str]] = [] + last_line_at = time.time() + + async def watchdog() -> None: + zero_reads = 0 + while proc.returncode is None: + await asyncio.sleep(_WATCHDOG_INTERVAL_S) + if proc.returncode is not None: + return + now = time.time() + # Per-run timeout (clock starts at spawn, spec 4.2) + if (now - (job.started_at or now) + > int(self._settings.job_timeout_seconds)): + kill_reason.append(( + ERR_JOB_TIMEOUT, + f"exceeded job_timeout_seconds=" + f"{self._settings.job_timeout_seconds}", + )) + await self._terminate_proc(proc) + return + # Stall detection: no JSONL line for too long + if (now - last_line_at + > int(self._settings.progress_stall_timeout_seconds)): + kill_reason.append(( + ERR_WORKER_STALLED, + "no progress output for " + f"{int(now - last_line_at)}s", + )) + await self._terminate_proc(proc) + return + # Footprint vs lease (secondary cleanup; layer 1 is the + # worker's own Metal wired limit) + footprint = get_phys_footprint(proc.pid) + if footprint <= 0: + zero_reads += 1 + if zero_reads >= 3: + kill_reason.append(( + ERR_MONITOR_FAILED, + "cannot read worker memory footprint", + )) + await self._terminate_proc(proc) + return + continue + zero_reads = 0 + if footprint > lease: + kill_reason.append(( + ERR_LEASE_EXCEEDED, + f"worker footprint {footprint / GB:.1f}GB " + f"exceeded lease {lease / GB:.1f}GB", + )) + try: + proc.kill() # immediate, no grace + except ProcessLookupError: + pass + return + + wd_task = asyncio.create_task(watchdog()) + try: + assert proc.stdout is not None + async for raw in proc.stdout: + last_line_at = time.time() + try: + ev = json.loads(raw.decode().strip()) + except (ValueError, UnicodeDecodeError): + continue + self._apply_progress(job, ev) + await proc.wait() + finally: + wd_task.cancel() + try: + await wd_task + except (asyncio.CancelledError, Exception): + pass + + self._conclude(job, proc.returncode or 0, kill_reason, + output_path, manifest_path) + finally: + self._current_proc = None + self._current_job_id = None + if enf is not None: + enf.set_video_worker_pid(None) + enf.release_video_lease() + self._retention_sweep() + + def _apply_progress(self, job: VideoJob, ev: dict[str, Any]) -> None: + phase = str(ev.get("phase", "") or "") + if phase: + job.phase = phase + total = int(ev.get("total_steps") or 0) + step = int(ev.get("step") or 0) + if phase == "loading": + job.progress = max(job.progress, 2) + elif phase == "loaded": + job.progress = max(job.progress, 5) + elif total > 0 and step > 0: + job.progress = max(job.progress, 5 + int(90 * step / total)) + elif phase == "saving": + job.progress = max(job.progress, 97) + # Persist sparsely: every phase change and ~every 5 progress points + if phase in ("loading", "loaded", "saving", "done", "failed") or ( + job.progress % 5 == 0 + ): + self._persist(job) + + def _conclude( + self, job: VideoJob, returncode: int, + kill_reason: list[tuple[str, str]], + output_path: Path, manifest_path: Path, + ) -> None: + # Job may have been deleted mid-run (DELETE endpoint kills the proc) + if job.id not in self._jobs: + return + if kill_reason: + code, message = kill_reason[0] + self._finish(job, "failed", code, message) + return + manifest: dict[str, Any] = {} + try: + with open(manifest_path) as f: + manifest = json.load(f) + except (OSError, ValueError): + pass + if returncode != 0: + self._finish( + job, "failed", + str(manifest.get("code", ERR_WORKER_CRASHED)), + str(manifest.get("message", + f"worker exited with code {returncode}")), + ) + return + if not output_path.exists() or output_path.stat().st_size == 0: + self._finish(job, "failed", ERR_OUTPUT_INVALID, + "worker exited 0 but produced no output file") + return + job.artifact_path = str(output_path) + if isinstance(manifest.get("lifetime_max_phys_gb"), (int, float)): + job.peak_memory_gb = float(manifest["lifetime_max_phys_gb"]) + self._finish(job, "completed") + + # -- retention ----------------------------------------------------------- + + def _retention_sweep(self) -> None: + """LRU-purge artifact blobs beyond count/bytes caps. Records stay; + expires_at marks the purge time (spec 4.2).""" + max_count = int(self._settings.artifacts_max_count) + max_bytes = int(float(self._settings.artifacts_max_gb) * GB) + holders = [ + j for j in self._jobs.values() + if j.artifact_path and j.expires_at is None + and Path(j.artifact_path).exists() + ] + holders.sort(key=lambda j: j.completed_at or j.created_at) # oldest first + total = sum( + Path(j.artifact_path).stat().st_size for j in holders # type: ignore[arg-type] + ) + while holders and (len(holders) > max_count or total > max_bytes): + victim = holders.pop(0) + blob_dir = self.artifacts_dir / victim.id + try: + size = Path(victim.artifact_path).stat().st_size # type: ignore[arg-type] + except OSError: + size = 0 + shutil.rmtree(blob_dir, ignore_errors=True) + victim.artifact_path = None + victim.expires_at = time.time() + self._persist(victim) + total -= size + logger.info(f"[video] purged artifact of {victim.id} (retention)") + + +class QueueFullError(Exception): + """Submission rejected: queue depth cap reached (HTTP 503).""" diff --git a/omlx/video/requirements.in b/omlx/video/requirements.in new file mode 100644 index 000000000..338b0f41b --- /dev/null +++ b/omlx/video/requirements.in @@ -0,0 +1 @@ +mlx-gen==0.18.14 diff --git a/omlx/video/requirements.lock b/omlx/video/requirements.lock new file mode 100644 index 000000000..5c8dcb21f --- /dev/null +++ b/omlx/video/requirements.lock @@ -0,0 +1,1437 @@ +# This file was autogenerated by uv via the following command: +# uv pip compile --python /Users/yuanwei/.fmlx/venvs/video/bin/python --generate-hashes -o /tmp/video-req.lock /tmp/video-req.in +annotated-doc==0.0.4 \ + --hash=sha256:571ac1dc6991c450b25a9c2d84a3705e2ae7a53467b5d111c24fa8baabbed320 \ + --hash=sha256:fbcda96e87e9c92ad167c2e53839e57503ecfda18804ea28102353485033faa4 + # via typer +anyio==4.13.0 \ + --hash=sha256:08b310f9e24a9594186fd75b4f73f4a4152069e3853f1ed8bfbf58369f4ad708 \ + --hash=sha256:334b70e641fd2221c1505b3890c69882fe4a2df910cba14d97019b90b24439dc + # via httpx +av==17.1.0 \ + --hash=sha256:1284addf3c0dd939887a9722dc30df2241a97471ad52c3c507e31583ae22ff02 \ + --hash=sha256:1370b11a697eb3f2555906f8ab3519b0cfe48425d7830a3996ad42e6bffafda5 \ + --hash=sha256:19264c9bb4bee404accc7ce9ec461f2044b7f577a70234d29aafde31ed17de46 \ + --hash=sha256:19c84fd72af5ef81a20f18fbc6f9aedff9e1455e53a7062c1d4c95926d73da4e \ + --hash=sha256:22dff0ae582d10ef08c75c2150a4fd27cfc26653b54930c7c27b9f7b3aa20723 \ + --hash=sha256:3453b06075c7bb973fdb6de52563f7692ff05cbc64c0bb45f4fd6e8709131f2f \ + --hash=sha256:3dcd41e53f53f9a3260751d9c3c11d34e93d70d61e506c81f13dbc1e3606e07b \ + --hash=sha256:43ebbe977f19a7f2d2bd1a4e119675a0b15e05852cf7309846b6ab922ba7ffe9 \ + --hash=sha256:5327807c1219293803ef0c5d1578ff3ae1cf638c09e5998962026e1a554ec240 \ + --hash=sha256:58f7593726437cda5bd19793027e027768450b5c4a594777bf487798a33db702 \ + --hash=sha256:5df5c1172ef1cf65a1529d612f7da7798ce2cf82c1ff7212466b538a6cc7214c \ + --hash=sha256:6a20658ec7d96a70e14b1196eff00b7cdd8831ac3b99868e16b8ba8b24090847 \ + --hash=sha256:6c9b71fe5c0c5a8d303b1588d4d8ce9397d6b023f467cfef95000ba1f75507fa \ + --hash=sha256:7f1e71ff621b66253333926f948e00faae11d855b2442133c65128bca64cdeb3 \ + --hash=sha256:90c49bc9608377d01e82e747377505419a229464873341db18202d5dddecce5a \ + --hash=sha256:9514cfda85180554c430695282faf4be3ffdf95775d8519733821244eecb58e0 \ + --hash=sha256:ad7b4aa011093324b7118245f50ac6db244cfe9900d4072508a5245a2b0d3f41 \ + --hash=sha256:b41647e42884bf543b8e8d0a1dabd4d1b006c99183eb1a2d7afc5b01f73eeff4 \ + --hash=sha256:bbab058bd965309f39962e53caac8126987c68c0be094fc4f9427e5615b0218f \ + --hash=sha256:bff8896454b38fcb785a70e5ae0485d7021cb776303a5849393128a30b8f850b \ + --hash=sha256:cc5a5247622cb77e24c342364eb68f88c1442ddfaab60c1f1f483359d3cc7879 \ + --hash=sha256:e1c90f85cd7431ede95b11e8e711571a896ebea433f298849c2c0f1594c8d86e \ + --hash=sha256:ec630be6321b04e317862f6082e84812bbd801e55a3c2298312e3fc8a0a4af4f \ + --hash=sha256:ee98534242a74da847af78624779ac5a3177dc7c69f956a4da9e6f0fdb37d7f6 \ + --hash=sha256:efe9b1397300b67b644ad220c89df4892a76f2debe70f16bae1749fa20526e63 \ + --hash=sha256:f997e3351bdf51127c07a74e21741a2996e9230cbeb2d81c14acde761b116c9c \ + --hash=sha256:f9a65d1f48b818323fb411e80358f89d77dec340b01d27c6b2dfbb9cbf4b779f \ + --hash=sha256:fa64e1f1500d01c4a98e7a41dc1a9a35fb4dfe71f5de0389264ec1192200c76a \ + --hash=sha256:ff457ed419348e5b8e8c811d341389b052c5e4d5839da3794d019b125b9fe830 \ + --hash=sha256:ffbd78d73d2c9bf31e9a007c992faec3991428b2941a3b085b84fb82e8c32d19 + # via mlx-gen +certifi==2026.5.20 \ + --hash=sha256:3c52e209ba0a4ad7aebe60436a4ab349c39e1e602e8c134221e546902ad25897 \ + --hash=sha256:69dea482ab64caa7b9f6aba1c6bf48bb6a5448d1c0f1b17ab42ad8c763a5344d + # via + # httpcore + # httpx + # requests +charset-normalizer==3.4.7 \ + --hash=sha256:007d05ec7321d12a40227aae9e2bc6dca73f3cb21058999a1df9e193555a9dcc \ + --hash=sha256:03853ed82eeebbce3c2abfdbc98c96dc205f32a79627688ac9a27370ea61a49c \ + --hash=sha256:07d9e39b01743c3717745f4c530a6349eadbfa043c7577eef86c502c15df2c67 \ + --hash=sha256:08e721811161356f97b4059a9ba7bafb23ea5ee2255402c42881c214e173c6b4 \ + --hash=sha256:0c96c3b819b5c3e9e165495db84d41914d6894d55181d2d108cc1a69bfc9cce0 \ + --hash=sha256:0ea948db76d31190bf08bd371623927ee1339d5f2a0b4b1b4a4439a65298703c \ + --hash=sha256:0f7eb884681e3938906ed0434f20c63046eacd0111c4ba96f27b76084cd679f5 \ + --hash=sha256:12a6fff75f6bc66711b73a2f0addfc4c8c15a20e805146a02d147a318962c444 \ + --hash=sha256:12d8baf840cc7889b37c7c770f478adea7adce3dcb3944d02ec87508e2dcf153 \ + --hash=sha256:14265bfe1f09498b9d8ec91e9ec9fa52775edf90fcbde092b25f4a33d444fea9 \ + --hash=sha256:16d971e29578a5e97d7117866d15889a4a07befe0e87e703ed63cd90cb348c01 \ + --hash=sha256:177a0ba5f0211d488e295aaf82707237e331c24788d8d76c96c5a41594723217 \ + --hash=sha256:1a87ca9d5df6fe460483d9a5bbf2b18f620cbed41b432e2bddb686228282d10b \ + --hash=sha256:1c2a768fdd44ee4a9339a9b0b130049139b8ce3c01d2ce09f67f5a68048d477c \ + --hash=sha256:1c2aed2e5e41f24ea8ef1590b8e848a79b56f3a5564a65ceec43c9d692dc7d8a \ + --hash=sha256:1dc8b0ea451d6e69735094606991f32867807881400f808a106ee1d963c46a83 \ + --hash=sha256:1efde3cae86c8c273f1eb3b287be7d8499420cf2fe7585c41d370d3e790054a5 \ + --hash=sha256:202389074300232baeb53ae2569a60901f7efadd4245cf3a3bf0617d60b439d7 \ + --hash=sha256:203104ed3e428044fd943bc4bf45fa73c0730391f9621e37fe39ecf477b128cb \ + --hash=sha256:2257141f39fe65a3fdf38aeccae4b953e5f3b3324f4ff0daf9f15b8518666a2c \ + --hash=sha256:298930cec56029e05497a76988377cbd7457ba864beeea92ad7e844fe74cd1f1 \ + --hash=sha256:2cd4a60d0e2fb04537162c62bbbb4182f53541fe0ede35cdf270a1c1e723cc42 \ + --hash=sha256:2d6eb928e13016cea4f1f21d1e10c1cebd5a421bc57ddf5b1142ae3f86824fab \ + --hash=sha256:2fe249cb4651fd12605b7288b24751d8bfd46d35f12a20b1ba33dea122e690df \ + --hash=sha256:30b8d1d8c52a48c2c5690e152c169b673487a2a58de1ec7393196753063fcd5e \ + --hash=sha256:320ade88cfb846b8cd6b4ddf5ee9e80ee0c1f52401f2456b84ae1ae6a1a5f207 \ + --hash=sha256:3534e7dcbdcf757da6b85a0bbf5b6868786d5982dd959b065e65481644817a18 \ + --hash=sha256:36836d6ff945a00b88ba1e4572d721e60b5b8c98c155d465f56ad19d68f23734 \ + --hash=sha256:38c0109396c4cfc574d502df99742a45c72c08eff0a36158b6f04000043dbf38 \ + --hash=sha256:3946fa46a0cf3e4c8cb1cc52f56bb536310d34f25f01ca9b6c16afa767dab110 \ + --hash=sha256:3bec022aec2c514d9cf199522a802bd007cd588ab17ab2525f20f9c34d067c18 \ + --hash=sha256:3c9a494bc5ec77d43cea229c4f6db1e4d8fe7e1bbffa8b6f0f0032430ff8ab44 \ + --hash=sha256:3dce51d0f5e7951f8bb4900c257dad282f49190fdbebecd4ba99bcc41fef404d \ + --hash=sha256:3dedcc22d73ec993f42055eff4fcfed9318d1eeb9a6606c55892a26964964e48 \ + --hash=sha256:4042d5c8f957e15221d423ba781e85d553722fc4113f523f2feb7b188cc34c5e \ + --hash=sha256:481551899c856c704d58119b5025793fa6730adda3571971af568f66d2424bb5 \ + --hash=sha256:4dc1e73c36828f982bfe79fadf5919923f8a6f4df2860804db9a98c48824ce8d \ + --hash=sha256:4e5163c14bffd570ef2affbfdd77bba66383890797df43dc8b4cc7d6f500bf53 \ + --hash=sha256:511ef87c8aec0783e08ac18565a16d435372bc1ac25a91e6ac7f5ef2b0bff790 \ + --hash=sha256:532bc9bf33a68613fd7d65e4b1c71a6a38d7d42604ecf239c77392e9b4e8998c \ + --hash=sha256:54523e136b8948060c0fa0bc7b1b50c32c186f2fceee897a495406bb6e311d2b \ + --hash=sha256:5649fd1c7bade02f320a462fdefd0b4bd3ce036065836d4f42e0de958038e116 \ + --hash=sha256:56be790f86bfb2c98fb742ce566dfb4816e5a83384616ab59c49e0604d49c51d \ + --hash=sha256:5b77459df20e08151cd6f8b9ef8ef1f961ef73d85c21a555c7eed5b79410ec10 \ + --hash=sha256:5ed6ab538499c8644b8a3e18debabcd7ce684f3fa91cf867521a7a0279cab2d6 \ + --hash=sha256:6178f72c5508bfc5fd446a5905e698c6212932f25bcdd4b47a757a50605a90e2 \ + --hash=sha256:6370e8686f662e6a3941ee48ed4742317cafbe5707e36406e9df792cdb535776 \ + --hash=sha256:64f02c6841d7d83f832cd97ccf8eb8a906d06eb95d5276069175c696b024b60a \ + --hash=sha256:65bcd23054beab4d166035cabbc868a09c1a49d1efe458fe8e4361215df40265 \ + --hash=sha256:66671f93accb62ed07da56613636f3641f1a12c13046ce91ffc923721f23c008 \ + --hash=sha256:6696b7688f54f5af4462118f0bfa7c1621eeb87154f77fa04b9295ce7a8f2943 \ + --hash=sha256:6785f414ae0f3c733c437e0f3929197934f526d19dfaa75e18fdb4f94c6fb374 \ + --hash=sha256:67f6279d125ca0046a7fd386d01b311c6363844deac3e5b069b514ba3e63c246 \ + --hash=sha256:6c114670c45346afedc0d947faf3c7f701051d2518b943679c8ff88befe14f8e \ + --hash=sha256:6e0d51f618228538a3e8f46bd246f87a6cd030565e015803691603f55e12afb5 \ + --hash=sha256:6ed74185b2db44f41ef35fd1617c5888e59792da9bbc9190d6c7300617182616 \ + --hash=sha256:708838739abf24b2ceb208d0e22403dd018faeef86ddac04319a62ae884c4f15 \ + --hash=sha256:715479b9a2802ecac752a3b0efa2b0b60285cf962ee38414211abdfccc233b41 \ + --hash=sha256:733784b6d6def852c814bce5f318d25da2ee65dd4839a0718641c696e09a2960 \ + --hash=sha256:750e02e074872a3fad7f233b47734166440af3cdea0add3e95163110816d6752 \ + --hash=sha256:752a45dc4a6934060b3b0dab47e04edc3326575f82be64bc4fc293914566503e \ + --hash=sha256:7579e913a5339fb8fa133f6bbcfd8e6749696206cf05acdbdca71a1b436d8e72 \ + --hash=sha256:7641bb8895e77f921102f72833904dcd9901df5d6d72a2ab8f31d04b7e51e4e7 \ + --hash=sha256:7804338df6fcc08105c7745f1502ba68d900f45fd770d5bdd5288ddccb8a42d8 \ + --hash=sha256:80d04837f55fc81da168b98de4f4b797ef007fc8a79ab71c6ec9bc4dd662b15b \ + --hash=sha256:813c0e0132266c08eb87469a642cb30aaff57c5f426255419572aaeceeaa7bf4 \ + --hash=sha256:82b271f5137d07749f7bf32f70b17ab6eaabedd297e75dce75081a24f76eb545 \ + --hash=sha256:84c018e49c3bf790f9c2771c45e9313a08c2c2a6342b162cd650258b57817706 \ + --hash=sha256:8751d2787c9131302398b11e6c8068053dcb55d5a8964e114b6e196cf16cb366 \ + --hash=sha256:8778f0c7a52e56f75d12dae53ae320fae900a8b9b4164b981b9c5ce059cd1fcb \ + --hash=sha256:87fad7d9ba98c86bcb41b2dc8dbb326619be2562af1f8ff50776a39e55721c5a \ + --hash=sha256:8d828b6667a32a728a1ad1d93957cdf37489c57b97ae6c4de2860fa749b8fc1e \ + --hash=sha256:8e385e4267ab76874ae30db04c627faaaf0b509e1ccc11a95b3fc3e83f855c00 \ + --hash=sha256:92a0a01ead5e668468e952e4238cccd7c537364eb7d851ab144ab6627dbbe12f \ + --hash=sha256:94e1885b270625a9a828c9793b4d52a64445299baa1fea5a173bf1d3dd9a1a5a \ + --hash=sha256:a180c5e59792af262bf263b21a3c49353f25945d8d9f70628e73de370d55e1e1 \ + --hash=sha256:a277ab8928b9f299723bc1a2dabb1265911b1a76341f90a510368ca44ad9ab66 \ + --hash=sha256:a5fe03b42827c13cdccd08e6c0247b6a6d4b5e3cdc53fd1749f5896adcdc2356 \ + --hash=sha256:a6c5863edfbe888d9eff9c8b8087354e27618d9da76425c119293f11712a6319 \ + --hash=sha256:a89c23ef8d2c6b27fd200a42aa4ac72786e7c60d40efdc76e6011260b6e949c4 \ + --hash=sha256:adb2597b428735679446b46c8badf467b4ca5f5056aae4d51a19f9570301b1ad \ + --hash=sha256:ae196f021b5e7c78e918242d217db021ed2a6ace2bc6ae94c0fc596221c7f58d \ + --hash=sha256:ae89db9e5f98a11a4bf50407d4363e7b09b31e55bc117b4f7d80aab97ba009e5 \ + --hash=sha256:aed52fea0513bac0ccde438c188c8a471c4e0f457c2dd20cdbf6ea7a450046c7 \ + --hash=sha256:aef65cd602a6d0e0ff6f9930fcb1c8fec60dd2cfcb6facaf4bdb0e5873042db0 \ + --hash=sha256:af21eb4409a119e365397b2adbaca4c9ccab56543a65d5dbd9f920d6ac29f686 \ + --hash=sha256:b14b2d9dac08e28bb8046a1a0434b1750eb221c8f5b87a68f4fa11a6f97b5e34 \ + --hash=sha256:bb6d88045545b26da47aa879dd4a89a71d1dce0f0e549b1abcb31dfe4a8eac49 \ + --hash=sha256:bb8cc7534f51d9a017b93e3e85b260924f909601c3df002bcdb58ddb4dc41a5c \ + --hash=sha256:bc17a677b21b3502a21f66a8cc64f5bfad4df8a0b8434d661666f8ce90ac3af1 \ + --hash=sha256:bd6c2a1c7573c64738d716488d2cdd3c00e340e4835707d8fdb8dc1a66ef164e \ + --hash=sha256:bd9b23791fe793e4968dba0c447e12f78e425c59fc0e3b97f6450f4781f3ee60 \ + --hash=sha256:c03a41a8784091e67a39648f70c5f97b5b6a37f216896d44d2cdcb82615339a0 \ + --hash=sha256:c0f081d69a6e58272819b70288d3221a6ee64b98df852631c80f293514d3b274 \ + --hash=sha256:c35abb8bfff0185efac5878da64c45dafd2b37fb0383add1be155a763c1f083d \ + --hash=sha256:c36c333c39be2dbca264d7803333c896ab8fa7d4d6f0ab7edb7dfd7aea6e98c0 \ + --hash=sha256:c45e9440fb78f8ddabcf714b68f936737a121355bf59f3907f4e17721b9d1aae \ + --hash=sha256:c593052c465475e64bbfe5dbd81680f64a67fdc752c56d7a0ae205dc8aeefe0f \ + --hash=sha256:cdd68a1fb318e290a2077696b7eb7a21a49163c455979c639bf5a5dcdc46617d \ + --hash=sha256:ce3412fbe1e31eb81ea42f4169ed94861c56e643189e1e75f0041f3fe7020abe \ + --hash=sha256:cf1493cd8607bec4d8a7b9b004e699fcf8f9103a9284cc94962cb73d20f9d4a3 \ + --hash=sha256:cf29836da5119f3c8a8a70667b0ef5fdca3bb12f80fd06487cfa575b3909b393 \ + --hash=sha256:d4a48e5b3c2a489fae013b7589308a40146ee081f6f509e047e0e096084ceca1 \ + --hash=sha256:d560742f3c0d62afaccf9f41fe485ed69bd7661a241f86a3ef0f0fb8b1a397af \ + --hash=sha256:d6038d37043bced98a66e68d3aa2b6a35505dc01328cd65217cefe82f25def44 \ + --hash=sha256:d61f00a0869d77422d9b2aba989e2d24afa6ffd552af442e0e58de4f35ea6d00 \ + --hash=sha256:d635aab80466bc95771bb78d5370e74d36d1fe31467b6b29b8b57b2a3cd7d22c \ + --hash=sha256:dca4bbc466a95ba9c0234ef56d7dd9509f63da22274589ebd4ed7f1f4d4c54e3 \ + --hash=sha256:dd915403e231e6b1809fe9b6d9fc55cf8fb5e02765ac625d9cd623342a7905d7 \ + --hash=sha256:e044c39e41b92c845bc815e5ae4230804e8e7bc29e399b0437d64222d92809dd \ + --hash=sha256:e060d01aec0a910bdccb8be71faf34e7799ce36950f8294c8bf612cba65a2c9e \ + --hash=sha256:e1421b502d83040e6d7fb2fb18dff63957f720da3d77b2fbd3187ceb63755d7b \ + --hash=sha256:e17b8d5d6a8c47c85e68ca8379def1303fd360c3e22093a807cd34a71cd082b8 \ + --hash=sha256:e5f4d355f0a2b1a31bc3edec6795b46324349c9cb25eed068049e4f472fb4259 \ + --hash=sha256:e712b419df8ba5e42b226c510472b37bd57b38e897d3eca5e8cfd410a29fa859 \ + --hash=sha256:e74327fb75de8986940def6e8dee4f127cc9752bee7355bb323cc5b2659b6d46 \ + --hash=sha256:e80c8378d8f3d83cd3164da1ad2df9e37a666cdde7b1cb2298ed0b558064be30 \ + --hash=sha256:e8ac484bf18ce6975760921bb6148041faa8fef0547200386ea0b52b5d27bf7b \ + --hash=sha256:eca9705049ad3c7345d574e3510665cb2cf844c2f2dcfe675332677f081cbd46 \ + --hash=sha256:ed065083d0898c9d5b4bbec7b026fd755ff7454e6e8b73a67f8c744b13986e24 \ + --hash=sha256:edac0f1ab77644605be2cbba52e6b7f630731fc42b34cb0f634be1a6eface56a \ + --hash=sha256:effc3f449787117233702311a1b7d8f59cba9ced946ba727bdc329ec69028e24 \ + --hash=sha256:f22dec1690b584cea26fade98b2435c132c1b5f68e39f5a0b7627cd7ae31f1dc \ + --hash=sha256:f495a1652cf3fbab2eb0639776dad966c2fb874d79d87ca07f9d5f059b8bd215 \ + --hash=sha256:f496c9c3cc02230093d8330875c4c3cdfc3b73612a5fd921c65d39cbcef08063 \ + --hash=sha256:f59099f9b66f0d7145115e6f80dd8b1d847176df89b234a5a6b3f00437aa0832 \ + --hash=sha256:f59ad4c0e8f6bba240a9bb85504faa1ab438237199d4cce5f622761507b8f6a6 \ + --hash=sha256:fbccdc05410c9ee21bbf16a35f4c1d16123dcdeb8a1d38f33654fa21d0234f79 \ + --hash=sha256:fea24543955a6a729c45a73fe90e08c743f0b3334bbf3201e6c4bc1b0c7fa464 + # via requests +click==8.4.1 \ + --hash=sha256:482be17c6991b8c19c5429a1e995d9b0efdbb63172824c41f99965dc0ade8ec2 \ + --hash=sha256:918b5633eddf6b41c32d4f454bf0de810065c74e3f7dbf8ee5452f8be88d3e96 + # via + # huggingface-hub + # typer +contourpy==1.3.3 \ + --hash=sha256:023b44101dfe49d7d53932be418477dba359649246075c996866106da069af69 \ + --hash=sha256:07ce5ed73ecdc4a03ffe3e1b3e3c1166db35ae7584be76f65dbbe28a7791b0cc \ + --hash=sha256:083e12155b210502d0bca491432bb04d56dc3432f95a979b429f2848c3dbe880 \ + --hash=sha256:0bf67e0e3f482cb69779dd3061b534eb35ac9b17f163d851e2a547d56dba0a3a \ + --hash=sha256:0c1fc238306b35f246d61a1d416a627348b5cf0648648a031e14bb8705fcdfe8 \ + --hash=sha256:13b68d6a62db8eafaebb8039218921399baf6e47bf85006fd8529f2a08ef33fc \ + --hash=sha256:15ff10bfada4bf92ec8b31c62bf7c1834c244019b4a33095a68000d7075df470 \ + --hash=sha256:177fb367556747a686509d6fef71d221a4b198a3905fe824430e5ea0fda54eb5 \ + --hash=sha256:1cadd8b8969f060ba45ed7c1b714fe69185812ab43bd6b86a9123fe8f99c3263 \ + --hash=sha256:1fd43c3be4c8e5fd6e4f2baeae35ae18176cf2e5cced681cca908addf1cdd53b \ + --hash=sha256:22e9b1bd7a9b1d652cd77388465dc358dafcd2e217d35552424aa4f996f524f5 \ + --hash=sha256:23416f38bfd74d5d28ab8429cc4d63fa67d5068bd711a85edb1c3fb0c3e2f381 \ + --hash=sha256:283edd842a01e3dcd435b1c5116798d661378d83d36d337b8dde1d16a5fc9ba3 \ + --hash=sha256:2a2a8b627d5cc6b7c41a4beff6c5ad5eb848c88255fda4a8745f7e901b32d8e4 \ + --hash=sha256:2b7e9480ffe2b0cd2e787e4df64270e3a0440d9db8dc823312e2c940c167df7e \ + --hash=sha256:322ab1c99b008dad206d406bb61d014cf0174df491ae9d9d0fac6a6fda4f977f \ + --hash=sha256:33c82d0138c0a062380332c861387650c82e4cf1747aaa6938b9b6516762e772 \ + --hash=sha256:348ac1f5d4f1d66d3322420f01d42e43122f43616e0f194fc1c9f5d830c5b286 \ + --hash=sha256:3519428f6be58431c56581f1694ba8e50626f2dd550af225f82fb5f5814d2a42 \ + --hash=sha256:3c30273eb2a55024ff31ba7d052dde990d7d8e5450f4bbb6e913558b3d6c2301 \ + --hash=sha256:3d1a3799d62d45c18bafd41c5fa05120b96a28079f2393af559b843d1a966a77 \ + --hash=sha256:451e71b5a7d597379ef572de31eeb909a87246974d960049a9848c3bc6c41bf7 \ + --hash=sha256:459c1f020cd59fcfe6650180678a9993932d80d44ccde1fa1868977438f0b411 \ + --hash=sha256:4d00e655fcef08aba35ec9610536bfe90267d7ab5ba944f7032549c55a146da1 \ + --hash=sha256:4debd64f124ca62069f313a9cb86656ff087786016d76927ae2cf37846b006c9 \ + --hash=sha256:4feffb6537d64b84877da813a5c30f1422ea5739566abf0bd18065ac040e120a \ + --hash=sha256:50ed930df7289ff2a8d7afeb9603f8289e5704755c7e5c3bbd929c90c817164b \ + --hash=sha256:51e79c1f7470158e838808d4a996fa9bac72c498e93d8ebe5119bc1e6becb0db \ + --hash=sha256:556dba8fb6f5d8742f2923fe9457dbdd51e1049c4a43fd3986a0b14a1d815fc6 \ + --hash=sha256:598c3aaece21c503615fd59c92a3598b428b2f01bfb4b8ca9c4edeecc2438620 \ + --hash=sha256:5ed3657edf08512fc3fe81b510e35c2012fbd3081d2e26160f27ca28affec989 \ + --hash=sha256:626d60935cf668e70a5ce6ff184fd713e9683fb458898e4249b63be9e28286ea \ + --hash=sha256:644a6853d15b2512d67881586bd03f462c7ab755db95f16f14d7e238f2852c67 \ + --hash=sha256:655456777ff65c2c548b7c454af9c6f33f16c8884f11083244b5819cc214f1b5 \ + --hash=sha256:66c8a43a4f7b8df8b71ee1840e4211a3c8d93b214b213f590e18a1beca458f7d \ + --hash=sha256:6afc576f7b33cf00996e5c1102dc2a8f7cc89e39c0b55df93a0b78c1bd992b36 \ + --hash=sha256:6c3d53c796f8647d6deb1abe867daeb66dcc8a97e8455efa729516b997b8ed99 \ + --hash=sha256:709a48ef9a690e1343202916450bc48b9e51c049b089c7f79a267b46cffcdaa1 \ + --hash=sha256:70f9aad7de812d6541d29d2bbf8feb22ff7e1c299523db288004e3157ff4674e \ + --hash=sha256:8153b8bfc11e1e4d75bcb0bff1db232f9e10b274e0929de9d608027e0d34ff8b \ + --hash=sha256:87acf5963fc2b34825e5b6b048f40e3635dd547f590b04d2ab317c2619ef7ae8 \ + --hash=sha256:88df9880d507169449d434c293467418b9f6cbe82edd19284aa0409e7fdb933d \ + --hash=sha256:929ddf8c4c7f348e4c0a5a3a714b5c8542ffaa8c22954862a46ca1813b667ee7 \ + --hash=sha256:92d9abc807cf7d0e047b95ca5d957cf4792fcd04e920ca70d48add15c1a90ea7 \ + --hash=sha256:95b181891b4c71de4bb404c6621e7e2390745f887f2a026b2d99e92c17892339 \ + --hash=sha256:9e999574eddae35f1312c2b4b717b7885d4edd6cb46700e04f7f02db454e67c1 \ + --hash=sha256:a15459b0f4615b00bbd1e91f1b9e19b7e63aea7483d03d804186f278c0af2659 \ + --hash=sha256:a22738912262aa3e254e4f3cb079a95a67132fc5a063890e224393596902f5a4 \ + --hash=sha256:ab2fd90904c503739a75b7c8c5c01160130ba67944a7b77bbf36ef8054576e7f \ + --hash=sha256:ab3074b48c4e2cf1a960e6bbeb7f04566bf36b1861d5c9d4d8ac04b82e38ba20 \ + --hash=sha256:afe5a512f31ee6bd7d0dda52ec9864c984ca3d66664444f2d72e0dc4eb832e36 \ + --hash=sha256:b08a32ea2f8e42cf1d4be3169a98dd4be32bafe4f22b6c4cb4ba810fa9e5d2cb \ + --hash=sha256:b20c7c9a3bf701366556e1b1984ed2d0cedf999903c51311417cf5f591d8c78d \ + --hash=sha256:b2e8faa0ed68cb29af51edd8e24798bb661eac3bd9f65420c1887b6ca89987c8 \ + --hash=sha256:b7301b89040075c30e5768810bc96a8e8d78085b47d8be6e4c3f5a0b4ed478a0 \ + --hash=sha256:b7448cb5a725bb1e35ce88771b86fba35ef418952474492cf7c764059933ff8b \ + --hash=sha256:ca0fdcd73925568ca027e0b17ab07aad764be4706d0a925b89227e447d9737b7 \ + --hash=sha256:ca658cd1a680a5c9ea96dc61cdbae1e85c8f25849843aa799dfd3cb370ad4fbe \ + --hash=sha256:cbedb772ed74ff5be440fa8eee9bd49f64f6e3fc09436d9c7d8f1c287b121d77 \ + --hash=sha256:cd5dfcaeb10f7b7f9dc8941717c6c2ade08f587be2226222c12b25f0483ed497 \ + --hash=sha256:cf9022ef053f2694e31d630feaacb21ea24224be1c3ad0520b13d844274614fd \ + --hash=sha256:d002b6f00d73d69333dac9d0b8d5e84d9724ff9ef044fd63c5986e62b7c9e1b1 \ + --hash=sha256:d06bb1f751ba5d417047db62bca3c8fde202b8c11fb50742ab3ab962c81e8216 \ + --hash=sha256:d304906ecc71672e9c89e87c4675dc5c2645e1f4269a5063b99b0bb29f232d13 \ + --hash=sha256:e4e6b05a45525357e382909a4c1600444e2a45b4795163d3b22669285591c1ae \ + --hash=sha256:e74a9a0f5e3fff48fb5a7f2fd2b9b70a3fe014a67522f79b7cca4c0c7e43c9ae \ + --hash=sha256:ea37e7b45949df430fe649e5de8351c423430046a2af20b1c1961cae3afcda77 \ + --hash=sha256:f64836de09927cba6f79dcd00fdd7d5329f3fccc633468507079c829ca4db4e3 \ + --hash=sha256:fd6ec6be509c787f1caf6b247f0b1ca598bef13f4ddeaa126b7658215529ba0f \ + --hash=sha256:fd907ae12cd483cd83e414b12941c632a969171bf90fc937d0c9f268a31cafff \ + --hash=sha256:fd914713266421b7536de2bfa8181aa8c699432b6763a0ea64195ebe28bff6a9 \ + --hash=sha256:fde6c716d51c04b1c25d0b90364d0be954624a0ee9d60e23e850e8d48353d07a + # via matplotlib +cycler==0.12.1 \ + --hash=sha256:85cef7cff222d8644161529808465972e51340599459b8ac3ccbac5a854e0d30 \ + --hash=sha256:88bb128f02ba341da8ef447245a9e138fae777f6a23943da4540077d3601eb1c + # via matplotlib +docutils==0.23 \ + --hash=sha256:25d013af9bf23bc1c7b2b093dff4208166c53a94786c9e447808335ef1185fea \ + --hash=sha256:746f5060322511280a1e50eb76846ed6bf2342984b2ac04dc42caa1a8d78799e + # via readme-renderer +filelock==3.29.2 \ + --hash=sha256:779d2f5443b584750c6b90457abffd49235bfb0e66ce82ef5a680867e518ca1c \ + --hash=sha256:f5d3feb44b2b8824832587543af5226822fe86baf086678ede47aa177fe47ca5 + # via + # huggingface-hub + # mlx-gen + # torch +fonttools==4.63.0 \ + --hash=sha256:032038247a96c1690f9f31e377c389383c902531b085aa4e4dabd6f57f870e69 \ + --hash=sha256:063e08bd17bd5a90127a14123de0d6a952dbc847695fd98b63c043d58057f90c \ + --hash=sha256:0c18358a155d75034911c5ee397a5b44cd19dd325dbb8b35fb60bf421d6a72ac \ + --hash=sha256:0eac00b9118c3c2f87d272e45341871c5b3066baa3c86897fa634a7c3fb59096 \ + --hash=sha256:1e874792a8212b44583ea02189d9e693906b2f78b261f372f95d6c563210ac1d \ + --hash=sha256:22135da48a348785c5e2d5d2d9d6bec5ed44adacbaeb9db12d9493bf6c6bfa68 \ + --hash=sha256:22693918177bd9ceabec4736d338045f357769416fc6b0b2508eefef75b08616 \ + --hash=sha256:27fdc65af8da6f88b9c6121c47a464cbe359fcfff7ff6fc2d37a1f395d755b78 \ + --hash=sha256:2b8ae05d9eacf6081414d759c0a352769ac28ce31280d6bb8e77b03f9e3c449f \ + --hash=sha256:2c14b4fd138c4bafcca294765c547914e1aa431ae1ca94ab99d8db08c958bd3b \ + --hash=sha256:308f957cdeaf8abe4e5f2f124902ef405448af92c90f80e302a3b771c2e6116b \ + --hash=sha256:37dd23e621e3b0aef1baa70a303b80aaf38449632cfc8fd2a55fb285bbccfc02 \ + --hash=sha256:445af2eab030a16b9171ea8bdda7ebf7d96bda2df88ee182a464252f6e05e20d \ + --hash=sha256:51394295f1a51de8b5f30bdb1e1b9a4231536c7064ef5c6e211eec19fa36036f \ + --hash=sha256:58dc6bb86a78d782f00f9190ca02c119cf5bbe2807536e361e18d42019f877d8 \ + --hash=sha256:59ac449f8cca9b4ffa08d2e7bbadad87ce710d69d1eda5c3c1ce579baa987272 \ + --hash=sha256:6b2248c5decb223562f7902ff6325077a073f608ee8e33e88ad88db734eb9f49 \ + --hash=sha256:6d4741eb179121cab9eea4cb2393d24492373a260d7945006358c08cfbf45419 \ + --hash=sha256:6db5140a60a5d731d21ec076745b40a310607731b0a565b50776393188649001 \ + --hash=sha256:6e528da43bc3791085f8cb6141b1d13e459226790240340fcbb4625649238b03 \ + --hash=sha256:796f27556dbe094c4824f75ca85267e4df776c79036c8441469a4df37038c196 \ + --hash=sha256:79cdc9f567aec74a72918fd060283911406750cbc9fd28c1316023deb6ce31a9 \ + --hash=sha256:7d76edbff9014094dbf03bd2d074709dfa6ec7aba13d838c937a2b33d2d6a86e \ + --hash=sha256:7d782fac32985914c351556f68ac0855391572bcd87de50e05970d3cd4c96fc5 \ + --hash=sha256:7dd683fef0663e9f0f45cf541d788d24caa3ec9db50796b588e1757d8b3bc007 \ + --hash=sha256:85be818f5506e8a7753153def2c9550178f0ecae6a47b5e0e8dbb23f7cc90380 \ + --hash=sha256:948428a275741f0b64b113c955425a953314f4b9ab9997f73a72c83e68e569c8 \ + --hash=sha256:9ced0bd02ac751dd6319b0da88aaef24414e3b0dbc32bb4f24944821a3741a27 \ + --hash=sha256:9e12f105d2b6342c559c298afb674006bb2893afc7102dcf8a1b55b0486b4e40 \ + --hash=sha256:a8b33a82979e0a6a34ff435cc81317be1f95ec1ebb7a3a2d1c8a6a54f02ae44e \ + --hash=sha256:a9faff9e0c1f76f9fd55899d2ce785832efebab37eb8ae13995853aef178bef0 \ + --hash=sha256:af2fd1664d00a397d75f806985ddb36282091c2131a73a6485c23b4a34722263 \ + --hash=sha256:afefc1ed0a59785a7fb06ea7e1678e849c193e1e387db783579bc7b3056fcfcb \ + --hash=sha256:b1cd75a03ad8cb5bc40c90bfde68c0c47de423aa19e5c0f362b43520645eea94 \ + --hash=sha256:ba04cb5891d4c0c21b6da95eda8d7b090021508a294fff33464fc7d241e0856b \ + --hash=sha256:bf00f21eb5fb721dbaf73d1e9da6d02a1af7768f2ebcf9798be98beab8ba90f6 \ + --hash=sha256:c0425b277a59cff3d80ca42162a8de360f318438a2ac83570842a678d826d579 \ + --hash=sha256:c1aaa4b9c75798400ac043ce04d74e7830376c85095a5a6ed7cba2f17a266bf4 \ + --hash=sha256:c2a2a42198b696a6f48fad91709afb55176e66a5e566131219dba372fb7f8c59 \ + --hash=sha256:caeb583deeb5168e694b65cda8b4ee62abedfa66cf88488734466f2366b9c4e0 \ + --hash=sha256:cb014d58140a38135f16064c74c652ed57aa0b75cbf8bb59cac821f7edb5334e \ + --hash=sha256:ccf41f2efdf56994d22d73bef4ced1052161958169428d06ba9724ea9e9a64be \ + --hash=sha256:cd7e9857e5e63738b9d9fd707bc1f59c8b09e5177726d23664db393c59bb08bd \ + --hash=sha256:d76ac49f929aecaf82d83250b8347e099d7aecba0f4726c1d9b6df3b8bb5fe18 \ + --hash=sha256:d7e5c9973aa04c95650c96e5f5ad865fbf42d62079163ecfab1e01cbc2504c22 \ + --hash=sha256:dcf076a4474fe0d7367e5bbf5b052c7284fa1feca729c04176ce513521afd8a0 \ + --hash=sha256:e3297a6a4059b4acc3a1e9a8b04741f240a80044eef08ebd32e8b5bcdddce75b \ + --hash=sha256:ee08ebfa58f6e1aeff5697ab9582105bb620008c1caafb681e4c557e7483027b \ + --hash=sha256:ef3048ef05dbb552b89817713d9cac912e00d0fde4a3105c00d29e52e10c89af \ + --hash=sha256:fd1e3094f42d806d3d7c79162fc59e5910fcbe3a7360c385b8da969bc4493745 + # via + # matplotlib + # mlx-gen +fsspec==2026.4.0 \ + --hash=sha256:11ef7bb35dab8a394fde6e608221d5cf3e8499401c249bebaeaad760a1a8dec2 \ + --hash=sha256:301d8ac70ae90ef3ad05dcf94d6c3754a097f9b5fe4667d2787aa359ec7df7e4 + # via + # huggingface-hub + # torch +h11==0.16.0 \ + --hash=sha256:4e35b956cf45792e4caa5885e69fba00bdbc6ffafbfa020300e549b208ee5ff1 \ + --hash=sha256:63cf8bbe7522de3bf65932fda1d9c2772064ffb3dae62d55932da54b31cb6c86 + # via httpcore +hf-transfer==0.1.9 \ + --hash=sha256:035572865dab29d17e783fbf1e84cf1cb24f3fcf8f1b17db1cfc7fdf139f02bf \ + --hash=sha256:0d991376f0eac70a60f0cbc95602aa708a6f7c8617f28b4945c1431d67b8e3c8 \ + --hash=sha256:16f208fc678911c37e11aa7b586bc66a37d02e636208f18b6bc53d29b5df40ad \ + --hash=sha256:1a6bd16c667ebe89a069ca163060127a794fa3a3525292c900b8c8cc47985b0d \ + --hash=sha256:2c7fc1b85f4d0f76e452765d7648c9f4bfd0aedb9ced2ae1ebfece2d8cfaf8e2 \ + --hash=sha256:3a736dfbb2c84f5a2c975478ad200c0c8bfcb58a25a35db402678fb87ce17fa4 \ + --hash=sha256:3ebc4ab9023414880c8b1d3c38174d1c9989eb5022d37e814fa91a3060123eb0 \ + --hash=sha256:435cc3cdc8524ce57b074032b8fd76eed70a4224d2091232fa6a8cef8fd6803e \ + --hash=sha256:504b8427fd785dd8546d53b9fafe6e436bd7a3adf76b9dce556507650a7b4567 \ + --hash=sha256:57fd9880da1ee0f47250f735f791fab788f0aa1ee36afc49f761349869c8b4d9 \ + --hash=sha256:5828057e313de59300dd1abb489444bc452efe3f479d3c55b31a8f680936ba42 \ + --hash=sha256:5d561f0520f493c66b016d99ceabe69c23289aa90be38dd802d2aef279f15751 \ + --hash=sha256:6e94e8822da79573c9b6ae4d6b2f847c59a7a06c5327d7db20751b68538dc4f6 \ + --hash=sha256:8669dbcc7a3e2e8d61d42cd24da9c50d57770bd74b445c65123291ca842a7e7a \ + --hash=sha256:8674026f21ed369aa2a0a4b46000aca850fc44cd2b54af33a172ce5325b4fc82 \ + --hash=sha256:89a23f58b7b7effbc047b8ca286f131b17728c99a9f972723323003ffd1bb916 \ + --hash=sha256:8fd0167c4407a3bc4cdd0307e65ada2294ec04f1813d8a69a5243e379b22e9d8 \ + --hash=sha256:a5b366d34cd449fe9b20ef25941e6eef0460a2f74e7389f02e673e1f88ebd538 \ + --hash=sha256:cdca9bfb89e6f8f281890cc61a8aff2d3cecaff7e1a4d275574d96ca70098557 \ + --hash=sha256:d2fde99d502093ade3ab1b53f80da18480e9902aa960dab7f74fb1b9e5bc5746 \ + --hash=sha256:dc7fff1345980d6c0ebb92c811d24afa4b98b3e07ed070c8e38cc91fd80478c5 \ + --hash=sha256:e66acf91df4a8b72f60223059df3003062a5ae111757187ed1a06750a30e911b \ + --hash=sha256:e6ac4eddcd99575ed3735ed911ddf9d1697e2bd13aa3f0ad7e3904dd4863842e \ + --hash=sha256:ee8b10afedcb75f71091bcc197c526a6ebf5c58bbbadb34fdeee6160f55f619f \ + --hash=sha256:fc6bd19e1cc177c66bdef15ef8636ad3bde79d5a4f608c158021153b4573509d + # via mlx-gen +hf-xet==1.5.1 \ + --hash=sha256:0c97106032ef70467b4f6bc2d0ccc266d7613ee076afc56516c502f87ce1c4a6 \ + --hash=sha256:3474760d10e3bb6f92ff3f024fcb00c0b3e4001e9b035c7483e49a5dd17aa70f \ + --hash=sha256:4f561cbbb92f80960772059864b7fb07eae879adde1b2e781ec6f86f6ac26c59 \ + --hash=sha256:51ef4500dab3764b41135ee1381a4b62ce56fc54d4c92b719b59e597d6df5bf6 \ + --hash=sha256:6071d5ccb4d8d2cbd5fea5cc798da4f0ba3f44e25369591c4e89a4987050e61d \ + --hash=sha256:6208adb15d192b90e4c2ad2a27ed864359b2cb0f2494eb6d7c7f3699ac02e2bf \ + --hash=sha256:6762d89b9e3267dfd502b29b2a327b4525f33b17e7b509a78d94e2151a30ce30 \ + --hash=sha256:6abd35c3221eff63836618ddfb954dcf84798603f71d8e33e3ed7b04acfdbe6e \ + --hash=sha256:6f7a04a8ad962422e225bc49fbbac99dc1806764b1f3e54dbd154bffa7593947 \ + --hash=sha256:8298485c1e36e7e67cbd01eeb1376619b7af43d4f1ec245caae306f890a8a32d \ + --hash=sha256:892e3a3a3aecc12aded8b93cf4f9cd059282c7de0732f7d55026f3abdf474350 \ + --hash=sha256:93d090b57b211133f6c0dab0205ef5cb6d89162979ba75a74845045cc3063b8e \ + --hash=sha256:94e761bbd266bf4c03cee73753916062665ce8365aa40ed321f45afcb934b41e \ + --hash=sha256:97f212a88d14bbf573619a74b7fecb238de77d08fc702e54dec6f78276ca3283 \ + --hash=sha256:a93df2039190502835b1db8cd7e178b0b7b889fe9ab51299d5ced26e0dd879a4 \ + --hash=sha256:bf67e6ed10260cef62e852789dc91ebb03f382d5bdc4b1dbeb64763ea275e7d6 \ + --hash=sha256:c6b6cd08ca095058780b50b8ce4d6cbf6787bcf27841705d58a9d32246e3e47a \ + --hash=sha256:d48199c2bf4f8df0adc55d31d1368b6ec0e4d4f45bc86b08038089c23db0bed8 \ + --hash=sha256:dbf48c0d02cf0b2e568944330c60d9120c272dabe013bd892d48e25bc6797577 \ + --hash=sha256:e1af0de8ca6f190d4294a28b88023db64a1e2d1d719cab044baf75bec569e7a9 \ + --hash=sha256:e78e4e5192ad2b674c2e1160b651cb9134db974f8ae1835bdfbfb0166b894a43 \ + --hash=sha256:e7dbb40617410f432182d918e37c12303fe6700fd6aa6c5964e30a535a4461d6 \ + --hash=sha256:f4ad3ebd4c32dd2b27099d69dc7b2df821e30767e46fb6ee6a0713778243b8ff \ + --hash=sha256:f61e3665892a6c8c5e765395838b8ddf36185da835253d4bc4509a81e49fb342 \ + --hash=sha256:f7b3002f95d1c13e24bcb4537baa8f0eb3838957067c91bb4959bc004a6435f5 + # via huggingface-hub +httpcore==1.0.9 \ + --hash=sha256:2d400746a40668fc9dec9810239072b40b4484b640a8c38fd654a024c7a1bf55 \ + --hash=sha256:6e34463af53fd2ab5d807f399a9b45ea31c3dfa2276f15a2c3f00afff6e176e8 + # via httpx +httpx==0.28.1 \ + --hash=sha256:75e98c5f16b0f35b567856f597f06ff2270a374470a5c2392242528e3e3e42fc \ + --hash=sha256:d909fcccc110f8c7faf814ca82a9a4d816bc5a6dbfea25d6591d6985b8ba59ad + # via huggingface-hub +huggingface-hub==1.18.0 \ + --hash=sha256:729be4a976fb706dcc02d176bcda8a3f32bdf21a294e8f4b3dda6fbcbc9c1ab1 \ + --hash=sha256:f0c5ecd1ef8c6a60f86f61ee278f2c1570ba9e279c9f54de9094210723b3613b + # via + # mlx-gen + # tokenizers + # transformers +id==1.6.1 \ + --hash=sha256:d0732d624fb46fd4e7bc4e5152f00214450953b9e772c182c1c22964def1a069 \ + --hash=sha256:f5ec41ed2629a508f5d0988eda142e190c9c6da971100612c4de9ad9f9b237ca + # via twine +idna==3.18 \ + --hash=sha256:7f952cbe720b688055e3f87de14f5c3e5fdaa8bc3928985c4077ca689de849a2 \ + --hash=sha256:ffb385a7e039654cef1ab9ef32c6fafe283c0c0467bba1d9029738ce4a14a848 + # via + # anyio + # httpx + # requests +jaraco-classes==3.4.0 \ + --hash=sha256:47a024b51d0239c0dd8c8540c6c7f484be3b8fcf0b2d85c13825780d3b3f3acd \ + --hash=sha256:f662826b6bed8cace05e7ff873ce0f9283b5c924470fe664fff1c2f00f581790 + # via keyring +jaraco-context==6.1.2 \ + --hash=sha256:bf8150b79a2d5d91ae48629d8b427a8f7ba0e1097dd6202a9059f29a36379535 \ + --hash=sha256:f1a6c9d391e661cc5b8d39861ff077a7dc24dc23833ccee564b234b81c82dfe3 + # via keyring +jaraco-functools==4.5.0 \ + --hash=sha256:3bb5665ea4a020cf78a7040e89154c77edadb3ca74f366479669c5999aa70b03 \ + --hash=sha256:79ce39246eddbde4b3a03b77ea5f0f7878dc669b166a66cf3fa8e266aa3fa2f4 + # via keyring +jinja2==3.1.6 \ + --hash=sha256:0137fb05990d35f1275a587e9aee6d56da821fc83491a0fb838183be43f66d6d \ + --hash=sha256:85ece4451f492d0c13c5dd7c13a64681a86afae63a5f347908daf103ce6d2f67 + # via torch +keyring==25.7.0 \ + --hash=sha256:be4a0b195f149690c166e850609a477c532ddbfbaed96a404d4e43f8d5e2689f \ + --hash=sha256:fe01bd85eb3f8fb3dd0405defdeac9a5b4f6f0439edbb3149577f244a2e8245b + # via twine +kiwisolver==1.5.0 \ + --hash=sha256:012b1eb16e28718fa782b5e61dc6f2da1f0792ca73bd05d54de6cb9561665fc9 \ + --hash=sha256:01808c6d15f4c3e8559595d6d1fe6411c68e4a3822b4b9972b44473b24f4e679 \ + --hash=sha256:0255a027391d52944eae1dbb5d4cc5903f57092f3674e8e544cdd2622826b3f0 \ + --hash=sha256:0b85aad90cea8ac6797a53b5d5f2e967334fa4d1149f031c4537569972596cb8 \ + --hash=sha256:0bf3acf1419fa93064a4c2189ac0b58e3be7872bf6ee6177b0d4c63dc4cea276 \ + --hash=sha256:0c50b89ffd3e1a911c69a1dd3de7173c0cd10b130f56222e57898683841e4f96 \ + --hash=sha256:0cbe94b69b819209a62cb27bdfa5dc2a8977d8de2f89dfd97ba4f53ed3af754e \ + --hash=sha256:0df54df7e686afa55e6f21fb86195224a6d9beb71d637e8d7920c95cf0f89aac \ + --hash=sha256:0e3aafb33aed7479377e5e9a82e9d4bf87063741fc99fc7ae48b0f16e32bdd6f \ + --hash=sha256:12e91c215a96e39f57989c8912ae761286ac5a9584d04030ceb3368a357f017a \ + --hash=sha256:1465387ac63576c3e125e5337a6892b9e99e0627d52317f3ca79e6930d889d15 \ + --hash=sha256:16b85d37c2cbb3253226d26e64663f755d88a03439a9c47df6246b35defbdfb7 \ + --hash=sha256:1b0feb50971481a2cc44d94e88bdb02cdd497618252ae226b8eb1201b957e368 \ + --hash=sha256:1d49a49ac4cbfb7c1375301cd1ec90169dfeae55ff84710d782260ce77a75a02 \ + --hash=sha256:1d9daea4ea6b9be74fe2f01f7fbade8d6ffab263e781274cffca0dba9be9eec9 \ + --hash=sha256:1dd9b0b119a350976a6d781e7278ec7aca0b201e1a9e2d23d9804afecb6ca681 \ + --hash=sha256:1f1489f769582498610e015a8ef2d36f28f505ab3096d0e16b4858a9ec214f57 \ + --hash=sha256:2517e24d7315eb51c10664cdb865195df38ab74456c677df67bb47f12d088a27 \ + --hash=sha256:295d9ffe712caa9f8a3081de8d32fc60191b4b51c76f02f951fd8407253528f4 \ + --hash=sha256:2a075bd7bd19c70cf67c8badfa36cf7c5d8de3c9ddb8420c51e10d9c50e94920 \ + --hash=sha256:32cc0a5365239a6ea0c6ed461e8838d053b57e397443c0ca894dcc8e388d4374 \ + --hash=sha256:332b4f0145c30b5f5ad9374881133e5aa64320428a57c2c2b61e9d891a51c2f3 \ + --hash=sha256:377815a8616074cabbf3f53354e1d040c35815a134e01d7614b7692e4bf8acfa \ + --hash=sha256:38f4a703656f493b0ad185211ccfca7f0386120f022066b018eb5296d8613e23 \ + --hash=sha256:3ac2360e93cb41be81121755c6462cff3beaa9967188c866e5fce5cf13170859 \ + --hash=sha256:3c4923e404d6bcd91b6779c009542e5647fef32e4a5d75e115e3bbac6f2335eb \ + --hash=sha256:3cdcb35dc9d807259c981a85531048ede628eabcffb3239adf3d17463518992d \ + --hash=sha256:41024ed50e44ab1a60d3fe0a9d15a4ccc9f5f2b1d814ff283c8d01134d5b81bc \ + --hash=sha256:413b820229730d358efd838ecbab79902fe97094565fdc80ddb6b0a18c18a581 \ + --hash=sha256:4432b835675f0ea7414aab3d37d119f7226d24869b7a829caeab49ebda407b0c \ + --hash=sha256:4db576bb8c3ef9365f8b40fe0f671644de6736ae2c27a2c62d7d8a1b4329f099 \ + --hash=sha256:4e7f886f47ab881692f278ae901039a234e4025a68e6dfab514263a0b1c4ae05 \ + --hash=sha256:4e9750bc21b886308024f8a54ccb9a2cc38ac9fa813bf4348434e3d54f337ff9 \ + --hash=sha256:5060731cc3ed12ca3a8b57acd4aeca5bbc2f49216dd0bec1650a1acd89486bcd \ + --hash=sha256:50847dca5d197fcbd389c805aa1a1cf32f25d2e7273dc47ab181a517666b68cc \ + --hash=sha256:5092eb5b1172947f57d6ea7d89b2f29650414e4293c47707eb499ec07a0ac796 \ + --hash=sha256:5124d1ea754509b09e53738ec185584cc609aae4a3b510aaf4ed6aa047ef9303 \ + --hash=sha256:51e8c4084897de9f05898c2c2a39af6318044ae969d46ff7a34ed3f96274adca \ + --hash=sha256:530a3fd64c87cffa844d4b6b9768774763d9caa299e9b75d8eca6a4423b31314 \ + --hash=sha256:56fa888f10d0f367155e76ce849fa1166fc9730d13bd2d65a2aa13b6f5424489 \ + --hash=sha256:58f812017cd2985c21fbffb4864d59174d4903dd66fa23815e74bbc7a0e2dd57 \ + --hash=sha256:59cd8683f575d96df5bb48f6add94afc055012c29e28124fcae2b63661b9efb1 \ + --hash=sha256:5ae8e62c147495b01a0f4765c878e9bfdf843412446a247e28df59936e99e797 \ + --hash=sha256:5b233ea3e165e43e35dba1d2b8ecc21cf070b45b65ae17dd2747d2713d942021 \ + --hash=sha256:6176c1811d9d5a04fa391c490cc44f451e240697a16977f11c6f722efb9041db \ + --hash=sha256:62f59da443c4f4849f73a51a193b1d9d258dcad0c41bc4d1b8fb2bcc04bfeb22 \ + --hash=sha256:6783e069732715ad0c3ce96dbf21dbc2235ab0593f2baf6338101f70371f4028 \ + --hash=sha256:6ab8ba9152203feec73758dad83af9a0bbe05001eb4639e547207c40cfb52083 \ + --hash=sha256:70d593af6a6ca332d1df73d519fddb5148edb15cd90d5f0155e3746a6d4fcc65 \ + --hash=sha256:72ec46b7eba5b395e0a7b63025490d3214c11013f4aacb4f5e8d6c3041829588 \ + --hash=sha256:7a32f72973f0f950c1920475d5c5ea3d971b81b6f0ec53b8d0a956cc965f22e0 \ + --hash=sha256:7a4aa69609f40fce3cbc3f87b2061f042eee32f94b8f11db707b66a26461591a \ + --hash=sha256:7c60d3c9b06fb23bd9c6139281ccbdc384297579ae037f08ae90c69f6845c0b1 \ + --hash=sha256:800ee55980c18545af444d93fdd60c56b580db5cc54867d8cbf8a1dc0829938c \ + --hash=sha256:80aa065ffd378ff784822a6d7c3212f2d5f5e9c3589614b5c228b311fd3063ac \ + --hash=sha256:86e0287879f75621ae85197b0877ed2f8b7aa57b511c7331dce2eb6f4de7d476 \ + --hash=sha256:893ff3a711d1b515ba9da14ee090519bad4610ed1962fbe298a434e8c5f8db53 \ + --hash=sha256:89fc958c702ee9a745e4700378f5d23fddbc46ff89e8fdbf5395c24d5c1452a3 \ + --hash=sha256:8c63c91f95173f9c2a67c7c526b2cea976828a0e7fced9cdcead2802dc10f8a4 \ + --hash=sha256:8df31fe574b8b3993cc61764f40941111b25c2d9fea13d3ce24a49907cd2d615 \ + --hash=sha256:8f9baf6f0a6e7571c45c8863010b45e837c3ee1c2c77fcd6ef423be91b21fedb \ + --hash=sha256:9027d773c4ff81487181a925945743413f6069634d0b122d0b37684ccf4f1e18 \ + --hash=sha256:9190426b7aa26c5229501fa297b8d0653cfd3f5a36f7990c264e157cbf886b3b \ + --hash=sha256:940dda65d5e764406b9fb92761cbf462e4e63f712ab60ed98f70552e496f3bf1 \ + --hash=sha256:94eff26096eb5395136634622515b234ecb6c9979824c1f5004c6e3c3c85ccd2 \ + --hash=sha256:9eed0f7edbb274413b6ee781cca50541c8c0facd3d6fd289779e494340a2b85c \ + --hash=sha256:ad4ae4ffd1ee9cd11357b4c66b612da9888f4f4daf2f36995eda64bd45370cac \ + --hash=sha256:b0f172dc8ffaccb8522d7c5d899de00133f2f1ca7b0a49b7da98e901de87bf2d \ + --hash=sha256:b2af221f268f5af85e776a73d62b0845fc8baf8ef0abfae79d29c77d0e776aaf \ + --hash=sha256:b7d335370ae48a780c6e6a6bbfa97342f563744c39c35562f3f367665f5c1de2 \ + --hash=sha256:b83af57bdddef03c01a9138034c6ff03181a3028d9a1003b301eb1a55e161a3f \ + --hash=sha256:bb5136fb5352d3f422df33f0c879a1b0c204004324150cc3b5e3c4f310c9049f \ + --hash=sha256:bc4d8e252f532ab46a1de9349e2d27b91fce46736a9eedaa37beaca66f574ed4 \ + --hash=sha256:bdd3e53429ff02aa319ba59dfe4ceeec345bf46cf180ec2cf6fd5b942e7975e9 \ + --hash=sha256:be12f931839a3bdfe28b584db0e640a65a8bcbc24560ae3fdb025a449b3d754e \ + --hash=sha256:be4a51a55833dc29ab5d7503e7bcb3b3af3402d266018137127450005cdfe737 \ + --hash=sha256:beb7f344487cdcb9e1efe4b7a29681b74d34c08f0043a327a74da852a6749e7b \ + --hash=sha256:bf4679a3d71012a7c2bf360e5cd878fbd5e4fcac0896b56393dec239d81529ed \ + --hash=sha256:c0e1403fd7c26d77c1f03e096dc58a5c726503fa0db0456678b8668f76f521e3 \ + --hash=sha256:c31c13da98624f957b0fb1b5bae5383b2333c2c3f6793d9825dd5ce79b525cb7 \ + --hash=sha256:c438f6ca858697c9ab67eb28246c92508af972e114cac34e57a6d4ba17a3ac08 \ + --hash=sha256:c8277104ded0a51e699c8c3aff63ce2c56d4ed5519a5f73e0fd7057f959a2b9e \ + --hash=sha256:c95cab08d1965db3d84a121f1c7ce7479bdd4072c9b3dafd8fecce48a2e6b902 \ + --hash=sha256:cc0b66c1eec9021353a4b4483afb12dfd50e3669ffbb9152d6842eb34c7e29fd \ + --hash=sha256:cdee07c4d7f6d72008d3f73b9bf027f4e11550224c7c50d8df1ae4a37c1402a6 \ + --hash=sha256:ce9bf03dad3b46408c08649c6fbd6ca28a9fce0eb32fdfffa6775a13103b5310 \ + --hash=sha256:cff8e5383db4989311f99e814feeb90c4723eb4edca425b9d5d9c3fefcdd9537 \ + --hash=sha256:d168fda2dbff7b9b5f38e693182d792a938c31db4dac3a80a4888de603c99554 \ + --hash=sha256:d1ffeb80b5676463d7a7d56acbe8e37a20ce725570e09549fe738e02ca6b7e1e \ + --hash=sha256:d36ca54cb4c6c4686f7cbb7b817f66f5911c12ddb519450bbe86707155028f87 \ + --hash=sha256:d4193f3d9dc3f6f79aaed0e5637f45d98850ebf01f7ca20e69457f3e8946b66a \ + --hash=sha256:d5cd5189fc2b6a538b75ae45433140c4823463918f7b1617c31e68b085c0022c \ + --hash=sha256:d618fd27420381a4f6044faa71f46d8bfd911bd077c555f7138ed88729bfbe79 \ + --hash=sha256:d76e2d8c75051d58177e762164d2e9ab92886534e3a12e795f103524f221dd8e \ + --hash=sha256:daae526907e262de627d8f70058a0f64acc9e2641c164c99c8f594b34a799a16 \ + --hash=sha256:db485b3847d182b908b483b2ed133c66d88d49cacf98fd278fadafe11b4478d1 \ + --hash=sha256:dd952e03bfbb096cfe2dd35cd9e00f269969b67536cb4370994afc20ff2d0875 \ + --hash=sha256:dda366d548e89a90d88a86c692377d18d8bd64b39c1fb2b92cb31370e2896bbd \ + --hash=sha256:e315e5ec90d88e140f57696ff85b484ff68bb311e36f2c414aa4286293e6dee0 \ + --hash=sha256:e4415a8db000bf49a6dd1c478bf70062eaacff0f462b92b0ba68791a905861f9 \ + --hash=sha256:e7a116ae737f0000343218c4edf5bd45893bfeaff0993c0b215d7124c9f77646 \ + --hash=sha256:e7c4c09a490dc4d4a7f8cbee56c606a320f9dc28cf92a7157a39d1ce7676a657 \ + --hash=sha256:ebae99ed6764f2b5771c522477b311be313e8841d2e0376db2b10922daebbba4 \ + --hash=sha256:ec4c85dc4b687c7f7f15f553ff26a98bfe8c58f5f7f0ac8905f0ba4c7be60232 \ + --hash=sha256:ed3a984b31da7481b103f68776f7128a89ef26ed40f4dc41a2223cda7fb24819 \ + --hash=sha256:f18c2d9782259a6dc132fdc7a63c168cbc74b35284b6d75c673958982a378384 \ + --hash=sha256:f1f9f4121ec58628c96baa3de1a55a4e3a333c5102c8e94b64e23bf7b2083309 \ + --hash=sha256:f42c23db5d1521218a3276bb08666dcb662896a0be7347cba864eca45ff64ede \ + --hash=sha256:f443b4825c50a51ee68585522ab4a1d1257fac65896f282b4c6763337ac9f5d2 \ + --hash=sha256:f6764a4ccab3078db14a632420930f6186058750df066b8ea2a7106df91d3203 \ + --hash=sha256:f7c7553b13f69c1b29a5bde08ddc6d9d0c8bfb84f9ed01c30db25944aeb852a7 \ + --hash=sha256:fa6248cd194edff41d7ea9425ced8ca3a6f838bfb295f6f1d6e6bb694a8518df \ + --hash=sha256:fa8eb9ecdb7efb0b226acec134e0d709e87a909fa4971a54c0c4f6e88635484c \ + --hash=sha256:fc20894c3d21194d8041a28b65622d5b86db786da6e3cfe73f0c762951a61167 \ + --hash=sha256:fc4d3f1fb9ca0ae9f97b095963bc6326f1dbfd3779d6679a1e016b9baaa153d3 \ + --hash=sha256:fd40bb9cd0891c4c3cb1ddf83f8bbfa15731a248fdc8162669405451e2724b09 \ + --hash=sha256:ff710414307fefa903e0d9bdf300972f892c23477829f49504e59834f4195398 + # via matplotlib +markdown-it-py==4.2.0 \ + --hash=sha256:04a21681d6fbb623de53f6f364d352309d4094dd4194040a10fd51833e418d49 \ + --hash=sha256:9f7ebbcd14fe59494226453aed97c1070d83f8d24b6fc3a3bcf9a38092641c4a + # via rich +markupsafe==3.0.3 \ + --hash=sha256:0303439a41979d9e74d18ff5e2dd8c43ed6c6001fd40e5bf2e43f7bd9bbc523f \ + --hash=sha256:068f375c472b3e7acbe2d5318dea141359e6900156b5b2ba06a30b169086b91a \ + --hash=sha256:0bf2a864d67e76e5c9a34dc26ec616a66b9888e25e7b9460e1c76d3293bd9dbf \ + --hash=sha256:0db14f5dafddbb6d9208827849fad01f1a2609380add406671a26386cdf15a19 \ + --hash=sha256:0eb9ff8191e8498cca014656ae6b8d61f39da5f95b488805da4bb029cccbfbaf \ + --hash=sha256:0f4b68347f8c5eab4a13419215bdfd7f8c9b19f2b25520968adfad23eb0ce60c \ + --hash=sha256:1085e7fbddd3be5f89cc898938f42c0b3c711fdcb37d75221de2666af647c175 \ + --hash=sha256:116bb52f642a37c115f517494ea5feb03889e04df47eeff5b130b1808ce7c219 \ + --hash=sha256:12c63dfb4a98206f045aa9563db46507995f7ef6d83b2f68eda65c307c6829eb \ + --hash=sha256:133a43e73a802c5562be9bbcd03d090aa5a1fe899db609c29e8c8d815c5f6de6 \ + --hash=sha256:1353ef0c1b138e1907ae78e2f6c63ff67501122006b0f9abad68fda5f4ffc6ab \ + --hash=sha256:15d939a21d546304880945ca1ecb8a039db6b4dc49b2c5a400387cdae6a62e26 \ + --hash=sha256:177b5253b2834fe3678cb4a5f0059808258584c559193998be2601324fdeafb1 \ + --hash=sha256:1872df69a4de6aead3491198eaf13810b565bdbeec3ae2dc8780f14458ec73ce \ + --hash=sha256:1b4b79e8ebf6b55351f0d91fe80f893b4743f104bff22e90697db1590e47a218 \ + --hash=sha256:1b52b4fb9df4eb9ae465f8d0c228a00624de2334f216f178a995ccdcf82c4634 \ + --hash=sha256:1ba88449deb3de88bd40044603fafffb7bc2b055d626a330323a9ed736661695 \ + --hash=sha256:1cc7ea17a6824959616c525620e387f6dd30fec8cb44f649e31712db02123dad \ + --hash=sha256:218551f6df4868a8d527e3062d0fb968682fe92054e89978594c28e642c43a73 \ + --hash=sha256:26a5784ded40c9e318cfc2bdb30fe164bdb8665ded9cd64d500a34fb42067b1c \ + --hash=sha256:2713baf880df847f2bece4230d4d094280f4e67b1e813eec43b4c0e144a34ffe \ + --hash=sha256:2a15a08b17dd94c53a1da0438822d70ebcd13f8c3a95abe3a9ef9f11a94830aa \ + --hash=sha256:2f981d352f04553a7171b8e44369f2af4055f888dfb147d55e42d29e29e74559 \ + --hash=sha256:32001d6a8fc98c8cb5c947787c5d08b0a50663d139f1305bac5885d98d9b40fa \ + --hash=sha256:3524b778fe5cfb3452a09d31e7b5adefeea8c5be1d43c4f810ba09f2ceb29d37 \ + --hash=sha256:3537e01efc9d4dccdf77221fb1cb3b8e1a38d5428920e0657ce299b20324d758 \ + --hash=sha256:35add3b638a5d900e807944a078b51922212fb3dedb01633a8defc4b01a3c85f \ + --hash=sha256:38664109c14ffc9e7437e86b4dceb442b0096dfe3541d7864d9cbe1da4cf36c8 \ + --hash=sha256:3a7e8ae81ae39e62a41ec302f972ba6ae23a5c5396c8e60113e9066ef893da0d \ + --hash=sha256:3b562dd9e9ea93f13d53989d23a7e775fdfd1066c33494ff43f5418bc8c58a5c \ + --hash=sha256:457a69a9577064c05a97c41f4e65148652db078a3a509039e64d3467b9e7ef97 \ + --hash=sha256:4bd4cd07944443f5a265608cc6aab442e4f74dff8088b0dfc8238647b8f6ae9a \ + --hash=sha256:4e885a3d1efa2eadc93c894a21770e4bc67899e3543680313b09f139e149ab19 \ + --hash=sha256:4faffd047e07c38848ce017e8725090413cd80cbc23d86e55c587bf979e579c9 \ + --hash=sha256:509fa21c6deb7a7a273d629cf5ec029bc209d1a51178615ddf718f5918992ab9 \ + --hash=sha256:5678211cb9333a6468fb8d8be0305520aa073f50d17f089b5b4b477ea6e67fdc \ + --hash=sha256:591ae9f2a647529ca990bc681daebdd52c8791ff06c2bfa05b65163e28102ef2 \ + --hash=sha256:5a7d5dc5140555cf21a6fefbdbf8723f06fcd2f63ef108f2854de715e4422cb4 \ + --hash=sha256:69c0b73548bc525c8cb9a251cddf1931d1db4d2258e9599c28c07ef3580ef354 \ + --hash=sha256:6b5420a1d9450023228968e7e6a9ce57f65d148ab56d2313fcd589eee96a7a50 \ + --hash=sha256:722695808f4b6457b320fdc131280796bdceb04ab50fe1795cd540799ebe1698 \ + --hash=sha256:729586769a26dbceff69f7a7dbbf59ab6572b99d94576a5592625d5b411576b9 \ + --hash=sha256:77f0643abe7495da77fb436f50f8dab76dbc6e5fd25d39589a0f1fe6548bfa2b \ + --hash=sha256:795e7751525cae078558e679d646ae45574b47ed6e7771863fcc079a6171a0fc \ + --hash=sha256:7be7b61bb172e1ed687f1754f8e7484f1c8019780f6f6b0786e76bb01c2ae115 \ + --hash=sha256:7c3fb7d25180895632e5d3148dbdc29ea38ccb7fd210aa27acbd1201a1902c6e \ + --hash=sha256:7e68f88e5b8799aa49c85cd116c932a1ac15caaa3f5db09087854d218359e485 \ + --hash=sha256:83891d0e9fb81a825d9a6d61e3f07550ca70a076484292a70fde82c4b807286f \ + --hash=sha256:8485f406a96febb5140bfeca44a73e3ce5116b2501ac54fe953e488fb1d03b12 \ + --hash=sha256:8709b08f4a89aa7586de0aadc8da56180242ee0ada3999749b183aa23df95025 \ + --hash=sha256:8f71bc33915be5186016f675cd83a1e08523649b0e33efdb898db577ef5bb009 \ + --hash=sha256:915c04ba3851909ce68ccc2b8e2cd691618c4dc4c4232fb7982bca3f41fd8c3d \ + --hash=sha256:949b8d66bc381ee8b007cd945914c721d9aba8e27f71959d750a46f7c282b20b \ + --hash=sha256:94c6f0bb423f739146aec64595853541634bde58b2135f27f61c1ffd1cd4d16a \ + --hash=sha256:9a1abfdc021a164803f4d485104931fb8f8c1efd55bc6b748d2f5774e78b62c5 \ + --hash=sha256:9b79b7a16f7fedff2495d684f2b59b0457c3b493778c9eed31111be64d58279f \ + --hash=sha256:a320721ab5a1aba0a233739394eb907f8c8da5c98c9181d1161e77a0c8e36f2d \ + --hash=sha256:a4afe79fb3de0b7097d81da19090f4df4f8d3a2b3adaa8764138aac2e44f3af1 \ + --hash=sha256:ad2cf8aa28b8c020ab2fc8287b0f823d0a7d8630784c31e9ee5edea20f406287 \ + --hash=sha256:b8512a91625c9b3da6f127803b166b629725e68af71f8184ae7e7d54686a56d6 \ + --hash=sha256:bc51efed119bc9cfdf792cdeaa4d67e8f6fcccab66ed4bfdd6bde3e59bfcbb2f \ + --hash=sha256:bdc919ead48f234740ad807933cdf545180bfbe9342c2bb451556db2ed958581 \ + --hash=sha256:bdd37121970bfd8be76c5fb069c7751683bdf373db1ed6c010162b2a130248ed \ + --hash=sha256:be8813b57049a7dc738189df53d69395eba14fb99345e0a5994914a3864c8a4b \ + --hash=sha256:c0c0b3ade1c0b13b936d7970b1d37a57acde9199dc2aecc4c336773e1d86049c \ + --hash=sha256:c47a551199eb8eb2121d4f0f15ae0f923d31350ab9280078d1e5f12b249e0026 \ + --hash=sha256:c4ffb7ebf07cfe8931028e3e4c85f0357459a3f9f9490886198848f4fa002ec8 \ + --hash=sha256:ccfcd093f13f0f0b7fdd0f198b90053bf7b2f02a3927a30e63f3ccc9df56b676 \ + --hash=sha256:d2ee202e79d8ed691ceebae8e0486bd9a2cd4794cec4824e1c99b6f5009502f6 \ + --hash=sha256:d53197da72cc091b024dd97249dfc7794d6a56530370992a5e1a08983ad9230e \ + --hash=sha256:d6dd0be5b5b189d31db7cda48b91d7e0a9795f31430b7f271219ab30f1d3ac9d \ + --hash=sha256:d88b440e37a16e651bda4c7c2b930eb586fd15ca7406cb39e211fcff3bf3017d \ + --hash=sha256:de8a88e63464af587c950061a5e6a67d3632e36df62b986892331d4620a35c01 \ + --hash=sha256:df2449253ef108a379b8b5d6b43f4b1a8e81a061d6537becd5582fba5f9196d7 \ + --hash=sha256:e1c1493fb6e50ab01d20a22826e57520f1284df32f2d8601fdd90b6304601419 \ + --hash=sha256:e1cf1972137e83c5d4c136c43ced9ac51d0e124706ee1c8aa8532c1287fa8795 \ + --hash=sha256:e2103a929dfa2fcaf9bb4e7c091983a49c9ac3b19c9061b6d5427dd7d14d81a1 \ + --hash=sha256:e56b7d45a839a697b5eb268c82a71bd8c7f6c94d6fd50c3d577fa39a9f1409f5 \ + --hash=sha256:e8afc3f2ccfa24215f8cb28dcf43f0113ac3c37c2f0f0806d8c70e4228c5cf4d \ + --hash=sha256:e8fc20152abba6b83724d7ff268c249fa196d8259ff481f3b1476383f8f24e42 \ + --hash=sha256:eaa9599de571d72e2daf60164784109f19978b327a3910d3e9de8c97b5b70cfe \ + --hash=sha256:ec15a59cf5af7be74194f7ab02d0f59a62bdcf1a537677ce67a2537c9b87fcda \ + --hash=sha256:f190daf01f13c72eac4efd5c430a8de82489d9cff23c364c3ea822545032993e \ + --hash=sha256:f34c41761022dd093b4b6896d4810782ffbabe30f2d443ff5f083e0cbbb8c737 \ + --hash=sha256:f3e98bb3798ead92273dc0e5fd0f31ade220f59a266ffd8a4f6065e0a3ce0523 \ + --hash=sha256:f42d0984e947b8adf7dd6dde396e720934d12c506ce84eea8476409563607591 \ + --hash=sha256:f71a396b3bf33ecaa1626c255855702aca4d3d9fea5e051b41ac59a9c1c41edc \ + --hash=sha256:f9e130248f4462aaa8e2552d547f36ddadbeaa573879158d721bbd33dfe4743a \ + --hash=sha256:fed51ac40f757d41b7c48425901843666a6677e3e8eb0abcff09e4ba6e664f50 + # via jinja2 +matplotlib==3.10.9 \ + --hash=sha256:09218df8a93712bd6ea133e83a153c755448cf7868316c531cffcc43f69d1cc9 \ + --hash=sha256:10cc5ce06d10231c36f40e875f3c7e8050362a4ee8f0ee5d29a6b3277d57bb42 \ + --hash=sha256:172db52c9e683f5d12eaf57f0f54834190e12581fe1cc2a19595a8f5acb4e77d \ + --hash=sha256:1872fb212a05b729e649754a72d5da61d03e0554d76e80303b6f83d1d2c0552b \ + --hash=sha256:1aa972116abb4c9d201bf245620b433726cb6856f3bef6a78f776a00f5c92d37 \ + --hash=sha256:1e7698ac9868428e84d2c967424803b2472ff7167d9d6590d4204ed775343c3b \ + --hash=sha256:2dc9477819ffd78ad12a20df1d9d6a6bd4fec6aaa9072681465fddca052f1456 \ + --hash=sha256:3225f4e1edcb8c86c884ddf79ebe20ecd0a67d30188f279897554ccd8fded4dc \ + --hash=sha256:336b9acc64d309063126edcdaca00db9373af3c476bb94388fe9c5a53ad13e6f \ + --hash=sha256:345f6f68ecc8da0ca56fad2ea08fde1a115eda530079eca185d50a7bc3e146c6 \ + --hash=sha256:34cf8167e023ad956c15f36302911d5406bd99a9862c1a8499ea6f7c0e015dc2 \ + --hash=sha256:3fc0364dfbe1d07f6d15c5ebd0c5bf89e126916e5a8667dd4a7a6e84c36653d4 \ + --hash=sha256:41cb28c2bd769aa3e98322c6ab09854cbcc52ab69d2759d681bba3e327b2b320 \ + --hash=sha256:42fb814efabe95c06c1994d8ab5a8385f43a249e23badd3ba931d4308e5bca20 \ + --hash=sha256:4e42042d54db34fda4e95a7bd3e5789c2a995d2dad3eb8850232ee534092fbbf \ + --hash=sha256:4edcfbd8565339aa62f1cd4012f7180926fdbe71850f7b0d3c379c175cd6b66c \ + --hash=sha256:51bf0ddbdc598e060d46c16b5590708f81a1624cefbaaf62f6a81bf9285b8c80 \ + --hash=sha256:56fc0bd271b00025c6edfdc7c2dcd247372c8e1544971d62e1dc7c17367e8bf9 \ + --hash=sha256:59476c6d29d612b8e9bb6ce8c5b631be6ba8f9e3a2421f22a02b192c7dd28716 \ + --hash=sha256:6640f75af2c6148293caa0a2b39dd806a492dd66c8a8b04035813e33d0fd2585 \ + --hash=sha256:68cfdcede415f7c8f5577b03303dd94526cdb6d11036cecdc205e08733b2d2bb \ + --hash=sha256:6b63d9c7c769b88ab81e10dc86e4e0607cf56817b9f9e6cf24b2a5f1693b8e38 \ + --hash=sha256:6be157fe17fc37cb95ac1d7374cf717ce9259616edec911a78d9d26dae8522d4 \ + --hash=sha256:6c63ebcd8b4b169eb2f5c200552ae6b8be8999a005b6b507ed76fb8d7d674fe2 \ + --hash=sha256:77210dce9cb8153dffc967efaae990543392563d5a376d4dd8539bebcb0ed217 \ + --hash=sha256:7a8d66a55def891c33147ba3ba9bfcabf0b526a43764c818acbb4525e5ed0838 \ + --hash=sha256:82368699727bfb7b0182e1aa13082e3c08e092fa1a25d3e1fd92405bff96f6d4 \ + --hash=sha256:82834c3c292d24d3a8aae77cd2d20019de69d692a34a970e4fdb8d33e2ea3dda \ + --hash=sha256:8e436d155fa8a3399dc62683f8f5d0e2e50d25d0144a73edd73f82eec8f4abfb \ + --hash=sha256:8f3bcac1ca5ed000a6f4337d47ba67dfddf37ed6a46c15fd7f014997f7bf865f \ + --hash=sha256:97e35e8d39ccc85859095e01a53847432ba9a53ddf7986f7a54a11b73d0e143f \ + --hash=sha256:985f2238880e2e69093f588f5fe2e46771747febf0649f3cf7f7b7480875317f \ + --hash=sha256:a49f1eadc84ca85fd72fa4e89e70e61bf86452df6f971af04b12c60761a0772c \ + --hash=sha256:a5a6104ed666402ba5106d7f36e0e0cdca4e8d7fa4d39708ca88019e2835a2eb \ + --hash=sha256:aba1615dabe83188e19d4f75a253c6a08423e04c1425e64039f800050a69de6b \ + --hash=sha256:ae20801130378b82d647ff5047c07316295b68dc054ca6b3c13519d0ea624285 \ + --hash=sha256:ae2f11957b27ce53497dd4d7b235c4d4f1faf383dfb39d0c5beb833bff883294 \ + --hash=sha256:b049278ddce116aaa1c1377ebf58adea909132dfce0281cf7e3a1ea9fc2e2c65 \ + --hash=sha256:b1b745c489cd1a77a0dc1120a05dc87af9798faebc913601feb8c73d89bf2d1e \ + --hash=sha256:b2b9516251cb89ff618d757daec0e2ed1bf21248013844a853d87ef85ab3081d \ + --hash=sha256:b580440f1ff81a0e34122051a3dfabb7e4b7f9e380629929bde0eff9af72165f \ + --hash=sha256:ba7b3b8ef09eab7df0e86e9ae086faa433efbfbdb46afcb3aa16aabf779469a8 \ + --hash=sha256:c27df8b3848f32a83d1767566595e43cfaa4460380974da06f4279a7ec143c39 \ + --hash=sha256:d091f9d758b34aaaaa6331d13574bf01891d903b3dec59bfff458ef7551de5d6 \ + --hash=sha256:d730e984eddf56974c3e72b6129c7ca462ac38dc624338f4b0b23eb23ecba00f \ + --hash=sha256:d75d11c949914165976c621b2324f9ef162af7ebf4b057ddf95dd1dba7e5edcf \ + --hash=sha256:d843374407c4017a6403b59c6c81606773d136f3259d5b6da3131bc814542cc2 \ + --hash=sha256:da4e09638420548f31c354032a6250e473c68e5a4e96899b4844cf39ddea23fe \ + --hash=sha256:de2445a0c6690d21b7eb6ce071cebad6d40a2e9bdf10d039074a96ba19797b99 \ + --hash=sha256:dfca0129678bd56379db26c52b5d77ed7de314c047492fbdc763aa7501710cfb \ + --hash=sha256:e9fae004b941b23ff2edcf1567a857ed77bafc8086ffa258190462328434faf8 \ + --hash=sha256:f0c3c28d9fbcc1fe7a03be236d73430cf6409c41fb2383a7ac52fe932b072cb1 \ + --hash=sha256:f4399f64b3e94cd500195490972ae1ee81170df1636fa15364d157d5bdd7b921 \ + --hash=sha256:f76e640a5268850bfda54b5131b1b1941cc685e42c5fa98ed9f2d64038308cba \ + --hash=sha256:fd66508e8c6877d98e586654b608a0456db8d7e8a546eb1e2600efd957302358 + # via mlx-gen +mdurl==0.1.2 \ + --hash=sha256:84008a41e51615a49fc9966191ff91509e3c40b939176e643fd50a5c2196b8f8 \ + --hash=sha256:bb413d29f5eea38f31dd4754dd7377d4465116fb207585f97bf925588687c1ba + # via markdown-it-py +mlx==0.31.2 \ + --hash=sha256:117c7583cae0ca107cd53c591cc34f8e75f97a505aa47088844b7dc0fc69dc67 \ + --hash=sha256:1b3fb0dda955b0d552ce57bdd6f42b3309ab21b067e40587d6848443d307e91f \ + --hash=sha256:2a64db61b2840f28bae08354e6f999698e30381af201cc12354290673c96213b \ + --hash=sha256:34b0171cd9eb5c43fdd82091f6135d6ccc5a065363a4a3e68fac64fb4e53d37c \ + --hash=sha256:4a3f181b367d404e44a6bd68ef5eb573930809ac60cacd51d0c851c629b1b651 \ + --hash=sha256:51ca102db641b01e7cb083ce8ecb580e281530a141a7ca12544bb370641630ae \ + --hash=sha256:53c8d57ffa9ce77f8355663be05014c0dd37280e57f19126fb0a24389a30684b \ + --hash=sha256:59ccbd0f0044d4f97f11ebcbf0c480bc9e962935fd96275f120954afea65be8a \ + --hash=sha256:69fbc94bf53607a75af9eb3e22c354738a6fe4e25aa4e2b20934b009a4bba1f3 \ + --hash=sha256:70297cbef7479429f69c966bfed10da20a6f0c2aa997eec2b4f6ba1a07caf2ef \ + --hash=sha256:99572133181481640a8bf8d449daf083816d0af3ee050c8adfc5bf45ceca91c6 \ + --hash=sha256:a13c9ce23c3deef6aa5a09315e7953e1a5dc311e851fa16fc74c81fb2509c0b9 \ + --hash=sha256:b0764bf11fc3a71dee988e19275eef67775cab63112d8bb7ef173ca8b2a1247c \ + --hash=sha256:b29cf940f34205f09bb552ac60465ae833c4ae640b52777c6d725ddbad8461ca \ + --hash=sha256:b368f7ede4238cc44076e4843820338c453c21ee50bd3ee26d4b182c179fd8e1 \ + --hash=sha256:c05981684279a8935d58b0dde3ea5b02d210c3bad3319aa0e9934ec2df165752 \ + --hash=sha256:c0ff158b7ac93a4b5659adbc70053498b30a5964fc45f78596398e056a96c36a \ + --hash=sha256:c71dff00cc1b363d542f111d9e8b7b59dadb65b29d027f798b71ea34da75b665 \ + --hash=sha256:cd1f4189e5f1bc68735f44eb63ce98ae09d66ac75d7ab5b15a41afae7e9f0513 \ + --hash=sha256:cd5d42b0b2bee7efe1b0680a7e302943dd33b92c879cffa0358ffdb5a4a8d27b \ + --hash=sha256:e3e2818157371501de097887f371784227f9dd9c91e177f986db7b25319c55d7 \ + --hash=sha256:e5067aaf2be1f3d7bba5be52348775804f111173c1ed04639618fd713b1a530f \ + --hash=sha256:e81798c610f95a09c642c89214ba5c23b72ce18ce4728184aceabe7eddca33d7 \ + --hash=sha256:ebdc47b87b4b0216ceab3b5961716804bba3107c16454b65ae51d0e0c059f298 \ + --hash=sha256:edb9797db7d852477ca1c99708058654ee860d4148fe5765f0d55528e2b1aa22 + # via mlx-gen +mlx-gen==0.18.14 \ + --hash=sha256:0f1e9f473e712f740e8f082af33facc16c398d221dfb4de49d200906cb537a62 \ + --hash=sha256:f5400df55fe6a611cc47df638816a29228f9fe6c2e1ef5ad2224f4d16dfe77ae + # via -r /tmp/video-req.in +mlx-metal==0.31.2 \ + --hash=sha256:84ffb60ee503f03eb684f5fb168d5cff31e2a16b7f27c1731eaf7662bd6e9b46 \ + --hash=sha256:b25385bcee18fc194092255b8b53b9a3d8489eb650e59160f1b57aadd07aa2dc \ + --hash=sha256:e9d4e5fce6ca10a87a0e388597f99519ad594d09e674708b5312bd8bd4f5997d + # via mlx +more-itertools==11.1.0 \ + --hash=sha256:48e8f4d9e7e5878571ecf6f2b4e57634f93cd474cc8cfbd2376f2d11b396e30d \ + --hash=sha256:4b65538ae22f6fed0ce4874efd317463a7489796a0939fa66824dd542125a192 + # via + # jaraco-classes + # jaraco-functools +mpmath==1.3.0 \ + --hash=sha256:7a28eb2a9774d00c7bc92411c19a89209d5da7c4c9a9e227be8330a23a25b91f \ + --hash=sha256:a0b2b9fe80bbcd81a6647ff13108738cfb482d481d826cc0e02f5b35e5c88d2c + # via sympy +networkx==3.6.1 \ + --hash=sha256:26b7c357accc0c8cde558ad486283728b65b6a95d85ee1cd66bafab4c8168509 \ + --hash=sha256:d47fbf302e7d9cbbb9e2555a0d267983d2aa476bac30e90dfbe5669bd57f3762 + # via torch +nh3==0.3.5 \ + --hash=sha256:0a09f51806fd51b4fedbf9ea2b61fef388f19aef0d62fe51199d41648be14588 \ + --hash=sha256:207c01801d3e9bb8ec08f08689346bdd30ce15b8bf60013a925d08b5388962a4 \ + --hash=sha256:23a312224875f72cd16bde417f49071451877e29ef646a60e50fcb69407cc18a \ + --hash=sha256:2c069570b06aa848457713ad7af4a9905691291548c4466a9ad78ee95808382b \ + --hash=sha256:38748140bf76383ab7ce2dce0ad4cb663855d8fbc9098f7f3483673d09616a17 \ + --hash=sha256:387abd011e81959d5a35151a11350a0795c6edeb53ebfa02d2e882dc01299263 \ + --hash=sha256:3bb854485c9b33e5bb143ff3e49e577073bc6bc320f0ff8fc316dd89c0d3c101 \ + --hash=sha256:45855e14ff056064fec77133bfcf7cd691838168e5e17bbef075394954dc9dc8 \ + --hash=sha256:45e6a65dc88a300a2e3502cb9c8e6d1d6b831d6fba7470643333609c6aab1f30 \ + --hash=sha256:488928988caad25ba14b1eb5bc74e25e21f3b5e40341d956f3ce4a8bc19460dc \ + --hash=sha256:48f45e3e914be93a596431aa143dedf1582557bf41a58153c296048d6e3798c9 \ + --hash=sha256:50d401ab2d8e86d59e2126e3ab2a2f45840c405842b626d9a51624b3a33b6878 \ + --hash=sha256:52d877980d7ca01dc3baf3936bf844828bc6f332962227a684ed79c18cce14c3 \ + --hash=sha256:559e4c73b689e9a7aa97ac9760b1bc488038d7c1a575aa4ab5a0e19ee9630c0f \ + --hash=sha256:6ea58cc44d274c643b83547ca9654a0b1a817609b160601356f76a2b744c49ad \ + --hash=sha256:72c5bdedec27fa33de6a5326346ea8aa3fe54f6ac294d54c4b204fb66a9f1e79 \ + --hash=sha256:84bdeb082544fbcb77a12c034dd77d7da0556fdc0727b787eb6214b958c15e29 \ + --hash=sha256:8f85285700a18e9f3fc5bff41fe573fa84f81542ef13b48a89f9fecca0474d3b \ + --hash=sha256:acfd354e61accbe4c74f8017c6e397a776916dfe47c48643cf7fd84ade826f93 \ + --hash=sha256:c357f1d042c67f135a5e6babb2b0e3b9d9224ff4a3543240f597767b01384ffd \ + --hash=sha256:c3aae321f67ae66cff2a627115f106a377d4475d10b0e13d97959a13486b9a88 \ + --hash=sha256:c88605d8d468f7fc1b31e06129bc91d6c96f6c621776c9b504a0da9beac9df5f \ + --hash=sha256:de8e8621853b6470fe928c684ee0d3f39ea8086cebafe4c416486488dea7b68d \ + --hash=sha256:e49c9b564e6bcb03ecd2f057213df9a0de15a95812ac9db9600b590db23d3ae9 \ + --hash=sha256:ea232933394d1d58bf7c4bb348dc4660eae6604e1ae81cd2ba6d9ed80d390f3b \ + --hash=sha256:eeedc90ed8c42c327e8e10e621ccfa314fc6cce35d5929f4297ff1cdb89667c4 \ + --hash=sha256:fe3a787dc76b50de6bee54ef242f26c41dfe47654428e3e94f0fae5bb6dd2cc1 + # via readme-renderer +numpy==2.4.6 \ + --hash=sha256:001fbb8e08d942dd57599e781f2472269ee7f2755fae407b4f67b2f0b17da3f1 \ + --hash=sha256:0280e0356c0829a18d9de1cb7eee50ec22ca639878d7240307ca0943d73cd2c4 \ + --hash=sha256:043191bfa8eab18c776647b62723ac9dddece59743b13f49b2016094129c2b3f \ + --hash=sha256:06ca2f61ec4385a07a6977c55ba998a4466c123642b4a32694d3128fce18c079 \ + --hash=sha256:0a041d3d761dc3c35cc56ce0351506a02bcbc25f7b169f652435141a17db9096 \ + --hash=sha256:0ab0a9c4ffb1a6d95ef519fe4247dba8eb6b18ad93999f76b7f657039acabd47 \ + --hash=sha256:0c9136e14ed34a9e343a31c533d78a9813a69a3148332bce5e9821cb2f996e66 \ + --hash=sha256:110f8b71aacb688ec69062bb7f6938a0f8acb01b7c1c4beb453c65b6d234584d \ + --hash=sha256:112b06a867b235ef466ed3508ddf0238050df9c727cafb5301ac385b899189a1 \ + --hash=sha256:17f9ade344e7d9b464a084d69bcf18fc691cb1db67c62ed80820bf4926d78f0e \ + --hash=sha256:1e254a00cdf42b1e4d5b3d68d33af63268d41340d8885df2ab6470f2e1500147 \ + --hash=sha256:1e978ec1e8bd0e0e4de6bb75de9d30cbb74db6b6a2bb727618613703ca0167dd \ + --hash=sha256:25c692919ac5a01f170a3bfcd62d745b24fd095c353d50812637d6fcab442e75 \ + --hash=sha256:260a5d70215b61ab4fadf5c7baacd64821842975eea312125ed3c39a6391b063 \ + --hash=sha256:2803abfebfc990042cd494d8ce2d5f82e9d847af6d35ec486923aa19dbad5e73 \ + --hash=sha256:29a287e0cf63ff528da061de6b9f64a4618da591ca1046aafc54062e40ca7eab \ + --hash=sha256:29cb7f67d10b479ff07c17d33e39f78c07f71c40ef30d63c153d340e96cd3fb4 \ + --hash=sha256:3213d622a0283a39a93d188f3cf72b26862df52fbb4ca3697f51705016523d41 \ + --hash=sha256:33111801a01c12a8a1e3721f0a9232f8cfc8ae2c6b7098167e6f623c6073f402 \ + --hash=sha256:357cc07a6d7b0b182ff02249616a03742827ebb1277546b5c7cd7f7620a45698 \ + --hash=sha256:38efbc8de75c7a0fc1ac190162d892787f3f47b57cc291231aafee36b80982b7 \ + --hash=sha256:4081eb135ac24158bd51cdfbef16f1c64df7063b1143f24731387137c092bec8 \ + --hash=sha256:40fdc1ae7125e518ea98e53e69a4ebc27e1fd50510c47b7ea130cf21e5e1d42b \ + --hash=sha256:4cfe66903cc32a9921a6733d96b19bb6abf310397581bbad89c228f5abaf0ee8 \ + --hash=sha256:511dbaf848decaaaf4b4ca48032619fb3138710c4bf7da7617765edad1ef96b0 \ + --hash=sha256:55cced7c52e981362f708ad635198e97a752dfba412cc03c23bbf3bd8d5cd662 \ + --hash=sha256:56b39e5e0622a09a25bf5baf62f4bcf0cb8a41ae6e2819cf49bbc5a74c083f91 \ + --hash=sha256:5dbbdb29840ca3d91ee0fece42fc29278886d908280bfec0a5846c6f901a3eb0 \ + --hash=sha256:5f9fb9157b4ce2971008323afe46053787b526ef624fea915b261468a8421a0f \ + --hash=sha256:6180d8b35af935aed8ece3a85e0a43f87393ae0ac87c8d2c8bd2c993f7270ef3 \ + --hash=sha256:68a5124b13fa6cc2086764a20005d30bc0548146f7f5322f02fce212ca14317f \ + --hash=sha256:68bb27509ac1b9a3443094260f6326150663b06abe40b73a2f81160623da5b67 \ + --hash=sha256:6f41ae150c4e32db4f3310cdaf64b1593a03dbabe29eec77fc9b50fe64061df6 \ + --hash=sha256:7265a2f3d436e54ef9f2b52b5c937e6be778781bd97a590319d7348f1c1ca997 \ + --hash=sha256:72fbe16c6fac95aedf5937fa873445cec2110be35d8a4e9433d7501fd98dae6b \ + --hash=sha256:7d92c3819208a60205a12a245c91ad70cb0a85336659b19b834205573ac8456e \ + --hash=sha256:8155154c7c691289fe18f510b5d4657c68c67989f293f0535a91360392ff6538 \ + --hash=sha256:81a1cca95ed5bb92aa8b10dd2cdc9a0d3853a50fad926c28b5d7e8ea54389627 \ + --hash=sha256:89cd468399cfd2504718f0ba50e410dca55a170b61a02ad92bb18c8a65186e93 \ + --hash=sha256:8ad03c0965fb3c692200e74d458ca28c1dbb4ce96f9a479a8aa041ad5fabca02 \ + --hash=sha256:90f9849678c75fe7afa2d348ac842c168b0a4d3d61919687216dfc547976d853 \ + --hash=sha256:948424b06129ce883307e8cff868c31396d8dc7630a59c61d70d98dbe70f222c \ + --hash=sha256:9cd5ffd25db4e7ba6a375693b3fc0fc1791ec636c17db3720da19bde7180ec43 \ + --hash=sha256:a0df0043bdb289bde1f62da130d20df23d58b45429f752bc7a8fc5325a225ecd \ + --hash=sha256:a2c306dea656c12c68f51f4cea133cbe78ca7435eb28c735eac1d3ebe73be6e8 \ + --hash=sha256:a7830bab239b79cda9c08c2da014761cafb48da6150e1da17ac06283f43b6089 \ + --hash=sha256:a7c711e21628b52034bb5ab8d1bce291f752fcc5e92accc615778acee1ff4778 \ + --hash=sha256:aaf159caa35993cb1f56fb9b8e4610d35758e7ca005412eb1daa856a78c9c4b1 \ + --hash=sha256:ae506e6902902557576a26ff33eda8695e7ecb3cb36c3b573a0765dee114ebdb \ + --hash=sha256:b507f5c4c1d508876d1819b6bf9a49d365b96320b5d4993426b33a23ca4b8261 \ + --hash=sha256:bf162abab1c1a736333192707cef898e735a5ca00f38f27eeedf44b39d9e85eb \ + --hash=sha256:c1a2af6c6ef86344a6b0db6b97834208bf598db514f2b155042439b62605601a \ + --hash=sha256:c2d37ab77531417474168eb79d6d80b14f821a966818505d03013d0833edb7a8 \ + --hash=sha256:c4fc99836233ea196540b17ab0983aff60ed07941751930f5f4d05bc3b3b7359 \ + --hash=sha256:d581b735e177fdcdce6fed8e7e8880a3fb6ee4e3653a3ac6af01c6f4c03effc5 \ + --hash=sha256:d6da64deb6b8ed903e7560180a92f2d804ee1ba5eeb849ac2748b8c1aba1f6d7 \ + --hash=sha256:d8e8286dd7cea7895157318d1b91cdacac64c479f3cbc8dce548331728484751 \ + --hash=sha256:ddea102b48f9e339f3948bf22040944184627a30fdf7f858667673b9c5f033c8 \ + --hash=sha256:dfa20cc6ca228e6b155b11da03825975ce66aea520985dbbddf0f2a5a495c605 \ + --hash=sha256:e3e5193ef5a3dc73bceee50f7fdc2c90dbb76c42df8d8fae3d1067a583df579e \ + --hash=sha256:e3eeb0aabd6bd5ce64faae67e9935203a6991b4bc2a485a767fbafb2c5125f45 \ + --hash=sha256:e5805d5a22fd19c8ccff10a9561f9df94436b0545619ea579db2d3c35294bce2 \ + --hash=sha256:e85b752a1e912b70eaad4fafbd4d1238007ab221de2009b9a2f5ae7461239895 \ + --hash=sha256:eaf7fa2de5c0be8ae6ff8e9bea2ccd725e980541244521d8d4b5f3354a27babe \ + --hash=sha256:ebfb099f8dcf083deef3ac1ca4c1503f387cf76296fcb3816b66f5ecb5f54fdb \ + --hash=sha256:ece3d2cfe132e7d51f44a832b303895e6f2d499c5e74dfbdb06ee246147a304a \ + --hash=sha256:ed9749eef4cbd126da3dc1d6bcb3a57f5eb7ac6a6484146bdbf743f552dfc577 \ + --hash=sha256:ede83e07a75dd06bc501566c1eca2afc0d61677c1472ac9ad93fdee6e638a48d \ + --hash=sha256:ef4aea96ce4d3b074422cb4f2f64e216bf9e213004bb58ecfdf50ea02ea8eb9a \ + --hash=sha256:f3a3570c4a2a16746ac2c31a7c7c7b0c186b95ce902e33db6f28094ed7387dda \ + --hash=sha256:f407cb6b8e9d6d8c626bc73c945db1706035af8fd632295547bf1c9e46d092d6 \ + --hash=sha256:f74a575920ab21fe304421a3fc28793d82e299cae9eccb37084e9fc7f3617c20 + # via + # contourpy + # matplotlib + # mlx-gen + # opencv-python + # transformers +opencv-python==4.13.0.92 \ + --hash=sha256:0bc2596e68f972ca452d80f444bc404e08807d021fbba40df26b61b18e01838a \ + --hash=sha256:372fe164a3148ac1ca51e5f3ad0541a4a276452273f503441d718fab9c5e5f59 \ + --hash=sha256:402033cddf9d294693094de5ef532339f14ce821da3ad7df7c9f6e8316da32cf \ + --hash=sha256:423d934c9fafb91aad38edf26efb46da91ffbc05f3f59c4b0c72e699720706f5 \ + --hash=sha256:5868a8c028a0b37561579bfb8ac1875babdc69546d236249fff296a8c010ccf9 \ + --hash=sha256:620d602b8f7d8b8dab5f4b99c6eb353e78d3fb8b0f53db1bd258bb1aa001c1d5 \ + --hash=sha256:bccaabf9eb7f897ca61880ce2869dcd9b25b72129c28478e7f2a5e8dee945616 \ + --hash=sha256:caf60c071ec391ba51ed00a4a920f996d0b64e3e46068aac1f646b5de0326a19 + # via mlx-gen +packaging==26.2 \ + --hash=sha256:5fc45236b9446107ff2415ce77c807cee2862cb6fac22b8a73826d0693b0980e \ + --hash=sha256:ff452ff5a3e828ce110190feff1178bb1f2ea2281fa2075aadb987c2fb221661 + # via + # huggingface-hub + # matplotlib + # transformers + # twine +piexif==1.1.3 \ + --hash=sha256:3bc435d171720150b81b15d27e05e54b8abbde7b4242cddd81ef160d283108b6 \ + --hash=sha256:83cb35c606bf3a1ea1a8f0a25cb42cf17e24353fd82e87ae3884e74a302a5f1b + # via mlx-gen +pillow==12.2.0 \ + --hash=sha256:00a2865911330191c0b818c59103b58a5e697cae67042366970a6b6f1b20b7f9 \ + --hash=sha256:01afa7cf67f74f09523699b4e88c73fb55c13346d212a59a2db1f86b0a63e8c5 \ + --hash=sha256:03e7e372d5240cc23e9f07deca4d775c0817bffc641b01e9c3af208dbd300987 \ + --hash=sha256:03f6fab9219220f041c74aeaa2939ff0062bd5c364ba9ce037197f4c6d498cd9 \ + --hash=sha256:042db20a421b9bafecc4b84a8b6e444686bd9d836c7fd24542db3e7df7baad9b \ + --hash=sha256:0538bd5e05efec03ae613fd89c4ce0368ecd2ba239cc25b9f9be7ed426b0af1f \ + --hash=sha256:0a34329707af4f73cf1782a36cd2289c0368880654a2c11f027bcee9052d35dd \ + --hash=sha256:0c838a5125cee37e68edec915651521191cef1e6aa336b855f495766e77a366e \ + --hash=sha256:144748b3af2d1b358d41286056d0003f47cb339b8c43a9ea42f5fea4d8c66b6e \ + --hash=sha256:1610dd6c61621ae1cf811bef44d77e149ce3f7b95afe66a4512f8c59f25d9ebe \ + --hash=sha256:1e1757442ed87f4912397c6d35a0db6a7b52592156014706f17658ff58bbf795 \ + --hash=sha256:22db17c68434de69d8ecfc2fe821569195c0c373b25cccb9cbdacf2c6e53c601 \ + --hash=sha256:25373b66e0dd5905ed63fa3cae13c82fbddf3079f2c8bf15c6fb6a35586324c1 \ + --hash=sha256:2bb4a8d594eacdfc59d9e5ad972aa8afdd48d584ffd5f13a937a664c3e7db0ed \ + --hash=sha256:2c727a6d53cb0018aadd8018c2b938376af27914a68a492f59dfcaca650d5eea \ + --hash=sha256:2d192a155bbcec180f8564f693e6fd9bccff5a7af9b32e2e4bf8c9c69dbad6b5 \ + --hash=sha256:2e589959f10d9824d39b350472b92f0ce3b443c0a3442ebf41c40cb8361c5b97 \ + --hash=sha256:2e5a76d03a6c6dcef67edabda7a52494afa4035021a79c8558e14af25313d453 \ + --hash=sha256:325ca0528c6788d2a6c3d40e3568639398137346c3d6e66bb61db96b96511c98 \ + --hash=sha256:34c0d99ecccea270c04882cb3b86e7b57296079c9a4aff88cb3b33563d95afaa \ + --hash=sha256:390ede346628ccc626e5730107cde16c42d3836b89662a115a921f28440e6a3b \ + --hash=sha256:394167b21da716608eac917c60aa9b969421b5dcbbe02ae7f013e7b85811c69d \ + --hash=sha256:3997232e10d2920a68d25191392e3a4487d8183039e1c74c2297f00ed1c50705 \ + --hash=sha256:3adc9215e8be0448ed6e814966ecf3d9952f0ea40eb14e89a102b87f450660d8 \ + --hash=sha256:3e080565d8d7c671db5802eedfb438e5565ffa40115216eabb8cd52d0ecce024 \ + --hash=sha256:4a6c9fa44005fa37a91ebfc95d081e8079757d2e904b27103f4f5fa6f0bf78c0 \ + --hash=sha256:4bfd07bc812fbd20395212969e41931001fd59eb55a60658b0e5710872e95286 \ + --hash=sha256:4e6c62e9d237e9b65fac06857d511e90d8461a32adcc1b9065ea0c0fa3a28150 \ + --hash=sha256:50d8520da2a6ce0af445fa6d648c4273c3eeefbc32d7ce049f22e8b5c3daecc2 \ + --hash=sha256:51c4167c34b0d8ba05b547a3bb23578d0ba17b80a5593f93bd8ecb123dd336a3 \ + --hash=sha256:56a3f9c60a13133a98ecff6197af34d7824de9b7b38c3654861a725c970c197b \ + --hash=sha256:56b25336f502b6ed02e889f4ece894a72612fe885889a6e8c4c80239ff6e5f5f \ + --hash=sha256:57850958fe9c751670e49b2cecf6294acc99e562531f4bd317fa5ddee2068463 \ + --hash=sha256:58f62cc0f00fd29e64b29f4fd923ffdb3859c9f9e6105bfc37ba1d08994e8940 \ + --hash=sha256:5c0a9f29ca8e79f09de89293f82fc9b0270bb4af1d58bc98f540cc4aedf03166 \ + --hash=sha256:5cdfebd752ec52bf5bb4e35d9c64b40826bc5b40a13df7c3cda20a2c03a0f5ed \ + --hash=sha256:5d04bfa02cc2d23b497d1e90a0f927070043f6cbf303e738300532379a4b4e0f \ + --hash=sha256:5d2fd0fa6b5d9d1de415060363433f28da8b1526c1c129020435e186794b3795 \ + --hash=sha256:62f5409336adb0663b7caa0da5c7d9e7bdbaae9ce761d34669420c2a801b2780 \ + --hash=sha256:632ff19b2778e43162304d50da0181ce24ac5bb8180122cbe1bf4673428328c7 \ + --hash=sha256:6562ace0d3fb5f20ed7290f1f929cae41b25ae29528f2af1722966a0a02e2aa1 \ + --hash=sha256:673aa32138f3e7531ccdbca7b3901dba9b70940a19ccecc6a37c77d5fdeb05b5 \ + --hash=sha256:6a6e67ea2e6feda684ed370f9a1c52e7a243631c025ba42149a2cc5934dec295 \ + --hash=sha256:6a9adfc6d24b10f89588096364cc726174118c62130c817c2837c60cf08a392b \ + --hash=sha256:6bb77b2dcb06b20f9f4b4a8454caa581cd4dd0643a08bacf821216a16d9c8354 \ + --hash=sha256:6e6b2a0c538fc200b38ff9eb6628228b77908c319a005815f2dde585a0664b60 \ + --hash=sha256:71cde9a1e1551df7d34a25462fc60325e8a11a82cc2e2f54578e5e9a1e153d65 \ + --hash=sha256:7371b48c4fa448d20d2714c9a1f775a81155050d383333e0a6c15b1123dda005 \ + --hash=sha256:766cef22385fa1091258ad7e6216792b156dc16d8d3fa607e7545b2b72061f1c \ + --hash=sha256:7b14cc0106cd9aecda615dd6903840a058b4700fcb817687d0ee4fc8b6e389be \ + --hash=sha256:7f84204dee22a783350679a0333981df803dac21a0190d706a50475e361c93f5 \ + --hash=sha256:8023abc91fba39036dbce14a7d6535632f99c0b857807cbbbf21ecc9f4717f06 \ + --hash=sha256:80b2da48193b2f33ed0c32c38140f9d3186583ce7d516526d462645fd98660ae \ + --hash=sha256:8297651f5b5679c19968abefd6bb84d95fe30ef712eb1b2d9b2d31ca61267f4c \ + --hash=sha256:88d387ff40b3ff7c274947ed3125dedf5262ec6919d83946753b5f3d7c67ea4c \ + --hash=sha256:88ddbc66737e277852913bd1e07c150cc7bb124539f94c4e2df5344494e0a612 \ + --hash=sha256:8bd7903a5f2a4545f6fd5935c90058b89d30045568985a71c79f5fd6edf9b91e \ + --hash=sha256:8be29e59487a79f173507c30ddf57e733a357f67881430449bb32614075a40ab \ + --hash=sha256:8c984051042858021a54926eb597d6ee3012393ce9c181814115df4c60b9a808 \ + --hash=sha256:8cbeb542b2ebc6fcdacabf8aca8c1a97c9b3ad3927d46b8723f9d4f033288a0f \ + --hash=sha256:8e9c4f5b3c546fa3458a29ab22646c1c6c787ea8f5ef51300e5a60300736905e \ + --hash=sha256:90e6f81de50ad6b534cab6e5aef77ff6e37722b2f5d908686f4a5c9eba17a909 \ + --hash=sha256:975385f4776fafde056abb318f612ef6285b10a1f12b8570f3647ad0d74b48ec \ + --hash=sha256:9a8a34cc89c67a65ea7437ce257cea81a9dad65b29805f3ecee8c8fe8ff25ffe \ + --hash=sha256:9aba9a17b623ef750a4d11b742cbafffeb48a869821252b30ee21b5e91392c50 \ + --hash=sha256:9f08483a632889536b8139663db60f6724bfcb443c96f1b18855860d7d5c0fd4 \ + --hash=sha256:a4e8f36e677d3336f35089648c8955c51c6d386a13cf6ee9c189c5f5bd713a9f \ + --hash=sha256:a52edc8bfff4429aaabdf4d9ee0daadbbf8562364f940937b941f87a4290f5ff \ + --hash=sha256:a830b1a40919539d07806aa58e1b114df53ddd43213d9c8b75847eee6c0182b5 \ + --hash=sha256:aa88ccfe4e32d362816319ed727a004423aab09c5cea43c01a4b435643fa34eb \ + --hash=sha256:af73337013e0b3b46f175e79492d96845b16126ddf79c438d7ea7ff27783a414 \ + --hash=sha256:b1c1fbd8a5a1af3412a0810d060a78b5136ec0836c8a4ef9aa11807f2a22f4e1 \ + --hash=sha256:b85f66ae9eb53e860a873b858b789217ba505e5e405a24b85c0464822fe88032 \ + --hash=sha256:b86024e52a1b269467a802258c25521e6d742349d760728092e1bc2d135b4d76 \ + --hash=sha256:bd9c0c7a0c681a347b3194c500cb1e6ca9cab053ea4d82a5cf45b6b754560136 \ + --hash=sha256:bfa9c230d2fe991bed5318a5f119bd6780cda2915cca595393649fc118ab895e \ + --hash=sha256:d362d1878f00c142b7e1a16e6e5e780f02be8195123f164edf7eddd911eefe7c \ + --hash=sha256:d5d38f1411c0ed9f97bcb49b7bd59b6b7c314e0e27420e34d99d844b9ce3b6f3 \ + --hash=sha256:dac8d77255a37e81a2efcbd1fc05f1c15ee82200e6c240d7e127e25e365c39ea \ + --hash=sha256:dd025009355c926a84a612fecf58bb315a3f6814b17ead51a8e48d3823d9087f \ + --hash=sha256:deede7c263feb25dba4e82ea23058a235dcc2fe1f6021025dc71f2b618e26104 \ + --hash=sha256:e74473c875d78b8e9d5da2a70f7099549f9eb37ded4e2f6a463e60125bccd176 \ + --hash=sha256:ee3120ae9dff32f121610bb08e4313be87e03efeadfc6c0d18f89127e24d0c24 \ + --hash=sha256:eedf4b74eda2b5a4b2b2fb4c006d6295df3bf29e459e198c90ea48e130dc75c3 \ + --hash=sha256:efd8c21c98c5cc60653bcb311bef2ce0401642b7ce9d09e03a7da87c878289d4 \ + --hash=sha256:f1c943e96e85df3d3478f7b691f229887e143f81fedab9b20205349ab04d73ed \ + --hash=sha256:f278f034eb75b4e8a13a54a876cc4a5ab39173d2cdd93a638e1b467fc545ac43 \ + --hash=sha256:f3f40b3c5a968281fd507d519e444c35f0ff171237f4fdde090dd60699458421 \ + --hash=sha256:f490f9368b6fc026f021db16d7ec2fbf7d89e2edb42e8ec09d2c60505f5729c7 \ + --hash=sha256:fb043ee2f06b41473269765c2feae53fc2e2fbf96e5e22ca94fb5ad677856f06 \ + --hash=sha256:fc3d34d4a8fbec3e88a79b92e5465e0f9b842b628675850d860b8bd300b159f5 + # via + # matplotlib + # mlx-gen +platformdirs==4.10.0 \ + --hash=sha256:31e761a6a0ca04faf7353ea759bdba55652be214725111e5aac52dfa29d4bef7 \ + --hash=sha256:fb516cdb12eb0d857d0cd85a7c57cea4d060bee4578d6cf5a14dfdf8cbf8784a + # via mlx-gen +protobuf==7.35.0 \ + --hash=sha256:4c4617b83ade0e279d1d2bfe04025a1adb87f9ed657de038620dc0ff959357f6 \ + --hash=sha256:4cbf5cc286130e06a6c9bbefac442431173906dfcc979712183d4adcc01b37ee \ + --hash=sha256:66be6c513931c794fa92c080ffee41671390da3d79da219cf9c0c0907f035dda \ + --hash=sha256:6c0f98f10c8a05ea30f8993dfef2de093d27b490fdae78bb60c8343795d55011 \ + --hash=sha256:a2efd84605f41e559f1881b0912b44099d0a2ac9bf46b3474823f10fb393b0e6 \ + --hash=sha256:c13f325cf242bad135c350629eeb5d54b24228eb472fb3e2e9ebbd4c5dc20ca0 \ + --hash=sha256:f05bcadf9a2a6b8dda047007075135fb7d08c73d9177aabc067e1be46881a201 \ + --hash=sha256:fcbe42a4ac09d3ec9c987ddfcd956afd0b15f1ff613bd8371bde9405ffd5c8e5 + # via mlx-gen +pygments==2.20.0 \ + --hash=sha256:6757cd03768053ff99f3039c1a36d6c0aa0b263438fcab17520b30a303a82b5f \ + --hash=sha256:81a9e26dd42fd28a23a2d169d86d7ac03b46e2f8b59ed4698fb4785f946d0176 + # via + # readme-renderer + # rich +pyparsing==3.3.2 \ + --hash=sha256:850ba148bd908d7e2411587e247a1e4f0327839c40e2e5e6d05a007ecc69911d \ + --hash=sha256:c777f4d763f140633dcb6d8a3eda953bf7a214dc4eff598413c070bcdc117cbc + # via matplotlib +python-dateutil==2.9.0.post0 \ + --hash=sha256:37dd54208da7e1cd875388217d5e00ebd4179249f90fb72437e91a35459a0ad3 \ + --hash=sha256:a8b2bc7bffae282281c8140a97d3aa9c14da0b136dfe83f850eea9a5f7470427 + # via matplotlib +pyyaml==6.0.3 \ + --hash=sha256:00c4bdeba853cc34e7dd471f16b4114f4162dc03e6b7afcc2128711f0eca823c \ + --hash=sha256:0150219816b6a1fa26fb4699fb7daa9caf09eb1999f3b70fb6e786805e80375a \ + --hash=sha256:02893d100e99e03eda1c8fd5c441d8c60103fd175728e23e431db1b589cf5ab3 \ + --hash=sha256:02ea2dfa234451bbb8772601d7b8e426c2bfa197136796224e50e35a78777956 \ + --hash=sha256:0f29edc409a6392443abf94b9cf89ce99889a1dd5376d94316ae5145dfedd5d6 \ + --hash=sha256:10892704fc220243f5305762e276552a0395f7beb4dbf9b14ec8fd43b57f126c \ + --hash=sha256:16249ee61e95f858e83976573de0f5b2893b3677ba71c9dd36b9cf8be9ac6d65 \ + --hash=sha256:1d37d57ad971609cf3c53ba6a7e365e40660e3be0e5175fa9f2365a379d6095a \ + --hash=sha256:1ebe39cb5fc479422b83de611d14e2c0d3bb2a18bbcb01f229ab3cfbd8fee7a0 \ + --hash=sha256:214ed4befebe12df36bcc8bc2b64b396ca31be9304b8f59e25c11cf94a4c033b \ + --hash=sha256:2283a07e2c21a2aa78d9c4442724ec1eb15f5e42a723b99cb3d822d48f5f7ad1 \ + --hash=sha256:22ba7cfcad58ef3ecddc7ed1db3409af68d023b7f940da23c6c2a1890976eda6 \ + --hash=sha256:27c0abcb4a5dac13684a37f76e701e054692a9b2d3064b70f5e4eb54810553d7 \ + --hash=sha256:28c8d926f98f432f88adc23edf2e6d4921ac26fb084b028c733d01868d19007e \ + --hash=sha256:2e71d11abed7344e42a8849600193d15b6def118602c4c176f748e4583246007 \ + --hash=sha256:34d5fcd24b8445fadc33f9cf348c1047101756fd760b4dacb5c3e99755703310 \ + --hash=sha256:37503bfbfc9d2c40b344d06b2199cf0e96e97957ab1c1b546fd4f87e53e5d3e4 \ + --hash=sha256:3c5677e12444c15717b902a5798264fa7909e41153cdf9ef7ad571b704a63dd9 \ + --hash=sha256:3ff07ec89bae51176c0549bc4c63aa6202991da2d9a6129d7aef7f1407d3f295 \ + --hash=sha256:41715c910c881bc081f1e8872880d3c650acf13dfa8214bad49ed4cede7c34ea \ + --hash=sha256:418cf3f2111bc80e0933b2cd8cd04f286338bb88bdc7bc8e6dd775ebde60b5e0 \ + --hash=sha256:44edc647873928551a01e7a563d7452ccdebee747728c1080d881d68af7b997e \ + --hash=sha256:4a2e8cebe2ff6ab7d1050ecd59c25d4c8bd7e6f400f5f82b96557ac0abafd0ac \ + --hash=sha256:4ad1906908f2f5ae4e5a8ddfce73c320c2a1429ec52eafd27138b7f1cbe341c9 \ + --hash=sha256:501a031947e3a9025ed4405a168e6ef5ae3126c59f90ce0cd6f2bfc477be31b7 \ + --hash=sha256:5190d403f121660ce8d1d2c1bb2ef1bd05b5f68533fc5c2ea899bd15f4399b35 \ + --hash=sha256:5498cd1645aa724a7c71c8f378eb29ebe23da2fc0d7a08071d89469bf1d2defb \ + --hash=sha256:5cf4e27da7e3fbed4d6c3d8e797387aaad68102272f8f9752883bc32d61cb87b \ + --hash=sha256:5e0b74767e5f8c593e8c9b5912019159ed0533c70051e9cce3e8b6aa699fcd69 \ + --hash=sha256:5ed875a24292240029e4483f9d4a4b8a1ae08843b9c54f43fcc11e404532a8a5 \ + --hash=sha256:5fcd34e47f6e0b794d17de1b4ff496c00986e1c83f7ab2fb8fcfe9616ff7477b \ + --hash=sha256:5fdec68f91a0c6739b380c83b951e2c72ac0197ace422360e6d5a959d8d97b2c \ + --hash=sha256:6344df0d5755a2c9a276d4473ae6b90647e216ab4757f8426893b5dd2ac3f369 \ + --hash=sha256:64386e5e707d03a7e172c0701abfb7e10f0fb753ee1d773128192742712a98fd \ + --hash=sha256:652cb6edd41e718550aad172851962662ff2681490a8a711af6a4d288dd96824 \ + --hash=sha256:66291b10affd76d76f54fad28e22e51719ef9ba22b29e1d7d03d6777a9174198 \ + --hash=sha256:66e1674c3ef6f541c35191caae2d429b967b99e02040f5ba928632d9a7f0f065 \ + --hash=sha256:6adc77889b628398debc7b65c073bcb99c4a0237b248cacaf3fe8a557563ef6c \ + --hash=sha256:79005a0d97d5ddabfeeea4cf676af11e647e41d81c9a7722a193022accdb6b7c \ + --hash=sha256:7c6610def4f163542a622a73fb39f534f8c101d690126992300bf3207eab9764 \ + --hash=sha256:7f047e29dcae44602496db43be01ad42fc6f1cc0d8cd6c83d342306c32270196 \ + --hash=sha256:8098f252adfa6c80ab48096053f512f2321f0b998f98150cea9bd23d83e1467b \ + --hash=sha256:850774a7879607d3a6f50d36d04f00ee69e7fc816450e5f7e58d7f17f1ae5c00 \ + --hash=sha256:8d1fab6bb153a416f9aeb4b8763bc0f22a5586065f86f7664fc23339fc1c1fac \ + --hash=sha256:8da9669d359f02c0b91ccc01cac4a67f16afec0dac22c2ad09f46bee0697eba8 \ + --hash=sha256:8dc52c23056b9ddd46818a57b78404882310fb473d63f17b07d5c40421e47f8e \ + --hash=sha256:9149cad251584d5fb4981be1ecde53a1ca46c891a79788c0df828d2f166bda28 \ + --hash=sha256:93dda82c9c22deb0a405ea4dc5f2d0cda384168e466364dec6255b293923b2f3 \ + --hash=sha256:96b533f0e99f6579b3d4d4995707cf36df9100d67e0c8303a0c55b27b5f99bc5 \ + --hash=sha256:9c57bb8c96f6d1808c030b1687b9b5fb476abaa47f0db9c0101f5e9f394e97f4 \ + --hash=sha256:9c7708761fccb9397fe64bbc0395abcae8c4bf7b0eac081e12b809bf47700d0b \ + --hash=sha256:9f3bfb4965eb874431221a3ff3fdcddc7e74e3b07799e0e84ca4a0f867d449bf \ + --hash=sha256:a33284e20b78bd4a18c8c2282d549d10bc8408a2a7ff57653c0cf0b9be0afce5 \ + --hash=sha256:a80cb027f6b349846a3bf6d73b5e95e782175e52f22108cfa17876aaeff93702 \ + --hash=sha256:b30236e45cf30d2b8e7b3e85881719e98507abed1011bf463a8fa23e9c3e98a8 \ + --hash=sha256:b3bc83488de33889877a0f2543ade9f70c67d66d9ebb4ac959502e12de895788 \ + --hash=sha256:b865addae83924361678b652338317d1bd7e79b1f4596f96b96c77a5a34b34da \ + --hash=sha256:b8bb0864c5a28024fac8a632c443c87c5aa6f215c0b126c449ae1a150412f31d \ + --hash=sha256:ba1cc08a7ccde2d2ec775841541641e4548226580ab850948cbfda66a1befcdc \ + --hash=sha256:bdb2c67c6c1390b63c6ff89f210c8fd09d9a1217a465701eac7316313c915e4c \ + --hash=sha256:c1ff362665ae507275af2853520967820d9124984e0f7466736aea23d8611fba \ + --hash=sha256:c2514fceb77bc5e7a2f7adfaa1feb2fb311607c9cb518dbc378688ec73d8292f \ + --hash=sha256:c3355370a2c156cffb25e876646f149d5d68f5e0a3ce86a5084dd0b64a994917 \ + --hash=sha256:c458b6d084f9b935061bc36216e8a69a7e293a2f1e68bf956dcd9e6cbcd143f5 \ + --hash=sha256:d0eae10f8159e8fdad514efdc92d74fd8d682c933a6dd088030f3834bc8e6b26 \ + --hash=sha256:d76623373421df22fb4cf8817020cbb7ef15c725b9d5e45f17e189bfc384190f \ + --hash=sha256:ebc55a14a21cb14062aa4162f906cd962b28e2e9ea38f9b4391244cd8de4ae0b \ + --hash=sha256:eda16858a3cab07b80edaf74336ece1f986ba330fdb8ee0d6c0d68fe82bc96be \ + --hash=sha256:ee2922902c45ae8ccada2c5b501ab86c36525b883eff4255313a253a3160861c \ + --hash=sha256:efd7b85f94a6f21e4932043973a7ba2613b059c4a000551892ac9f1d11f5baf3 \ + --hash=sha256:f7057c9a337546edc7973c0d3ba84ddcdf0daa14533c2065749c9075001090e6 \ + --hash=sha256:fa160448684b4e94d80416c0fa4aac48967a969efe22931448d853ada8baf926 \ + --hash=sha256:fc09d0aa354569bc501d4e787133afc08552722d3ab34836a80547331bb5d4a0 + # via + # huggingface-hub + # transformers +readme-renderer==45.0 \ + --hash=sha256:030a8fac74904f8fba11ad1bb6964e3f76e896dc7e5e71f16af190c9056696d1 \ + --hash=sha256:3385ed220117104a2bceb4a9dac8c5fdf6d1f96890d7ea2a9c7174fd5c84091f + # via twine +regex==2026.5.9 \ + --hash=sha256:002205cafd2a9e78c6290c7d1df277bf3277b3b7a30e0b4bb0dac2e2e3f7cb2d \ + --hash=sha256:01f0f5f55f4b64dacec85dc116d3c05fd23ad3ff037bbc73a2085775953c2611 \ + --hash=sha256:01f28d868834624c934b8d2e0aa1c8341337e37831f4a012f18a5afcba4cbaf3 \ + --hash=sha256:075160bf16658e16d35233300b8453aac25de4cbea808d22348b6979668e924d \ + --hash=sha256:0de5cf193997384ed2ca6f1cd4f78055b255d93d82d5a8cd6ba0d11c10b167e4 \ + --hash=sha256:0e1b1b4e496afbb24f4a62aba855ee4f88f25578927697b340702e48c9ee6bc2 \ + --hash=sha256:0f03aa6898aaaac4592479821df16e68e8d0e29e903e65d8f2dfb2f19028a989 \ + --hash=sha256:0f9eede6a5cbdc02d4978090186390936e1776a7d1359b21e41014c609880bcf \ + --hash=sha256:1268eddd8486dc561d08eee1156e40aa3a8fe10f4bdec8fa653b455fcbffd12c \ + --hash=sha256:15ee42209947f4ca045412eae98416317238163618ace2a8e54f99586a466733 \ + --hash=sha256:164eba9b755ea6f244b0d881196fbc1fac09714e9782c9e2732b813142033c8e \ + --hash=sha256:19c16ceb4a267a8789e25733e583983eeab9f0f8664e66b0bd1c5d21f14c2d4b \ + --hash=sha256:1bd7587a2948b4085195d5a3374eaf4a425dc3e55784c038175355ecf3bbbf8a \ + --hash=sha256:1e6da47d679b7010ef27556b6e0f99771b744936db1792a10ceac6547ae1503e \ + --hash=sha256:205109e96b3cf5adf8f4cd62bedde9487feb282b9497a3535451e5a24cd706a0 \ + --hash=sha256:2099f7e7ff7b6aa3192312650a56e91cc091e49d50b04e4f6f8b6e28b3b27f1c \ + --hash=sha256:246de9d60aa3f8538b519834dd95cbf276ea263d6a7bd5a3666dc3fa0230505b \ + --hash=sha256:24b2355ef5cc9aa5b8f07d17704face1c166fdcc2290fa7bd6e6c925655a8346 \ + --hash=sha256:2a661a7d270a61f7cf460caee8b9fa2d5ef9e5c681234bcb9e0fe14f488e7dfc \ + --hash=sha256:2acfb48634f64996b57f90f39afa692ff362162722581921fe92239a59960f3c \ + --hash=sha256:2efa205e6d98b24d1f3ab395c11aa15cdf10935bca283d0285e0499c284fba21 \ + --hash=sha256:31037c82eccb44b7ea2e9e221d7c01429430e989a1f4b91ea5a855f6017b509a \ + --hash=sha256:3527bb4942d2c14552155406cdedd906567456821848aed1cb4933a391bf5eca \ + --hash=sha256:39617fb0cde9c0e6306dc70e3bfc096f3da793219879f7ae7aa341a69fbdcf6d \ + --hash=sha256:398c521292f4c7fb807001dcd54694d3a1fcafc179a36ad9cc56f98df85930b6 \ + --hash=sha256:3b1e39888c5e0c7d92cea4fc777396c4a90363b05de75d02eb459a4752200808 \ + --hash=sha256:3dd4a3ff360dfb836fecdb93a4598f9d6e2ac81e3e397125145c6221bf58cf4c \ + --hash=sha256:3ddd90103f9e5c471c49c7852ecc1fe27c7e45eb99e977aefe7caa4e779f4f58 \ + --hash=sha256:446ddd671e43ab535810c4b21cff7104945c701d4a14d1e6d1cd6f4e445a8bea \ + --hash=sha256:45375819235558a4ff1c4971dc32881f022613abdb180128f5cb4768c1765a1c \ + --hash=sha256:46f1326ca6e65b0879d23ca302c0f2415aad42ff0309b9c818e7949fe19a41d8 \ + --hash=sha256:48036f6374aaa79eb3b754ec29c61d1c6b1606749d705a13f8854fa2539671f6 \ + --hash=sha256:4ebe8f0b5ec5a5024dc4a4c59f444c4e9afc5f2abdbb8962065b75d27fb971f9 \ + --hash=sha256:4eeb011098fcb77af513dcef521a3dbecbf8849b1e38940759d293b7a93f5026 \ + --hash=sha256:508f56a89ba9cb26e4168cbc37dbd60a28d82430a9e18ad1d25fe0883c314ca2 \ + --hash=sha256:5604dfd046dc37eca90250fc3be938b076c8059fa772ac0ed6f499b0f0fb0415 \ + --hash=sha256:56a33f191f17d8c417f99945ebdc1e691d3af9605d86ec68c7e54a57e3e17af6 \ + --hash=sha256:57e8915c7986aa33d25e4d3629cef711cd2863f2961b10409f0c04cb8b7d9020 \ + --hash=sha256:57eeeb05db7979413dec5438f2db21d7ecbba787cde7a711df1a6f6df672aa06 \ + --hash=sha256:5b73ab8afcf66c622db143d1c6fda4e58e4d537ee4f125229ad47b1ab80f34c0 \ + --hash=sha256:5e41809d2683fcde7d5a8c87a6567ba1fb1ce0de9f31bff578de00a4b2d76daa \ + --hash=sha256:6351571c8a42b505eb555c0dc47d740d0fb66977dc142919eea6f4325b7c56a0 \ + --hash=sha256:6441cc660d76107934a09c22167200839a0e89604a6297f78a974e66e931d2c0 \ + --hash=sha256:65c8c8c37377794bd5b2f3ebe51919042bf17aec802e23c833d89782ed0c78af \ + --hash=sha256:6ba42b2e7e7f46cf68cc6a5ca36fa07959f9bbd9c6bdcc47b6ee76549a590248 \ + --hash=sha256:71b61c5bfe1c806332defc42ad6c780b3c55f661986d7f40283a3a88274b4c00 \ + --hash=sha256:728d8bfd28a8845c8b6bc5dc7ce010453d206396786c0765c2740cb65f37791e \ + --hash=sha256:7b92817338591505f282cf3864c145244b1edcf5381d237038df955001091538 \ + --hash=sha256:7e30b874d341fac767d7df5a0870540541c2c054b80cfaac116e8d367a8a7ff2 \ + --hash=sha256:7e87577720152d2caae19fe2baaf1f8d5ca12091e9e229f03915c37d1e4b9178 \ + --hash=sha256:83d0ee4a57d1c87cb549e195ec300b8f0ec3a82eba66d835e4e2ed8634fe4499 \ + --hash=sha256:8676474c07469d6f33dd1085ca2cd45f65785f32518f2b20e36d9953ca07f994 \ + --hash=sha256:86f40a5d6444db30a125c9c9177e6b25dad981cbc37451fd838f145e6edac92e \ + --hash=sha256:872acc074bd29ffc9913ecdfedf6ea77502312ca44a4aa0d3779089c6069d8de \ + --hash=sha256:8abd33fef90b2a9efac5557d6033ca82d1195ed3a15fea5af15ba7b463c6a63b \ + --hash=sha256:8c6e4218fbdfbcd4f6c19efca40930d24a621bf4b48cb76bc6640543bd28ef20 \ + --hash=sha256:8e76e8161ad00694cfce6767d5dea860c6391ac5b83e5c3a39661e696f11fc7e \ + --hash=sha256:8f3af7a4903c5c04a11a196a5aa75cdd7dd3f8508132f9fb3259d9f5908e3b88 \ + --hash=sha256:91328f1c23d47595ca3ef0a7557fa129c5a23404b775c770697d2f35b33e0107 \ + --hash=sha256:916714069da19329ef7de197dcbc77bb3104145c7c2c864dbfbe318f46b88b14 \ + --hash=sha256:93a7860539414dddaefba2b40f8771765ae17949d4c7182b876ce429e11a8309 \ + --hash=sha256:954cc214c04663ee6d266fc61739cad83054683048de65c5bd1d640ad28098ac \ + --hash=sha256:96f5f58b54a063d7ea9dca08e1cf57bfe10499c4d579ee672da284f57f5f0070 \ + --hash=sha256:97cf3bc1b7d7d2306772ec07366c80d9df00ff79e79cea32898883a646d2fae2 \ + --hash=sha256:98bd73080e8756255137e1bd3f3f00295bbc5aa383c0e0f973920e9134d7c4ad \ + --hash=sha256:992604d02e6d9c6d786c24a706a71ecffe1020fc1ef264044474cd81fa2c3919 \ + --hash=sha256:a24852d3c29ad9e47593593d8a247c44ccc3d0548ef12c822d6ed0810affe676 \ + --hash=sha256:a6a563446a41adc451393dc6b8e6ad87979efaee3c8738690a8d1b08ebead1b4 \ + --hash=sha256:a8234aa23ec39894bfe4a3f1b85616a7032481964a13ac6fc9f10de4f6fca270 \ + --hash=sha256:a8820737949116ffff55fe18f9fc644530063ba6ebfcb8314239416e78f1347c \ + --hash=sha256:a9e1328e17c84c1a5d22ec9f785ecef4a967fab9a42b6a8dc3bcbebd0a0c9e44 \ + --hash=sha256:aa0fbdbac82cb3e4450d0ccde7d7a35607f4cb2dd9fba4b8b69bfaf8c9fa6aed \ + --hash=sha256:b310768746dd314ea6e2ff4cc89ef215426813396ff4e94ee8e6f7096c8b6e03 \ + --hash=sha256:b46b0f094dc1d3b90356c85a0bd2c9bafc4a6a190b9d6f8ddd5a033b6e088ed4 \ + --hash=sha256:b4bb445ff3f725f59df8f6014edb547ee928ec7023a774f6a39a3f953038cbb2 \ + --hash=sha256:b6d189041f15691cfa2b6c4290448ec221244d225b3f5fe9e7771b34ffcdf6e2 \ + --hash=sha256:b96350aa424e79d4fd6b567b344dcbe2b2d6bfc48dfe7717587e1fa6d43da6ff \ + --hash=sha256:be3372b9df6ddecff6486d37e19095a7b4973137caf5512407a89f4455361f41 \ + --hash=sha256:bfe1ce50cbfb569d74e1e4337da6468961f31dbea55fd85aa5de59c0947a805a \ + --hash=sha256:c010eb8caca74bdb40c07498d7ece26b4428fd3f04aa8a72c9ac6f79e8faaac6 \ + --hash=sha256:c8b9b9d294cfea3cd19c718ade7cc93492b2c4991abd9a68d0b3477ae6d8e100 \ + --hash=sha256:c9411dd64ca95477225734a93dfc8583b51916b8d5942f99d6cac21e09965451 \ + --hash=sha256:ca518ed29c46eecba6010b15f1b9a479314d2de409536e71b6a13aa04e3b8a77 \ + --hash=sha256:ccf5249114cc3e772ecdd88a98a86eca0fd74c61ce32a94743758c083fc05d48 \ + --hash=sha256:cd2846168eb9ee3c513902bc8225409cb1caab31d04728b145171fa1625d9621 \ + --hash=sha256:d29eebfc9525db68cad3c97eedd7f754fa265aa5cd0cf4f863b2421e1b48fc9f \ + --hash=sha256:d3d7eb5c9a7f6df82ed3cfac9beb93882a5cbcb5b8b157b56cb2b3b276574ac1 \ + --hash=sha256:d626b84406444b165fc0ba981604edea39f0588ff1f92baa23fe50799ea9afdb \ + --hash=sha256:d641a8c9a61618047796d572a39a79b26167b0411d2c3031937b2fe2d081e2cf \ + --hash=sha256:d659eee77986549c9ea45b861c7567e44d6287c3dc9a4565478853f7b9fe2ff6 \ + --hash=sha256:d6b8a143aca6c39b446ea8092cde25cc8fe9304d4f5fecfbc1a9dbb0282703c2 \ + --hash=sha256:d726ca3f0d76969bf1e8e477d160d3d666bbf999f6860bd314889e5345782046 \ + --hash=sha256:d7bdc0ab8f3dd7e1b4f9ab88634e13374669db86bb3c72e8292f07ae313f539f \ + --hash=sha256:daff2bdbaf1d23e52fdff7c0b7bc2048b68f978df6a4d107ac981f94caef2e66 \ + --hash=sha256:dd2810d22146b6d838acc5ec15602cb6b47920aa4e33015df3868eedfd20bab8 \ + --hash=sha256:ddda5340e6c01a293027dd46232fa79eaff1b48058ce7a98f572b6445b088041 \ + --hash=sha256:dea2e88e1cce4522496cce630e11e67b98b7076620bc4336c3f674bc21a375f4 \ + --hash=sha256:debb893095e944091c16e641a6e33c1b0f4cb61ab945ec5afbf53ce7068834d8 \ + --hash=sha256:dfbe4579b9f08036aa7d101d1835437a20783574ac66327e6b29b4018a138081 \ + --hash=sha256:e1d93bf647916292e8edcec150c07ddf3dc50179ccaf770c04a7f9e452155372 \ + --hash=sha256:e82db382b44d0111b22601c509c89f64434816c9e0eef9d1989cda8cc6ff1c04 \ + --hash=sha256:ea9c8ecfa1b73c73b626534d6626e5340d429630943672b8480724f44e84b962 \ + --hash=sha256:ead4b163ac30a29574510cd4b3e2e985ac5290c05fc7095557d6a5f403fc31b5 \ + --hash=sha256:ecd353045824e4477562a2ac718c25799cdaaa41f7aa925a806a8a3e6848a5b9 \ + --hash=sha256:ed2c9e8068b614c574d8d30e543d617cf5379b0535d46f97ef00e904745a08b5 \ + --hash=sha256:ed457d8e98ae812ed7732bef7bf78de78e834eae0372a74e23ca90ef21d910f9 \ + --hash=sha256:ef31cbfe458e21c6122ba8150ff060e0c7789ed0d26eb423f25472584920b555 \ + --hash=sha256:f079e50a0d3cc3cd5091fa9ff45869a2e6b2cd35895731edafb0327901a8d86d \ + --hash=sha256:f3844f134e834076677dd369976e9f5068679fcb8e50102fdf6b7ac96a3ec127 \ + --hash=sha256:f7a7c26137296beba7784de6eba69c6a93a63ccebc385e4962fe67e267a91225 \ + --hash=sha256:fa411799ca8da32a8d38d020a88faa5b6f91657d284761352940ecf9f7c3bbdd \ + --hash=sha256:fd03c4f0e33280d15cae17159b899245d6b7c53d21def19b263b39655061f5ce \ + --hash=sha256:fd190e88a895a8901325fad284a3f74ea52b1da8525b76cc811fa9b1edf0ce2b \ + --hash=sha256:ff8d372ac2acdc048d1c19916f27ee61bc5722728458ba6ca5052f2c72d51763 + # via + # mlx-gen + # transformers +requests==2.34.2 \ + --hash=sha256:2a0d60c172f83ac6ab31e4554906c0f3b3588d37b5cb939b1c061f4907e278e0 \ + --hash=sha256:f288924cae4e29463698d6d60bc6a4da69c89185ad1e0bcc4104f584e960b9ed + # via + # mlx-gen + # requests-toolbelt + # twine +requests-toolbelt==1.0.0 \ + --hash=sha256:7681a0a3d047012b5bdc0ee37d7f8f07ebe76ab08caeccfc3921ce23c88d5bc6 \ + --hash=sha256:cccfdd665f0a24fcf4726e690f65639d272bb0637b9b92dfd91a5568ccf6bd06 + # via twine +rfc3986==2.0.0 \ + --hash=sha256:50b1502b60e289cb37883f3dfd34532b8873c7de9f49bb546641ce9cbd256ebd \ + --hash=sha256:97aacf9dbd4bfd829baad6e6309fa6573aaf1be3f6fa735c8ab05e46cecb261c + # via twine +rich==15.0.0 \ + --hash=sha256:33bd4ef74232fb73fe9279a257718407f169c09b78a87ad3d296f548e27de0bb \ + --hash=sha256:edd07a4824c6b40189fb7ac9bc4c52536e9780fbbfbddf6f1e2502c31b068c36 + # via + # twine + # typer +safetensors==0.8.0 \ + --hash=sha256:040070828e36dc8e122178bbbd5830ff9e97920affb84cbe0f46442497bed358 \ + --hash=sha256:096ec1a98435df7beb08853bb5aa9081a84f23d0adc67ed1a0a10550f608373f \ + --hash=sha256:2ddf52eac562eda224f99acfa7889d02968c1fd59a5b011ae7d8137c37e9c02d \ + --hash=sha256:3ae091f16662658bdc019a4ff6cb4c085bb7d725eb5978b183ffd265863b6d2d \ + --hash=sha256:4124502b78f03534117c848f87a39b8f31e577b15eff423bf8bfb95f2a8c30d0 \ + --hash=sha256:4a95ae2b05d7726d751da4ebf626a2ca782b706e101bd894c95bc2450b1cffcc \ + --hash=sha256:7a46e5ff292c356d6991e60942ba7f79817682d3a2cef0702136448cb9c4d235 \ + --hash=sha256:7bc0a787ba8a35be368ee3574edfa2b1ad389eebd0a72e482ae275490e3f6c98 \ + --hash=sha256:87eec7ffed2b809f05a398a8becb7d013f19f7837cd15d9748580d6cf30dbaf4 \ + --hash=sha256:8e080062fcde23be189565e1c3305d16751a218ecf9412c8601e64204eb6f846 \ + --hash=sha256:8e9f537aa183a38ace122d27303dcd986b26bd2a7591f9181d7f0c396f4677ca \ + --hash=sha256:c554f85858e05226d3c2828e32395e677434685d6d94594a41643361c5e837f0 \ + --hash=sha256:c80201d22cbf405b80647a60ada77bba06c8fba2da2743ba1e89cdcc39a81f25 \ + --hash=sha256:f7838e5135a406ad3e02efdcb8cf2e5397d368b0154537c4fec682dbc544d452 \ + --hash=sha256:fabaf3e0f18a6618d9b36560682562157f77c2b71fcffc7b432be2baed9d753d \ + --hash=sha256:fcdd41ec4628fee5799f807c73c353629130fbd942aa23d83c623dd6c9d52d78 \ + --hash=sha256:fd6f3f93c9a0a7cc2788ee63fb763353d4bd2e89b0751bc78fcf7dda00bea774 + # via + # mlx-gen + # transformers +sentencepiece==0.2.1 \ + --hash=sha256:010f025a544ef770bb395091d57cb94deb9652d8972e0d09f71d85d5a0816c8c \ + --hash=sha256:017f97b274d4b0baa84b2dc743bf4517be81156f413bb24f12aacacde378e5ab \ + --hash=sha256:01e6912125cb45d3792f530a4d38f8e21bf884d6b4d4ade1b2de5cf7a8d2a52b \ + --hash=sha256:02593eca45440ef39247cee8c47322a34bdcc1d8ae83ad28ba5a899a2cf8d79a \ + --hash=sha256:097f3394e99456e9e4efba1737c3749d7e23563dd1588ce71a3d007f25475fff \ + --hash=sha256:0a0d15781a171d188b661ae4bde1d998c303f6bd8621498c50c671bd45a4798e \ + --hash=sha256:0a81799d0a68d618e89063fb423c3001a034c893069135ffe51fee439ae474d6 \ + --hash=sha256:0c0f672da370cc490e4c59d89e12289778310a0e71d176c541e4834759e1ae07 \ + --hash=sha256:0cdfecef430d985f1c2bcbfff3defd1d95dae876fbd0173376012d2d7d24044b \ + --hash=sha256:105e36e75cbac1292642045458e8da677b2342dcd33df503e640f0b457cb6751 \ + --hash=sha256:10ed3dab2044c47f7a2e7b4969b0c430420cdd45735d78c8f853191fa0e3148b \ + --hash=sha256:1855f57db07b51fb51ed6c9c452f570624d2b169b36f0f79ef71a6e6c618cd8b \ + --hash=sha256:2005242a16d2dc3ac5fe18aa7667549134d37854823df4c4db244752453b78a8 \ + --hash=sha256:22c4ebcb3c6ab1496ab1c37c79ef7bb563b8726f29548c30773b7a4cb152df1a \ + --hash=sha256:251874d720ac7f28024a168501f3c7bb15d1802245f6e66de565f18bbb9b5eaa \ + --hash=sha256:27e38eee653abc3d387862e67bc5c8b6f428cd604e688b85d29170b7e725c26c \ + --hash=sha256:2af5a1fb05013332ad94343b8b5f3973e006a2dde2dfba55a819549e054e2f0f \ + --hash=sha256:2f27ae6deea72efdb6f361750c92f6c21fd0ad087445082770cc34015213c526 \ + --hash=sha256:33f068c9382dc2e7c228eedfd8163b52baa86bb92f50d0488bf2b7da7032e484 \ + --hash=sha256:39f8651bd10974eafb9834ce30d9bcf5b73e1fc798a7f7d2528f9820ca86e119 \ + --hash=sha256:3d165fbb9bf8fba35f1946ba2617c3f9995679f07438325f07c026d53f33e746 \ + --hash=sha256:477c81505db072b3ab627e7eab972ea1025331bd3a92bacbf798df2b75ea86ec \ + --hash=sha256:4cdc7c36234fda305e85c32949c5211faaf8dd886096c7cea289ddc12a2d02de \ + --hash=sha256:4f5a3e0d9f445ed9d66c0fec47d4b23d12cfc858b407a03c194c1b26c2ac2a63 \ + --hash=sha256:56dd39a3c4d6493db3cdca7e8cc68c6b633f0d4195495cbadfcf5af8a22d05a6 \ + --hash=sha256:57cae326c8727de58c85977b175af132a7138d84c764635d7e71bbee7e774133 \ + --hash=sha256:5d0350b686c320068702116276cfb26c066dc7e65cfef173980b11bb4d606719 \ + --hash=sha256:5e4366c97b68218fd30ea72d70c525e6e78a6c0a88650f57ac4c43c63b234a9d \ + --hash=sha256:60937c959e6f44159fdd9f56fbdd302501f96114a5ba436829496d5f32d8de3f \ + --hash=sha256:6356d0986b8b8dc351b943150fcd81a1c6e6e4d439772e8584c64230e58ca987 \ + --hash=sha256:6d297a1748d429ba8534eebe5535448d78b8acc32d00a29b49acf28102eeb094 \ + --hash=sha256:733e59ff1794d26db706cd41fc2d7ca5f6c64a820709cb801dc0ea31780d64ab \ + --hash=sha256:8138cec27c2f2282f4a34d9a016e3374cd40e5c6e9cb335063db66a0a3b71fad \ + --hash=sha256:814978ac05130dd5812b4b03215c766bc6abaef13e7bd72bc534e4d1e12e9a4c \ + --hash=sha256:82d9ead6591015f009cb1be1cb1c015d5e6f04046dbb8c9588b931e869a29728 \ + --hash=sha256:881b2e44b14fc19feade3cbed314be37de639fc415375cefaa5bc81a4be137fd \ + --hash=sha256:891ade6503dd93d418c03993f7d6a8aa20260c422cefff5096b9068185e67642 \ + --hash=sha256:89a3ea015517c42c0341d0d962f3e6aaf2cf10d71b1932d475c44ba48d00aa2b \ + --hash=sha256:8dd4b477a7b069648d19363aad0cab9bad2f4e83b2d179be668efa672500dc94 \ + --hash=sha256:8f8ba89a3acb3dc1ae90f65ec1894b0b9596fdb98ab003ff38e058f898b39bc7 \ + --hash=sha256:9076430ac25dfa7147d9d05751dbc66a04bc1aaac371c07f84952979ea59f0d0 \ + --hash=sha256:92b3816aa2339355fda2c8c4e021a5de92180b00aaccaf5e2808972e77a4b22f \ + --hash=sha256:99f955df238021bf11f0fc37cdb54fd5e5b5f7fd30ecc3d93fb48b6815437167 \ + --hash=sha256:a19adcec27c524cb7069a1c741060add95f942d1cbf7ad0d104dffa0a7d28a2b \ + --hash=sha256:a483fd29a34c3e34c39ac5556b0a90942bec253d260235729e50976f5dba1068 \ + --hash=sha256:ac650534e2251083c5f75dde4ff28896ce7c8904133dc8fef42780f4d5588fcd \ + --hash=sha256:ad8493bea8432dae8d6830365352350f3b4144415a1d09c4c8cb8d30cf3b6c3c \ + --hash=sha256:afefe50a0cdcb4f2fd9733cb52001a2c164181ee2d82c32d38f5b1b326a8528c \ + --hash=sha256:b3616ad246f360e52c85781e47682d31abfb6554c779e42b65333d4b5f44ecc0 \ + --hash=sha256:b81a24733726e3678d2db63619acc5a8dccd074f7aa7a54ecd5ca33ca6d2d596 \ + --hash=sha256:c415c9de1447e0a74ae3fdb2e52f967cb544113a3a5ce3a194df185cbc1f962f \ + --hash=sha256:c6c8f42949f419ff8c7e9960dbadcfbc982d7b5efc2f6748210d3dd53a7de062 \ + --hash=sha256:c7f0fd2f2693309e6628aeeb2e2faf6edd221134dfccac3308ca0de01f8dab47 \ + --hash=sha256:c7f54a31cde6fa5cb030370566f68152a742f433f8d2be458463d06c208aef33 \ + --hash=sha256:c83b85ab2d6576607f31df77ff86f28182be4a8de6d175d2c33ca609925f5da1 \ + --hash=sha256:caa4e560c72c151da80036aecc2159e51a7fd8ae9efebefd96860460ce6bd025 \ + --hash=sha256:d3233770f78e637dc8b1fda2cd7c3b99ec77e7505041934188a4e7fe751de3b0 \ + --hash=sha256:d7b670879c370d350557edabadbad1f6561a9e6968126e6debca4029e5547820 \ + --hash=sha256:d8b1d91545578852f128650b8cce4ec20f93d39b378ff554ebe66290f2dabb92 \ + --hash=sha256:d9381351182ff9888cc80e41c632e7e274b106f450de33d67a9e8f6043da6f76 \ + --hash=sha256:daeb5e9e9fcad012324807856113708614d534f596d5008638eb9b40112cd9e4 \ + --hash=sha256:dcd8161eee7b41aae57ded06272905dbd680a0a04b91edd0f64790c796b2f706 \ + --hash=sha256:e10fa50bdbaa5e2445dbd387979980d391760faf0ec99a09bd7780ff37eaec44 \ + --hash=sha256:e37e4b4c4a11662b5db521def4e44d4d30ae69a1743241412a93ae40fdcab4bb \ + --hash=sha256:e52144670738b4b477fade6c2a9b6af71a8d0094514c9853ac9f6fc1fcfabae7 + # via mlx-gen +setuptools==81.0.0 \ + --hash=sha256:487b53915f52501f0a79ccfd0c02c165ffe06631443a886740b91af4b7a5845a \ + --hash=sha256:fdd925d5c5d9f62e4b74b30d6dd7828ce236fd6ed998a08d81de62ce5a6310d6 + # via torch +shellingham==1.5.4 \ + --hash=sha256:7ecfff8f2fd72616f7481040475a65b2bf8af90a56c89140852d1120324e8686 \ + --hash=sha256:8dbca0739d487e5bd35ab3ca4b36e11c4078f3a234bfce294b0a0291363404de + # via typer +six==1.17.0 \ + --hash=sha256:4721f391ed90541fddacab5acf947aa0d3dc7d27b2e1e8eda2be8970586c3274 \ + --hash=sha256:ff70335d468e7eb6ec65b95b99d3a2836546063f63acc5171de367e834932a81 + # via python-dateutil +sympy==1.14.0 \ + --hash=sha256:d3d3fe8df1e5a0b42f0e7bdf50541697dbe7d23746e894990c030e2b05e72517 \ + --hash=sha256:e091cc3e99d2141a0ba2847328f5479b05d94a6635cb96148ccb3f34671bd8f5 + # via torch +tokenizers==0.22.2 \ + --hash=sha256:143b999bdc46d10febb15cbffb4207ddd1f410e2c755857b5a0797961bbdc113 \ + --hash=sha256:1a62ba2c5faa2dd175aaeed7b15abf18d20266189fb3406c5d0550dd34dd5f37 \ + --hash=sha256:1c774b1276f71e1ef716e5486f21e76333464f47bece56bbd554485982a9e03e \ + --hash=sha256:1e418a55456beedca4621dbab65a318981467a2b188e982a23e117f115ce5001 \ + --hash=sha256:1e50f8554d504f617d9e9d6e4c2c2884a12b388a97c5c77f0bc6cf4cd032feee \ + --hash=sha256:2249487018adec45d6e3554c71d46eb39fa8ea67156c640f7513eb26f318cec7 \ + --hash=sha256:25b85325d0815e86e0bac263506dd114578953b7b53d7de09a6485e4a160a7dd \ + --hash=sha256:29c30b83d8dcd061078b05ae0cb94d3c710555fbb44861139f9f83dcca3dc3e4 \ + --hash=sha256:319f659ee992222f04e58f84cbf407cfa66a65fe3a8de44e8ad2bc53e7d99012 \ + --hash=sha256:369cc9fc8cc10cb24143873a0d95438bb8ee257bb80c71989e3ee290e8d72c67 \ + --hash=sha256:37ae80a28c1d3265bb1f22464c856bd23c02a05bb211e56d0c5301a435be6c1a \ + --hash=sha256:38337540fbbddff8e999d59970f3c6f35a82de10053206a7562f1ea02d046fa5 \ + --hash=sha256:473b83b915e547aa366d1eee11806deaf419e17be16310ac0a14077f1e28f917 \ + --hash=sha256:544dd704ae7238755d790de45ba8da072e9af3eea688f698b137915ae959281c \ + --hash=sha256:64d94e84f6660764e64e7e0b22baa72f6cd942279fdbb21d46abd70d179f0195 \ + --hash=sha256:753d47ebd4542742ef9261d9da92cd545b2cacbb48349a1225466745bb866ec4 \ + --hash=sha256:791135ee325f2336f498590eb2f11dc5c295232f288e75c99a36c5dbce63088a \ + --hash=sha256:9ce725d22864a1e965217204946f830c37876eee3b2ba6fc6255e8e903d5fcbc \ + --hash=sha256:a6bf3f88c554a2b653af81f3204491c818ae2ac6fbc09e76ef4773351292bc92 \ + --hash=sha256:bfb88f22a209ff7b40a576d5324bf8286b519d7358663db21d6246fb17eea2d5 \ + --hash=sha256:c9ea31edff2968b44a88f97d784c2f16dc0729b8b143ed004699ebca91f05c48 \ + --hash=sha256:df6c4265b289083bf710dff49bc51ef252f9d5be33a45ee2bed151114a56207b \ + --hash=sha256:e10bf9113d209be7cd046d40fbabbaf3278ff6d18eb4da4c500443185dc1896c \ + --hash=sha256:f01a9c019878532f98927d2bacb79bbb404b43d3437455522a00a30718cdedb5 + # via transformers +toml==0.10.2 \ + --hash=sha256:806143ae5bfb6a3c6e736a764057db0e6a0e05e338b5630894a5f779cabb4f9b \ + --hash=sha256:b3bda1d108d5dd99f4a20d24d9c348e91c4db7ab1b749200bded2f839ccbe68f + # via mlx-gen +torch==2.12.0 \ + --hash=sha256:10802fd383bbfed646212e765a72c37d2185205d4f26eb197a254e8ac7ddcb25 \ + --hash=sha256:10ee1448a9f304d3b987eb4656f664ba6e4d7b410ca7a5a7c642199777a2cf88 \ + --hash=sha256:1834bd984f8a2f4f16bdfbeecca9146184b220aa46276bf5756735b5dae12812 \ + --hash=sha256:2140e373e9a51a3e22ef62e8d14366d0b470d18f0adf19fdc757368077133a34 \ + --hash=sha256:3fee918902090ade827643e758e98363278815de583c75d111fdd665ebffde9f \ + --hash=sha256:415c1b8d0412f67551c8e89a2daca0fb3e56694af0281ba155eaa9da481f58b4 \ + --hash=sha256:4b4f64c2c2b11f7510d93dd6412b87025ff6eddd6bb61c3b5a3d892ea20c4756 \ + --hash=sha256:5d6b560dfa7d56291c07d615c3bb73e8d9943d9b6d87f76cd0d9d570c4797fa6 \ + --hash=sha256:5f96b63f8287f66a005dd1b5a6abba2920f11156c5e5c4d815f3e2050fd1aa16 \ + --hash=sha256:6a7512adfdd7f6732e40de1c620831e3c75b39b98cef60b11d0c5f0a76473ec5 \ + --hash=sha256:864392c73b7654f4d2b3ae712f607937d0dbb1101c4555fbb41848106b297f39 \ + --hash=sha256:891c769072637c74e9a5a77a3bc782894696d8ffec83b938df8536dee7f0ba78 \ + --hash=sha256:8b958caff4a14d3a3b0b2dfc6a378f64dda9728a9dad28c08a0db9ce4dafb549 \ + --hash=sha256:8fbef9f108a863e7722a73740998967e3b074742a834fc5be3a535a2befa7057 \ + --hash=sha256:90dd587a5f61bfe1307148b581e2084fc5bc4a06e2b90a20e9a36b81087ff16b \ + --hash=sha256:a43ac605a5e13116c72b64c359644cce0229f213dde48d2ae0ae5eb5becf7feb \ + --hash=sha256:a6a2eebb237d3b1d9ad3b378e86d9b9e0782afdea8b1e0eba6a13646b9b49c07 \ + --hash=sha256:af68dbf403439cae9ceaeaaf92f8352b460787dcd27b92aa05c40dd4a19c0f1e \ + --hash=sha256:b41339df93d491435e790ff8bcbae1c0ce777175889bfd1281d119862793e6a2 \ + --hash=sha256:b4556715c8572758625d62b6e0ae3b1f76c440221913a6fb5e100f321fb4fb02 \ + --hash=sha256:c12592630aef72feaf18bd3f197ef587bbfa21131b31c38b23ab2e55fce92e36 \ + --hash=sha256:c66696857e987efb8bc1777a37357ec4f60ab5e8af6250b83d6034437fa2d8f3 \ + --hash=sha256:cf9839790285dd472e7a16aafcb4a4e6bf58ec1b494045044b0eefb0eb4bd1f2 \ + --hash=sha256:d47e7dee68ac4cd7a068b26bcd6b989935427709fae1c8f7bd0019978f829e15 \ + --hash=sha256:d4d029801cb7b6df858804a2a21b00cc2aa0bf0ee5d2ab18d343c9e9e5681f35 \ + --hash=sha256:dd37188ea325042cb1f6cafa56822b11ada2520c04791a52629b0af25bdfbfd9 \ + --hash=sha256:e2ad3eb85d39c3cab62dfa93ed5a73516e6a53c6713cb97d004004fe089f0f1f \ + --hash=sha256:f7dfae4a519197dfa050e98d8e36378a0fb5899625a875c2b54445005a2e404e + # via mlx-gen +tqdm==4.68.2 \ + --hash=sha256:89c230e8dbc67c7615c142487111222f878c77427ea09549960f62389e258add \ + --hash=sha256:d4240441fb5353290b87d6a85968c9decc131a99b8c7faa28269d829de669ede + # via + # huggingface-hub + # mlx-gen + # transformers +transformers==5.10.2 \ + --hash=sha256:8a669db546f82c7c3618cb46ceb0f0afd89292bc70f319c058f8332ec63e268d \ + --hash=sha256:f9a44b9c8ca9ab1156b467f574d832ea066284299c2fd0ed84641ccb592751fc + # via mlx-gen +twine==6.2.0 \ + --hash=sha256:418ebf08ccda9a8caaebe414433b0ba5e25eb5e4a927667122fbe8f829f985d8 \ + --hash=sha256:e5ed0d2fd70c9959770dce51c8f39c8945c574e18173a7b81802dab51b4b75cf + # via mlx-gen +typer==0.25.1 \ + --hash=sha256:75caa44ed46a03fb2dab8808753ffacdbfea88495e74c85a28c5eefcf5f39c89 \ + --hash=sha256:9616eb8853a09ffeabab1698952f33c6f29ffdbceb4eaeecf571880e8d7664cc + # via + # huggingface-hub + # transformers +typing-extensions==4.15.0 \ + --hash=sha256:0cea48d173cc12fa28ecabc3b837ea3cf6f38c6d1136f85cbaaf598984861466 \ + --hash=sha256:f0fa19c6845758ab08074a0cfa8b7aecb71c999ca73d62883bc25cc018c4e548 + # via + # anyio + # huggingface-hub + # torch +urllib3==2.7.0 \ + --hash=sha256:231e0ec3b63ceb14667c67be60f2f2c40a518cb38b03af60abc813da26505f4c \ + --hash=sha256:9fb4c81ebbb1ce9531cce37674bbc6f1360472bc18ca9a553ede278ef7276897 + # via + # id + # mlx-gen + # requests + # twine diff --git a/omlx/video/worker.py b/omlx/video/worker.py new file mode 100644 index 000000000..9e6ce3dfa --- /dev/null +++ b/omlx/video/worker.py @@ -0,0 +1,209 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: Apache-2.0 +"""Video generation subprocess worker. + +Runs ONE generation job and exits. Spawned by VideoJobManager as: + + /bin/python -I /omlx/video/worker.py --spec job_spec.json + +HARD RULE: this script must not import omlx. It runs under the video venv +(mlx-gen + its deps); only mflux, mlx and the standard library are +available. See docs/video-generation-engine-spec.md section 4.2. + +Protocol: +- stdout: one JSON object per line. Phase heartbeats ({"phase": ...}) are + emitted on every phase transition so silent long phases (42GB weight + load, torch text encoding, VAE decode) still show liveness; denoise + steps additionally carry step/total_steps. The manager tracks the last + line timestamp for stall detection. +- Exit 0 + the output mp4 present and healthy = success. A result manifest + with timings and the kernel lifetime-max memory peak is written next to + the output for calibration records. +- Any failure: a failure manifest {code, message, detail} is written at + spec["manifest_path"] and the exit code is non-zero. + +Memory: before loading anything the worker pins its own Metal wired limit +inside the lease (spec 4.4 layer 1) -- overshoot degrades to non-resident +pages or an in-process allocation failure, never wired-sum growth toward +the machine cap. +""" + +from __future__ import annotations + +import argparse +import ctypes +import json +import os +import sys +import time +import traceback + +GB = 1024**3 +_T0 = time.time() + + +def _emit(**kw) -> None: + kw["t"] = round(time.time() - _T0, 1) + try: + print(json.dumps(kw), flush=True) + except Exception: + # Never let progress reporting kill the generation (a raising + # progress callback aborts mlx-gen's denoise loop). + pass + + +def _lifetime_max_phys() -> int: + """Own-process lifetime-max phys_footprint via libproc (best effort). + + rusage_info_v4 layout from sys/resource.h: ri_uuid (16 bytes), then 28 + c_uint64 fields, then ri_lifetime_max_phys_footprint. Standalone copy -- + this script cannot import omlx/utils/proc_memory.py. + """ + try: + class _RusageInfoV4(ctypes.Structure): + _fields_ = ( + [("ri_uuid", ctypes.c_uint8 * 16)] + + [(f"_u{i}", ctypes.c_uint64) for i in range(28)] + + [("ri_lifetime_max_phys_footprint", ctypes.c_uint64)] + + [("_tail", ctypes.c_uint64 * 6)] + ) + + libproc = ctypes.CDLL("/usr/lib/libproc.dylib", use_errno=True) + fn = libproc.proc_pid_rusage + fn.argtypes = [ctypes.c_int, ctypes.c_int, ctypes.c_void_p] + fn.restype = ctypes.c_int + info = _RusageInfoV4() + if fn(os.getpid(), 4, ctypes.byref(info)) != 0: + return 0 + return int(info.ri_lifetime_max_phys_footprint) + except Exception: + return 0 + + +def _write_manifest(path: str, payload: dict) -> None: + try: + tmp = path + ".tmp" + with open(tmp, "w") as f: + json.dump(payload, f, indent=1) + os.replace(tmp, path) + except Exception: + pass + + +def run(spec: dict) -> int: + manifest_path = spec["manifest_path"] + output_path = spec["output_path"] + + # Layer-1 memory containment: pin our Metal wired limit inside the + # lease BEFORE any weights load. + lease = int(spec.get("lease_bytes", 0)) + margin = int(spec.get("wired_margin_bytes", 2 * GB)) + if lease > 0: + import mlx.core as mx + + limit = max(1 * GB, lease - margin) + try: + mx.set_wired_limit(limit) + _emit(phase="wired_limit_set", limit_gb=round(limit / GB, 1)) + except Exception as e: + _emit(phase="wired_limit_failed", error=str(e)) + + # Low-RAM mode (default ON): release the inactive/high-noise denoiser + # after the boundary step, free both transformers before VAE decode and + # clear the MLX cache per step. P0 measurement showed the natural-mode + # peak at ~49GB even for small profiles; the low-RAM knobs are what the + # official benchmarks (20.7GB) use. Cost: the model instance is dead + # after one generation -- irrelevant here, one process per job. + low_ram = bool(spec.get("low_ram", True)) + if low_ram: + import mlx.core as mx + + try: + mx.set_cache_limit(1 * GB) + except Exception: + pass + + _emit(phase="loading") + from mflux.models.common.config.model_config import ModelConfig + from mflux.models.wan.variants import Wan2_2_TI2V + + model = Wan2_2_TI2V( + model_config=ModelConfig.wan2_2_t2v_a14b(), + model_path=spec["model_dir"], + ) + _emit(phase="loaded") + + def cb(ev) -> None: + _emit( + phase=str(getattr(ev, "phase", "denoise")), + step=int(getattr(ev, "step", 0) or 0), + total_steps=int(getattr(ev, "total_steps", 0) or 0), + ) + + kwargs = dict( + seed=int(spec["seed"]), + prompt=spec["prompt"], + num_inference_steps=int(spec["steps"]), + height=int(spec["height"]), + width=int(spec["width"]), + num_frames=int(spec["frames"]), + fps=int(spec["fps"]), + progress_callback=cb, + ) + if low_ram: + kwargs["release_inactive_denoiser"] = True + kwargs["release_denoisers_before_decode"] = True + kwargs["clear_cache_each_step"] = True + if spec.get("negative_prompt"): + kwargs["negative_prompt"] = spec["negative_prompt"] + if spec.get("guidance") is not None: + kwargs["guidance"] = float(spec["guidance"]) + if spec.get("guidance_2") is not None: + kwargs["guidance_2"] = float(spec["guidance_2"]) + + video = model.generate_video(**kwargs) + + _emit(phase="saving") + os.makedirs(os.path.dirname(output_path), exist_ok=True) + video.save(output_path) + + wall = round(time.time() - _T0, 1) + _write_manifest( + manifest_path, + { + "status": "completed", + "wall_seconds": wall, + "lifetime_max_phys_gb": round(_lifetime_max_phys() / GB, 2), + "output_bytes": ( + os.path.getsize(output_path) if os.path.exists(output_path) else 0 + ), + }, + ) + _emit(phase="done", wall_seconds=wall) + return 0 + + +def main() -> int: + ap = argparse.ArgumentParser() + ap.add_argument("--spec", required=True) + args = ap.parse_args() + with open(args.spec) as f: + spec = json.load(f) + try: + return run(spec) + except Exception as e: + _write_manifest( + spec.get("manifest_path", args.spec + ".manifest.json"), + { + "status": "failed", + "code": "worker_crashed", + "message": f"{type(e).__name__}: {e}", + "detail": traceback.format_exc()[-4000:], + }, + ) + _emit(phase="failed", error=f"{type(e).__name__}: {e}") + return 1 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/scripts/video_p0_measure.py b/scripts/video_p0_measure.py new file mode 100644 index 000000000..5bd3728c3 --- /dev/null +++ b/scripts/video_p0_measure.py @@ -0,0 +1,302 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: Apache-2.0 +"""Standalone P0 measurement harness for the fmlx video engine spec. + +Runs Wan2.2 T2V generation profiles under the video venv and measures the +true per-run memory peak via the kernel lifetime-max phys_footprint ledger +(ri_lifetime_max_phys_footprint). Each profile runs in a fresh child process +so the lifetime max is exact for that run: model load + text encoding + +denoise + VAE decode + every sub-poll spike. + +Must run under the video venv python (needs mflux). Does NOT import omlx +(see docs/video-generation-engine-spec.md section 4.2: worker venv isolation). + +Parent mode (default): spawns one child per profile, samples the child's +phys_footprint every 0.5s, writes per-profile samples + results and a +summary.json. + +Child mode (--single): loads the model, generates, saves the mp4, then reads +its OWN lifetime-max ledger and writes a result JSON. + +Usage: + video_p0_measure.py --model DIR --out DIR [--profiles default,steps40,...] +""" + +from __future__ import annotations + +import argparse +import ctypes +import json +import os +import subprocess +import sys +import threading +import time + +# --------------------------------------------------------------------------- +# phys_footprint via libproc (standalone copy of omlx/utils/proc_memory.py +# layout; this script must not import omlx) +# --------------------------------------------------------------------------- + + +class _RusageInfoV4(ctypes.Structure): + _fields_ = [ + ("ri_uuid", ctypes.c_uint8 * 16), + ("ri_user_time", ctypes.c_uint64), + ("ri_system_time", ctypes.c_uint64), + ("ri_pkg_idle_wkups", ctypes.c_uint64), + ("ri_interrupt_wkups", ctypes.c_uint64), + ("ri_pageins", ctypes.c_uint64), + ("ri_wired_size", ctypes.c_uint64), + ("ri_resident_size", ctypes.c_uint64), + ("ri_phys_footprint", ctypes.c_uint64), + ("ri_proc_start_abstime", ctypes.c_uint64), + ("ri_proc_exit_abstime", ctypes.c_uint64), + ("ri_child_user_time", ctypes.c_uint64), + ("ri_child_system_time", ctypes.c_uint64), + ("ri_child_pkg_idle_wkups", ctypes.c_uint64), + ("ri_child_interrupt_wkups", ctypes.c_uint64), + ("ri_child_pageins", ctypes.c_uint64), + ("ri_child_elapsed_abstime", ctypes.c_uint64), + ("ri_diskio_bytesread", ctypes.c_uint64), + ("ri_diskio_byteswritten", ctypes.c_uint64), + ("ri_cpu_time_qos_default", ctypes.c_uint64), + ("ri_cpu_time_qos_maintenance", ctypes.c_uint64), + ("ri_cpu_time_qos_background", ctypes.c_uint64), + ("ri_cpu_time_qos_utility", ctypes.c_uint64), + ("ri_cpu_time_qos_legacy", ctypes.c_uint64), + ("ri_cpu_time_qos_user_initiated", ctypes.c_uint64), + ("ri_cpu_time_qos_user_interactive", ctypes.c_uint64), + ("ri_billed_system_time", ctypes.c_uint64), + ("ri_serviced_system_time", ctypes.c_uint64), + ("ri_logical_writes", ctypes.c_uint64), + ("ri_lifetime_max_phys_footprint", ctypes.c_uint64), + ("ri_instructions", ctypes.c_uint64), + ("ri_cycles", ctypes.c_uint64), + ("ri_billed_energy", ctypes.c_uint64), + ("ri_serviced_energy", ctypes.c_uint64), + ("ri_interval_max_phys_footprint", ctypes.c_uint64), + ("ri_runnable_time", ctypes.c_uint64), + ] + + +_RUSAGE_INFO_V4 = 4 +_libproc = ctypes.CDLL("/usr/lib/libproc.dylib", use_errno=True) +_proc_pid_rusage = _libproc.proc_pid_rusage +_proc_pid_rusage.argtypes = [ctypes.c_int, ctypes.c_int, ctypes.c_void_p] +_proc_pid_rusage.restype = ctypes.c_int + + +def _rusage(pid: int) -> _RusageInfoV4 | None: + info = _RusageInfoV4() + if _proc_pid_rusage(pid, _RUSAGE_INFO_V4, ctypes.byref(info)) != 0: + return None + return info + + +def phys_footprint(pid: int) -> int: + info = _rusage(pid) + return info.ri_phys_footprint if info else 0 + + +def lifetime_max_phys(pid: int) -> int: + info = _rusage(pid) + return info.ri_lifetime_max_phys_footprint if info else 0 + + +# --------------------------------------------------------------------------- +# profiles +# --------------------------------------------------------------------------- + +PROMPT = "A red fox running through a snowy forest at dawn, cinematic, soft light" +SEED = 42 + +PROFILES: dict[str, dict] = { + # name: width height frames steps fps (frames must be 4n+1, dims /16). + # lowram=True mirrors the production worker defaults (mx cache limit + # 1GB + release denoisers + clear cache per step) -- the numbers that + # calibrate the shipped lease/predictor. Natural-mode profiles measure + # the unconstrained envelope. + "default": dict(width=480, height=272, frames=49, steps=20, fps=16), + "steps40": dict(width=480, height=272, frames=49, steps=40, fps=16), + "mid_spatial": dict(width=832, height=480, frames=49, steps=20, fps=16), + "frames101": dict(width=480, height=272, frames=101, steps=20, fps=16), + "default_lowram": dict( + width=480, height=272, frames=49, steps=20, fps=16, lowram=True + ), + "mid_spatial_lowram": dict( + width=832, height=480, frames=49, steps=20, fps=16, lowram=True + ), + "frames101_lowram": dict( + width=480, height=272, frames=101, steps=20, fps=16, lowram=True + ), +} + +GB = 1024**3 + + +# --------------------------------------------------------------------------- +# child mode: run one profile, report own lifetime max +# --------------------------------------------------------------------------- + + +def run_single(model_dir: str, out_dir: str, name: str) -> int: + p = PROFILES[name] + t0 = time.time() + lowram = bool(p.get("lowram", False)) + + def emit(**kw): + kw["t"] = round(time.time() - t0, 1) + print(json.dumps(kw), flush=True) + + if lowram: + import mlx.core as mx + + try: + mx.set_cache_limit(1 * GB) + except Exception: + pass + + emit(phase="loading") + from mflux.models.common.config.model_config import ModelConfig + from mflux.models.wan.variants import Wan2_2_TI2V + + model = Wan2_2_TI2V( + model_config=ModelConfig.wan2_2_t2v_a14b(), model_path=model_dir + ) + emit(phase="loaded") + + def cb(ev): + emit( + phase=getattr(ev, "phase", "?"), + step=getattr(ev, "step", 0), + total_steps=getattr(ev, "total_steps", 0), + ) + + gen_kwargs = dict( + seed=SEED, + prompt=PROMPT, + num_inference_steps=p["steps"], + height=p["height"], + width=p["width"], + num_frames=p["frames"], + fps=p["fps"], + progress_callback=cb, + ) + if lowram: + gen_kwargs.update( + release_inactive_denoiser=True, + release_denoisers_before_decode=True, + clear_cache_each_step=True, + ) + video = model.generate_video(**gen_kwargs) + emit(phase="saving") + out_mp4 = os.path.join(out_dir, f"{name}.mp4") + video.save(out_mp4) + wall = time.time() - t0 + # read own ledger BEFORE exit (proc_pid_rusage fails on a reaped pid) + result = { + "profile": name, + "params": p, + "wall_seconds": round(wall, 1), + "lifetime_max_phys_gb": round(lifetime_max_phys(os.getpid()) / GB, 2), + "final_phys_gb": round(phys_footprint(os.getpid()) / GB, 2), + "output": out_mp4, + "output_bytes": os.path.getsize(out_mp4) if os.path.exists(out_mp4) else 0, + "seed": SEED, + } + with open(os.path.join(out_dir, f"{name}.result.json"), "w") as f: + json.dump(result, f, indent=1) + emit(phase="done", wall_seconds=result["wall_seconds"]) + return 0 + + +# --------------------------------------------------------------------------- +# parent mode: spawn child per profile, sample its footprint +# --------------------------------------------------------------------------- + + +def run_parent(model_dir: str, out_dir: str, names: list[str], timeout_s: int) -> int: + os.makedirs(out_dir, exist_ok=True) + summary = {"profiles": {}, "started_at": time.strftime("%Y-%m-%d %H:%M:%S")} + for name in names: + print(f"=== profile {name} ===", flush=True) + log_path = os.path.join(out_dir, f"{name}.events.jsonl") + samples_path = os.path.join(out_dir, f"{name}.samples.jsonl") + child = subprocess.Popen( + [sys.executable, os.path.abspath(__file__), "--single", name, + "--model", model_dir, "--out", out_dir], + stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True, + ) + stop = threading.Event() + peak = {"sampled_max": 0, "max_delta_per_sample": 0} + + def sampler(): + last = 0 + with open(samples_path, "w") as sf: + while not stop.is_set(): + b = phys_footprint(child.pid) + if b: + t = round(time.time(), 1) + sf.write(json.dumps({"t": t, "gb": round(b / GB, 3)}) + "\n") + sf.flush() + peak["sampled_max"] = max(peak["sampled_max"], b) + if last: + peak["max_delta_per_sample"] = max( + peak["max_delta_per_sample"], b - last + ) + last = b + stop.wait(0.5) + + th = threading.Thread(target=sampler, daemon=True) + th.start() + deadline = time.time() + timeout_s + with open(log_path, "w") as lf: + for line in child.stdout: # type: ignore[union-attr] + lf.write(line) + lf.flush() + print(f" [{name}] {line.rstrip()}", flush=True) + if time.time() > deadline: + child.kill() + print(f" [{name}] TIMEOUT after {timeout_s}s, killed", flush=True) + break + rc = child.wait() + stop.set() + th.join(timeout=2) + entry = { + "exit_code": rc, + "sampled_max_gb": round(peak["sampled_max"] / GB, 2), + "max_delta_per_0p5s_gb": round(peak["max_delta_per_sample"] / GB, 2), + } + rpath = os.path.join(out_dir, f"{name}.result.json") + if os.path.exists(rpath): + with open(rpath) as f: + entry.update(json.load(f)) + summary["profiles"][name] = entry + with open(os.path.join(out_dir, "summary.json"), "w") as f: + json.dump(summary, f, indent=1) + print(f"=== {name} done: {json.dumps(entry)} ===", flush=True) + print("=== ALL DONE ===", flush=True) + return 0 + + +def main() -> int: + ap = argparse.ArgumentParser() + ap.add_argument("--model", required=True) + ap.add_argument("--out", required=True) + ap.add_argument("--profiles", default="default,steps40,mid_spatial,frames101") + ap.add_argument("--single", default=None, help="internal: run one profile in-process") + ap.add_argument("--timeout", type=int, default=10800) + args = ap.parse_args() + if args.single: + return run_single(args.model, args.out, args.single) + names = [n.strip() for n in args.profiles.split(",") if n.strip()] + for n in names: + if n not in PROFILES: + print(f"unknown profile {n}; known: {list(PROFILES)}", file=sys.stderr) + return 2 + return run_parent(args.model, args.out, names, args.timeout) + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/tests/test_process_memory_enforcer.py b/tests/test_process_memory_enforcer.py index b456ecc66..64ae81c0b 100644 --- a/tests/test_process_memory_enforcer.py +++ b/tests/test_process_memory_enforcer.py @@ -965,6 +965,63 @@ async def test_check_and_enforce_walks_caps_on_soft(self, enforcer): scheduler.adjust_store_cache_cap.assert_called_with("soft") +class TestRecentPeakTracking: + """Tests for recent-peak high-water tracking across poll ticks.""" + + @pytest.mark.asyncio + async def test_recent_peak_is_window_max(self, enforcer): + """After several poll ticks, recent_peak == max over the window. + + The window update lives after the ceiling > 0 early return in + _check_and_enforce, so the fixture's positive ceiling (10 GB) is + required for the update to run at all. + """ + readings = [3 * 1024**3, 5 * 1024**3, 2 * 1024**3, 4 * 1024**3] + with patch("omlx.process_memory_enforcer.mx") as mock_mx, patch( + "omlx.process_memory_enforcer.get_phys_footprint", return_value=0 + ): + mock_mx.get_active_memory.side_effect = _cycling(readings) + for _ in readings: + await enforcer._check_and_enforce() + + assert enforcer.recent_peak_bytes() == 5 * 1024**3 + + @pytest.mark.asyncio + async def test_recent_peak_drops_after_window_slides(self, enforcer): + """Old high readings age out once they leave the maxlen=5 window.""" + # Feed one big reading, then enough small ones to push it out of the + # 5-slot window. + big = 9 * 1024**3 + small = 1 * 1024**3 + readings = [big, small, small, small, small, small] + with patch("omlx.process_memory_enforcer.mx") as mock_mx, patch( + "omlx.process_memory_enforcer.get_phys_footprint", return_value=0 + ): + mock_mx.get_active_memory.side_effect = _cycling(readings) + for _ in readings: + await enforcer._check_and_enforce() + + # After 6 ticks the first (big) reading has slid out of the window, + # leaving only small readings. + assert enforcer.recent_peak_bytes() == small + + def test_propagates_recent_peak_to_scheduler(self, enforcer): + """_propagate_memory_limit pushes recent_peak onto each scheduler.""" + scheduler = MagicMock(spec=[]) + scheduler._memory_limit_bytes = 0 + scheduler._memory_hard_limit_bytes = 0 + scheduler._memory_recent_peak_bytes = 0 + engine = MagicMock(spec=[]) + engine.scheduler = scheduler + entry = _make_entry("model-a", engine=engine) + enforcer._engine_pool._entries = {"model-a": entry} + + enforcer._recent_peak_bytes = 7 * 1024**3 + enforcer._propagate_memory_limit() + + assert scheduler._memory_recent_peak_bytes == 7 * 1024**3 + + class TestProperties: """Tests for enforcer properties.""" diff --git a/tests/test_scheduler_admission.py b/tests/test_scheduler_admission.py index 0ed0c6458..1640ef8d3 100644 --- a/tests/test_scheduler_admission.py +++ b/tests/test_scheduler_admission.py @@ -2,13 +2,15 @@ """Tests for scheduler admission control (queue depth cap + admission_paused).""" from collections import deque -from unittest.mock import MagicMock +from unittest.mock import MagicMock, patch import pytest from omlx.exceptions import SchedulerQueueFullError from omlx.scheduler import Scheduler +GB = 1024**3 + @pytest.fixture def scheduler(): @@ -100,3 +102,77 @@ def test_default_false(self): s._prefill_memory_guard = False s._admission_paused = False assert s._admission_paused is False + + +def _preflight_scheduler(hard_limit: int, recent_peak: int, peak: int): + """Build a bare Scheduler wired for _preflight_memory_check. + + `peak` is the value the (mocked) memory_monitor estimates for the + prefill chunk; `recent_peak` is the propagated high-water mark. + """ + s = Scheduler.__new__(Scheduler) + s._prefill_memory_guard = True + s._memory_hard_limit_bytes = hard_limit + s._memory_recent_peak_bytes = recent_peak + s.config = MagicMock(prefill_step_size=2048) + s.memory_monitor = MagicMock() + s.memory_monitor.estimate_prefill_peak_bytes = MagicMock(return_value=peak) + return s + + +def _preflight_request(): + r = MagicMock() + r.num_prompt_tokens = 8192 + r.cached_tokens = 0 + return r + + +class TestPreflightRecentPeak: + """_preflight_memory_check uses the recent high-water mark, not just the + instant reading, so it does not wave through a request during a prefill + trough that would wall the next chunk.""" + + def test_rejects_on_recent_peak_when_instant_is_low(self): + """Instant active/phys low but recent_peak high -> reject. + + Picks numbers so that low + peak fits (pre-change behaviour would + admit) but recent_peak + peak exceeds the hard limit. This pins the + fix. + """ + hard_limit = 100 * GB + peak = 20 * GB + low = 10 * GB + high = 85 * GB + # Sanity: old code (low + peak) would have passed. + assert low + peak <= hard_limit + # New code (high + peak) must exceed the limit. + assert high + peak > hard_limit + + s = _preflight_scheduler( + hard_limit=hard_limit, recent_peak=high, peak=peak + ) + with patch("omlx.scheduler.mx") as mock_mx, patch( + "omlx.scheduler.get_phys_footprint", return_value=low + ): + mock_mx.get_active_memory.return_value = low + result = s._preflight_memory_check(_preflight_request()) + + assert result is not None + assert "Prefill would require" in result + + def test_admits_when_recent_peak_also_low(self): + """Control: when recent_peak is low too, the request passes.""" + hard_limit = 100 * GB + peak = 20 * GB + low = 10 * GB + + s = _preflight_scheduler( + hard_limit=hard_limit, recent_peak=low, peak=peak + ) + with patch("omlx.scheduler.mx") as mock_mx, patch( + "omlx.scheduler.get_phys_footprint", return_value=low + ): + mock_mx.get_active_memory.return_value = low + result = s._preflight_memory_check(_preflight_request()) + + assert result is None diff --git a/tests/test_scheduler_prefill_forward_gate.py b/tests/test_scheduler_prefill_forward_gate.py new file mode 100644 index 000000000..ff6932645 --- /dev/null +++ b/tests/test_scheduler_prefill_forward_gate.py @@ -0,0 +1,427 @@ +# SPDX-License-Identifier: Apache-2.0 +"""Tests for the forward-FRONT prefill memory gate (P0c). + +The gate (_prefill_forward_gate) predicts a prefill chunk's peak memory +BEFORE running self.model(...) and raises RuntimeError when it would breach +the hard cap, so the request is aborted cleanly instead of the transient +landing on the Metal ceiling and kernel-panicking the machine. The legacy +chunk-END check only fires after the allocation has already happened, which +on Apple Silicon is too late. + +Strategy: pure mocks, no model load. The discriminating assertion is that +when the predicted peak exceeds the cap the model forward is NOT called -- +on pre-change code (no forward-front gate) the forward WOULD run. +""" + +import logging +from unittest.mock import MagicMock, patch + +import mlx.core as mx +import pytest + +from omlx.request import Request, RequestStatus, SamplingParams +from omlx.scheduler import Scheduler, SchedulerConfig, _PrefillState + +GB = 1024**3 + + +# --------------------------------------------------------------------------- +# Direct unit tests of _prefill_forward_gate +# --------------------------------------------------------------------------- + + +def _gate_scheduler( + *, + hard_limit: int, + recent_peak: int, + estimate: int, + margin: int, + guard: bool = True, + monitor: bool = True, +): + """Build a bare Scheduler wired only for _prefill_forward_gate.""" + s = Scheduler.__new__(Scheduler) + s._prefill_memory_guard = guard + s._memory_hard_limit_bytes = hard_limit + s._memory_recent_peak_bytes = recent_peak + s._prefill_transient_margin_bytes = margin + s.config = MagicMock(prefill_step_size=2048) + if monitor: + s.memory_monitor = MagicMock() + s.memory_monitor.estimate_prefill_peak_bytes = MagicMock( + return_value=estimate + ) + else: + s.memory_monitor = None + return s + + +def _call_gate(s, chunk_tokens, *, instant): + """Invoke the gate with patched instant memory probes.""" + with patch("omlx.scheduler.mx") as mock_mx, patch( + "omlx.scheduler.get_phys_footprint", return_value=instant + ): + mock_mx.get_active_memory.return_value = instant + s._prefill_forward_gate( + chunk_tokens, request_id="rid-1", loop_label="external" + ) + + +class TestPrefillForwardGateUnit: + """Direct tests of the gate predicate.""" + + def test_raises_when_predicted_peak_exceeds_cap(self): + """current(high-water) + estimate + margin > cap -> RuntimeError. + + Numbers chosen so the instant reading alone (low) + estimate would + fit, but the high-water recent_peak + estimate + margin overflow. + """ + hard = 107 * GB + estimate = 2 * GB + margin = 10 * GB + instant = 50 * GB + recent_peak = 96 * GB + # Instant + estimate (no margin) fits; this is the trough the legacy + # check could read. + assert instant + estimate <= hard + # High-water + estimate + margin overflows -> must refuse. + assert recent_peak + estimate + margin > hard + + s = _gate_scheduler( + hard_limit=hard, + recent_peak=recent_peak, + estimate=estimate, + margin=margin, + ) + with pytest.raises(RuntimeError, match="refused before forward"): + _call_gate(s, 256, instant=instant) + + def test_passes_when_predicted_peak_fits(self): + """current + estimate + margin <= cap -> no raise.""" + hard = 107 * GB + s = _gate_scheduler( + hard_limit=hard, + recent_peak=80 * GB, + estimate=2 * GB, + margin=10 * GB, + ) + # 80 + 2 + 10 = 92 < 107. + _call_gate(s, 256, instant=80 * GB) # must not raise + + def test_margin_is_what_tips_it_over(self): + """Without the margin it would pass; the margin alone forces refusal. + + Pins that the margin term is actually applied (not dropped). + """ + hard = 100 * GB + estimate = 1 * GB + instant = 90 * GB + recent_peak = 90 * GB + # current + estimate (no margin) = 91 < 100 -> would pass. + assert recent_peak + estimate < hard + # current + estimate + margin = 101 > 100 -> must refuse. + margin = 10 * GB + assert recent_peak + estimate + margin > hard + + s = _gate_scheduler( + hard_limit=hard, + recent_peak=recent_peak, + estimate=estimate, + margin=margin, + ) + with pytest.raises(RuntimeError): + _call_gate(s, 256, instant=instant) + + # Same setup, margin=0 -> passes (control). + s0 = _gate_scheduler( + hard_limit=hard, + recent_peak=recent_peak, + estimate=estimate, + margin=0, + ) + _call_gate(s0, 256, instant=instant) # must not raise + + def test_uses_recent_peak_high_water_not_just_instant(self): + """A mid-prefill trough in the instant reading must not mask the + real footprint: recent_peak high + low instant still refuses.""" + hard = 107 * GB + s = _gate_scheduler( + hard_limit=hard, + recent_peak=100 * GB, # real footprint + estimate=2 * GB, + margin=10 * GB, + ) + # Instant reads a trough at 50GB; without recent_peak it would pass. + assert 50 * GB + 2 * GB + 10 * GB < hard + with pytest.raises(RuntimeError): + _call_gate(s, 256, instant=50 * GB) + + def test_noop_when_guard_off(self): + s = _gate_scheduler( + hard_limit=107 * GB, + recent_peak=200 * GB, + estimate=200 * GB, + margin=10 * GB, + guard=False, + ) + _call_gate(s, 256, instant=200 * GB) # guard off -> never raises + + def test_noop_when_hard_limit_unset(self): + s = _gate_scheduler( + hard_limit=0, + recent_peak=200 * GB, + estimate=200 * GB, + margin=10 * GB, + ) + _call_gate(s, 256, instant=200 * GB) # no limit -> never raises + + def test_fires_without_monitor_phys_based(self): + """THE fix: in production scheduler.memory_monitor is never wired, so + the gate must still fire on current(phys) + margin. estimate is treated + as 0 and the margin carries the guarantee.""" + s = _gate_scheduler( + hard_limit=107 * GB, + recent_peak=100 * GB, + estimate=0, + margin=10 * GB, + monitor=False, + ) + # current 100 + estimate 0 + margin 10 = 110 > cap 107 -> refuse. + with pytest.raises(RuntimeError, match="refused before forward"): + _call_gate(s, 256, instant=100 * GB) + + def test_passes_without_monitor_when_fits(self): + """Phys-based gate does not false-fire: current + margin <= cap passes + even with no monitor.""" + s = _gate_scheduler( + hard_limit=107 * GB, + recent_peak=90 * GB, + estimate=0, + margin=10 * GB, + monitor=False, + ) + # current 90 + margin 10 = 100 <= cap 107 -> no raise. + _call_gate(s, 256, instant=90 * GB) + + def test_fires_with_zero_estimate_margin_carries(self): + """Monitor present but estimate==0 (model can't be dim-estimated): the + gate still fires on current + margin -- the margin, not the estimate, is + the safety mechanism.""" + s = _gate_scheduler( + hard_limit=107 * GB, + recent_peak=100 * GB, + estimate=0, + margin=10 * GB, + ) + with pytest.raises(RuntimeError, match="refused before forward"): + _call_gate(s, 256, instant=100 * GB) + + def test_noop_when_chunk_zero(self): + s = _gate_scheduler( + hard_limit=107 * GB, + recent_peak=200 * GB, + estimate=2 * GB, + margin=10 * GB, + ) + _call_gate(s, 0, instant=200 * GB) # nothing to process -> never raises + + +# --------------------------------------------------------------------------- +# Integration: gate fires BEFORE the model forward in the real chunked loop +# --------------------------------------------------------------------------- + + +def _integration_scheduler(*, hard_gb: float, estimate_bytes: int, margin_gb: float): + """Scheduler with a mock model, hard cap on but soft off (so the adaptive + throttle passes through and only the forward-front gate can fire).""" + model = MagicMock() + model.layers = [] + tokenizer = MagicMock() + tokenizer.eos_token_id = 2 + config = SchedulerConfig( + max_num_seqs=8, + prefill_step_size=256, + chunked_prefill=True, + paged_cache_block_size=0, + ) + s = Scheduler(model=model, tokenizer=tokenizer, config=config) + s.batch_generator = MagicMock() + # Soft limit 0 -> _adaptive_chunk_size is a pure passthrough. + s._memory_limit_bytes = 0 + s._memory_hard_limit_bytes = int(hard_gb * GB) + s._prefill_memory_guard = True + s._prefill_transient_margin_bytes = int(margin_gb * GB) + s.memory_monitor = MagicMock() + s.memory_monitor.estimate_prefill_peak_bytes = MagicMock( + return_value=estimate_bytes + ) + return s, model + + +def _prefill_state(n_tokens: int) -> _PrefillState: + req = Request( + request_id="rid-int", + prompt=list(range(n_tokens + 1)), + sampling_params=SamplingParams(max_tokens=8), + ) + req.prompt_token_ids = list(range(n_tokens + 1)) + req.num_prompt_tokens = n_tokens + 1 + req.status = RequestStatus.WAITING + return _PrefillState( + request=req, + cache=[], + tokens_remaining=mx.array(list(range(n_tokens)))[None], + last_token=[n_tokens], + tokens_processed=0, + base_size=0, + emitted_boundaries={}, + boundary_enabled=False, + block_size=0, + total_length=n_tokens + 1, + ) + + +class TestForwardGateBlocksForward: + """The gate must abort the chunk BEFORE self.model(...) runs.""" + + def test_over_cap_does_not_call_model_forward(self): + """Predicted peak over cap -> RuntimeError raised and model NOT called. + + This is the discriminating assertion that pins the fix: pre-change + code (no forward-front gate) reaches self.model(chunk, ...) and the + transient lands on the cap (kernel panic on real hardware). With the + gate, the forward never runs. + """ + # recent_peak high (set via instant probes) + estimate + margin > cap. + s, model = _integration_scheduler( + hard_gb=107.0, estimate_bytes=2 * GB, margin_gb=10 * 1.0 + ) + state = _prefill_state(n_tokens=200) + + high = int(100 * GB) + with patch( + "omlx.scheduler.mx.get_active_memory", return_value=high + ), patch("omlx.scheduler.get_phys_footprint", return_value=high), patch( + "omlx.scheduler.mx.eval" + ) as mock_eval: + with pytest.raises(RuntimeError, match="refused before forward"): + s._step_prefill_chunk(state) + + # The whole point: the model forward must not have executed. + model.assert_not_called() + mock_eval.assert_not_called() + + def test_under_cap_runs_model_forward(self): + """Predicted peak under cap -> forward runs as normal (control).""" + s, model = _integration_scheduler( + hard_gb=107.0, estimate_bytes=1 * GB, margin_gb=2.0 + ) + state = _prefill_state(n_tokens=200) + + low = int(50 * GB) # 50 + 1 + 2 = 53 < 107 + with patch( + "omlx.scheduler.mx.get_active_memory", return_value=low + ), patch("omlx.scheduler.get_phys_footprint", return_value=low), patch( + "omlx.scheduler.mx.eval" + ), patch("omlx.scheduler._sync_and_clear_cache"), patch( + "omlx.scheduler.get_prefill_tracker" + ): + done = s._step_prefill_chunk(state) + + # Forward ran exactly once; prefill consumed the only chunk. + assert model.call_count == 1 + assert done is True + + +class TestForwardGateExternalLoopWiring: + """Sanity that the external loop wiring calls the gate before the forward. + + Patch _prefill_forward_gate to raise; the model forward must not run. + Uses a tiny text-only request through _do_external_prefill. + """ + + def test_external_loop_calls_gate_before_forward(self): + model = MagicMock() + model.layers = [] + tokenizer = MagicMock() + tokenizer.eos_token_id = 2 + config = SchedulerConfig( + max_num_seqs=8, + prefill_step_size=256, + chunked_prefill=False, + paged_cache_block_size=0, + ) + s = Scheduler(model=model, tokenizer=tokenizer, config=config) + + req = Request( + request_id="rid-ext", + prompt=[1, 2, 3, 4, 5], + sampling_params=SamplingParams(max_tokens=8), + ) + req.prompt_token_ids = [1, 2, 3, 4, 5] + req.num_prompt_tokens = 5 + + with patch.object( + s, + "_prefill_forward_gate", + side_effect=RuntimeError("Prefill refused before forward"), + ) as mock_gate, patch( + "omlx.scheduler.make_prompt_cache", return_value=[] + ): + with pytest.raises(RuntimeError, match="refused before forward"): + s._do_external_prefill(req, [1, 2, 3, 4, 5], None) + + mock_gate.assert_called_once() + # Gate raised -> forward must not have run. + model.assert_not_called() + + +class TestGateStateLog: + """_log_prefill_gate_state_once surfaces the resolved gate config loudly -- + the prior monitor-based gate shipped inert and SILENT, found only on metal.""" + + def test_logs_resolved_margin_once(self, caplog): + s = _gate_scheduler( + hard_limit=107 * GB, recent_peak=0, estimate=2 * GB, margin=12 * GB + ) + with caplog.at_level(logging.INFO, logger="omlx.scheduler"): + s._log_prefill_gate_state_once() + s._log_prefill_gate_state_once() # second call must be a no-op + hits = [ + r for r in caplog.records + if "prefill forward gate ACTIVE" in r.getMessage() + ] + assert len(hits) == 1 + msg = hits[0].getMessage() + assert "margin=12.0GB" in msg + assert "cap=107.0GB" in msg + assert "estimator=active" in msg # monitor returns >0 here + + def test_warns_when_margin_zero(self, caplog): + s = _gate_scheduler( + hard_limit=107 * GB, recent_peak=0, estimate=2 * GB, margin=0 + ) + with caplog.at_level(logging.INFO, logger="omlx.scheduler"): + s._log_prefill_gate_state_once() + rec = [ + r for r in caplog.records + if "prefill forward gate" in r.getMessage() + ][0] + assert rec.levelno == logging.WARNING + assert "margin=0" in rec.getMessage().lower() + + def test_reports_estimator_disabled_without_monitor(self, caplog): + s = _gate_scheduler( + hard_limit=107 * GB, + recent_peak=0, + estimate=0, + margin=12 * GB, + monitor=False, + ) + with caplog.at_level(logging.INFO, logger="omlx.scheduler"): + s._log_prefill_gate_state_once() + rec = [ + r for r in caplog.records + if "prefill forward gate ACTIVE" in r.getMessage() + ][0] + assert "DISABLED" in rec.getMessage() diff --git a/tests/test_settings.py b/tests/test_settings.py index a16cde3e1..665cda404 100644 --- a/tests/test_settings.py +++ b/tests/test_settings.py @@ -444,13 +444,13 @@ def test_to_dict(self): """Test conversion to dictionary.""" settings = HuggingFaceSettings(endpoint="https://hf-mirror.com") result = settings.to_dict() - assert result == {"endpoint": "https://hf-mirror.com"} + assert result == {"endpoint": "https://hf-mirror.com", "disable_xet": False} def test_to_dict_empty(self): """Test conversion to dictionary with empty endpoint.""" settings = HuggingFaceSettings() result = settings.to_dict() - assert result == {"endpoint": ""} + assert result == {"endpoint": "", "disable_xet": False} def test_from_dict(self): """Test creation from dictionary.""" diff --git a/tests/test_video_discovery.py b/tests/test_video_discovery.py new file mode 100644 index 000000000..125cdef04 --- /dev/null +++ b/tests/test_video_discovery.py @@ -0,0 +1,409 @@ +# SPDX-License-Identifier: Apache-2.0 +"""Tests for video (diffusers-layout) model discovery. + +Covers the discovery-layer changes from docs/video-generation-engine-spec.md +section 4.1: model_index.json as a model root, WanPipeline -> "video", +unknown-pipeline skip, no phantom component entries from org-folder descent, +and regression guards for the existing config.json (LLM) path. +""" + +import json +from pathlib import Path + +import pytest + +from omlx.model_discovery import ( + VIDEO_PIPELINE_CLASSES, + _is_model_dir, + detect_model_type, + discover_models, + estimate_model_size, + read_model_index_pipeline_class, +) + +# Component subdirs of a Wan2.2-style diffusers repo. Each carries its own +# config.json + weights, so pre-fix org-folder descent would have registered +# them as phantom standalone "llm" models. +_WAN_COMPONENTS = ("transformer", "transformer_2", "vae", "text_encoder") + + +def make_diffusers_dir( + parent: Path, + name: str = "Wan2.2-T2V-A14B", + class_name: str | None = "WanPipeline", + component_weight_bytes: int = 1024, +) -> Path: + """Create a diffusers-layout model dir with fake component weights.""" + model_dir = parent / name + model_dir.mkdir(parents=True) + + index: dict = {"_diffusers_version": "0.35.0"} + if class_name is not None: + index["_class_name"] = class_name + (model_dir / "model_index.json").write_text(json.dumps(index)) + + for comp in _WAN_COMPONENTS: + comp_dir = model_dir / comp + comp_dir.mkdir() + # Component config.json is registerable on its own (would detect as + # "llm") -- exactly what made the phantom-entry bug dangerous. + (comp_dir / "config.json").write_text( + json.dumps({"model_type": "llama", "architectures": ["LlamaForCausalLM"]}) + ) + (comp_dir / "model.safetensors").write_bytes(b"\0" * component_weight_bytes) + + return model_dir + + +def make_llm_dir(parent: Path, name: str = "llama-3b", weight_bytes: int = 512) -> Path: + """Create a plain transformers-layout LLM model dir.""" + model_dir = parent / name + model_dir.mkdir(parents=True) + (model_dir / "config.json").write_text( + json.dumps({"model_type": "llama", "architectures": ["LlamaForCausalLM"]}) + ) + (model_dir / "model.safetensors").write_bytes(b"\0" * weight_bytes) + return model_dir + + +class TestReadModelIndexPipelineClass: + """Unit tests for read_model_index_pipeline_class.""" + + def test_valid_wan_pipeline(self, tmp_path): + (tmp_path / "model_index.json").write_text( + json.dumps({"_class_name": "WanPipeline", "_diffusers_version": "0.35.0"}) + ) + assert read_model_index_pipeline_class(tmp_path) == "WanPipeline" + + def test_unknown_pipeline_class_still_returned(self, tmp_path): + """The reader returns the raw class; the allowlist filter lives elsewhere.""" + (tmp_path / "model_index.json").write_text( + json.dumps({"_class_name": "FluxPipeline"}) + ) + assert read_model_index_pipeline_class(tmp_path) == "FluxPipeline" + + def test_missing_model_index(self, tmp_path): + assert read_model_index_pipeline_class(tmp_path) is None + + def test_missing_class_name_key(self, tmp_path): + (tmp_path / "model_index.json").write_text( + json.dumps({"_diffusers_version": "0.35.0"}) + ) + assert read_model_index_pipeline_class(tmp_path) is None + + def test_non_string_class_name(self, tmp_path): + (tmp_path / "model_index.json").write_text(json.dumps({"_class_name": 123})) + assert read_model_index_pipeline_class(tmp_path) is None + + def test_invalid_json(self, tmp_path): + (tmp_path / "model_index.json").write_text("{not valid json") + assert read_model_index_pipeline_class(tmp_path) is None + + def test_wan_pipeline_in_allowlist(self): + assert "WanPipeline" in VIDEO_PIPELINE_CLASSES + + +class TestDetectModelTypeVideo: + """Tests for detect_model_type video branch + LLM regression.""" + + def test_wan_pipeline_dir_is_video(self, tmp_path): + model_dir = make_diffusers_dir(tmp_path) + assert detect_model_type(model_dir) == "video" + + def test_wan_pipeline_index_alone_is_video(self, tmp_path): + """model_index.json alone (no components yet) already types as video.""" + (tmp_path / "model_index.json").write_text( + json.dumps({"_class_name": "WanPipeline"}) + ) + assert detect_model_type(tmp_path) == "video" + + def test_unknown_pipeline_falls_through_to_llm(self, tmp_path): + """Unknown pipeline class does not type as video. detect_model_type + falls back to "llm" (the skip happens in _register_model).""" + model_dir = make_diffusers_dir(tmp_path, class_name="FluxPipeline") + assert detect_model_type(model_dir) == "llm" + + def test_config_json_llm_unchanged(self, tmp_path): + """Regression guard: plain config.json LLM detection is unaffected.""" + model_dir = make_llm_dir(tmp_path) + assert detect_model_type(model_dir) == "llm" + + def test_video_branch_runs_before_missing_config_fallback(self, tmp_path): + """A WanPipeline dir has no root config.json; without the video branch + the missing-config early-exit would have returned "llm".""" + model_dir = make_diffusers_dir(tmp_path) + assert not (model_dir / "config.json").exists() + assert detect_model_type(model_dir) == "video" + + +class TestIsModelDir: + """_is_model_dir accepts model_index.json as a model root.""" + + def test_model_index_json_is_model_root(self, tmp_path): + (tmp_path / "model_index.json").write_text( + json.dumps({"_class_name": "WanPipeline"}) + ) + assert _is_model_dir(tmp_path) is True + + def test_config_json_is_model_root(self, tmp_path): + (tmp_path / "config.json").write_text("{}") + assert _is_model_dir(tmp_path) is True + + def test_empty_dir_is_not_model_root(self, tmp_path): + assert _is_model_dir(tmp_path) is False + + def test_adapter_wins_over_model_index(self, tmp_path): + """adapter_config.json + model_index.json -> adapter check wins.""" + (tmp_path / "model_index.json").write_text( + json.dumps({"_class_name": "WanPipeline"}) + ) + (tmp_path / "adapter_config.json").write_text("{}") + assert _is_model_dir(tmp_path) is False + + +class TestDiscoverVideoOwnerRepoLayout: + """Owner/repo (organized two-level) layout.""" + + def test_single_video_entry_no_phantoms(self, tmp_path): + make_diffusers_dir(tmp_path / "Wan-AI", name="Wan2.2-T2V-A14B") + + models = discover_models(tmp_path) + + assert set(models.keys()) == {"Wan2.2-T2V-A14B"} + entry = models["Wan2.2-T2V-A14B"] + assert entry.model_type == "video" + assert entry.engine_type == "video" + assert entry.config_model_type == "WanPipeline" + # No phantom component entries + for comp in _WAN_COMPONENTS: + assert comp not in models + + def test_entry_paths_and_size(self, tmp_path): + model_dir = make_diffusers_dir( + tmp_path / "Wan-AI", component_weight_bytes=1000 + ) + + models = discover_models(tmp_path) + entry = models["Wan2.2-T2V-A14B"] + assert Path(entry.model_path) == model_dir + # 4 components x 1000 bytes, 5% runtime overhead + assert entry.estimated_size == int(4 * 1000 * 1.05) + + def test_video_alongside_llm_in_same_org(self, tmp_path): + org = tmp_path / "Wan-AI" + make_diffusers_dir(org) + make_llm_dir(org, name="some-llm") + + models = discover_models(tmp_path) + assert set(models.keys()) == {"Wan2.2-T2V-A14B", "some-llm"} + assert models["Wan2.2-T2V-A14B"].model_type == "video" + assert models["some-llm"].model_type == "llm" + + +class TestDiscoverVideoFlatLayout: + """Flat layout: the diffusers dir sits directly under model_dir. + + This is the org-folder-descent fix: pre-fix, a dir without root + config.json was treated as an organization folder and its component + subdirs (transformer/, vae/, ...) were registered as phantom llm models. + """ + + def test_single_video_entry_no_phantoms(self, tmp_path): + make_diffusers_dir(tmp_path, name="Wan2.2-T2V-A14B") + + models = discover_models(tmp_path) + + assert set(models.keys()) == {"Wan2.2-T2V-A14B"} + entry = models["Wan2.2-T2V-A14B"] + assert entry.model_type == "video" + assert entry.engine_type == "video" + assert entry.config_model_type == "WanPipeline" + for comp in _WAN_COMPONENTS: + assert comp not in models + + def test_flat_and_owner_repo_give_same_result(self, tmp_path): + flat_root = tmp_path / "flat" + flat_root.mkdir() + make_diffusers_dir(flat_root) + + org_root = tmp_path / "org" + org_root.mkdir() + make_diffusers_dir(org_root / "Wan-AI") + + flat = discover_models(flat_root) + org = discover_models(org_root) + + assert set(flat.keys()) == set(org.keys()) == {"Wan2.2-T2V-A14B"} + for key in ("model_type", "engine_type", "config_model_type", "estimated_size"): + assert getattr(flat["Wan2.2-T2V-A14B"], key) == getattr( + org["Wan2.2-T2V-A14B"], key + ) + + +class TestUnknownPipelineSkipped: + """Unknown diffusers pipelines are skipped at registration -- no entry, + no phantom component entries.""" + + def test_flux_pipeline_flat_not_registered(self, tmp_path): + make_diffusers_dir(tmp_path, name="FLUX.2-dev", class_name="FluxPipeline") + + models = discover_models(tmp_path) + assert models == {} + + def test_flux_pipeline_owner_repo_not_registered(self, tmp_path): + make_diffusers_dir( + tmp_path / "black-forest-labs", name="FLUX.2-dev", class_name="FluxPipeline" + ) + + models = discover_models(tmp_path) + assert models == {} + + def test_flux_skip_logs_warning(self, tmp_path, caplog): + make_diffusers_dir(tmp_path, name="FLUX.2-dev", class_name="FluxPipeline") + + with caplog.at_level("WARNING", logger="omlx.model_discovery"): + discover_models(tmp_path) + + assert any( + "FluxPipeline" in rec.message and "FLUX.2-dev" in rec.message + for rec in caplog.records + ) + + def test_flux_does_not_block_sibling_models(self, tmp_path): + make_diffusers_dir(tmp_path, name="FLUX.2-dev", class_name="FluxPipeline") + make_llm_dir(tmp_path, name="llama-3b") + make_diffusers_dir(tmp_path, name="Wan2.2-T2V-A14B") + + models = discover_models(tmp_path) + assert set(models.keys()) == {"Wan2.2-T2V-A14B", "llama-3b"} + + +class TestMalformedModelIndex: + """model_index.json with no _class_name or broken JSON: never video, + discovery does not crash.""" + + def test_missing_class_name_not_video(self, tmp_path): + make_diffusers_dir(tmp_path, name="no-class-name", class_name=None) + + models = discover_models(tmp_path) + + # A model_index.json without a readable _class_name (and no root + # config.json) is skipped entirely: registering it would produce + # an unloadable llm entry. No phantoms either. + assert "no-class-name" not in models + for comp in _WAN_COMPONENTS: + assert comp not in models + + def test_invalid_json_not_video(self, tmp_path): + model_dir = make_diffusers_dir(tmp_path, name="bad-json", class_name=None) + (model_dir / "model_index.json").write_text("{definitely not json") + + models = discover_models(tmp_path) + + for entry in models.values(): + assert entry.model_type != "video" + for comp in _WAN_COMPONENTS: + assert comp not in models + + def test_malformed_index_without_weights_not_registered(self, tmp_path): + """No weights anywhere -> estimate_model_size raises -> entry dropped + gracefully (no exception escapes discover_models).""" + model_dir = tmp_path / "empty-index" + model_dir.mkdir() + (model_dir / "model_index.json").write_text(json.dumps({"foo": "bar"})) + + models = discover_models(tmp_path) + assert models == {} + + +class TestAdapterExclusion: + """adapter_config.json wins over model_index.json.""" + + def test_adapter_with_model_index_excluded_flat(self, tmp_path): + model_dir = make_diffusers_dir(tmp_path, name="wan-lora") + (model_dir / "adapter_config.json").write_text("{}") + + models = discover_models(tmp_path) + + assert "wan-lora" not in models + # Adapter dirs are skipped wholesale -- no descent, no phantoms. + for comp in _WAN_COMPONENTS: + assert comp not in models + + def test_adapter_with_model_index_excluded_in_org(self, tmp_path): + model_dir = make_diffusers_dir(tmp_path / "Wan-AI", name="wan-lora") + (model_dir / "adapter_config.json").write_text("{}") + + models = discover_models(tmp_path) + assert models == {} + + +class TestEstimateModelSizeDiffusers: + """estimate_model_size sums recursive **/*.safetensors for diffusers + layouts (no root-level weight files).""" + + def test_recursive_sum_with_overhead(self, tmp_path): + model_dir = tmp_path / "Wan2.2-T2V-A14B" + model_dir.mkdir() + (model_dir / "model_index.json").write_text( + json.dumps({"_class_name": "WanPipeline"}) + ) + sizes = { + "transformer/diffusion_pytorch_model-00001-of-00002.safetensors": 3000, + "transformer/diffusion_pytorch_model-00002-of-00002.safetensors": 2000, + "transformer_2/diffusion_pytorch_model.safetensors": 1500, + "vae/diffusion_pytorch_model.safetensors": 700, + "text_encoder/model.safetensors": 300, + } + for rel, size in sizes.items(): + f = model_dir / rel + f.parent.mkdir(exist_ok=True) + f.write_bytes(b"\0" * size) + + expected = int(sum(sizes.values()) * 1.05) + assert estimate_model_size(model_dir) == expected + + def test_no_weights_raises(self, tmp_path): + model_dir = tmp_path / "wan-empty" + model_dir.mkdir() + (model_dir / "model_index.json").write_text( + json.dumps({"_class_name": "WanPipeline"}) + ) + with pytest.raises(ValueError): + estimate_model_size(model_dir) + + +class TestLLMRegressionGuard: + """Normal LLM dirs (config.json) still discover exactly as before.""" + + def test_flat_llm(self, tmp_path): + make_llm_dir(tmp_path, name="llama-3b", weight_bytes=2048) + + models = discover_models(tmp_path) + + assert set(models.keys()) == {"llama-3b"} + entry = models["llama-3b"] + assert entry.model_type == "llm" + assert entry.engine_type == "batched" + assert entry.config_model_type == "llama" + assert entry.estimated_size == int(2048 * 1.05) + + def test_org_folder_llm_descent_still_works(self, tmp_path): + org = tmp_path / "mlx-community" + make_llm_dir(org, name="llama-3b") + make_llm_dir(org, name="qwen-7b") + + models = discover_models(tmp_path) + assert set(models.keys()) == {"llama-3b", "qwen-7b"} + assert all(m.model_type == "llm" for m in models.values()) + + def test_mixed_llm_and_video(self, tmp_path): + make_llm_dir(tmp_path, name="llama-3b") + make_diffusers_dir(tmp_path / "Wan-AI") + + models = discover_models(tmp_path) + assert set(models.keys()) == {"llama-3b", "Wan2.2-T2V-A14B"} + assert models["llama-3b"].model_type == "llm" + assert models["llama-3b"].engine_type == "batched" + assert models["Wan2.2-T2V-A14B"].model_type == "video" + assert models["Wan2.2-T2V-A14B"].engine_type == "video" diff --git a/tests/test_video_manager.py b/tests/test_video_manager.py new file mode 100644 index 000000000..a71e7b7c2 --- /dev/null +++ b/tests/test_video_manager.py @@ -0,0 +1,513 @@ +# SPDX-License-Identifier: Apache-2.0 +"""Tests for VideoJobManager (omlx/video/manager.py) with a fake worker. + +The manager spawns [worker_python, -I, worker_script, --spec, spec.json]; +these tests point worker_python at sys.executable and worker_script at a +tiny stdlib-only script written into tmp_path, so no model / mflux / venv +is needed. Spec reference: docs/video-generation-engine-spec.md section 4.2. +""" + +import asyncio +import json +import sys +import time +from pathlib import Path + +import pytest + +import omlx.video.manager as vm +from omlx.settings import VideoSettings +from omlx.video.manager import QueueFullError, VideoJob, VideoJobManager + +GB = 1024**3 + + +# --------------------------------------------------------------------------- +# Fake worker scripts (stdlib only -- they run under python -I) +# --------------------------------------------------------------------------- + +_PRELUDE = """\ +import json, sys, time + +def emit(obj): + sys.stdout.write(json.dumps(obj) + "\\n") + sys.stdout.flush() + +spec_path = sys.argv[sys.argv.index("--spec") + 1] +with open(spec_path) as f: + spec = json.load(f) +""" + +_SUCCESS_BODY = """\ +emit({"phase": "loading"}) +emit({"phase": "loaded"}) +emit({"phase": "denoise", "step": 1, "total_steps": 2}) +emit({"phase": "denoise", "step": 2, "total_steps": 2}) +emit({"phase": "saving"}) +with open(spec["output_path"], "wb") as f: + f.write(b"FAKE-MP4-BYTES") +with open(spec["manifest_path"], "w") as f: + json.dump({"status": "completed", "lifetime_max_phys_gb": 1.5}, f) +sys.exit(0) +""" + +_CRASH_BODY = """\ +emit({"phase": "loading"}) +with open(spec["manifest_path"], "w") as f: + json.dump({"status": "failed", "code": "worker_crashed", + "message": "boom"}, f) +sys.exit(1) +""" + +_NO_OUTPUT_BODY = """\ +emit({"phase": "loading"}) +emit({"phase": "saving"}) +sys.exit(0) +""" + +_STALL_BODY = """\ +emit({"phase": "loading"}) +time.sleep(60) +sys.exit(0) +""" + +# Prints a heartbeat every 0.5s "forever" (bounded so a leaked process +# cannot outlive the test session by much) +_CHATTY_BODY = """\ +for _ in range(240): + emit({"phase": "denoise"}) + time.sleep(0.5) +sys.exit(0) +""" + + +def _write_worker(tmp_path: Path, name: str, body: str) -> Path: + script = tmp_path / name + script.write_text(_PRELUDE + body) + return script + + +# --------------------------------------------------------------------------- +# Fake enforcer +# --------------------------------------------------------------------------- + + +class FakeEnforcer: + """Records lease-related calls so tests can assert order + release.""" + + def __init__(self, ceiling_gb: float = 100.0, peak_bytes: int = 0): + self.is_running = True + self._ceiling = int(ceiling_gb * GB) + self.peak = peak_bytes + self._soft_threshold = 0.85 + self._prefill_transient_margin_bytes = 0 + self.calls: list[tuple] = [] + + def get_final_ceiling(self) -> int: + return self._ceiling + + def recent_peak_bytes(self) -> int: + return self.peak + + def acquire_video_lease(self, lease_bytes: int) -> None: + self.calls.append(("acquire", lease_bytes)) + + def set_video_worker_pid(self, pid) -> None: + self.calls.append(("set_pid", pid)) + + def release_video_lease(self) -> None: + self.calls.append(("release",)) + + # assertion helpers ---------------------------------------------------- + + def call_names(self) -> list[str]: + return [c[0] for c in self.calls] + + def assert_lease_cycle(self, lease_bytes: int) -> None: + """One acquire -> set_pid(real) -> set_pid(None) -> release cycle.""" + names = self.call_names() + assert names.count("acquire") == names.count("release") == 1 + assert ("acquire", lease_bytes) in self.calls + assert names.index("acquire") < names.index("release") + pids = [c[1] for c in self.calls if c[0] == "set_pid"] + assert pids[-1] is None # cleared before release + assert isinstance(pids[0], int) and pids[0] > 0 + # acquire happens before the pid is registered + assert names.index("acquire") < names.index("set_pid") + # release is the very last lease call + assert names[-1] == "release" + + +# --------------------------------------------------------------------------- +# Construction helpers +# --------------------------------------------------------------------------- + + +def _make_settings(**overrides) -> VideoSettings: + kwargs = dict( + enabled=True, + worker_python=sys.executable, + memory_lease_gb=1.0, + max_queued_jobs=4, + job_timeout_seconds=60, + progress_stall_timeout_seconds=30, + artifacts_max_count=50, + artifacts_max_gb=50.0, + ) + kwargs.update(overrides) + return VideoSettings(**kwargs) + + +def _make_job(job_id: str = "video_t1", **param_overrides) -> VideoJob: + params = dict(prompt="a cat", width=256, height=256, frames=5, + steps=2, fps=16, seed=7) + params.update(param_overrides) + return VideoJob(id=job_id, model_id="wan-test", + model_dir="/nonexistent/model", params=params) + + +def _make_manager(tmp_path: Path, worker_body: str, + settings: VideoSettings | None = None, + enforcer: FakeEnforcer | None = None, + ) -> tuple[VideoJobManager, FakeEnforcer]: + enforcer = enforcer or FakeEnforcer() + script = _write_worker(tmp_path, "fake_worker.py", worker_body) + manager = VideoJobManager( + settings=settings or _make_settings(), + base_path=tmp_path, + enforcer=enforcer, + worker_script=script, + ) + return manager, enforcer + + +async def _wait_until(cond, timeout: float = 12.0, interval: float = 0.05): + deadline = time.monotonic() + timeout + while time.monotonic() < deadline: + if cond(): + return True + await asyncio.sleep(interval) + return False + + +async def _wait_terminal(job: VideoJob, timeout: float = 12.0) -> None: + ok = await _wait_until( + lambda: job.status in ("completed", "failed"), timeout=timeout + ) + assert ok, ( + f"job did not reach a terminal state within {timeout}s " + f"(status={job.status}, phase={job.phase!r})" + ) + + +# --------------------------------------------------------------------------- +# (1) success path +# --------------------------------------------------------------------------- + + +async def test_success_completes_with_artifact_and_lease_cycle(tmp_path): + manager, enforcer = _make_manager(tmp_path, _SUCCESS_BODY) + try: + job = await manager.submit(_make_job("video_ok1")) + await _wait_terminal(job) + + assert job.status == "completed" + assert job.error is None + assert job.progress == 100 + assert job.phase == "done" + assert job.artifact_path is not None + artifact = Path(job.artifact_path) + assert artifact.exists() and artifact.stat().st_size > 0 + assert artifact == manager.artifacts_dir / job.id / "output.mp4" + assert job.peak_memory_gb == 1.5 + assert job.wall_seconds is not None + + # wire shape + wire = job.to_dict() + assert wire["object"] == "video" + assert wire["status"] == "completed" + assert wire["progress"] == 100 + assert wire["error"] is None + assert wire["size"] == "256x256" + + # lease acquired AND released, in order, pid registered then cleared + enforcer.assert_lease_cycle(lease_bytes=1 * GB) + + # persisted record reflects completion + with open(manager.jobs_dir / f"{job.id}.json") as f: + persisted = json.load(f) + assert persisted["status"] == "completed" + finally: + await manager.shutdown() + + +# --------------------------------------------------------------------------- +# (2) crash with failure manifest +# --------------------------------------------------------------------------- + + +async def test_crash_propagates_manifest_error_and_releases_lease(tmp_path): + manager, enforcer = _make_manager(tmp_path, _CRASH_BODY) + try: + job = await manager.submit(_make_job("video_crash1")) + await _wait_terminal(job) + + assert job.status == "failed" + assert job.error == {"code": "worker_crashed", "message": "boom"} + assert job.artifact_path is None + # lease released even on failure + enforcer.assert_lease_cycle(lease_bytes=1 * GB) + finally: + await manager.shutdown() + + +# --------------------------------------------------------------------------- +# (3) exit 0 but no output file +# --------------------------------------------------------------------------- + + +async def test_exit_zero_without_output_is_output_invalid(tmp_path): + manager, enforcer = _make_manager(tmp_path, _NO_OUTPUT_BODY) + try: + job = await manager.submit(_make_job("video_noout1")) + await _wait_terminal(job) + + assert job.status == "failed" + assert job.error is not None + assert job.error["code"] == vm.ERR_OUTPUT_INVALID + enforcer.assert_lease_cycle(lease_bytes=1 * GB) + finally: + await manager.shutdown() + + +# --------------------------------------------------------------------------- +# (4) stall: silent worker killed by the watchdog +# --------------------------------------------------------------------------- + + +async def test_stalled_worker_is_killed(tmp_path): + settings = _make_settings(progress_stall_timeout_seconds=2) + manager, enforcer = _make_manager(tmp_path, _STALL_BODY, + settings=settings) + try: + job = await manager.submit(_make_job("video_stall1")) + # one heartbeat then 60s of silence; watchdog ticks every 2s so the + # kill should land well within ~8s + await _wait_terminal(job, timeout=12.0) + + assert job.status == "failed" + assert job.error is not None + assert job.error["code"] == vm.ERR_WORKER_STALLED + enforcer.assert_lease_cycle(lease_bytes=1 * GB) + finally: + await manager.shutdown() + + +# --------------------------------------------------------------------------- +# (5) per-run timeout +# --------------------------------------------------------------------------- + + +async def test_job_timeout_kills_chatty_worker(tmp_path): + settings = _make_settings(job_timeout_seconds=2) + manager, enforcer = _make_manager(tmp_path, _CHATTY_BODY, + settings=settings) + try: + job = await manager.submit(_make_job("video_timeout1")) + await _wait_terminal(job, timeout=12.0) + + assert job.status == "failed" + assert job.error is not None + assert job.error["code"] == vm.ERR_JOB_TIMEOUT + enforcer.assert_lease_cycle(lease_bytes=1 * GB) + finally: + await manager.shutdown() + + +# --------------------------------------------------------------------------- +# (6) queue depth cap +# --------------------------------------------------------------------------- + + +async def test_queue_full_raises_when_cap_reached(tmp_path): + settings = _make_settings(max_queued_jobs=1) + manager, _ = _make_manager(tmp_path, _CHATTY_BODY, settings=settings) + try: + job_a = await manager.submit(_make_job("video_qa")) + # wait until the dispatcher picks A up (queue drains) + ok = await _wait_until( + lambda: job_a.status == "in_progress" and manager.queue_depth() == 0 + ) + assert ok, "first job never started" + + await manager.submit(_make_job("video_qb")) # fills the queue + assert manager.queue_depth() == 1 + with pytest.raises(QueueFullError): + await manager.submit(_make_job("video_qc")) + assert manager.get("video_qc") is None + finally: + await manager.shutdown() + + +# --------------------------------------------------------------------------- +# (7) DELETE a running job +# --------------------------------------------------------------------------- + + +async def test_delete_running_job_kills_worker_and_removes_record(tmp_path): + manager, _ = _make_manager(tmp_path, _CHATTY_BODY) + try: + job = await manager.submit(_make_job("video_del1")) + ok = await _wait_until( + lambda: job.status == "in_progress" + and manager._current_proc is not None + ) + assert ok, "job never started" + proc = manager._current_proc + + assert await manager.delete(job.id) is True + + assert proc.returncode is not None # worker terminated + assert manager.get(job.id) is None + assert not (manager.jobs_dir / f"{job.id}.json").exists() + assert not (manager.artifacts_dir / job.id).exists() + # deleting again reports not found + assert await manager.delete(job.id) is False + finally: + await manager.shutdown() + + +# --------------------------------------------------------------------------- +# (8) startup replay marks in-flight jobs as failed +# --------------------------------------------------------------------------- + + +async def test_restart_replay_fails_inflight_jobs(tmp_path): + jobs_dir = tmp_path / "video-jobs" + jobs_dir.mkdir(parents=True) + inflight = _make_job("video_replay1") + inflight.status = "in_progress" + inflight.started_at = time.time() + with open(jobs_dir / "video_replay1.json", "w") as f: + json.dump(inflight.to_persist(), f) + done = _make_job("video_replay2") + done.status = "completed" + done.progress = 100 + done.completed_at = time.time() + with open(jobs_dir / "video_replay2.json", "w") as f: + json.dump(done.to_persist(), f) + + manager, _ = _make_manager(tmp_path, _SUCCESS_BODY) + try: + replayed = manager.get("video_replay1") + assert replayed is not None + assert replayed.status == "failed" + assert replayed.error is not None + assert replayed.error["code"] == vm.ERR_SERVER_RESTARTED + assert replayed.completed_at is not None + # the failure is persisted back to disk + with open(jobs_dir / "video_replay1.json") as f: + assert json.load(f)["status"] == "failed" + # terminal jobs replay unchanged + survivor = manager.get("video_replay2") + assert survivor is not None and survivor.status == "completed" + finally: + await manager.shutdown() + + +# --------------------------------------------------------------------------- +# (9) retention: LRU purge beyond artifacts_max_count +# --------------------------------------------------------------------------- + + +async def test_retention_purges_oldest_artifact_but_keeps_record(tmp_path): + settings = _make_settings(artifacts_max_count=1) + manager, _ = _make_manager(tmp_path, _SUCCESS_BODY, settings=settings) + try: + job1 = await manager.submit(_make_job("video_ret1")) + await _wait_terminal(job1) + assert job1.status == "completed" + assert job1.artifact_path is not None + + job2 = await manager.submit(_make_job("video_ret2")) + await _wait_terminal(job2) + assert job2.status == "completed" + + ok = await _wait_until(lambda: job1.artifact_path is None) + assert ok, "retention sweep did not purge the older artifact" + assert job1.expires_at is not None + assert job1.status == "completed" # record kept, status unchanged + assert manager.get(job1.id) is not None + assert not (manager.artifacts_dir / job1.id).exists() + # newest artifact survives + assert job2.artifact_path is not None + assert Path(job2.artifact_path).exists() + assert job2.expires_at is None + finally: + await manager.shutdown() + + +# --------------------------------------------------------------------------- +# (10) memory admission deferral +# --------------------------------------------------------------------------- + + +async def test_admission_defers_then_proceeds(tmp_path, monkeypatch): + monkeypatch.setattr(vm, "_ADMISSION_RECHECK_S", 0.2) + enforcer = FakeEnforcer(ceiling_gb=100.0, peak_bytes=200 * GB) + manager, _ = _make_manager(tmp_path, _SUCCESS_BODY, enforcer=enforcer) + try: + job = await manager.submit(_make_job("video_adm1")) + ok = await _wait_until(lambda: "waiting for memory" in job.phase) + assert ok, f"job never reported memory wait (phase={job.phase!r})" + assert job.status == "queued" + assert enforcer.call_names() == [] # no lease while deferred + + enforcer.peak = 0 # pressure clears + await _wait_terminal(job) + assert job.status == "completed" + enforcer.assert_lease_cycle(lease_bytes=1 * GB) + finally: + await manager.shutdown() + + +# --------------------------------------------------------------------------- +# (11) watchdog: footprint over lease +# --------------------------------------------------------------------------- + + +async def test_watchdog_kills_worker_over_lease(tmp_path, monkeypatch): + lease = 1 * GB + monkeypatch.setattr(vm, "get_phys_footprint", lambda pid=None: lease + GB) + manager, enforcer = _make_manager(tmp_path, _CHATTY_BODY) + try: + job = await manager.submit(_make_job("video_lease1")) + await _wait_terminal(job, timeout=12.0) + + assert job.status == "failed" + assert job.error is not None + assert job.error["code"] == vm.ERR_LEASE_EXCEEDED + enforcer.assert_lease_cycle(lease_bytes=lease) + finally: + await manager.shutdown() + + +# --------------------------------------------------------------------------- +# (12) watchdog: footprint monitor failure (3x zero reads) +# --------------------------------------------------------------------------- + + +async def test_watchdog_kills_worker_when_monitor_fails(tmp_path, monkeypatch): + monkeypatch.setattr(vm, "get_phys_footprint", lambda pid=None: 0) + manager, enforcer = _make_manager(tmp_path, _CHATTY_BODY) + try: + job = await manager.submit(_make_job("video_mon1")) + # 3 zero reads at 2s watchdog cadence -> killed around t=6s + await _wait_terminal(job, timeout=14.0) + + assert job.status == "failed" + assert job.error is not None + assert job.error["code"] == vm.ERR_MONITOR_FAILED + enforcer.assert_lease_cycle(lease_bytes=1 * GB) + finally: + await manager.shutdown() diff --git a/tests/test_video_pool_and_lease.py b/tests/test_video_pool_and_lease.py new file mode 100644 index 000000000..6f8e1012e --- /dev/null +++ b/tests/test_video_pool_and_lease.py @@ -0,0 +1,332 @@ +# SPDX-License-Identifier: Apache-2.0 +"""Tests for video-model pool rejection and the enforcer video memory lease. + +Part A: EnginePool.get_engine must reject model_type == "video" entries with +ModelTypeNotLoadableError BEFORE the memory-admission loop, so a misrouted +chat request can never evict resident LLM engines +(docs/video-generation-engine-spec.md section 3). + +Part B: ProcessMemoryEnforcer video lease (spec section 4.4): the lease is +subtracted from the final ceiling at a single choke point, the dynamic +ceiling adds back min(worker_footprint, lease) so the worker is counted +exactly once, and acquire/release move the Metal wired-limit request. +""" + +import asyncio +import time +from unittest.mock import MagicMock + +import pytest + +import omlx.process_memory_enforcer as pme +from omlx.engine_pool import EngineEntry, EnginePool +from omlx.exceptions import EnginePoolError, ModelTypeNotLoadableError + +GB = 1024**3 + +# Deterministic static ceiling patched onto enforcer instances so the +# wired-limit math does not depend on the host machine's RAM. +STATIC_CEILING = 100 * GB +CUSTOM_GB = 20.0 + + +# ========================================================================= +# Part A -- pool rejection of video entries +# ========================================================================= + + +class FakeLLMEngine: + """Loaded-engine stand-in that records eviction attempts.""" + + def __init__(self): + self.stop_called = False + + def has_active_requests(self) -> bool: + return False + + async def stop(self) -> None: + self.stop_called = True + + +def _make_pool_with_video_and_llm(): + pool = EnginePool(scheduler_config=None) + fake_engine = FakeLLMEngine() + pool._entries["llm-id"] = EngineEntry( + model_id="llm-id", + model_path="/nonexistent/llm-id", + model_type="llm", + engine_type="batched", + estimated_size=4 * GB, + engine=fake_engine, + last_access=time.time(), + ) + pool._entries["video-id"] = EngineEntry( + model_id="video-id", + model_path="/nonexistent/video-id", + model_type="video", + engine_type="video", + estimated_size=42 * GB, + ) + return pool, fake_engine + + +class TestVideoPoolRejection: + def test_model_type_not_loadable_is_engine_pool_error(self): + assert issubclass(ModelTypeNotLoadableError, EnginePoolError) + exc = ModelTypeNotLoadableError("video-id", "video") + assert exc.model_id == "video-id" + assert exc.model_type == "video" + assert "/v1/videos" in str(exc) + + def test_model_type_map_has_video_engine(self): + assert EnginePool._MODEL_TYPE_TO_ENGINE["video"] == "video" + + async def test_get_engine_rejects_video_before_admission(self, monkeypatch): + """Video rejection fires before admission: no LLM eviction happens. + + Memory is mocked so that, had the 42GB video entry reached the + admission loop, projected (20 + 42 GB) > ceiling (50 GB) would + have evicted the idle llm entry. The rejection must fire first. + """ + pool, fake_engine = _make_pool_with_video_and_llm() + pool._get_final_ceiling = lambda: 50 * GB + # Make current usage high enough that admission WOULD evict. + monkeypatch.setattr( + "omlx.engine_pool.get_phys_footprint", lambda pid=None: 20 * GB + ) + + with pytest.raises(ModelTypeNotLoadableError) as excinfo: + await pool.get_engine("video-id") + + assert excinfo.value.model_id == "video-id" + assert excinfo.value.model_type == "video" + assert "/v1/videos" in str(excinfo.value) + # The resident llm engine must be untouched -- not stopped, not + # unloaded. + assert pool._entries["llm-id"].engine is fake_engine + assert fake_engine.stop_called is False + + +# ========================================================================= +# Part B -- enforcer video memory lease +# ========================================================================= + + +@pytest.fixture +def wired_calls(monkeypatch): + """Replace _apply_metal_wired_limit with a recorder; no mx side effects.""" + calls: list[int] = [] + + def _recorder(desired_bytes): + calls.append(desired_bytes) + return desired_bytes, None + + monkeypatch.setattr(pme, "_apply_metal_wired_limit", _recorder) + return calls + + +def _make_pool_stub(): + pool = MagicMock() + pool._entries = {} + pool._lock = asyncio.Lock() + return pool + + +def _make_enforcer(monkeypatch, tier="custom", custom_gb=CUSTOM_GB, **kwargs): + """Enforcer with deterministic ceilings; never started (no loop). + + custom tier -> dynamic ceiling == custom_gb verbatim. The static + ceiling is pinned to STATIC_CEILING and the Metal cap mocked away so + get_final_ceiling() == min(STATIC_CEILING, custom) == custom on any + machine. + """ + monkeypatch.setattr(pme, "get_effective_metal_cap_bytes", lambda: 0) + enforcer = pme.ProcessMemoryEnforcer( + engine_pool=_make_pool_stub(), + memory_guard_tier=tier, + memory_guard_custom_ceiling_gb=custom_gb, + **kwargs, + ) + enforcer._get_static_ceiling = lambda: STATIC_CEILING + return enforcer + + +class TestVideoLeaseCeiling: + def test_acquire_reduces_final_ceiling_by_lease(self, monkeypatch, wired_calls): + enforcer = _make_enforcer(monkeypatch) + base = enforcer.get_final_ceiling() + assert base == int(CUSTOM_GB * GB) + + enforcer.acquire_video_lease(8 * GB) + assert enforcer.get_final_ceiling() == base - 8 * GB + + def test_huge_lease_clamps_ceiling_to_one(self, monkeypatch, wired_calls): + enforcer = _make_enforcer(monkeypatch) + enforcer.acquire_video_lease(1024 * GB) + # Never 0: consumers treat ceiling 0 as "guard disabled". + assert enforcer.get_final_ceiling() == 1 + + def test_lease_equal_to_ceiling_clamps_to_one(self, monkeypatch, wired_calls): + enforcer = _make_enforcer(monkeypatch) + enforcer.acquire_video_lease(int(CUSTOM_GB * GB)) + assert enforcer.get_final_ceiling() == 1 + + def test_double_acquire_raises_runtime_error(self, monkeypatch, wired_calls): + enforcer = _make_enforcer(monkeypatch) + enforcer.acquire_video_lease(8 * GB) + with pytest.raises(RuntimeError): + enforcer.acquire_video_lease(1 * GB) + + def test_non_positive_lease_raises_value_error(self, monkeypatch, wired_calls): + enforcer = _make_enforcer(monkeypatch) + with pytest.raises(ValueError): + enforcer.acquire_video_lease(0) + with pytest.raises(ValueError): + enforcer.acquire_video_lease(-5) + # Failed acquires must not leave a partial lease behind. + assert enforcer.video_lease_bytes == 0 + + def test_release_restores_ceiling(self, monkeypatch, wired_calls): + enforcer = _make_enforcer(monkeypatch) + base = enforcer.get_final_ceiling() + enforcer.acquire_video_lease(8 * GB) + assert enforcer.get_final_ceiling() == base - 8 * GB + enforcer.release_video_lease() + assert enforcer.get_final_ceiling() == base + + def test_release_when_not_held_is_noop(self, monkeypatch, wired_calls): + enforcer = _make_enforcer(monkeypatch) + before = list(wired_calls) + enforcer.release_video_lease() # must not raise + assert enforcer.video_lease_bytes == 0 + # Early return: no Metal wired-limit churn either. + assert wired_calls == before + + def test_video_lease_bytes_property_tracks(self, monkeypatch, wired_calls): + enforcer = _make_enforcer(monkeypatch) + assert enforcer.video_lease_bytes == 0 + enforcer.acquire_video_lease(8 * GB) + assert enforcer.video_lease_bytes == 8 * GB + enforcer.release_video_lease() + assert enforcer.video_lease_bytes == 0 + + def test_release_clears_worker_pid(self, monkeypatch, wired_calls): + enforcer = _make_enforcer(monkeypatch) + enforcer.acquire_video_lease(8 * GB) + enforcer.set_video_worker_pid(12345) + assert enforcer._video_worker_pid == 12345 + enforcer.release_video_lease() + assert enforcer._video_worker_pid is None + + def test_guard_disabled_ceiling_stays_zero(self, monkeypatch, wired_calls): + """Guard off: ceiling is 0 (= disabled) and acquire skips Metal calls.""" + enforcer = _make_enforcer(monkeypatch, prefill_memory_guard=False) + assert enforcer.get_final_ceiling() == 0 + enforcer.acquire_video_lease(8 * GB) + assert enforcer.get_final_ceiling() == 0 + assert wired_calls == [] + + +class TestVideoLeaseWiredLimit: + def test_acquire_and_release_move_wired_limit_request( + self, monkeypatch, wired_calls + ): + enforcer = _make_enforcer(monkeypatch) + assert enforcer._metal_wired_limit_request == 0 + assert wired_calls == [] + + enforcer.acquire_video_lease(8 * GB) + assert wired_calls[-1] == STATIC_CEILING - 8 * GB + assert enforcer._metal_wired_limit_request == STATIC_CEILING - 8 * GB + + enforcer.release_video_lease() + assert wired_calls[-1] == STATIC_CEILING + assert enforcer._metal_wired_limit_request == STATIC_CEILING + + def test_oversized_lease_clamps_wired_target_to_one( + self, monkeypatch, wired_calls + ): + enforcer = _make_enforcer(monkeypatch) + enforcer.acquire_video_lease(STATIC_CEILING + 5 * GB) + assert wired_calls[-1] == 1 + assert enforcer._metal_wired_limit_request == 1 + + +class TestDynamicCeilingWorkerAddBack: + """Non-custom tier: dynamic ceiling adds back min(worker_footprint, lease). + + Inputs are fully mocked: get_macos_vm_stats returns fixed numbers and + get_phys_footprint is a per-pid fake. balanced tier -> active ratio 0.5, + so the base dynamic ceiling is own + free + inactive + active * 0.5. + """ + + OWN = 5 * GB + WORKER_PID = 4242 + VM = {"free": 10 * GB, "inactive": 4 * GB, "active": 8 * GB, "wired": 0} + # own 5 + free 10 + inactive 4 + active 8 * 0.5 = 23 GB + BASE = 23 * GB + + def _setup(self, monkeypatch, worker_footprint): + monkeypatch.setattr(pme, "get_macos_vm_stats", lambda: dict(self.VM)) + + def fake_phys(pid=None): + if pid is None: + return self.OWN + if pid == self.WORKER_PID: + return worker_footprint + return 0 + + monkeypatch.setattr(pme, "get_phys_footprint", fake_phys) + return _make_enforcer(monkeypatch, tier="balanced") + + def test_no_pid_no_add_back(self, monkeypatch, wired_calls): + enforcer = self._setup(monkeypatch, worker_footprint=3 * GB) + enforcer.acquire_video_lease(8 * GB) + # Lease held but no worker pid bound yet (pre-spawn): add-back 0. + assert enforcer._get_dynamic_ceiling() == self.BASE + + def test_add_back_equals_worker_footprint_under_lease( + self, monkeypatch, wired_calls + ): + enforcer = self._setup(monkeypatch, worker_footprint=3 * GB) + enforcer.acquire_video_lease(8 * GB) + enforcer.set_video_worker_pid(self.WORKER_PID) + assert enforcer._get_dynamic_ceiling() == self.BASE + 3 * GB + + def test_add_back_clamped_to_lease(self, monkeypatch, wired_calls): + # Runaway worker (50 GB footprint) must not raise the parent + # ceiling beyond the 8 GB lease. + enforcer = self._setup(monkeypatch, worker_footprint=50 * GB) + enforcer.acquire_video_lease(8 * GB) + enforcer.set_video_worker_pid(self.WORKER_PID) + assert enforcer._get_dynamic_ceiling() == self.BASE + 8 * GB + + def test_zero_footprint_read_no_add_back(self, monkeypatch, wired_calls): + # Footprint read failure (0) degrades to double-counting, which + # is fail-conservative. + enforcer = self._setup(monkeypatch, worker_footprint=0) + enforcer.acquire_video_lease(8 * GB) + enforcer.set_video_worker_pid(self.WORKER_PID) + assert enforcer._get_dynamic_ceiling() == self.BASE + + def test_no_lease_no_add_back_even_with_pid(self, monkeypatch, wired_calls): + enforcer = self._setup(monkeypatch, worker_footprint=3 * GB) + enforcer.set_video_worker_pid(self.WORKER_PID) + assert enforcer._get_dynamic_ceiling() == self.BASE + + def test_lease_then_release_round_trip_final_ceiling( + self, monkeypatch, wired_calls + ): + """End to end on a non-custom tier: final ceiling tightens by the + lease minus the worker add-back, then restores after release.""" + enforcer = self._setup(monkeypatch, worker_footprint=3 * GB) + base = enforcer.get_final_ceiling() + assert base == self.BASE # min(static 100 GB, dynamic 23 GB) + + enforcer.acquire_video_lease(8 * GB) + enforcer.set_video_worker_pid(self.WORKER_PID) + # dynamic = BASE + 3 GB add-back; final = dynamic - 8 GB lease. + assert enforcer.get_final_ceiling() == self.BASE + 3 * GB - 8 * GB + + enforcer.release_video_lease() + assert enforcer.get_final_ceiling() == base diff --git a/tests/test_video_routes.py b/tests/test_video_routes.py new file mode 100644 index 000000000..26bffff65 --- /dev/null +++ b/tests/test_video_routes.py @@ -0,0 +1,608 @@ +# SPDX-License-Identifier: Apache-2.0 +"""Tests for the /v1/videos API routes (omlx/api/video_routes.py). + +A minimal FastAPI app mounts the video router; the module-level accessors +(_get_video_manager / _get_engine_pool / _resolve_model) are monkeypatched. +get/list/delete semantics run against a REAL VideoJobManager constructed on +tmp_path with enforcer=None; only submit and the guard/venv probes are +stubbed per test. create_video also reads omlx.server._server_state +.global_settings.video inside the handler, so a settings stub is patched +onto the real ServerState instance (monkeypatch restores it afterwards). + +No real model dirs, no ~/.fmlx, no worker subprocess is ever spawned. +""" + +from __future__ import annotations + +from pathlib import Path +from types import SimpleNamespace + +import pytest +from fastapi import FastAPI +from fastapi.testclient import TestClient + +import omlx.api.video_routes as video_routes +import omlx.server as omlx_server +from omlx.settings import VideoSettings +from omlx.video.manager import QueueFullError, VideoJob, VideoJobManager + + +# --------------------------------------------------------------------------- +# Helpers +# --------------------------------------------------------------------------- + +VIDEO_MODEL = "wan-t2v" +LLM_MODEL = "llama-llm" + + +def _video_settings(**overrides) -> VideoSettings: + """Enabled settings; lease defaults to the dataclass default (36GB, + P0-calibrated to admit the caps corner under the spatial-token + predictor). Tests that exercise the 413 boundary pass a smaller + explicit lease.""" + params = dict(enabled=True) + params.update(overrides) + return VideoSettings(**params) + + +def _make_manager( + tmp_path: Path, settings: VideoSettings, stub_submit: bool = True +) -> VideoJobManager: + """Real manager (real get/list_jobs/delete) with probe seams stubbed.""" + manager = VideoJobManager( + settings=settings, base_path=tmp_path, enforcer=None + ) + manager.guard_available = lambda: (True, "") # type: ignore[method-assign] + + async def _probe(force: bool = False): + return True, "" + + manager.probe_worker_venv = _probe # type: ignore[method-assign] + + if stub_submit: + submitted: list[VideoJob] = [] + + async def _submit(job: VideoJob) -> VideoJob: + # Record without waking the real dispatcher (no admission loop, + # no subprocess) + manager._jobs[job.id] = job + submitted.append(job) + return job + + manager.submit = _submit # type: ignore[method-assign] + manager.test_submitted = submitted # type: ignore[attr-defined] + return manager + + +def _seed_job( + manager: VideoJobManager, + job_id: str, + created_at: float = 100.0, + status: str = "queued", + **kwargs, +) -> VideoJob: + job = VideoJob( + id=job_id, + model_id=VIDEO_MODEL, + model_dir="/nonexistent/model-dir", + params={ + "prompt": "a cat", + "width": 480, + "height": 272, + "frames": 49, + "steps": 20, + "fps": 16, + "seed": 7, + "seconds": 3.06, + }, + status=status, + created_at=created_at, + **kwargs, + ) + manager._jobs[job_id] = job + return job + + +@pytest.fixture +def video_env(monkeypatch, tmp_path): + """Builder returning (TestClient, manager) with accessors patched.""" + + def build( + settings: VideoSettings | None = None, + stub_submit: bool = True, + patch_manager_accessor: bool = True, + ): + vs = settings or _video_settings() + manager = _make_manager(tmp_path, vs, stub_submit=stub_submit) + + entries = { + VIDEO_MODEL: SimpleNamespace( + model_path=tmp_path / "models" / "wan", model_type="video" + ), + LLM_MODEL: SimpleNamespace( + model_path=tmp_path / "models" / "llama", model_type="llm" + ), + } + pool = SimpleNamespace(get_entry=lambda mid: entries.get(mid)) + + if patch_manager_accessor: + monkeypatch.setattr( + video_routes, "_get_video_manager", lambda: manager + ) + monkeypatch.setattr(video_routes, "_get_engine_pool", lambda: pool) + monkeypatch.setattr(video_routes, "_resolve_model", lambda m: m) + # create_video reads _server_state.global_settings.video directly + monkeypatch.setattr( + omlx_server._server_state, + "global_settings", + SimpleNamespace(video=vs), + ) + + app = FastAPI() + app.include_router(video_routes.router) + return TestClient(app), manager + + return build + + +def _post(client: TestClient, **fields): + body = {"model": VIDEO_MODEL, "prompt": "a cat"} + body.update(fields) + return client.post("/v1/videos", json=body) + + +# --------------------------------------------------------------------------- +# POST /v1/videos -- happy paths +# --------------------------------------------------------------------------- + + +class TestCreateVideo: + def test_post_json_happy_path(self, video_env): + client, manager = video_env() + r = _post(client, size="480x272", seconds=3) + assert r.status_code == 200 + body = r.json() + assert body["id"].startswith("video_") + assert body["object"] == "video" + assert body["status"] == "queued" + assert body["model"] == VIDEO_MODEL + assert body["size"] == "480x272" + # seconds=3 * default_fps=16 = 48 frames -> 4n+1 -> 49 + assert body["frames"] == 49 + # Derived seconds string = round(49/16, 2) + assert body["seconds"] == "3.06" + assert body["progress"] == 0 + assert body["error"] is None + # Job actually reached the manager + assert manager.get(body["id"]) is not None + assert len(manager.test_submitted) == 1 + + def test_post_multipart_all_string_fields(self, video_env): + """openai SDK shape: multipart/form-data, every field a string.""" + client, manager = video_env() + r = client.post( + "/v1/videos", + data={ + "model": VIDEO_MODEL, + "prompt": "a cat", + "seconds": "4", + "steps": "10", + }, + # File part forces multipart encoding; non-str form values are + # filtered out by the handler + files={"input_reference": ("ref.png", b"\x89PNG", "image/png")}, + ) + assert r.status_code == 200 + body = r.json() + assert body["status"] == "queued" + assert body["steps"] == 10 + # "4" * fps 16 = 64 -> 4n+1 -> 65 + assert body["frames"] == 65 + assert body["seconds"] == str(round(65 / 16, 2)) + # Defaults applied when size omitted + assert body["size"] == "480x272" + + def test_seed_and_explicit_params_pass_through(self, video_env): + client, manager = video_env() + r = _post(client, width=480, height=272, frames=49, seed=1234, fps=8) + assert r.status_code == 200 + body = r.json() + assert body["seed"] == 1234 + assert body["fps"] == 8 + job = manager.get(body["id"]) + assert job.params["seed"] == 1234 + + +# --------------------------------------------------------------------------- +# POST /v1/videos -- model resolution errors +# --------------------------------------------------------------------------- + + +class TestCreateVideoModelErrors: + def test_unknown_model_404(self, video_env): + client, _ = video_env() + r = _post(client, model="no-such-model") + assert r.status_code == 404 + assert "not found" in r.json()["detail"] + + def test_non_video_model_400(self, video_env): + client, _ = video_env() + r = _post(client, model=LLM_MODEL) + assert r.status_code == 400 + detail = r.json()["detail"] + assert "not a video generation model" in detail + assert "model_type=llm" in detail + + def test_missing_prompt_400(self, video_env): + client, _ = video_env() + r = client.post("/v1/videos", json={"model": VIDEO_MODEL}) + assert r.status_code == 400 + + def test_malformed_body_400(self, video_env): + client, _ = video_env() + r = client.post( + "/v1/videos", + content=b"not json", + headers={"content-type": "application/json"}, + ) + assert r.status_code == 400 + assert "Malformed request body" in r.json()["detail"] + + +# --------------------------------------------------------------------------- +# POST /v1/videos -- normalization +# --------------------------------------------------------------------------- + + +class TestNormalization: + def test_dimensions_round_up_to_multiple_of_16(self, video_env): + client, _ = video_env() + r = _post(client, width=470, height=270) + assert r.status_code == 200 + assert r.json()["size"] == "480x272" + + def test_frames_from_seconds_times_fps(self, video_env): + client, _ = video_env() + r = _post(client, seconds=3, fps=16) + assert r.status_code == 200 + assert r.json()["frames"] == 49 # round(3*16)=48 -> 4n+1 -> 49 + + def test_explicit_frames_rounded_to_4n_plus_1(self, video_env): + client, _ = video_env() + r = _post(client, frames=50) + assert r.status_code == 200 + body = r.json() + assert body["frames"] == 53 # 4*ceil(49/4)+1 + assert body["seconds"] == str(round(53 / 16, 2)) + + def test_invalid_size_string_400(self, video_env): + client, _ = video_env() + r = _post(client, size="480by272") + assert r.status_code == 400 + assert "Invalid size" in r.json()["detail"] + + def test_nonpositive_seconds_400(self, video_env): + client, _ = video_env() + r = _post(client, seconds=0) + assert r.status_code == 400 + assert "seconds must be positive" in r.json()["detail"] + + +# --------------------------------------------------------------------------- +# POST /v1/videos -- static caps (400) and peak predictor (413) +# --------------------------------------------------------------------------- + + +class TestCapsAndPredictor: + def test_steps_over_max_400(self, video_env): + client, _ = video_env() # default max_steps=50 + r = _post(client, steps=51) + assert r.status_code == 400 + assert "max_steps" in r.json()["detail"] + + def test_pixels_over_max_400(self, video_env): + client, _ = video_env() # default cap 1280*720 + r = _post(client, width=1280, height=736) + assert r.status_code == 400 + assert "max_pixels_per_frame" in r.json()["detail"] + + def test_frames_over_max_400(self, video_env): + client, _ = video_env() # default max_frames=121 + r = _post(client, frames=125) + assert r.status_code == 400 + assert "max_frames" in r.json()["detail"] + + def test_peak_predictor_413_when_over_lease(self, video_env): + # P0-calibrated formula: predicted = 17.5 + 0.0029 * (W/16 * H/16), + # frame-count-invariant. 1280x720 -> 3600 spatial tokens -> + # 17.5 + 10.44 = 27.94GB, +6 margin = 33.94 > lease 30 -> 413 + client, _ = video_env(settings=_video_settings(memory_lease_gb=30.0)) + r = _post(client, width=1280, height=720, frames=81) + assert r.status_code == 413 + detail = r.json()["detail"] + assert "memory_lease_gb" in detail + assert "Predicted memory peak" in detail + + def test_peak_predictor_small_request_fits_same_lease(self, video_env): + # Same 30GB lease: 480x272 -> 510 tokens -> 17.5 + 1.48 = 18.98GB, + # +6 margin = 24.98 < 30 -> ok + client, _ = video_env(settings=_video_settings(memory_lease_gb=30.0)) + r = _post(client, width=480, height=272, frames=49) + assert r.status_code == 200 + + def test_peak_predictor_frame_count_invariant(self, video_env): + # Frames do not enter the memory formula (P0: 49f == 101f peaks); + # a long video at modest resolution must NOT 413. + client, _ = video_env(settings=_video_settings(memory_lease_gb=30.0)) + r = _post(client, width=480, height=272, frames=121) + assert r.status_code == 200 + + def test_default_lease_admits_cap_corner(self, video_env): + # Out-of-the-box settings must admit the caps corner (the v1 bug: + # default lease below the predictor floor 413'd everything). + client, _ = video_env(settings=_video_settings()) + r = _post(client, width=1280, height=720) + assert r.status_code == 200 + + +# --------------------------------------------------------------------------- +# POST /v1/videos -- 503 gates +# --------------------------------------------------------------------------- + + +class TestServiceGates: + def test_queue_full_503(self, video_env): + # Real submit with max_queued_jobs=0 raises QueueFullError before + # the dispatcher would start + client, _ = video_env( + settings=_video_settings(max_queued_jobs=0), stub_submit=False + ) + r = _post(client) + assert r.status_code == 503 + assert "queue is full" in r.json()["detail"].lower() + + def test_queue_full_error_importable_and_raised_by_submit( + self, video_env + ): + _, manager = video_env( + settings=_video_settings(max_queued_jobs=0), stub_submit=False + ) + job = VideoJob(id="video_x", model_id="m", model_dir="d", params={}) + import asyncio + + with pytest.raises(QueueFullError): + asyncio.run(manager.submit(job)) + + def test_guard_unavailable_503(self, video_env): + client, manager = video_env() + manager.guard_available = lambda: (False, "guard is not running") + r = _post(client) + assert r.status_code == 503 + assert r.json()["detail"] == "guard is not running" + + def test_venv_probe_failure_503(self, video_env): + client, manager = video_env() + + async def _probe(force: bool = False): + return False, "Video worker python not found at /x" + + manager.probe_worker_venv = _probe + r = _post(client) + assert r.status_code == 503 + assert "worker python not found" in r.json()["detail"] + + def test_video_disabled_503(self, video_env, monkeypatch): + # Do NOT patch _get_video_manager: the real accessor must gate on + # settings.video.enabled via _server_state.global_settings + client, _ = video_env( + settings=_video_settings(enabled=False), + patch_manager_accessor=False, + ) + monkeypatch.setattr( + omlx_server._server_state, "video_job_manager", None + ) + r = _post(client) + assert r.status_code == 503 + assert "disabled" in r.json()["detail"] + # Every endpoint shares the gate + assert client.get("/v1/videos").status_code == 503 + assert client.get("/v1/videos/video_x").status_code == 503 + assert client.delete("/v1/videos/video_x").status_code == 503 + + def test_manager_missing_503(self, video_env, monkeypatch): + # Enabled but lifespan never built the manager -> 503 + client, _ = video_env(patch_manager_accessor=False) + monkeypatch.setattr( + omlx_server._server_state, "video_job_manager", None + ) + r = _post(client) + assert r.status_code == 503 + assert "not initialized" in r.json()["detail"] + + +# --------------------------------------------------------------------------- +# GET /v1/videos/{id} +# --------------------------------------------------------------------------- + + +class TestGetVideo: + def test_get_unknown_404(self, video_env): + client, _ = video_env() + r = client.get("/v1/videos/video_doesnotexist") + assert r.status_code == 404 + + def test_get_known_returns_wire_shape(self, video_env): + client, manager = video_env() + job = _seed_job(manager, "video_aaa", status="in_progress") + job.progress = 42 + job.phase = "denoising" + r = client.get("/v1/videos/video_aaa") + assert r.status_code == 200 + assert r.json() == job.to_dict() + body = r.json() + assert body["object"] == "video" + assert body["status"] == "in_progress" + assert body["progress"] == 42 + assert body["phase"] == "denoising" + assert body["size"] == "480x272" + + +# --------------------------------------------------------------------------- +# GET /v1/videos/{id}/content +# --------------------------------------------------------------------------- + + +class TestGetContent: + def test_content_not_completed_409(self, video_env): + client, manager = video_env() + _seed_job(manager, "video_q", status="queued") + r = client.get("/v1/videos/video_q/content") + assert r.status_code == 409 + assert "queued" in r.json()["detail"] + + def test_content_unknown_404(self, video_env): + client, _ = video_env() + assert client.get("/v1/videos/video_nope/content").status_code == 404 + + def test_content_artifact_expired_404_detail_dict(self, video_env): + client, manager = video_env() + job = _seed_job(manager, "video_purged", status="completed") + job.artifact_path = None + job.expires_at = 1750000000.5 + r = client.get("/v1/videos/video_purged/content") + assert r.status_code == 404 + detail = r.json()["detail"] + assert isinstance(detail, dict) + assert detail["code"] == "artifact_expired" + assert detail["expires_at"] == 1750000000 + assert "purged" in detail["message"] + + def test_content_completed_serves_mp4(self, video_env, tmp_path): + client, manager = video_env() + payload = b"\x00\x00\x00\x18ftypmp42" + b"\x00" * 64 + mp4 = tmp_path / "out.mp4" + mp4.write_bytes(payload) + job = _seed_job(manager, "video_done", status="completed") + job.artifact_path = str(mp4) + r = client.get("/v1/videos/video_done/content") + assert r.status_code == 200 + assert r.headers["content-type"].startswith("video/mp4") + assert r.content == payload + assert "video_done.mp4" in r.headers.get("content-disposition", "") + + +# --------------------------------------------------------------------------- +# DELETE /v1/videos/{id} +# --------------------------------------------------------------------------- + + +class TestDeleteVideo: + def test_delete_known(self, video_env): + client, manager = video_env() + _seed_job(manager, "video_del") + r = client.delete("/v1/videos/video_del") + assert r.status_code == 200 + assert r.json() == { + "id": "video_del", + "object": "video.deleted", + "deleted": True, + } + # Record is gone afterwards + assert manager.get("video_del") is None + assert client.get("/v1/videos/video_del").status_code == 404 + + def test_delete_unknown_404(self, video_env): + client, _ = video_env() + assert client.delete("/v1/videos/video_nope").status_code == 404 + + +# --------------------------------------------------------------------------- +# GET /v1/videos -- list envelope + pagination (real list_jobs semantics) +# --------------------------------------------------------------------------- + + +@pytest.fixture +def listing_env(video_env): + client, manager = video_env() + _seed_job(manager, "video_a", created_at=100.0) + _seed_job(manager, "video_b", created_at=200.0) + _seed_job(manager, "video_c", created_at=300.0) + return client, manager + + +class TestListVideos: + def test_envelope_default_desc(self, listing_env): + client, _ = listing_env + r = client.get("/v1/videos") + assert r.status_code == 200 + body = r.json() + assert body["object"] == "list" + assert [j["id"] for j in body["data"]] == [ + "video_c", "video_b", "video_a", + ] + assert body["has_more"] is False + assert body["first_id"] == "video_c" + assert body["last_id"] == "video_a" + + def test_limit_and_has_more(self, listing_env): + client, _ = listing_env + r = client.get("/v1/videos", params={"limit": 2}) + body = r.json() + assert [j["id"] for j in body["data"]] == ["video_c", "video_b"] + assert body["has_more"] is True + assert body["first_id"] == "video_c" + assert body["last_id"] == "video_b" + + def test_after_cursor(self, listing_env): + client, _ = listing_env + r = client.get("/v1/videos", params={"after": "video_c"}) + body = r.json() + assert [j["id"] for j in body["data"]] == ["video_b", "video_a"] + assert body["has_more"] is False + + def test_after_cursor_with_limit(self, listing_env): + client, _ = listing_env + r = client.get("/v1/videos", params={"after": "video_c", "limit": 1}) + body = r.json() + assert [j["id"] for j in body["data"]] == ["video_b"] + assert body["has_more"] is True + + def test_order_asc(self, listing_env): + client, _ = listing_env + r = client.get("/v1/videos", params={"order": "asc"}) + body = r.json() + assert [j["id"] for j in body["data"]] == [ + "video_a", "video_b", "video_c", + ] + + def test_bad_order_400(self, listing_env): + client, _ = listing_env + assert client.get( + "/v1/videos", params={"order": "sideways"} + ).status_code == 400 + + def test_limit_clamped_to_minimum_1(self, listing_env): + client, _ = listing_env + r = client.get("/v1/videos", params={"limit": 0}) + body = r.json() + assert len(body["data"]) == 1 + assert body["has_more"] is True + + def test_unknown_after_cursor_ignored(self, listing_env): + # Manager semantics: unknown cursor falls through to the full list + client, _ = listing_env + r = client.get("/v1/videos", params={"after": "video_ghost"}) + body = r.json() + assert len(body["data"]) == 3 + + def test_empty_list_envelope(self, video_env): + client, _ = video_env() + r = client.get("/v1/videos") + body = r.json() + assert body == { + "object": "list", + "data": [], + "has_more": False, + "first_id": None, + "last_id": None, + }