Feat: About to Introduce Service Mode#430
Conversation
Feature/service
This branch was previously covered by a revert for misoperation of PR. Now it should be restored for the following developing. - Sincere apology from Kiramei.
Feat: Log Beautification with Rich
Replaces custom logging and console output with a centralized rich-based logging format. Moves log handler and formatter setup to service/__init__.py via set_log_format(), and updates main.service.py and core/utils.py to use standard logging. Adds logging for config changes in config_manager.py and suppresses specific warnings for cleaner output.
Introduces a new update_to_latest command to the websocket API and implements a robust update mechanism in service/utils/_update.py, supporting both MirrorC and Git-based updates. Refactors update logic in service/utils/update.py to use a new GitOperationHandler abstraction, improving reliability and maintainability. Adds PID file management to main.service.py for process tracking, and updates service/runtime.py and service/app.py to support the new update command.
This Track made the docker available for building and deployment.
Feat/docker support
Introduces deploy/installer/installer.py with comprehensive installer logic, including environment setup, package installation, update mechanisms, and error handling. Updates build.bat to include --hidden-import=_cffi_backend for PyInstaller. Refactors deploy/installer/installer.py to streamline imports, configuration management, and utility classes, improving maintainability and modularity.
Refactor: installer logic refactor
Introduces the DotPrinter class to provide visual progress indication during git operations. Updates all usages of TemporaryDirectory to specify the parent directory for better file management. Adds calls to gc.collect() after major file operations to improve resource cleanup.
Remove the useless APT packages and Delete some cached files.
|
WoW, excited to see that Service Mode has been integrated. Thanks for your amazing work! After reading the PR description and checking the code, I have a few questions and suggestions regarding engineering best practices, especially since I'm a newcomer to this project:
哇,很高兴看到 Service Mode 已经集成进来了,感谢你的出色工作! 在阅读了 PR 描述和代码后,作为新手,我有几点关于工程最佳实践的疑问和建议:
|
感谢您的回复,关于您提到的几点:
|
感谢你的详细说明。对于你提到的国内镜像源分发和用户体验的难点,我非常理解。不过,关于将构建产物存入
综合考虑工程规范、国内网络环境以及目前现状,我个人的建议是:
|
|
|
1. 即使是基于浅拷贝对于分发二进制或者构建产物来说依旧是“缓兵之计”,因为体积依旧会随着不断地拉取新代码而膨胀。这是一个长期的隐患,不过为了推进项目进度我觉得可以暂时采取这个折中的方案。
2. 关于构建时间和费用的担忧是非常负责任的,不过在现代 CI/CD 流程中,这些其实都有成熟的解法。我们完全不需要依赖 Docker Hub。GitHub 自带的 GHCR (GitHub Container Registry) 是完全免费的(对于公开仓库),且与 GitHub Actions 无缝集成。没有任何费用支出,也没有严格的上传限制。依赖包几个月不变一次,CI 会直接复用缓存。Docker 只会重新打包代码层。即使有 1000 次 commit,也只是增加了 1000 个的 layer,对于存储和传输的压力微乎其微。
3. 用户只需要无脑执行 docker pull <image>:latest。Docker 守护进程会自动处理增量下载和指针切换。至于旧的 Layer,Docker 会自动管理,用户不需要去底层文件系统删 Layer。Docker 提供了标准命令 docker image prune,一条命令就能自动识别并清理掉所有不再被容器使用的悬空镜像。很多 NAS 系统(如群晖)的 Docker 管理面板里甚至直接有一个清理按钮。
4. 仅从我个人的角度来看,Docker 镜像的工作完全可以同步进行,这本质上只是 CI/CD 的打包逻辑,与核心代码功能的无关。假如需要具体的实现方案(GitHub Actions)或者测试验证,我也乐意效劳协助完成这部分的搭建。
キラメイ ***@***.***> 於 2026年1月2日週五 下午8:56寫道:
… *Kiramei* left a comment (pur1fying/blue_archive_auto_script#430)
<#430 (comment)>
1. 关于Git Clone,
我在修改的installer里,我建议可以直接使用浅层克隆,即只克隆最新版本,从用户的角度来看,过往的更新他们往往是没有用的,他们不会去进行手动回滚更新,我们无需把用户当作开发者对待。所以我建议可以考虑加上
--dep 1
这个参数。并且,我在个人开发中也经常使用该方式克隆代码,简单快捷。而Pull的速度,我认为不会造成很大影响,如果频繁使用且更新的话;
2. 这个模式我可以去了解一下,感谢您提供的思路;
3、4.
你说的很对,docker一般是不会考虑让用户来进行git热更新的,并且您说的“可以构建一个环境层”的建议所言极是,我们对于环境这一方面是不会进行频繁的更新,我们引入新的依赖包也只是根据需求来,这个没问题。但关于代码层,我仍有隐忧,其中比较有代表性的就是用户需要手动进行
docker pull
拉取最新的代码,然后对旧版的layer也需要进行删除处理。其次,对于docker的频繁更新是否会产生大量的存储要求,从结果上来看,您的说法可能会耗费大量build的时间,1000个commit意味着1000个镜像,存储上我觉得应该和git类似大小。并且,dockerhub等平台是否会接受如此频繁的更新,是否会产生费用(我们开发需要保证0费用支出)也是比较值得商榷的。
—
Reply to this email directly, view it on GitHub
<#430 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ATDYBIDRBYDBS22MV6IPOL34EZTIDAVCNFSM6AAAAACPN2SWKWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTOMBVGI3TAMJTHA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
感谢您耐心的回复,我大概了解了。最后关于费用方面一个问题: 据我了解,Github Action对于普通用户来说每月应该有2000分钟限制和500MB Artifact存储,按照项目目前的开发更新频率,您评估一下是否可能触及该限制?Ref: https://docs.github.com/zh/actions/reference/limits 开发方面,我近期在推进其他的项目,本PR可能暂时无法处理,但在大概一两周后 pur1fy 会开始对该 PR 的功能性和兼容性进行验证和测试,届时我将优先考虑他所提出的建议。等处理完功能和兼容的问题、然后我个人推进的项目完成后,我将按照您提供思路进行具体实现,届时有劳您协助该 PR 的推进了。 |
|
Thanks for your reply.
Regarding the limits, I have some good news, GitHub Actions usage is
actually free for public repositories, so the 2,000-minute limit typically
applies only to private ones. Even regarding efficiency, our estimates are
quite safe: building a Runtime Base Image (containing the environment and
libraries) typically takes only 2-5 minutes since we utilize pre-built
wheels; daily updates would simply pull this cached base and add the code
layer, which is extremely fast and likely takes less than 2 minutes.
Furthermore, regarding storage, we would use GitHub Container Registry
(GHCR) rather than Actions Artifacts. GHCR is free for public repositories
and designed for production distribution, avoiding the 500MB artifact limit
and expiration policy entirely.
I look forward to collaborating with you after the testing phase.
「キラメイ ***@***.***>」在 2026年1月3日 週六,下午6:05 寫道:
… *Kiramei* left a comment (pur1fying/blue_archive_auto_script#430)
<#430 (comment)>
1.
即使是基于浅拷贝对于分发二进制或者构建产物来说依旧是“缓兵之计”,因为体积依旧会随着不断地拉取新代码而膨胀。这是一个长期的隐患,不过为了推进项目进度我觉得可以暂时采取这个折中的方案。
2. 关于构建时间和费用的担忧是非常负责任的,不过在现代 CI/CD 流程中,这些其实都有成熟的解法。我们完全不需要依赖 Docker
Hub。GitHub 自带的 GHCR (GitHub Container Registry) 是完全免费的(对于公开仓库),且与 GitHub
Actions 无缝集成。没有任何费用支出,也没有严格的上传限制。依赖包几个月不变一次,CI 会直接复用缓存。Docker 只会重新打包代码层。即使有
1000 次 commit,也只是增加了 1000 个的 layer,对于存储和传输的压力微乎其微。
3. 用户只需要无脑执行 docker pull :latest。Docker 守护进程会自动处理增量下载和指针切换。至于旧的
Layer,Docker 会自动管理,用户不需要去底层文件系统删 Layer。Docker 提供了标准命令 docker image
prune,一条命令就能自动识别并清理掉所有不再被容器使用的悬空镜像。很多 NAS 系统(如群晖)的 Docker 管理面板里甚至直接有一个清理按钮。
4. 仅从我个人的角度来看,Docker 镜像的工作完全可以同步进行,这本质上只是 CI/CD
的打包逻辑,与核心代码功能的无关。假如需要具体的实现方案(GitHub Actions)或者测试验证,我也乐意效劳协助完成这部分的搭建。
感谢您耐心的回复,我大概了解了。最后关于费用方面一个问题:
据我了解,Github Action对于普通用户来说每月应该有2000分钟限制和500MB
Artifact存储,按照项目目前的开发更新频率,您评估一下是否可能触及该限制?Ref:
https://docs.github.com/zh/actions/reference/limits
开发方面,我近期在推进其他的项目,本PR可能暂时无法处理,但在大概一两周后 pur1fy 会开始对该 PR
的功能性和兼容性进行验证和测试,届时我将优先考虑他所提出的建议。等处理完功能和兼容的问题、然后我个人推进的项目完成后,我将按照您提供思路进行具体实现,届时有劳您协助该
PR 的推进了。
—
Reply to this email directly, view it on GitHub
<#430 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ATDYBID7ZKMB7GCEKXUHNG34E6H7FAVCNFSM6AAAAACPN2SWKWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTOMBWHE2DEMBTGM>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
MC-ALL
left a comment
There was a problem hiding this comment.
I’ve done a quick sweep of the areas I’m familiar with. Apart from the download logic—which is critical—the other issues aren't blockers. However, I believe the last two points are quite important for the project's long-term engineering health.
| from core.device import emulator_manager | ||
| from main import Main | ||
| from .broadcast import BroadcastChannel | ||
| from .utils import * |
There was a problem hiding this comment.
Could we change this to explicit imports? Using import * impacts readability because it's hard to tell where functions come from. It also messes up static analysis tools.
There was a problem hiding this comment.
You're right. As it is a initial version for introducing service mode, every module is designed to simplify the code structure. As you can see, there's potential for packaging every module for different function for clarification.
| elif cmd.command == "update_to_latest": | ||
| result = await context.runtime.update_to_latest() | ||
| response_payload = {"status": "ok", "data": result} |
There was a problem hiding this comment.
I've reviewed the underlying code for the update logic. It appears to work by spawning a new process to replace the current one. Do the backend and the scripts currently share the same interpreter?
I noticed the frontend instructs users to stop all tasks. Should we also implement a check on the backend to verify and stop any running tasks before executing this command?
Also, regarding the implementation in service/runtime.py, it seems the return statement in that function is unreachable. Could we modify the flow to send a 'restart initiated' response to the frontend before the actual restart? This would also allow us to introduce a delay window where the operation could be manually cancelled.
There was a problem hiding this comment.
For the first question, I'd say no. As you can see from deploy/installer/installer.py, the virtual env is spawned by the installer script, where the installer script is running on the packaged one with Pyinstaller. You may have doubts in the updating process. For our project has already made use of pygit2 for OCR updates, we reuse this module for dynamic update, which needs all tasks users are running to be stopped, avoiding unexpected issues. For the final question, I'd say my implementation has not been prepared for it, so your proposal could be useful, thx.
| with open(file_path, "wb") as download_f: | ||
| for chunk in response.iter_content(chunk_size=1024 * 64): | ||
| if not chunk: | ||
| continue | ||
| download_f.write(chunk) |
There was a problem hiding this comment.
If a network interruption occurs during the streaming phase (inside the for chunk in response.iter_content loop), requests will raise exceptions like ChunkedEncodingError or IncompleteRead. Since the loop is outside the try-block, these errors will be unhandled and cause the entire application to crash.
Below are the results of an ugly AI-generated black-box test for your reference.
[INFO] [BadServer] Serving on port 9991
==================================================
>>> 开始黑盒测试 (Blackbox Test) <<<
==================================================
客户端正在请求下载: http://localhost:9991/incomplete_file.zip ...
[INFO] Prepare for downloading incomplete_file.zip
[INFO] [BadServer] Sent 10 bytes and stopped.
[INFO] Server claims file size is: 1000 bytes
==================================================
>>> 测试结果:程序崩溃 (Crash) <<<
==================================================
程序抛出了未捕获的异常: ChunkedEncodingError: ('Connection broken: IncompleteRead(10 bytes read, 990 more expected)', IncompleteRead(10 bytes read, 990 more expected))
测试结束。
There was a problem hiding this comment.
Oh ..., I see it. May a try-catch block fix can resolve this problem?
| try: | ||
| if cmd.command == "start_scheduler": | ||
| if not cmd.config_id: | ||
| raise ValueError("config_id is required for start_scheduler") | ||
| result = await context.runtime.start_scheduler(cmd.config_id, | ||
| set_log=context.ensure_runtime_logger_attached) | ||
| response_payload = {"status": "ok", "data": result} | ||
| elif cmd.command == "stop_scheduler": | ||
| if not cmd.config_id: | ||
| raise ValueError("config_id is required for stop_scheduler") | ||
| result = await context.runtime.stop_scheduler(cmd.config_id) | ||
| response_payload = {"status": "ok", "data": result} | ||
| elif cmd.command == "solve": | ||
| if not cmd.config_id: | ||
| raise ValueError("config_id is required for solve") | ||
| task = cmd.payload.get("task") | ||
| if not task: | ||
| raise ValueError("task is required for solve command") | ||
| result = await context.runtime.solve_task(cmd.config_id, task, | ||
| set_log=context.ensure_runtime_logger_attached) | ||
| response_payload = {"status": "ok", "data": result} |
There was a problem hiding this comment.
The current implementation uses a lengthy if-elif chain to handle different commands, which results in high cyclomatic complexity and violates the Open/Closed Principle. Adding new commands requires constant modification of this entry function, making it harder to maintain and test.
Consider refactoring this into a Dictionary-based Dispatcher or the Command Pattern. You could map command strings to specific handler functions. This would separate the routing logic from the business logic (validation & execution), significantly improving readability and extensibility.
For example,
async def _handle_start_scheduler(cmd: CommandMessage, context: ServiceContext) -> Any:
if not cmd.config_id:
raise ValueError("config_id is required")
return await context.runtime.start_scheduler(
cmd.config_id, set_log=context.ensure_runtime_logger_attached
)
async def _handle_stop_scheduler(cmd: CommandMessage, context: ServiceContext) -> Any:
if not cmd.config_id:
raise ValueError("config_id is required")
return await context.runtime.stop_scheduler(cmd.config_id)
async def _handle_solve(cmd: CommandMessage, context: ServiceContext) -> Any:
if not cmd.config_id:
raise ValueError("config_id is required")
task = cmd.payload.get("task")
if not task:
raise ValueError("task is required")
return await context.runtime.solve_task(
cmd.config_id, task, set_log=context.ensure_runtime_logger_attached
)
# ...
COMMAND_HANDLERS = {
"start_scheduler": _handle_start_scheduler,
"stop_scheduler": _handle_stop_scheduler,
"solve": _handle_solve
}
# ...
@app.websocket("/ws/trigger")
async def websocket_trigger(websocket: WebSocket) -> None:
# ...
try:
handler = COMMAND_HANDLERS.get(cmd.command)
if handler:
result = await handler(cmd, context)
response_payload = {"status": "ok", "data": data}
else:
raise ValueError(f"Unsupported command '{cmd.command}'")There was a problem hiding this comment.
This fix is ok, I'll consider implementing this.
| except (AuthenticationError, HTTPException) as exc: | ||
| await websocket.close(code=4401, reason=str(exc)) | ||
| except WebSocketDisconnect: | ||
| pass | ||
| except Exception as exc: | ||
| import traceback | ||
| traceback.print_exc() | ||
| await websocket.close(code=1011, reason=str(exc)) |
There was a problem hiding this comment.
Every WebSocket endpoint repeats the same handshake logic, exception handling (AuthenticationError, WebSocketDisconnect), and cleanup routines.
Introduce a websocket_scope async context manager. This manager should handle the handshake on entry and standardize error handling/cleanup on exit. This will reduce boilerplate significantly and ensure consistent behavior across all endpoints.
For example,
@contextlib.asynccontextmanager
async def websocket_scope(websocket: WebSocket):
"""
Handles the standard lifecycle of a BAAS WebSocket connection:
1. Performs handshake on entry.
2. Catches standard exceptions (Auth, Disconnect).
3. Yields the cipher for encryption/decryption.
"""
try:
# standard handshake logic moved here
if _SHARED_SECRET is None:
raise RuntimeError("Shared secret not initialised")
await websocket.accept()
session = HandshakeSession(_SHARED_SECRET)
challenge = session.issue_challenge()
await websocket.send_json({"type": "handshake", ...})
# ... verify response ...
await websocket.send_json({"type": "handshake_ok"})
cipher = session.build_cipher()
yield cipher # Yield cipher to the inner block
except (AuthenticationError, HTTPException) as exc:
await websocket.close(code=4401, reason=str(exc))
except WebSocketDisconnect:
pass # Normal closure
except Exception as exc:
import traceback
traceback.print_exc()
await websocket.close(code=1011, reason=str(exc))Before,
@app.websocket("/ws/trigger")
async def websocket_trigger(websocket: WebSocket):
try:
_, cipher = await _perform_handshake(websocket)
# ... logic ...
except Exception as e:
# ... error handling ...After,
@app.websocket("/ws/trigger")
async def websocket_trigger(websocket: WebSocket):
async with websocket_scope(websocket) as cipher:
# ... logic using cipher ...
# No need to worry about handshake or top-level try-exceptThere was a problem hiding this comment.
This fix is ok, I'll consider implementing this.
|
有关installer.py,此次pr我想尽可能解决包括 #454 在内的一些启动器报错
目前问题:
|
关于问题第二点,先前版本正常吗 |
这是我首次遇到,之前也没有人上报过这种问题 (这也是我首次在u盘中运行启动器) |
|
近期在尝试重构installer为运行库,采用类似状态机的形式进行设计,预计导入uv作为包管理器,正在研究uv的便携化,目前检验下来大概率能够兼容之前创建的 virtual venv 的 environment。 |
我也赞成从release下载 |
|
目前我在规划着用vue写自动战斗相关的用户界面 ( 轴文件管理, 运行, 搜索 ) 自动战斗的运行涉及和C++端代码交互 如果要把这些界面也在BAAS中展示应该注意什么? |
其实可以考虑做成一个 bootstrap program / 启动引导程式?分发这个启动引导程式就行?uv 是用 rust 写的,直接用 rust 还顺带解决了跨平台的问题。做成 tui / entry / frontend + backend 的架构,也够简单和可扩展了。 |
参考 BAAS-Tauri 项目。 |










baas-tauri以及baas-webui进行对接。该模式不会影响目前的UI,但仍然需要经过测试,重点观察是否会在Git上出BUG;installer.py,但不确定是否有功能遗失,旧版保存于_installer.py中,需要 @pur1fying 仔细核对 ;dist前端文件组成,后期该部分更新方法考虑使用 Github Action 进行PR推送。即该文件夹会与https://github.com/Kiramei/baas-webui/tree/gh-pages保持同步;server_installer进行了对Git CLI的支持,默认情况下会自动选择系统级Git(该模式无进度条),若不存在,则回滚PyGit2,此举是为了适应Docker部署;