Skip to content

Conversation

@Peterrpeterrr
Copy link

Change Summary

Related issue number

Checklist

  • The pull request title is a good summary of the changes - it will be used in the changelog
  • Unit tests for the changes exist
  • Run pre-commit install and pre-commit run --all-files before git commit, and passed lint check.
  • Documentation reflects the changes where applicable

Y-C-Fan and others added 20 commits November 7, 2025 21:11
- Add comprehensive documentation for role C (Integration Engineer & Verification Loop)
- Create analysis, goals and task breakdown documents in skx-docs
- Integrate orchestrator components from chao/docs branch
- Update projects/code_scratch configurations
- Modify .gitignore to include skx-docs in version control
- Implement external integration modules for role C
- Implement CodeScratchCaller for blackbox calling of projects/code_scratch
- Implement DeepResearchCaller for blackbox calling of projects/deep_research
- Implement PromptInjector for meta-instruction construction
- Implement TestRunner for automated testing and error feedback
- Update CodeAdapter to use real implementations instead of mocks
- Create test file for external integration modules
- Update devcontainer.json to use Python 3.10 as default
- Sync devcontainer configuration with chao/docs branch
- Add user_operation_guide.md with complete workflow instructions
- Remove git_config_guide.md as it's no longer needed in docs
@gemini-code-assist
Copy link

Summary of Changes

Hello @Peterrpeterrr, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

本次拉取请求引入了一个功能强大且标准化的 DevContainer 开发环境,显著提升了项目的 AI Agent 编排能力。通过新增的 Orchestrator 框架,实现了对现有深度研究和代码生成模块的无缝集成与管理,构建了一个从研究到代码的自动化流水线,并提供了全面的文档支持,确保了开发流程的顺畅和可维护性。

Highlights

  • DevContainer 环境搭建: 新增了 Dockerfile、devcontainer.json、docker-compose.yml 以及一系列辅助脚本,用于构建一个基于 Ubuntu 22.04 的标准化开发容器环境,集成了 Python 3.10、Node.js 和各种开发工具。
  • DevContainer 文档: 提供了详细的 DevContainer 使用指南、代理配置说明和优化功能文档,包括 Claude Code 集成、增强的 VS Code 设置和生命周期优化。
  • Orchestrator 模块引入: 新增了一个名为 orchestrator 的核心模块,旨在通过适配器模式(Adapter Pattern)编排现有的 AI Agent 功能,实现 Research-to-Code 的自动化流水线。
  • 外部集成工具: 创建了 external_integration 模块,包含 CodeScratchCallerDeepResearchCallerPromptInjectorTestRunner,用于以非侵入方式调用现有 ms-agent 项目并实现代码生成与验证的闭环。
  • 项目与集成文档: 新增了大量文档,包括对 Code ScratchDeep ResearchDoc Research 模块的深度分析,以及 Orchestrator 的架构分析、集成计划、任务清单和用户操作指南。
  • LLM 模型配置更新: 更新了 projects/code_scratch 下的 YAML 配置文件,将特定模型名称替换为占位符 <OPENAI_MODEL_ID>,并修改 ms_agent/agent/agent.yaml 以从环境变量读取 MODELSCOPE_API_KEY
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/sync-upstream.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive development environment using DevContainers and a new orchestration layer for the agent. The setup is very thorough and includes detailed documentation, which is great.

My review focuses on improving the security, correctness, and maintainability of the new configuration and scripts. I've identified some critical security concerns in the Docker configuration, such as the use of seccomp:unconfined and host networking, which should be addressed. There are also a few inconsistencies and potential bugs in the setup scripts and configuration files. Additionally, I've provided suggestions for optimizing the Dockerfile and refactoring some Python code for better practices.

Overall, this is a significant contribution that will greatly improve the development workflow. Addressing the feedback will help ensure the environment is secure and robust.

Comment on lines +40 to +41
security_opt:
- seccomp:unconfined

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Disabling seccomp with seccomp:unconfined significantly reduces container security by removing the default syscall restrictions. This, combined with --cap-add=SYS_PTRACE from devcontainer.json, gives the container extensive privileges on the host. This should be avoided unless absolutely necessary for debugging. If required, please document the reason clearly. Consider using a custom, more restrictive seccomp profile if possible.

"--hostname",
"ms-agent-dev",
"--add-host=host.docker.internal:host-gateway",
"--network=host"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using --network=host removes network isolation between the container and the host, which can be a security risk. The container gets access to the host's entire network stack. If this is for convenience (e.g., accessing services on localhost), consider using host.docker.internal or explicitly publishing ports instead. If host networking is required, please document the reason.

environment:
- PYTHONPATH=/workspace
- PYTHONUNBUFFERED=1
- PYTHON_VERSION=3.11

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The PYTHON_VERSION environment variable is set to 3.11, but the Dockerfile installs python3.10. This inconsistency can lead to confusion and potential issues if scripts rely on this environment variable. Please make them consistent.

Comment on lines +51 to +56
git config --global credential.helper store

# 创建 GitHub 凭据文件
mkdir -p ~/.git-credentials
echo "https://oauth2:${GITHUB_TOKEN}@github.com" > ~/.git-credentials
chmod 600 ~/.git-credentials

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using credential.helper store saves the GitHub token in plaintext in ~/.git-credentials. This is a security risk, even inside a container. Consider using a more secure credential helper like cache to store the token in memory for a limited time. If you need to automate authentication, look into using the GitHub CLI (gh) for authentication, which is more secure.

echo "🚀 开始安装项目依赖..."

# 切换到workspace目录(项目代码在此)
cd /workspaces/seu-ms-agent

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The path /workspaces/seu-ms-agent is hardcoded. The docker-compose.yml mounts the project to /workspace. This hardcoded path is likely incorrect and will cause the script to fail. It's better to use a relative path or an environment variable. Given the context, it should probably be /workspace.

Suggested change
cd /workspaces/seu-ms-agent
cd /workspace

Comment on lines +19 to +60
RUN sed -i 's|http://archive.ubuntu.com/ubuntu/|https://mirrors.aliyun.com/ubuntu/|g' /etc/apt/sources.list \
&& sed -i 's|http://security.ubuntu.com/ubuntu/|https://mirrors.aliyun.com/ubuntu/|g' /etc/apt/sources.list \
&& printf 'Acquire::Retries "5";\nAcquire::http::Timeout "30";\n' > /etc/apt/apt.conf.d/80-retries

# 创建用户(避免使用root)
ARG USERNAME=vscode
ARG USER_UID=1000
ARG USER_GID=$USER_UID

# 更新系统并安装基础软件包
RUN apt-get update && apt-get install -y \
# 基础工具
sudo \
curl \
wget \
git \
vim \
nano \
unzip \
zip \
build-essential \
cmake \
pkg-config \
# Python 3.10相关
python3.10 \
python3-pip \
python3-venv \
python3-dev \
python3-distutils \
# 网络工具
openssh-client \
# 语言环境
locales \
# 其他有用工具
htop \
tree \
jq \
# 配置语言环境并清理缓存
&& locale-gen en_US.UTF-8 \
&& update-locale LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To optimize the Docker image size and build time, it's a good practice to chain RUN commands together. The instructions on lines 19-21 and 29-60 can be combined into a single RUN layer. This avoids running apt-get update and cleaning up apt caches multiple times.

Comment on lines +105 to +192
def _parse_pytest_output(self, stdout: str,
stderr: str) -> List[Dict[str, Any]]:
"""
解析pytest输出,提取关键错误信息

Args:
stdout: pytest的标准输出
stderr: pytest的错误输出

Returns:
解析后的错误信息列表
"""
errors = []

# 解析输出以提取错误信息
output = stdout + stderr

# 查找测试失败的相关信息
lines = output.split('\n')
current_error = None

for line in lines:
# 检查是否是错误行
if 'FAIL' in line and '::' in line:
# 这是一个失败的测试
parts = line.split()
for part in parts:
if '::' in part and 'FAIL' not in part:
test_name = part
errors.append({
'type': 'test_failure',
'test_name': test_name,
'message': line.strip()
})
break
elif 'ERROR' in line and '::' in line:
# 这是一个错误的测试
parts = line.split()
for part in parts:
if '::' in part and 'ERROR' not in part:
test_name = part
errors.append({
'type': 'test_error',
'test_name': test_name,
'message': line.strip()
})
break
elif line.strip().startswith('E '):
# 这是具体的错误详情
if errors:
errors[-1]['detailed_error'] = line.strip()[
4:] # 移除 'E ' 前缀

# 尝试解析Python异常
in_traceback = False
for i, line in enumerate(lines):
if 'Traceback' in line and '(most recent call last)' in line:
in_traceback = True
continue

if in_traceback:
if line.strip().startswith('File "'):
# 提取异常信息
error_info = {'type': 'exception', 'traceback': []}

# 收集整个traceback
j = i
while j < len(lines) and not (lines[j].strip() == ''
and j > i + 5):
if j >= len(lines):
break
error_info['traceback'].append(lines[j])
if j > i and not lines[j].strip().startswith(
' ') and not lines[j].strip().startswith(
'File "'):
# 可能是异常类型和消息
if ':' in lines[j]:
parts = lines[j].split(':', 1)
error_info['exception_type'] = parts[0].strip()
error_info['exception_message'] = parts[
1].strip()
break
j += 1

errors.append(error_info)
in_traceback = False

return errors

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Parsing pytest's stdout/stderr text is brittle and can easily break if pytest changes its output format. A more robust approach is to have pytest generate a machine-readable report, such as JUnit XML, using the --junit-xml=path/to/report.xml flag. You can then parse this XML file to reliably extract test results and failures.

Comment on lines +68 to +83
try:
loop = asyncio.get_event_loop()
except RuntimeError:
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)

# 配置参数 (可以后续移入 Config)
run_params = {
'user_prompt': query,
'breadth': 4, # 广度
'depth': 2, # 深度
'is_report': True,
'show_progress': True
}

loop.run_until_complete(workflow.run(**run_params))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic to get or create an asyncio event loop is a bit complex. Since Python 3.7, asyncio.run() is the recommended, simpler way to run a top-level async function from sync code, as it handles loop creation and teardown automatically. Consider refactoring this to use asyncio.run() for simplicity and better practice.

        # 5. 执行 Run (异步)
        # 使用 asyncio.run() 来简化异步代码的执行
        run_params = {
            'user_prompt': query,
            'breadth': 4,  # 广度
            'depth': 2,  # 深度
            'is_report': True,
            'show_progress': True
        }

        asyncio.run(workflow.run(**run_params))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants