Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions specs/206-markdown-file-creation-f7d8d3/feature.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
feature:
id: "206-markdown-file-creation-f7d8d3"
name: "markdown-file-creation-f7d8d3"
number: 206
branch: "feat/206-markdown-file-creation-f7d8d3"
lifecycle: "research"
createdAt: "2026-03-25T06:54:05Z"
status:
phase: "implementation-complete"
progress:
completed: 14
total: 14
percentage: 100
currentTask: null
lastUpdated: "2026-03-25T07:12:34.543Z"
lastUpdatedBy: "feature-agent:implement"
completedPhases:
- "analyze"
- "requirements"
- "research"
- "plan"
- "phase-1-setup"
- "phase-2-file-creation"
- "phase-5-orchestration"
validation:
lastRun: null
gatesPassed: []
autoFixesApplied: []
tasks:
current: null
blocked: []
failed: []
checkpoints:
- phase: "feature-created"
completedAt: "2026-03-25T06:54:05Z"
completedBy: "feature-agent"
errors:
current: null
history: []
230 changes: 230 additions & 0 deletions specs/206-markdown-file-creation-f7d8d3/plan.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,230 @@
name: "markdown-file-creation-f7d8d3"
summary: >
Implement feature 206 to create a test markdown file (test-afcl8i.md) following the
established pattern from features 200-205. Architecture uses single-module design with
clear separation of concerns: hard-coded content → file creation → validation pipeline →
git operations. Implementation is straightforward: pathlib for cross-platform file I/O,
regex-based CommonMark validation, subprocess for git operations, fail-fast error handling
with specific error messages. No external dependencies beyond Python 3.11+ stdlib and git.

relatedFeatures:
- number: 205
name: "markdown-file-creation-870df7"
relationship: "Direct precedent with identical hard-coded content pattern"
- number: 204
name: "markdown-file-creation-946f62"
relationship: "Validation pattern reference (uses API-based content; feature 206 uses hard-coded)"
- number: 203
name: "markdown-file-creation-213afa"
relationship: "Architecture and fail-fast error handling pattern"

technologies:
- "Python 3.11+ (standard library: pathlib, subprocess, sys, re)"
- "pathlib.Path for cross-platform file I/O"
- "subprocess.run() for git command execution"
- "Regular expressions (re module) for CommonMark validation"
- "Git command-line interface"
- "Markdown (CommonMark specification)"
- "UTF-8 text encoding with Unix LF line endings"
- "sheep.observability.logging for structured logging"

relatedLinks:
- title: "Feature 205 Implementation (Direct Precedent)"
url: "https://github.com/jobnik/sheep/blob/main/src/sheep/features/feature_205_markdown_file_creation.py"
- title: "CommonMark Markdown Specification"
url: "https://spec.commonmark.org/"
- title: "Conventional Commits Specification"
url: "https://www.conventionalcommits.org/"
- title: "Python pathlib Documentation"
url: "https://docs.python.org/3.11/library/pathlib.html"
- title: "Python subprocess Documentation"
url: "https://docs.python.org/3.11/library/subprocess.html"

phases:
- id: "phase-1-setup"
name: "Module Setup & Constants"
description: "Define module constants (filename, feature number, branch name, commit message, title, prose content), set up imports and logging infrastructure. This phase establishes the foundation for all subsequent implementation."
parallel: false

- id: "phase-2-file-creation"
name: "Content & File Creation"
description: "Implement hard-coded content definition and file writing using pathlib.Path.write_text(). Ensure UTF-8 encoding without BOM and Unix LF line endings are produced by the pathlib call on Unix systems."
parallel: false

- id: "phase-3-validation"
name: "Validation Pipeline"
description: "Implement comprehensive validation functions using regex-based pattern matching and binary file inspection. Covers H1 heading format, blank line separator, sentence count, UTF-8 encoding, LF line endings, and file size. Each validator has clear, specific error messages enabling quick debugging."
parallel: false

- id: "phase-4-git-integration"
name: "Git Integration"
description: "Implement git operations using subprocess.run(): stage file with \"git add\", create commit with conventional message \"feat(206): ...\", and push to feature branch. Each operation uses fail-fast pattern with check=True to raise CalledProcessError on any failure."
parallel: false

- id: "phase-5-orchestration"
name: "Orchestration & End-to-End Testing"
description: "Wire together all components into a cohesive orchestration function that creates file, validates, performs git operations, and handles errors. Test end-to-end workflow and verify all success criteria are met."
parallel: false

filesToCreate:
- "src/sheep/features/feature_206_markdown_file_creation.py"

filesToModify: []

openQuestions: []

content: |
## Architecture Overview

Feature 206 follows the single-module architecture established in features 200-205, implementing
a focused, straightforward workflow for creating a test markdown file with hard-coded content.

**Module Design:**
Single module `src/sheep/features/feature_206_markdown_file_creation.py` contains:
- Module constants (FILENAME, FEATURE_NUMBER, BRANCH_NAME, COMMIT_MESSAGE, TITLE_TEXT, PROSE_CONTENT)
- File creation function using pathlib.Path.write_text()
- Validation helper functions (regex patterns, individual validators)
- Comprehensive validation pipeline orchestrator
- Git operation functions (add, commit, push) using subprocess.run()
- Main orchestration function that wires everything together
- Script entry point for direct execution

**Validation Pipeline:**
The feature implements a fail-fast validation pipeline that stops on first error with specific,
actionable error messages. Validation order: (1) file exists, (2) H1 heading format, (3) blank
separator line, (4) prose sentence count (2-3), (5) UTF-8 encoding without BOM, (6) Unix LF
line endings only, (7) file size within 100-600 bytes. Each validator uses simple string
operations or regex patterns — no external dependencies.

**Git Integration:**
Git operations use subprocess.run() with check=True for fail-fast behavior. Three sequential
operations: (1) `git add test-afcl8i.md`, (2) `git commit -m "feat(206): ..."`,
(3) `git push -u origin feat/206-...`. Commands use list form (not shell strings) to prevent
injection vulnerabilities. Fail-fast approach: any git error raises CalledProcessError with
stderr context.

**Logging & Observability:**
Structured logging via sheep.observability.logging.get_logger() for consistent integration
with Langfuse observability system. Log levels: info() for major workflow steps (file created,
validation passed), debug() for detailed operation results, error() for failures with full context.

**Content Strategy:**
Hard-coded, deterministic content (not API-based) ensures reproducibility across runs and
simplifies error handling. Prose content is embedded as module constant for full transparency.
Eliminates external dependencies and API latency compared to feature 204 approach.

## Key Design Decisions

### 1. Hard-Coded Deterministic Content

**Chosen:** Hard-coded H1 title and 2-3 sentences of prose content (embedded as constants).

**Why:** Specification explicitly requires hard-coded content (NFR-9) for reproducibility across
runs. Feature 205 demonstrates this pattern is simpler and more maintainable than API-based
approaches (feature 204). Hard-coded content: (1) eliminates external dependencies, (2) removes
API call latency (2-3 seconds), (3) simplifies error handling (no LLM-specific edge cases),
(4) ensures identical execution on repeated runs, (5) aligns with straightforward nature of task.

### 2. File I/O: Direct pathlib.Path

**Chosen:** Use `pathlib.Path.write_text(content, encoding=\"utf-8\")` for file creation.

**Why:** Features 204-205 establish pathlib as the standard for this series. Benefits:
(1) cross-platform compatibility (pathlib handles path separators), (2) explicit UTF-8 encoding
parameter, (3) automatic LF line endings on Unix systems (our deployment target), (4) modern
Python best practice (standard since 3.4, recommended for 3.11+), (5) post-creation validation
checks actual bytes to verify correct encoding and line endings.

### 3. Regex-Based Markdown Validation

**Chosen:** Use re module (Python stdlib) for regex-based validation of H1 heading and paragraph
structure.

**Why:** Specification explicitly recommends regex validation for this use case. Sufficient for
simple H1+prose format with deterministic content. Avoids external dependencies (NFR-8), keeps
implementation maintainable. Regex patterns are: (1) deterministic and testable with known inputs,
(2) enable specific error messages (\"Expected H1 starting with '# '\"), (3) require only stdlib,
(4) proven effective in feature 205.

### 4. Auto-Conversion to UTF-8/LF with Validation

**Chosen:** Write with UTF-8/LF via pathlib.write_text(), then validate actual bytes; fail with
specific error if encoding/line endings incorrect.

**Why:** Specification FR-4 requires \"validate and convert if necessary\". pathlib.write_text()
on Unix produces UTF-8 without BOM and LF line endings natively. Post-creation validation uses
binary read to check actual bytes. If validation fails, log specific error and stop (fail-fast).
This approach ensures files always meet requirements regardless of system defaults.

### 5. Fail-Fast Error Handling

**Chosen:** Stop immediately on first validation failure with specific, actionable error message.

**Why:** Specification NFR-4 requires error messages to be specific and actionable (include context,
show expected vs. actual). Fail-fast approach: (1) stops execution at first error, (2) enables quick
diagnosis and resolution without collecting multiple errors, (3) matches proven pattern from
features 200-205, (4) keeps validation logic simple and deterministic. Example error message:
\"File size 750 bytes exceeds maximum 600 bytes (expected 100-600)\".

### 6. subprocess.run() for Git Operations

**Chosen:** Use `subprocess.run()` with check=True for git add, commit, and push operations.

**Why:** Features 204-205 establish subprocess as the pattern. Benefits: (1) git commands are
directly visible in code (transparency), (2) error handling via CalledProcessError with stderr,
(3) check=True ensures fail-fast on any git failure, (4) only stdlib required (no external git
wrapper dependencies), (5) matches established pattern exactly. Commands use list form
(not shell strings) to prevent injection vulnerabilities.

### 7. Commit Message: Feature Number Scope

**Chosen:** Use feature number in commit scope: `feat(206): Create markdown file test-afcl8i.md`

**Why:** Specification explicitly recommends feature number for clear traceability. Feature number
provides: (1) direct link to feature tracking system, (2) enables automated tooling to correlate
commits to feature specs, (3) maintains consistency with features 200-205+ numbering scheme,
(4) provides clear context in git log.

## Implementation Strategy

**Phase Ordering Rationale:**

1. **Phase 1 (Setup)** establishes module foundation: constants, imports, logging infrastructure.
All subsequent phases depend on these definitions being in place.

2. **Phase 2 (File Creation)** comes before validation because we must create the file before
validating it. Logical workflow: create → validate → commit.

3. **Phase 3 (Validation)** comes before git operations because we validate completely before
staging/committing. Fail-fast validation prevents pushing invalid files to git.

4. **Phase 4 (Git Integration)** only runs after all validation passes. Ensures file is correct
before staging, committing, and pushing.

5. **Phase 5 (Orchestration)** wires everything together and performs end-to-end testing. This
phase requires all previous phases to be complete and working.

**Task Granularity:**
Each task is focused and narrow, implementing a single piece of functionality with a clear TDD
cycle (RED-GREEN-REFACTOR). This enables: (1) independent testing of each component,
(2) early error detection, (3) easier debugging and troubleshooting, (4) clearer git history
with logical commits.

**Testing Strategy:**
Each code task follows TDD. For file operations, tests use the actual file system but clean up
after themselves. For git operations, tests use subprocess mocking or temporary test repositories
to avoid side effects. Validation functions are tested with deterministic inputs and expected
outputs.

## Risk Mitigation

| Risk | Mitigation |
| ---- | ---------- |
| File encoding issues across platforms | Write with explicit `encoding=\"utf-8\"` via pathlib. Post-creation, validate actual bytes using binary read. Fail with specific error if BOM or wrong encoding detected. |
| Git operations fail (network, permissions, missing branch) | Use subprocess.run() with check=True. CalledProcessError provides stderr context. Feature branch is pre-created (feat/206-...). Verify git setup (user.name/email) is prerequisite. |
| Validation regex patterns don't match edge cases | Specification defines success criteria precisely. Test patterns with both valid and invalid inputs. Hard-coded content is deterministic, so file structure is predictable and easy to test. |
| File size outside bounds (100-600 bytes) | Validate file size after creation using pathlib.stat(). Hard-coded content size is deterministic. Write prose carefully to ensure total file size (with H1 + blank line) falls within bounds. Test with actual content. |
| Sentence count validation fails | Count periods in prose carefully. Validation function counts '.' characters. Write exactly 2-3 sentences with terminal periods. Test validation with actual prose content before committing. |
| Module import issues or missing dependencies | Use only Python 3.11+ stdlib: pathlib, subprocess, sys, re. Import sheep.observability.logging from established codebase. Test imports at module startup. |
| Hard-coded prose content doesn't fit requirements | Write prose with exactly 2-3 complete sentences. Each sentence ends with period. Verify prose is 60-120 words (fits size bounds). Test sentence count with validation function. |
| Feature branch naming mismatch | Branch is pre-created: `feat/206-markdown-file-creation-f7d8d3`. Use exact name from spec. Test push operation to verify it reaches correct remote branch. |
Loading
Loading