-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathsample_data.py
More file actions
201 lines (163 loc) · 5.31 KB
/
Copy pathsample_data.py
File metadata and controls
201 lines (163 loc) · 5.31 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
"""
Sample data
===========
Seed documents for each role on first launch. Delete `data/` and
`chroma_db/` to re-seed.
"""
from __future__ import annotations
import logging
from vector_store import VectorDocumentStore
logger = logging.getLogger(__name__)
FINANCE_DOCS = [
{
"title": "Quarterly Financial Report Q3 2024",
"content": """
# Q3 2024 Financial Report
## Executive Summary
Q3 2024 revenue reached $5.2M, up 12% QoQ. Net profit grew 24% to $1.4M.
## Key Financial Metrics
- Revenue: $5.2M (up 12% QoQ)
- Operating Expenses: $3.1M (up 5% QoQ)
- Net Profit: $1.4M (up 24% QoQ)
- Cash Reserves: $8.7M
## Department Breakdown
- Sales: $3.2M (up 15%)
- Services: $1.5M (up 8%)
- Licensing: $0.5M (up 5%)
## Projections for Q4
Q4 revenue is projected at $5.8M with continued growth in Sales.
""",
},
{
"title": "2025 Budget Planning Guidelines",
"content": """
# Budget Planning Guidelines for 2025
## Overview
Process and requirements for departmental budget submissions for FY 2025.
## Timeline
- Budget templates distributed: October 1
- Initial submissions due: October 31
- Review meetings: November 7-18
- Final approvals: December 15
## Budget Constraints
- Total budget increase capped at 8% over 2024
- New headcount limited to critical roles only
- Capital expenditures require ROI analysis for amounts over $25,000
## Required Documentation
1. Completed budget template
2. Headcount justification form (if applicable)
3. Capital expenditure requests with ROI analysis
4. Revenue projections (for revenue-generating departments)
## Approval Process
All budgets require approval from the department head, finance director, and appropriate VP.
""",
},
]
ENGINEERING_DOCS = [
{
"title": "System Architecture Overview",
"content": """
# System Architecture Overview
## Core Components
1. Frontend Layer
- React.js SPA
- Responsive design using Material-UI
- Client-side state management with Redux
2. API Gateway
- AWS API Gateway
- Authentication and request routing
- Rate limiting and caching
3. Microservices
- User Service (Node.js)
- Product Service (Python / Django)
- Order Service (Java / Spring Boot)
- Notification Service (Go)
4. Data Layer
- Primary database: PostgreSQL
- Caching: Redis
- Search: Elasticsearch
- Data warehouse: Snowflake
5. Infrastructure
- AWS Cloud (primary)
- Kubernetes for container orchestration
- CI/CD through GitHub Actions
- Terraform for infrastructure as code
## Communication Patterns
- Synchronous: REST APIs, gRPC
- Asynchronous: Kafka for event streaming
## Security Measures
- JWT-based authentication
- HTTPS everywhere
- WAF for API endpoints
- Regular security audits
""",
},
{
"title": "Development Workflow Guidelines",
"content": """
# Development Workflow Guidelines
## Git Workflow
Modified Git Flow with branches:
- main: production code
- develop: integration branch for features
- feature/*: new features
- bugfix/*: bug fixes
- release/*: release candidates
- hotfix/*: production fixes
## Pull Request Process
1. Create a feature/bugfix branch from develop
2. Implement your changes with appropriate tests
3. Open a PR to develop with a clear description, linked issues, passing tests, and coverage met
4. Obtain at least two code reviews
5. Address review comments
6. Merge when approved and CI passes
## Coding Standards
- JavaScript: Airbnb style guide
- Python: PEP 8
- Java: Google Java Style
- Document public APIs
- Write unit tests for all new code
- Maintain minimum 80% code coverage
## Deployment Process
1. Changes merged to develop are auto-deployed to staging
2. QA performs testing in staging
3. Release branch created for production deployment
4. Final testing on pre-production
5. Release manager approves and merges to main
6. CI/CD pipeline deploys to production
""",
},
]
ADMIN_DOCS = [
{
"title": "Admin Onboarding Runbook",
"content": """
# Admin Onboarding Runbook
## New Admin Checklist
1. Create account with role=admin via the Admin tab
2. Verify access to all role-scoped collections (finance, engineering, admin)
3. Rotate default passwords for seeded accounts
4. Review audit log every Monday
## User Management
- New employee: add_user(username, password, role)
- Role change: delete account and recreate (keeps audit history clean)
- Departure: delete account; sessions auto-expire within the configured window
## Document Management
- Use the Admin tab to upload new documents to the correct role collection
- Document title must be unique within a role; re-uploading replaces in place
- Keep documents under ~8000 characters for best retrieval results
## Incident Response
- Account lockout clears automatically after 10 minutes
- To unlock immediately: edit data/users.json, set failed_attempts=0
- Rate limit spikes: review data/audit.jsonl (if Challenge 2 is implemented)
""",
},
]
def add_sample_documents(doc_store: VectorDocumentStore) -> None:
for doc in FINANCE_DOCS:
doc_store.add_document("finance", doc["title"], doc["content"])
for doc in ENGINEERING_DOCS:
doc_store.add_document("engineering", doc["title"], doc["content"])
for doc in ADMIN_DOCS:
doc_store.add_document("admin", doc["title"], doc["content"])
logger.info("Seeded sample documents for all three roles")