Feature Proposal: FERPA-Compliant Content Handling for U.S. Educational Deployments #537

ashutoshrana · 2026-05-08T20:54:30Z

ashutoshrana
May 8, 2026

Background

OpenMAIC is actively used in U.S. educational institutions. Once a deployment involves students as users — even just interacting with the classroom chat — it enters the scope of the Family Educational Rights and Privacy Act (FERPA, 20 U.S.C. § 1232g). FERPA §99.31 limits which student record data can be disclosed to third-party systems, including AI/LLM providers. There is currently no FERPA handling in the generation or chat pipeline.

I'd like to propose adding a lightweight FERPA compliance layer and am happy to implement it if maintainers are interested.

The Problem in the Current Codebase

Two entry points in the current pipeline can inadvertently transmit FERPA-protected data to LLM providers:

1. app/api/generate-classroom/route.ts — pdfContent field

The generation API accepts a pdfContent field alongside the lesson requirement. An educator generating a lesson from an uploaded PDF may accidentally upload a document containing student records (grade reports, disability accommodation files, class rosters with grades). That raw content flows directly into runClassroomGenerationJob() and eventually into LLM prompts via the generation pipeline. There is no scan step.

2. lib/orchestration/summarizers/conversation-summary.ts / message-converter.ts

Student chat messages are accumulated into state.messages and passed to LLM providers via convertMessagesToOpenAI(). If a student types a message containing their own student ID, disability status, or financial aid situation (common in an advising-style AI classroom), that PII flows into LLM API calls. FERPA's minimum-necessary principle applies here.

Proposed Implementation

A single optional middleware module — lib/ferpa/ferpa-filter.ts — with two functions:

// Scans document text for FERPA-protected field patterns before LLM ingestion.
// Returns: { clean: string; redactions: RedactionRecord[] }
export function redactFerpaContent(text: string, options?: FerpaFilterOptions): FerpaResult

// Scans a chat message for student PII before including in LLM context.
// Configurable: warn-only | redact | block modes.
export function scanChatMessageForFerpaPii(message: string): FerpaScanResult

Insertion points (minimal, non-breaking):

app/api/generate-classroom/route.ts: run redactFerpaContent(body.pdfContent) before passing to runClassroomGenerationJob. If redactions occur, log them and (optionally) surface a warning to the client.
lib/orchestration/summarizers/message-converter.ts: run scanChatMessageForFerpaPii() on user message content in convertMessagesToOpenAI() — warn-only mode by default so it does not break existing behavior.

Both are opt-in via environment variable (ENABLE_FERPA_FILTER=true) so non-U.S. deployments see zero overhead.

Detection Patterns

FERPA-protected fields to detect (regex + heuristic):

Field	Pattern
Student ID / Banner ID	8–9 digit numeric sequences in academic context
SSN	`\d{3}-\d{2}-\d{4}`
Grades in student record context	Letter grades adjacent to names in tabular format
Disability/accommodation status	Keyword patterns ("IEP", "504 plan", "accommodation letter")
Financial aid data	EFC values, SAI amounts, aid package details

Prior Art / Reference Implementation

I've implemented this pattern as an open-source Python package (haystack-ferpa-filter, PyPI v0.1.0) for Haystack AI pipelines, and have deployed a production version of this architecture in a FERPA-compliant enterprise RAG system. The TypeScript implementation proposed here would be a clean-room adaptation of the same pattern for OpenMAIC's Node.js/Next.js stack — not a port of the Python code.

Scope / What I'm NOT Proposing

No changes to data storage, databases, or user authentication
No mandatory compliance certification — this is a best-effort filter, not a legal guarantee
No behavioral changes for non-U.S. deployments (fully opt-in)
This is a content scanning utility, not a full FERPA compliance system

Questions for Maintainers

Is FERPA compliance within scope for OpenMAIC, or is it intentionally left to deployment-level configuration?
If in scope, would you prefer this as a standalone lib/ferpa/ module (as proposed) or as a plugin/skill under the skills/ directory?
Are there existing plans for a privacy/compliance layer I should be aware of before writing code?

Happy to open a draft PR with the lib/ferpa/ferpa-filter.ts module and the two insertion points for review before doing anything more.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Proposal: FERPA-Compliant Content Handling for U.S. Educational Deployments #537

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Feature Proposal: FERPA-Compliant Content Handling for U.S. Educational Deployments #537

Uh oh!

ashutoshrana May 8, 2026

Background

The Problem in the Current Codebase

Proposed Implementation

Detection Patterns

Prior Art / Reference Implementation

Scope / What I'm NOT Proposing

Questions for Maintainers

Replies: 0 comments

ashutoshrana
May 8, 2026