Skip to content

1.1 Corpus taxonomy, filing, indexing

1. Focus / PL1-corpus-taxonomy

1.1 Corpus taxonomy, filing, indexing

every document in the project corpus has an unambiguous type (epic, investigation, feature spec, ADR, runbook, research doc, spike), is filed in a predictable location by that type, and is indexed for agent retrieval. An agent should be able to answer "what investigations exist for this subsystem?" or "what is the current feature spec for X?" without guessing at file locations or scanning raw directories


Levels

Level 0

No taxonomy; document type ambiguous from filename alone; documents scattered across ad-hoc locations; no index. Agent cannot reliably locate or classify documents

Level 1

Informal conventions exist (naming patterns, rough folder structure) but inconsistently applied; some document types are distinguishable, others are not; no index — retrieval requires human navigation or broad search

Level 2

Explicit type system enforced: every document carries a type marker (frontmatter field, filename prefix, or designated folder per type); filing structure is consistent and documented; corpus is indexed and agent-queryable by type, project, and recency. Humans and agents can reliably find what they need

Level 3

Taxonomy evolves with use: new document types extracted from observed patterns and formalised; stale or merged types deprecated with migration; retrieval success rate tracked; filing gaps (documents created outside the taxonomy) auto-flagged and remediated


Recipes that advance this criterion