Model Vocabulary Registry

A workspace-level term registry that enforces naming consistency across the architecture. Similarity matching suggests canonical terms when naming graph nodes, edges, and properties, with confidence scoring and direct integration into the graph editor.

March 5, 2026

Naming inconsistency is a subtle but significant governance problem. When one team calls it "CustomerOrder" and another calls it "ClientOrder" or "Order", the result is confusion, duplicated entities, and broken traceability. The Model Vocabulary Registry provides a canonical term list for your workspace, with similarity matching that surfaces suggestions as you name entities, properties, and relationships in the graph editor.

How It Works

The vocabulary registry is stored as a JSON file (model-vocabulary.registry.json) in the workspace's governance directory. Each term is stored with multiple normalised forms to enable accurate matching regardless of casing convention.

Form	Example (for "SystemOwner")	Purpose
Original	SystemOwner	The canonical form as entered by the architect
Normalised	systemowner	Lowercase, trimmed, Unicode-normalised (NFC) for case-insensitive comparison
Compact	systemowner	Separators removed for matching across naming conventions
Tokens	["system", "owner"]	Split from camelCase, snake_case, kebab-case, etc. for partial matching

Similarity Matching

When you type a name in the graph editor (node label, edge label, or property name), the vocabulary suggestion overlay appears with matching terms. The matching algorithm uses composite scoring with three weighted components.

Component	Weight	Method
Token Jaccard	50%	Jaccard similarity between the token sets of the input and the candidate term
Compact Substring	30%	Whether the compact form of the input is a substring of (or contains) the candidate's compact form
Plural/Singular Bonus	20%	Bonus score if the input differs from the candidate only by a plural/singular suffix

Confidence Categories

Match results are categorised by confidence level, each displayed with a distinct visual badge in the suggestion overlay.

Category	Score	Meaning
Exact	1.0	The compact forms match exactly. This is the canonical term.
High	0.8 or above	Very likely the same concept with a different casing or minor variation
Possible	0.6 or above	May be related, worth reviewing to decide if this should use the canonical term
Low	Below 0.6	Weak match, probably a different concept

Graph Editor Integration

The vocabulary suggestion overlay integrates directly into the graph editor. When you type a node label, edge label, or property name, suggestions appear in a body-appended panel. The overlay only opens on typing (not on focus), which preserves the grid's arrow-key navigation.

You can accept a suggestion to use the canonical term, or ignore it if the name is intentionally different. You can also add new terms to the registry directly from the suggestion panel when you introduce a new concept.

Vocabulary Editor

The dedicated vocabulary editor (accessed via Governance > Model Vocabulary in the menu) provides a term list with a detail panel. You can add, edit, and remove terms, provide descriptions for each term, and optionally link a term to a content page that documents the concept in detail. Creation and modification dates are tracked for audit purposes.