Lineage and Traceability

Field-level lineage connects every schema field across all API types (REST, GraphQL, gRPC, AsyncAPI, Webhooks, MCP) back to model entities and database columns. Governance blocks link to model entities via governs edges. Together, this creates end-to-end traceability from risk documentation through API contracts to data storage.

March 5, 2026

Traceability is the ability to follow a data element from a governance decision, through API contracts, down to its storage in a database column or model property. In most organisations, these connections exist only in people's heads. NeoArc makes them explicit and machine-readable through two complementary mechanisms: field-level lineage on schemas and governs edges on content blocks. Together, they create an unbroken chain from risk documentation to data storage, across every API type the system supports.

The Shared Schema System

All API types in NeoArc use the same underlying schema system (.cf.schema.json files). This is a deliberate architectural decision with profound governance implications. A schema defined once can be referenced by a REST endpoint, a GraphQL operation, a gRPC method, an AsyncAPI message, a webhook event payload, and an MCP tool input, all sharing the same lineage back to the model.

API Type	How Schemas Are Used	Schema Reference Field
REST API	Request bodies, response bodies, error responses, path/query parameters	requestBody.schemaRef, responses[].schemaRef
GraphQL	Return types, argument types for queries/mutations/subscriptions	returnTypeRef, arguments[].typeRef
gRPC	Proto messages for request/response (reuses shared schema with isProtoMessage flag)	requestSchemaRef, responseSchemaRef
AsyncAPI (Event-Driven)	Message payloads and message headers across supported protocols	payloadSchemaRef, headersSchemaRef
Webhooks	Event payloads for HTTP callbacks	payloadSchemaRef
MCP (AI Tools)	Tool input schemas, tool output schemas	inputSchemaRef, outputSchemaRef

Field-Level Lineage

Each schema field can declare its lineage: where the data originates. Lineage entries specify a source type and the coordinates of that source, creating maps-to edges in the Intent Graph.

Source Type	Coordinates	Intent Graph Edge
ERD Column	Diagram ID, Table ID, Column ID	maps-to edge from schema-field node to diagram-shape node
Graph Property	Graph Diagram ID, Node ID, Property ID	maps-to edge from schema-field node to graph-node property node
Model Entity	Model Entity ID	maps-to edge from schema-field node to model-entity node

These maps-to edges form the traceability chain. Combined with depends-on edges (schema to API endpoint), references edges (pages to schemas), and governs edges (content blocks to model entities), the Intent Graph provides a complete dependency map from governance documentation through API contracts to storage.

Governance Blocks Link to the Model

Lineage is not just about API-to-database mappings. Content blocks throughout the workspace can reference model entities via governs edges. This creates a second axis of traceability: from governance documentation to the data model.

Block Type	Edge Type	Target	Example
Risk	governs	Model Entity, REST API, REST Endpoint	A risk block documenting PII exposure risk linked to the Customer and PaymentMethod entities
Security Control	governs	Model Entity, Schema, REST Endpoint, REST API	An encryption-at-rest control linked to entities storing sensitive data
Compliance Requirement	governs	Model Entity	A GDPR data subject access right requirement linked to the User entity
Data Lifecycle	governs	Model Entity	A retention policy linked to the AuditLog and TransactionHistory entities
Threat Model	governs	Model Entity	A SQL injection threat model linked to entities exposed through search endpoints
NFR	governs	Model Entity	A response time SLA linked to the Order and Inventory entities
Assumption	governs	Model Entity	An assumption about data volumes linked to the Event entity
Constraint	governs	Schema	A field naming convention constraint linked to the API response schemas
Incident Response Plan	governs	REST Endpoint	An incident response procedure linked to the payment processing endpoints

The Complete Traceability Chain

Combining field-level lineage with governance block references creates a traceability chain that spans the entire architecture.

API-to-Database Coverage

The API-to-Database Coverage report measures how many schema fields have lineage mappings to database columns or model properties. This is a direct measure of traceability completeness.

Coverage Percentage

Overall and per-schema coverage shown as radial bars. A schema with 10 fields and 7 mapped has 70% coverage. Unmapped fields are listed for easy identification.

Heatmap Visualisation

A matrix of API names against database table names, where each cell shows the count of maps-to edges. Reveals which APIs depend on which data stores at a glance.

Per-Endpoint Coverage

Horizontal bars show coverage per endpoint, computed from the schemas that endpoint depends on. Endpoints using schemas with low lineage coverage are flagged.

Broken Lineage Detection

Lineage entries can break when the target they point to is renamed, deleted, or restructured. The Broken Lineage report validates every lineage entry in every schema against the Intent Graph.

Severity	Condition	Example
Error	The base node (table, graph node, or model entity) does not exist in the Intent Graph	A lineage entry references an ERD table that was deleted
Warning	The base node exists but the specific property or column does not	A lineage entry references a column that was renamed

Issues are grouped by schema with direct file navigation. Because schemas are shared across API types, a single broken lineage entry in a schema affects every API that references it, whether REST, GraphQL, gRPC, async, webhook, or MCP.

Orphan Element Detection

The inverse of lineage coverage: the Orphan Elements report identifies database columns, graph node properties, and model entity properties with zero incoming maps-to edges. These are data elements that exist in the data layer but are not referenced by any schema, meaning they have no traceability to any API surface.

Cross-API Traceability

The Cross-API Analysis report extends traceability across API boundaries. It detects schemas shared between multiple APIs, overlapping endpoints, and naming convention inconsistencies. A heatmap visualisation shows the matrix of APIs against shared schemas, revealing architectural coupling between services.

This is particularly relevant for governance because shared schemas create implicit dependencies between teams and across protocol boundaries. A schema shared between a REST endpoint and a Kafka event consumer means changes to that schema affect both synchronous and asynchronous interfaces. The cross-API report makes these hidden dependencies visible.

Model Coverage

Beyond API-level lineage, the Model Coverage report analyses which model entities are referenced by any architectural artefact (schemas, endpoints, views, documentation, governance blocks). Entities with zero references are flagged as potentially stale. High-impact entities (referenced by many artefacts across multiple API types) are identified as requiring closer governance attention, since changes to them have the widest blast radius.