Lineage and Traceability
Field-level lineage connects every schema field across all API types (REST, GraphQL, gRPC, AsyncAPI, Webhooks, MCP) back to model entities and database columns. Governance blocks link to model entities via governs edges. Together, this creates end-to-end traceability from risk documentation through API contracts to data storage.
Traceability is the ability to follow a data element from a governance decision, through API contracts, down to its storage in a database column or model property. In most organisations, these connections exist only in people's heads. NeoArc makes them explicit and machine-readable through two complementary mechanisms: field-level lineage on schemas and governs edges on content blocks. Together, they create an unbroken chain from risk documentation to data storage, across every API type the system supports.
The Shared Schema System
All API types in NeoArc use the same underlying schema system (.cf.schema.json files). This is a deliberate architectural decision with profound governance implications. A schema defined once can be referenced by a REST endpoint, a GraphQL operation, a gRPC method, an AsyncAPI message, a webhook event payload, and an MCP tool input, all sharing the same lineage back to the model.
| API Type | How Schemas Are Used | Schema Reference Field |
|---|---|---|
| REST API | Request bodies, response bodies, error responses, path/query parameters | requestBody.schemaRef, responses[].schemaRef |
| GraphQL | Return types, argument types for queries/mutations/subscriptions | returnTypeRef, arguments[].typeRef |
| gRPC | Proto messages for request/response (reuses shared schema with isProtoMessage flag) | requestSchemaRef, responseSchemaRef |
| AsyncAPI (Event-Driven) | Message payloads and message headers across supported protocols | payloadSchemaRef, headersSchemaRef |
| Webhooks | Event payloads for HTTP callbacks | payloadSchemaRef |
| MCP (AI Tools) | Tool input schemas, tool output schemas | inputSchemaRef, outputSchemaRef |
Field-Level Lineage
Each schema field can declare its lineage: where the data originates. Lineage entries specify a source type and the coordinates of that source, creating maps-to edges in the Intent Graph.
| Source Type | Coordinates | Intent Graph Edge |
|---|---|---|
| ERD Column | Diagram ID, Table ID, Column ID | maps-to edge from schema-field node to diagram-shape node |
| Graph Property | Graph Diagram ID, Node ID, Property ID | maps-to edge from schema-field node to graph-node property node |
| Model Entity | Model Entity ID | maps-to edge from schema-field node to model-entity node |
These maps-to edges form the traceability chain. Combined with depends-on edges (schema to API endpoint), references edges (pages to schemas), and governs edges (content blocks to model entities), the Intent Graph provides a complete dependency map from governance documentation through API contracts to storage.
Governance Blocks Link to the Model
Lineage is not just about API-to-database mappings. Content blocks throughout the workspace can reference model entities via governs edges. This creates a second axis of traceability: from governance documentation to the data model.
| Block Type | Edge Type | Target | Example |
|---|---|---|---|
| Risk | governs | Model Entity, REST API, REST Endpoint | A risk block documenting PII exposure risk linked to the Customer and PaymentMethod entities |
| Security Control | governs | Model Entity, Schema, REST Endpoint, REST API | An encryption-at-rest control linked to entities storing sensitive data |
| Compliance Requirement | governs | Model Entity | A GDPR data subject access right requirement linked to the User entity |
| Data Lifecycle | governs | Model Entity | A retention policy linked to the AuditLog and TransactionHistory entities |
| Threat Model | governs | Model Entity | A SQL injection threat model linked to entities exposed through search endpoints |
| NFR | governs | Model Entity | A response time SLA linked to the Order and Inventory entities |
| Assumption | governs | Model Entity | An assumption about data volumes linked to the Event entity |
| Constraint | governs | Schema | A field naming convention constraint linked to the API response schemas |
| Incident Response Plan | governs | REST Endpoint | An incident response procedure linked to the payment processing endpoints |
The Complete Traceability Chain
Combining field-level lineage with governance block references creates a traceability chain that spans the entire architecture.
API-to-Database Coverage
The API-to-Database Coverage report measures how many schema fields have lineage mappings to database columns or model properties. This is a direct measure of traceability completeness.
Broken Lineage Detection
Lineage entries can break when the target they point to is renamed, deleted, or restructured. The Broken Lineage report validates every lineage entry in every schema against the Intent Graph.
| Severity | Condition | Example |
|---|---|---|
| Error | The base node (table, graph node, or model entity) does not exist in the Intent Graph | A lineage entry references an ERD table that was deleted |
| Warning | The base node exists but the specific property or column does not | A lineage entry references a column that was renamed |
Issues are grouped by schema with direct file navigation. Because schemas are shared across API types, a single broken lineage entry in a schema affects every API that references it, whether REST, GraphQL, gRPC, async, webhook, or MCP.
Orphan Element Detection
The inverse of lineage coverage: the Orphan Elements report identifies database columns, graph node properties, and model entity properties with zero incoming maps-to edges. These are data elements that exist in the data layer but are not referenced by any schema, meaning they have no traceability to any API surface.
Cross-API Traceability
The Cross-API Analysis report extends traceability across API boundaries. It detects schemas shared between multiple APIs, overlapping endpoints, and naming convention inconsistencies. A heatmap visualisation shows the matrix of APIs against shared schemas, revealing architectural coupling between services.
This is particularly relevant for governance because shared schemas create implicit dependencies between teams and across protocol boundaries. A schema shared between a REST endpoint and a Kafka event consumer means changes to that schema affect both synchronous and asynchronous interfaces. The cross-API report makes these hidden dependencies visible.
Model Coverage
Beyond API-level lineage, the Model Coverage report analyses which model entities are referenced by any architectural artefact (schemas, endpoints, views, documentation, governance blocks). Entities with zero references are flagged as potentially stale. High-impact entities (referenced by many artefacts across multiple API types) are identified as requiring closer governance attention, since changes to them have the widest blast radius.