Logo
NeoArc Studio

Data Engineering Tools

Document dbt models, Airflow DAGs, Terraform infrastructure, Spark jobs, and Kafka topics. Complete documentation for modern data engineering.

Modern data engineering relies on specialised tools for transformation, orchestration, and infrastructure. The documentation tooling covers dbt, Airflow, Terraform, Spark, and Kafka.

dbt (Data Build Tool)

Document dbt projects with NeoArc.

dbt Use Cases

Model Lineage
Document model lineage with source to model to exposure flow
Column Descriptions
Define column descriptions with column-level documentation
Test Coverage
Map test coverage with test relationships
Transformation Logic
Document transformation logic with ADRs for complex decisions
Environment Promotion
Show environment promotion with deployment flow

Apache Airflow

Document Airflow orchestration.

Airflow Use Cases

DAG Dependencies
Document DAG dependencies with task relationships
Pipeline Flow
Show data pipeline flow with task sequence
Connection Inventory
Document connection inventory with external systems
SLA Requirements
Define SLA requirements with NFR blocks and timing targets
Failure Handling
Show failure handling with Failure Scenario blocks and retry logic

Terraform

Document infrastructure as code.

Terraform Use Cases

Infrastructure Topology
Document infrastructure topology with cloud icons
Module Dependencies
Show module dependencies with module relationships
State Management
Document state management with backend configuration
Variable Schemas
Define variable schemas with variable definitions
Deployment Workflow
Show deployment workflow with CI/CD integration

Apache Spark

Document Spark processing.

Apache Kafka

Document event streaming.

Kafka Use Cases

Cluster Architecture
Document cluster architecture with broker layout
Message Flow
Show message flow with producer to broker to consumer
Topic Schemas
Document topic schemas with Avro/JSON schemas
Consumer Groups
Map consumer groups with partition assignments
Delivery Guarantees
Document exactly-once semantics with delivery guarantee decisions

Data Governance Documentation

Document data governance across tools.

ConceptNeoArc Feature
Data cataloguesGraph Diagrams for catalogue structure
Data lineageGraph Diagrams for source-to-target tracing
ClassificationGraph Diagrams for sensitivity hierarchies
PoliciesConstraint blocks for access rules
Data qualityNFR blocks for quality dimensions
StewardshipGraph Diagrams for owner assignments

Next Steps

Databricks and Lakehouse
Data platform documentation
Learn more →
Cloud Platforms
AWS, Azure, and GCP documentation
Learn more →
Getting Started with Schemas
Introduction to authoring and organising schema definitions.
Learn more →