References: Semantic Layers for Data Lakes¶
-
Semantic Layer - Wikipedia - Defines the semantic layer concept, its role translating physical data into business terms, and its position between raw storage and end-user tools — directly matching this chapter's central topic.
-
Data Governance - Wikipedia - Covers policies, standards, and stewardship practices that underpin consistent metric definitions, naming standards, and business glossaries described throughout this chapter.
-
Data Catalog - Wikipedia - Explains data catalog systems that enable table discovery, business glossary management, and source system mapping — all core semantic layer components covered in this chapter.
-
Designing Data-Intensive Applications - Martin Kleppmann - O'Reilly Media - Chapters 2–3 cover data modeling approaches and storage engines; Chapter 10 covers data pipelines — foundational context for understanding data lake and lakehouse trade-offs discussed here.
-
Fundamentals of Data Engineering - Joe Reis, Matt Housley - O'Reilly Media - Chapters 6–8 cover data serving, semantic layers, and transformation patterns from ingestion to analytics, directly supporting this chapter's treatment of the data lake-to-semantic-layer stack.
-
Extract, Transform, Load - Wikipedia - Describes ETL and ELT patterns that feed data lakes and lakehouses, providing background for this chapter's discussion of schema-on-write vs. schema-on-read architectures.
-
Data Lineage - Wikipedia - Explains data lineage tracking that connects business metrics back to source columns — directly relevant to source system mapping and the semantic layer's role in grounding LLM context.
-
OpenLineage Open Standard - OpenLineage Project - Defines the open standard for capturing and sharing data lineage metadata across pipelines, supporting this chapter's treatment of source system mappings and semantic consistency.
-
DataHub Open Source Data Catalog - DataHub Project - Documents an open-source metadata platform for table discovery, business glossary management, and schema registry functions described in this chapter's discovery and naming sections.
-
Master Data Management - Wikipedia - Covers MDM practices for maintaining authoritative entity definitions across systems, supporting this chapter's sections on vocabulary alignment and semantic consistency across source systems.