Skip to content

Quiz: Data Quality and Data Management

Test your understanding of data quality dimensions, data governance roles, metadata, data lineage, master data management, and access control with these review questions.


1. Which definition most accurately captures "data quality" in the context of IT management graph data?

  1. Data that has been validated against a formal JSON Schema and contains no syntax errors
  2. Data that is stored in a normalized relational schema with referential integrity constraints enforced
  3. Data that is fit for the specific purpose for which it is being used, measured across multiple quality dimensions
  4. Data that has been reviewed and approved by an executive data owner within the past 90 days
Show Answer

The correct answer is C. Data quality is fundamentally about fitness for purpose—data that is accurate enough for one use case may be completely unfit for another. A server hostname that is 80% populated may be sufficient for a basic asset inventory but useless for automated DNS configuration. The DMBOK framework defines quality not as a binary pass/fail but as a multi-dimensional assessment against the specific needs of the consuming process or decision.

Concept Tested: Data Quality


2. An IT management graph contains a server node whose last_discovered timestamp is 14 months old. Which data quality dimension is most directly violated?

  1. Accuracy — the stored values do not match the current real-world state of the server
  2. Completeness — the node is missing required properties that have not yet been populated
  3. Timeliness — the data is not current enough to reflect the present state of the managed environment
  4. Validity — the timestamp format does not conform to the ISO 8601 standard required by the schema
Show Answer

The correct answer is C. Timeliness measures whether data is current enough to be useful for its intended purpose. A server record that has not been refreshed in 14 months is stale—it may reflect a configuration that no longer exists, miss newly installed software, or reference a server that has been decommissioned. Stale discovery data is one of the most common and damaging quality issues in CMDBs, because it causes incident responders to act on outdated topology information.

Concept Tested: Timeliness


3. The DMBOK framework organizes data management into 11 knowledge areas. What is its primary contribution to IT management graph programs?

  1. It provides a set of SQL query templates for validating data quality in relational CMDBs
  2. It defines a comprehensive vocabulary and structured approach for all aspects of data management, enabling organizations to assess maturity and prioritize improvement
  3. It mandates specific data retention periods and encryption standards that graph databases must implement to achieve compliance
  4. It specifies the exact graph schema that IT management platforms must use to represent configuration items and their relationships
Show Answer

The correct answer is B. The Data Management Body of Knowledge (DMBOK) provides a framework—not a prescriptive technical specification—that helps organizations understand the full scope of data management challenges. By covering areas from data governance and quality to metadata management and data security, it gives IT management graph programs a structured language for assessing what they do well and where critical gaps exist. Without this framework, teams often address symptoms rather than root causes of data quality problems.

Concept Tested: DMBOK


4. A configuration item has all required fields populated and passes schema validation, but its owner field references a team that was dissolved two years ago. Which quality dimension failure does this best illustrate?

  1. Completeness — a required field should be empty rather than contain an invalid reference
  2. Timeliness — the record has not been updated since the team dissolution event occurred
  3. Consistency — the record conflicts with the organizational data that shows the team no longer exists
  4. Accuracy — the stored value does not reflect reality, since the team no longer exists as a valid owner
Show Answer

The correct answer is D. Accuracy measures whether data values correctly represent the real-world entity they describe. An owner field pointing to a dissolved team is factually wrong—the server has no valid owner in the current organization. While the record passes completeness and schema validation checks, it fails the accuracy dimension. This example illustrates why automated validation alone cannot ensure data quality: some inaccuracies require cross-referencing against external authoritative sources like HR systems.

Concept Tested: Accuracy


5. What is the distinction between a Data Owner and a Data Steward in a data governance model?

  1. A Data Owner manages physical storage infrastructure, while a Data Steward manages logical data models and schemas
  2. A Data Owner holds strategic accountability for data as a business asset, while a Data Steward handles the day-to-day operational tasks of maintaining quality, resolving issues, and enforcing standards
  3. A Data Owner creates and deletes data records, while a Data Steward only reads data to generate quality reports
  4. A Data Owner is an IT role responsible for database administration, while a Data Steward is a business role responsible for defining data requirements
Show Answer

The correct answer is B. In data governance, ownership and stewardship operate at different levels. A Data Owner is a senior business stakeholder who is accountable for a data domain—they make strategic decisions about what data should exist, how it should be used, and who should have access. A Data Steward operates daily within those boundaries: profiling data, resolving quality issues, enforcing naming standards, and coordinating with Data Custodians who manage the technical infrastructure. Both roles are necessary for effective governance.

Concept Tested: Data Owner / Data Steward


6. Data lineage tracking in an IT management graph provides which capability that a static data dictionary cannot?

  1. Data lineage defines the business rules and validation constraints that each data field must satisfy
  2. Data lineage shows where data originated, how it has been transformed, and how it flows between systems—enabling impact analysis when source data changes
  3. Data lineage enforces access control policies by recording which users have read or modified each data record
  4. Data lineage generates executive dashboards that aggregate data quality scores across all systems in the digital estate
Show Answer

The correct answer is B. Data lineage answers the "where did this come from?" question by tracing the path of data from its origin through all transformations and systems it passes through. In IT management graphs, knowing that server configuration data flows from a discovery scanner through an ETL pipeline into the CMDB is critical when the scanner changes its output format—lineage reveals every downstream consumer that will be affected. A static data dictionary documents what fields mean but cannot capture this dynamic flow and transformation history.

Concept Tested: Data Lineage


7. A large enterprise maintains separate "golden records" for servers in its CMDB. This practice is an example of which data management discipline?

  1. Data Catalog management, which registers all authoritative data sources in a searchable inventory
  2. Schema Validation, which ensures all server records conform to a defined structure before ingestion
  3. Master Data Management, which creates and maintains a single authoritative version of key entities shared across multiple systems
  4. Reference Data management, which standardizes code values used consistently across enterprise systems
Show Answer

The correct answer is C. Master Data Management (MDM) addresses the challenge of maintaining a single "golden record" for core business entities—servers, applications, business services, and personnel—that are referenced across many systems. When multiple discovery tools each report slightly different attributes for the same server, MDM defines the processes and rules for reconciling those versions into one authoritative record. Without MDM, the same server may appear as multiple conflicting nodes in the management graph, corrupting blast radius calculations and impact analysis.

Concept Tested: Master Data Management


8. Role-Based Access Control (RBAC) is applied to an IT management graph. What is the primary security benefit over giving all users read access to all nodes?

  1. RBAC improves query performance by restricting the nodes each user can access, reducing the graph traversal scope
  2. RBAC ensures that sensitive data—such as security vulnerability details or financial impact properties—is accessible only to authorized roles, enforcing least-privilege principles
  3. RBAC automatically encrypts data in nodes that contain personally identifiable information before storing them in the graph database
  4. RBAC generates audit trails for every graph query, enabling compliance teams to verify that no unauthorized queries were executed
Show Answer

The correct answer is B. RBAC enforces the principle of least privilege by ensuring users can only access the data their role requires. In an IT management graph, a help-desk analyst may need to see application topology but not the security vulnerability scores that could reveal exploitable weaknesses. A financial analyst may need revenue impact properties on business service nodes but should not access detailed network configuration data. RBAC prevents over-exposure of sensitive graph data without restricting legitimate operational use.

Concept Tested: RBAC / Security Model


9. A data catalog entry for an IT management graph node type includes field descriptions, acceptable value ranges, data type constraints, and example values. This entry is an example of which concept?

  1. Validation Rules — the catalog entry defines the constraints used to reject non-conforming records at ingestion time
  2. Reference Data — the catalog entry provides the code lists and enumerated values used in the graph's property fields
  3. Metadata — the catalog entry is data about data, describing the structure, meaning, and constraints of the node type
  4. Data Lineage — the catalog entry traces the origin and transformation history of the node type's properties
Show Answer

The correct answer is C. Metadata is data that describes other data. A data catalog entry that documents field names, data types, acceptable ranges, and descriptions is technical and business metadata about the node type—not the node data itself. This metadata enables data consumers to understand what they are working with, data stewards to enforce standards, and automated tools to validate incoming records. Without metadata, IT management graph data becomes an opaque collection of properties whose meaning and quality cannot be assessed.

Concept Tested: Metadata / Data Catalog


10. JSON Schema validation is applied to incoming configuration item records before they are loaded into the IT management graph. Which combination of quality dimensions does this validation most directly enforce?

  1. Timeliness and Accuracy — schema validation confirms records are current and match real-world state
  2. Completeness and Validity — schema validation confirms required fields are present and values conform to defined types and constraints
  3. Consistency and Fitness for Purpose — schema validation ensures records align with data from other systems and meet operational needs
  4. Data Lineage and Accuracy — schema validation traces data origins and confirms values are factually correct
Show Answer

The correct answer is B. JSON Schema validation enforces structural rules: required properties must be present (Completeness) and values must match their defined data types, formats, and allowed value sets (Validity). For example, a schema can require that hostname is a non-empty string, cpu_count is a positive integer, and environment is one of ["prod", "staging", "dev"]. What schema validation cannot enforce is whether the values are factually accurate, current, or consistent with data from other sources—those dimensions require additional validation logic beyond schema conformance.

Concept Tested: Schema Validation / JSON Schema