NoSQL ATAM Glossary of Terms
ACID Properties
A set of properties that guarantee database transactions are processed reliably through atomicity, consistency, isolation, and durability.
Example: PostgreSQL provides ACID compliance for all transactions.
Anti-Entropy
A distributed system process that ensures data consistency by periodically comparing and synchronizing data across replicas.
Example: Cassandra uses anti-entropy repair to maintain consistency across cluster nodes.
API Gateway
A service that acts as an entry point for client requests, providing routing, authentication, and request/response transformation.
Example: Kong API Gateway manages traffic between microservices and external clients.
Architectural Approaches
Methods or strategies used to achieve desired quality attributes in a software system's architecture.
Example: Load balancing is an architectural approach for improving system availability.
Architectural Decisions
Fundamental choices made about the structure, behavior, and interaction of system components.
Example: Choosing between monolithic and microservices architecture is an architectural decision.
ATAM (Architecture Tradeoff Analysis Method)
A systematic approach for evaluating software architectures by analyzing how architectural decisions support quality attributes.
Example: ATAM helps identify risks and tradeoffs in a proposed microservices architecture.
Atomicity
A transaction property ensuring that all operations within a transaction are completed successfully or none are applied.
Example: A bank transfer either completes both debit and credit operations or neither.
Authentication
The process of verifying the identity of a user or system component.
Example: Multi-factor authentication requires both password and SMS verification.
Authorization
The process of determining what actions an authenticated user or system is permitted to perform.
Example: RBAC authorization grants different database access levels to users based on their roles.
Auto-Scaling
The automatic adjustment of computing resources based on current demand or predefined metrics.
Example: Kubernetes auto-scaling increases pod replicas when CPU usage exceeds 80%.
Auto-Sharding
The automatic partitioning of data across multiple database instances without manual intervention.
Example: MongoDB auto-sharding distributes collections across shards based on shard keys.
Availability
The degree to which a system is operational and accessible when required for use.
Example: 99.9% availability means the system is down for no more than 8.76 hours per year.
BASE Properties
A consistency model emphasizing basically available, soft state, and eventual consistency over strict ACID properties.
Example: NoSQL databases often implement BASE properties for better scalability.
Batch Processing
A method of processing data in large volumes at scheduled intervals rather than in real-time.
Example: Daily ETL jobs process transaction data from the previous day.
Bloom Filters
A probabilistic data structure used to test whether an element is in a set, with possible false positives but no false negatives.
Example: Cassandra uses Bloom filters to reduce disk reads during key lookups.
BSON (Binary JSON)
A binary representation of JSON documents that supports additional data types and enables efficient storage and parsing.
Example: MongoDB stores documents in BSON format for better performance than plain JSON.
B-Trees
A self-balancing tree data structure that maintains sorted data and allows searches, insertions, and deletions in logarithmic time.
Example: Most relational databases use B-trees for indexing table data.
Bulk Synchronous Parallel (BSP)
A parallel computing model where computation proceeds in synchronized supersteps.
Example: Apache Giraph implements BSP for distributed graph processing.
Business Drivers
The key business needs, goals, and constraints that influence architectural decisions.
Example: Reducing operational costs is a business driver for cloud migration.
Caching
The temporary storage of frequently accessed data in a faster storage layer to improve response times.
Example: Redis caches database query results to reduce database load.
CAP Theorem
A principle stating that distributed systems can guarantee at most two of consistency, availability, and partition tolerance.
Example: During network partitions, systems must choose between consistency and availability.
Cassandra Query Language (CQL)
A query language for Apache Cassandra that provides a SQL-like interface for interacting with Cassandra databases.
Example: CQL SELECT statements retrieve data from Cassandra column families.
Cell Timestamps
Metadata associated with individual data cells that indicates when the data was written or last modified.
Example: Cassandra uses cell timestamps for conflict resolution in distributed writes.
Cloud Computing
The delivery of computing services over the internet, including servers, storage, databases, and software.
Example: AWS provides cloud computing services through EC2 virtual machines.
Column Families
A data structure in column-family databases that groups related columns together.
Example: Cassandra organizes data into column families similar to tables in relational databases.
Column-Family Model
A data model that stores data in column families, where each row can have different columns.
Example: Apache HBase implements the column-family model for big data storage.
Column-Family Stores
Databases that organize data into column families, allowing flexible schemas and horizontal scaling.
Example: Cassandra and HBase are column-family stores designed for big data applications.
Columnar Compression
A data compression technique that compresses data column by column rather than row by row.
Example: Parquet files use columnar compression to reduce storage space for analytical queries.
Columnar Storage
A data storage method that stores data by columns rather than rows, optimizing for analytical queries.
Example: Amazon Redshift uses columnar storage for fast aggregation queries.
Community Detection
Graph algorithms that identify groups of nodes that are more densely connected to each other than to other nodes.
Example: Social networks use community detection to identify user groups with similar interests.
Compaction
A background process that merges and reorganizes data files to optimize storage and query performance.
Example: Cassandra compaction merges SSTables to eliminate deleted data and improve read performance.
Composite Keys
A primary key composed of multiple columns that together uniquely identify a row.
Example: A composite key might combine customer_id and order_date to uniquely identify orders.
Compound Indexes
Database indexes that include multiple columns to optimize queries with multiple search criteria.
Example: A compound index on (last_name, first_name) optimizes queries filtering by both fields.
Compression
The process of reducing data size through encoding techniques to save storage space and transfer time.
Example: LZ4 compression reduces the size of database backup files.
Conflict Resolution
Mechanisms for handling conflicting updates to the same data in distributed systems.
Example: Vector clocks provide conflict resolution for concurrent updates in distributed databases.
Consensus Algorithms
Distributed computing algorithms that enable nodes in a network to agree on a single value or decision.
Example: Raft consensus algorithm ensures consistency in distributed key-value stores.
Consistency
The property that all nodes in a distributed system see the same data at the same time.
Example: Strong consistency ensures that all reads return the most recent write.
Consistent Hashing
A distributed hashing scheme that minimizes data movement when nodes are added or removed.
Example: Cassandra uses consistent hashing to distribute data across cluster nodes.
Container Orchestration
The automated deployment, management, and scaling of containerized applications.
Example: Kubernetes orchestrates Docker containers across multiple hosts.
Content Delivery Networks (CDN)
A distributed network of servers that deliver web content to users based on their geographic location.
Example: CloudFlare CDN caches static content closer to users for faster page loads.
Cost-Based Optimization
A query optimization technique that selects execution plans based on estimated resource costs.
Example: PostgreSQL's cost-based optimizer chooses between index scans and table scans.
CQRS (Command Query Responsibility Segregation)
An architectural pattern that separates read and write operations to optimize performance and scalability.
Example: E-commerce systems use CQRS to separate order processing from product catalog queries.
Cypher
A declarative query language for graph databases, particularly Neo4j.
Example: Cypher queries use pattern matching to find relationships between nodes.
Data Governance
The management of data availability, usability, integrity, and security across an organization.
Example: Data governance policies define who can access customer personal information.
Data Ingestion
The process of collecting and importing data from various sources into a storage system.
Example: Apache Kafka ingests streaming data from IoT devices into a data lake.
Data Lakes
Large-scale storage repositories that hold raw data in its native format until needed.
Example: AWS S3 serves as a data lake for storing unstructured log files and documents.
Data Lineage
The tracking of data flow from its origin through various transformations to its final destination.
Example: Data lineage tools show how customer data flows from CRM to analytics dashboards.
Data Locality
The principle of storing and processing data close to where it's needed to minimize transfer costs.
Example: Hadoop moves computation to data nodes rather than moving data to computation nodes.
Data Marts
Specialized subsets of data warehouses focused on specific business areas or departments.
Example: A sales data mart contains only sales-related data for the sales team's analytics.
Data Masking
The process of hiding sensitive data by replacing it with realistic but fictional data.
Example: Production databases use data masking to protect customer SSNs in test environments.
Data Modeling
The process of creating a conceptual representation of data structures and their relationships.
Example: Entity-relationship modeling defines how customer and order entities relate.
Data Pipelines
Automated workflows that move and transform data from source systems to destination systems.
Example: ETL pipelines extract data from databases, transform it, and load it into data warehouses.
Data Privacy
The protection of personal and sensitive information from unauthorized access or disclosure.
Example: GDPR compliance requires explicit consent for processing personal data.
Data Quality
The degree to which data is accurate, complete, consistent, and fit for its intended use.
Example: Data quality checks validate that email addresses follow proper format rules.
Data Warehousing
The process of collecting, storing, and managing data from multiple sources for business intelligence.
Example: Snowflake provides cloud data warehousing for analytical workloads.
Denormalization
The process of adding redundant data to improve query performance at the cost of storage space.
Example: Storing customer names in order tables eliminates joins but increases storage.
Dimension Tables
Tables in a star schema that contain descriptive attributes for analyzing fact data.
Example: A time dimension table contains date, month, quarter, and year attributes.
Distributed Systems
Computing systems where components are located on different networked computers that communicate through message passing.
Example: Google's search engine runs on a distributed system across multiple data centers.
Distributed Transactions
Transactions that span multiple databases or systems while maintaining ACID properties.
Example: Two-phase commit ensures atomicity across multiple database systems.
Document Collections
Groups of related documents stored together in document databases.
Example: MongoDB collections store user profiles as JSON documents.
Document Databases
NoSQL databases that store data as documents, typically in JSON or XML format.
Example: MongoDB stores product catalogs as flexible JSON documents.
Document Model
A data model that stores information as documents containing nested attributes and arrays.
Example: JSON documents can store entire customer profiles with embedded address information.
Document Type Definition (DTD)
A markup language for defining the structure and legal elements of XML documents.
Example: DTD validates that XML documents contain required elements in correct order.
Drill-Down
The process of navigating from summary data to more detailed data in analytical systems.
Example: Drilling down from quarterly sales to monthly sales to daily sales.
Durability
The guarantee that once a transaction is committed, it will survive system failures.
Example: Write-ahead logging ensures durability by recording changes before applying them.
Dynamic Columns
Columns that can be added to rows at runtime without modifying the database schema.
Example: Cassandra allows adding new columns to existing rows without schema changes.
Edge Computing
A distributed computing paradigm that brings computation and data storage closer to data sources.
Example: IoT sensors process data at the edge before sending summaries to the cloud.
ELT (Extract, Load, Transform)
A data integration process that loads raw data first, then transforms it within the destination system.
Example: Modern data lakes use ELT to load raw data and transform it for specific analyses.
Embedded Documents
Documents stored within other documents, creating nested data structures.
Example: Customer documents embed address documents rather than referencing separate address records.
Encryption at Rest
The protection of stored data through cryptographic algorithms.
Example: Database files are encrypted at rest using AES-256 encryption.
Encryption in Transit
The protection of data while it's being transmitted between systems.
Example: HTTPS encrypts data in transit between web browsers and servers.
Entity-Relationship Model
A data modeling technique that represents entities and their relationships in a database.
Example: ER diagrams show how customer entities relate to order entities.
ETL (Extract, Transform, Load)
A data integration process that extracts data from sources, transforms it, and loads it into destinations.
Example: Nightly ETL jobs extract sales data, calculate metrics, and load results into reporting tables.
Event Sourcing
An architectural pattern that stores all changes to application state as a sequence of events.
Example: Banking systems use event sourcing to maintain an audit trail of all account transactions.
Eventual Consistency
A consistency model where the system will become consistent over time if no new updates are made.
Example: DNS updates eventually propagate to all servers worldwide.
Execution Plans
Detailed steps that database engines use to execute queries, including access methods and join strategies.
Example: Query execution plans show whether indexes are used for table scans.
Fact Tables
Tables in a star schema that contain quantitative measures for business analysis.
Example: Sales fact tables contain revenue, quantity, and cost measures.
Failover
The automatic switching to a backup system when the primary system fails.
Example: Database failover redirects traffic to standby servers during primary server outages.
Findability
The ease with which information can be located and retrieved from a system.
Example: Search functionality and clear navigation improve content findability.
Fitness for Purpose
The degree to which a system meets its intended use and requirements.
Example: A real-time trading system's fitness for purpose depends on low-latency performance.
FLWOR Expressions
XQuery expressions that use For, Let, Where, Order by, and Return clauses to query XML data.
Example: FLWOR expressions extract and sort product information from XML catalogs.
Foreign Keys
Database constraints that link rows in one table to rows in another table.
Example: Order tables use customer_id foreign keys to reference customer tables.
Full-Text Search
The capability to search for text within documents or database fields.
Example: Elasticsearch provides full-text search across document collections.
Geo-Distribution
The deployment of system components across multiple geographic locations.
Example: Global CDNs use geo-distribution to serve content from nearby edge locations.
Geospatial Indexes
Specialized indexes for efficiently querying location-based data.
Example: MongoDB geospatial indexes enable finding nearby restaurants within a radius.
Gossip Protocol
A communication protocol where nodes periodically exchange state information with randomly selected peers.
Example: Cassandra uses gossip protocol for cluster membership and failure detection.
Graph Algorithms
Computational methods for analyzing graph structures and relationships.
Example: PageRank algorithm determines the importance of web pages based on link structure.
Graph Databases
NoSQL databases designed to store and query data with complex relationships.
Example: Neo4j stores social network connections as graph nodes and relationships.
Graph Embeddings
Vector representations of graph nodes that capture their structural properties.
Example: Node2Vec creates embeddings for recommendation systems based on user interaction graphs.
Graph Model
A data model that represents data as nodes connected by edges or relationships.
Example: Social networks use graph models to represent friend connections.
Graph Neural Networks (GNNs)
Neural network architectures designed to operate on graph-structured data.
Example: GNNs predict protein functions based on molecular interaction graphs.
Graph Partitioning
The division of large graphs into smaller subgraphs for distributed processing.
Example: Graph partitioning enables parallel processing of social network analysis.
Graph Traversal
The process of visiting nodes and edges in a graph following specific patterns.
Example: Breadth-first traversal finds the shortest path between two nodes.
GraphSAGE
A graph neural network framework for generating node embeddings through sampling and aggregation.
Example: GraphSAGE learns user preferences from social network interaction patterns.
Gremlin
A graph traversal language for querying graph databases.
Example: Gremlin queries find mutual friends between users in a social network.
GSQL (Graph SQL)
A query language for graph databases that extends SQL with graph-specific operations.
Example: GSQL queries analyze fraud patterns in financial transaction graphs.
Hadoop
An open-source framework for distributed storage and processing of big data.
Example: Hadoop MapReduce processes terabytes of log data across cluster nodes.
HBase
A distributed, column-family NoSQL database built on top of Hadoop.
Example: HBase stores time-series data for real-time analytics applications.
Health Checks
Monitoring mechanisms that verify system components are functioning correctly.
Example: Load balancers use health checks to route traffic only to healthy servers.
Hierarchical Data
Data organized in a tree-like structure with parent-child relationships.
Example: File systems organize files and directories in hierarchical structures.
High Availability
The characteristic of a system that remains operational for extended periods.
Example: Database clusters provide high availability through redundant nodes.
Hinted Handoff
A mechanism where nodes temporarily store data intended for unavailable nodes.
Example: Cassandra uses hinted handoff to ensure data delivery after node recovery.
Horizontal Scaling
The practice of adding more servers to handle increased load.
Example: Web applications achieve horizontal scaling by adding more web server instances.
HTML (HyperText Markup Language)
A markup language used to create web pages and web applications.
Example: HTML forms collect user input for database storage.
Hypercubes
Multidimensional data structures used in OLAP systems for fast analytical queries.
Example: Sales hypercubes allow analysis across product, time, and region dimensions.
IaaS (Infrastructure as a Service)
Cloud computing services that provide virtualized computing resources over the internet.
Example: AWS EC2 provides IaaS through virtual machine instances.
Indexing
The creation of data structures that improve query performance by providing fast access paths.
Example: B-tree indexes on customer_id columns speed up customer lookups.
Infrastructure as Code
The practice of managing infrastructure through code rather than manual configuration.
Example: Terraform scripts define cloud infrastructure as version-controlled code.
Interoperability
The ability of different systems to work together and exchange information.
Example: RESTful APIs enable interoperability between different software systems.
IoT (Internet of Things)
A network of physical devices embedded with sensors and software for data collection and exchange.
Example: Smart thermostats collect temperature data and adjust settings automatically.
ISO 11179 Term Definition
A term definition is considered to be consistent with ISO metadata registry guideline 11179 if it meets the following criteria:
- Precise
- Concise
- Distinct
- Non-circular
- Unencumbered with business rules
Example: We use ISO term definition guidelines for creating definitions for our glossary of terms.
Isolation
The guarantee that concurrent transactions do not interfere with each other.
Example: Database isolation prevents one user's changes from affecting another user's transaction.
JSON (JavaScript Object Notation)
A lightweight data interchange format that is easy for humans to read and write.
Example: REST APIs commonly use JSON for request and response payloads.
JSON Schema
A specification for validating the structure and content of JSON documents.
Example: JSON Schema validates that API requests contain required fields.
Kafka
A distributed streaming platform for building real-time data pipelines and applications.
Example: Kafka processes millions of events per second for real-time analytics.
Key-Value Model
A data model that stores data as key-value pairs with unique keys.
Example: Redis uses the key-value model for caching session data.
Key-Value Stores
NoSQL databases that store data as collections of key-value pairs.
Example: Amazon DynamoDB stores user preferences as key-value pairs.
Kubernetes
An open-source container orchestration platform for automating deployment and management.
Example: Kubernetes automatically scales application pods based on resource usage.
Labeled Property Graph (LPG)
A graph model where nodes and edges can have labels and properties.
Example: Neo4j uses LPG to model complex relationships with rich metadata.
Load Balancing
The distribution of incoming requests across multiple servers to optimize performance.
Example: Application load balancers distribute web traffic across multiple web servers.
Logging
The systematic recording of events and messages for monitoring and debugging.
Example: Application logs record user actions for security auditing.
LSM Trees (Log-Structured Merge Trees)
A data structure optimized for high-volume sequential writes.
Example: Cassandra uses LSM trees for efficient handling of write-heavy workloads.
Machine Learning (ML)
The use of algorithms and statistical models to enable computers to improve performance through experience.
Example: ML algorithms analyze customer behavior to recommend products.
Maintainability
The ease with which a system can be modified to correct faults or improve performance.
Example: Modular code architecture improves maintainability by isolating changes.
MapReduce
A programming model for processing large datasets in parallel across distributed systems.
Example: MapReduce jobs analyze web server logs to identify popular pages.
Master-Master Replication
A replication setup where multiple nodes can accept writes and synchronize changes.
Example: MySQL master-master replication allows writes to multiple database servers.
Master-Slave Architecture
A distributed system design where one master node coordinates multiple slave nodes.
Example: Redis master-slave architecture replicates data from master to slave nodes.
Massively Parallel Processing (MPP)
A computing architecture that uses many processors to perform computations in parallel.
Example: Data warehouses use MPP to execute complex analytical queries quickly.
Materialized Views
Pre-computed query results stored as tables to improve query performance.
Example: Materialized views store aggregated sales data for faster reporting.
Memcached
A high-performance, distributed memory caching system.
Example: Memcached stores database query results in memory for faster access.
Microservices
An architectural pattern that structures applications as collections of loosely coupled services.
Example: E-commerce platforms use microservices for user management, inventory, and payments.
MongoDB
A document-oriented NoSQL database that stores data in flexible JSON documents.
Example: MongoDB stores product catalogs with varying attributes per product.
Monitoring
The continuous observation of system performance and behavior.
Example: Application monitoring alerts developers when response times exceed thresholds.
Multi-Data Center Deployment
The distribution of system components across multiple data centers for redundancy.
Example: Global applications deploy across multiple data centers for disaster recovery.
Multi-Tenant Architecture
A software architecture where a single instance serves multiple tenants or customers.
Example: SaaS applications use multi-tenant architecture to serve multiple organizations.
Multi-Version Concurrency Control (MVCC)
A concurrency control method that maintains multiple versions of data to avoid locking.
Example: PostgreSQL uses MVCC to allow concurrent reads and writes without blocking.
MySQL
An open-source relational database management system.
Example: MySQL powers many web applications with ACID-compliant transactions.
Native Parallel Graph (NPG)
A graph processing model designed for native parallel execution.
Example: TigerGraph implements NPG for real-time graph analytics.
Neo4j
A graph database management system designed for handling connected data.
Example: Neo4j analyzes social network connections to detect fraud patterns.
NFRs (Non-Functional Requirements)
Requirements that specify criteria for system operation rather than specific behaviors.
Example: NFRs include performance targets like "response time under 100ms."
Node2Vec
A machine learning algorithm that generates vector representations of graph nodes.
Example: Node2Vec creates embeddings for social network users based on connection patterns.
Nodes (Vertices)
Individual entities in a graph database that can store properties and connect to other nodes.
Example: User nodes in a social network contain profile information and friend connections.
Non-Risks
Architectural elements that do not pose threats to system quality attributes.
Example: Well-established libraries with proven track records are typically non-risks.
Normalization
The process of organizing database tables to reduce redundancy and improve data integrity.
Example: Third normal form eliminates transitive dependencies between columns.
NoSQL
Database systems that provide flexible schemas and horizontal scaling beyond traditional SQL databases.
Example: Document databases like MongoDB store JSON documents without fixed schemas.
Object-Relational Impedance Mismatch
The conceptual difference between object-oriented programming and relational databases.
Example: Object inheritance doesn't map directly to relational table structures.
OLAP (Online Analytical Processing)
A category of database processing focused on complex analytical queries.
Example: OLAP cubes enable fast analysis of sales data across multiple dimensions.
OLAP Cubes
Multidimensional data structures that enable fast analytical queries.
Example: Sales cubes allow quick analysis by product, region, and time period.
OLTP (Online Transaction Processing)
A category of database processing focused on transaction-oriented applications.
Example: OLTP systems handle credit card transactions in real-time.
Ontology
A formal representation of knowledge that defines concepts and relationships within a domain.
Example: Medical ontologies define relationships between diseases, symptoms, and treatments.
Optimistic Concurrency Control
A concurrency control method that assumes conflicts are rare and checks for conflicts at commit time.
Example: Version numbers detect conflicts when multiple users edit the same document.
Oracle Database
A commercial relational database management system.
Example: Oracle Database provides enterprise-grade features for mission-critical applications.
PageRank
A graph algorithm that measures the importance of nodes based on link structure.
Example: Google's PageRank algorithm ranks web pages based on incoming links.
PaaS (Platform as a Service)
Cloud computing services that provide a platform for developing and running applications.
Example: Heroku provides PaaS for deploying web applications without managing servers.
Partitioning
The division of large databases into smaller, more manageable pieces.
Example: Date-based partitioning splits tables by time periods for better performance.
Partitioning Strategies
Methods for dividing data across multiple storage units or servers.
Example: Hash partitioning distributes data evenly across database shards.
Peer-to-Peer Architecture
A distributed system design where nodes act as both clients and servers.
Example: BitTorrent uses peer-to-peer architecture for file sharing.
Performance
The degree to which a system accomplishes its functions within time and resource constraints.
Example: Database performance measures include query response time and throughput.
Pessimistic Concurrency Control
A concurrency control method that prevents conflicts by locking resources before use.
Example: Database row locking prevents concurrent updates to the same record.
Pivot
The operation of rotating data from rows to columns for analysis.
Example: Pivot tables transform monthly sales data into columns for yearly comparison.
Point-in-Time Recovery
The ability to restore a database to a specific point in time.
Example: Database backups enable point-in-time recovery after data corruption.
Polyglot Persistence
The use of multiple database technologies within a single application.
Example: Applications might use Redis for caching and PostgreSQL for transactions.
PostgreSQL
An open-source relational database with advanced features and SQL compliance.
Example: PostgreSQL supports JSON data types alongside traditional relational features.
Primary Keys
Database constraints that uniquely identify rows in a table.
Example: Customer tables use customer_id as the primary key.
Quality Attribute Scenarios
Concrete examples that illustrate how quality attributes apply to specific system situations.
Example: "System responds to 1000 concurrent users within 2 seconds" describes a performance scenario.
Quality Attributes
System properties that describe how well a system performs its intended functions.
Example: Performance, security, and scalability are key quality attributes.
Quality Tree
A hierarchical structure that organizes quality attributes and their refinements.
Example: Performance quality trees might include response time and throughput branches.
Queryability
The ease with which data can be retrieved and analyzed through queries.
Example: SQL databases provide high queryability through standardized query language.
Query Optimization
The process of improving query performance through better execution strategies.
Example: Query optimizers choose index scans over table scans for selective queries.
Quorum
A minimum number of nodes that must agree before a distributed system operation can proceed.
Example: Cassandra quorum reads ensure consistency by reading from majority of replicas.
RBAC (Role-Based Access Control)
A security model that grants access based on user roles rather than individual permissions.
Example: Database administrators have different access roles than application users.
RDF (Resource Description Framework)
A framework for representing information about resources on the web.
Example: RDF triples describe relationships between web resources in semantic format.
Read-After-Write Consistency
A consistency model that ensures reads following writes return the written value.
Example: Social media posts appear immediately to the author after publishing.
Read Repair
A mechanism that fixes inconsistencies by comparing data during read operations.
Example: Cassandra read repair synchronizes data when inconsistencies are detected.
Real-Time Processing
The processing of data as it arrives without significant delay.
Example: Stream processing engines handle real-time event data for immediate analysis.
Redis
An in-memory data structure store used as a database, cache, and message broker.
Example: Redis stores session data for fast web application access.
Referential Integrity
The guarantee that foreign key relationships remain valid across database operations.
Example: Referential integrity prevents deletion of customers who have existing orders.
Reliability
The probability that a system will perform its intended function without failure.
Example: Database reliability measures include mean time between failures.
Replication
The process of copying data across multiple database servers for redundancy.
Example: Master-slave replication keeps backup servers synchronized with the primary.
Risks
Architectural elements that may prevent a system from achieving its quality attributes.
Example: Single points of failure represent availability risks.
Risk Themes
Categories of related risks that affect system quality attributes.
Example: Security risk themes might include authentication and data protection risks.
Roll-Up
The aggregation of detailed data into summary data at higher levels.
Example: Rolling up daily sales data into monthly and quarterly summaries.
Row-Oriented Storage
A data storage method that stores complete rows together sequentially.
Example: Traditional relational databases use row-oriented storage for transactional workloads.
SaaS (Software as a Service)
Cloud computing services that provide software applications over the internet.
Example: Salesforce provides CRM functionality as a SaaS application.
Scalability
The ability of a system to handle increased load by adding resources.
Example: Horizontal scalability adds more servers to handle growing user traffic.
Scenarios
Specific situations or use cases that demonstrate how quality attributes apply.
Example: Load testing scenarios simulate peak traffic conditions.
Schema Evolution
The process of modifying database schemas while maintaining backward compatibility.
Example: Adding new columns to existing tables without breaking existing applications.
Schema-on-Read
A data processing approach where schema is applied when data is read rather than written.
Example: Data lakes use schema-on-read to handle diverse data formats.
Schema-on-Write
A data processing approach where schema is enforced when data is written to storage.
Example: Relational databases use schema-on-write to ensure data consistency.
Schemaless
The ability to store data without predefined schema requirements.
Example: Document databases allow schemaless storage of JSON documents.
Schematron
A rule-based validation language for XML documents.
Example: Schematron validates business rules in XML documents beyond basic schema validation.
Secondary Indexes
Additional indexes created on columns other than the primary key.
Example: Email indexes on user tables enable fast lookups by email address.
Security
The protection of system resources and data from unauthorized access or modification.
Example: Database security includes authentication, authorization, and encryption.
Semantic Web
An extension of the web that enables machines to understand and process web content.
Example: RDF data enables semantic web applications to reason about relationships.
Semi-Structured Data
Data that has some organizational structure but doesn't fit rigid schemas.
Example: JSON documents represent semi-structured data with flexible schemas.
Sensitivity Points
Architectural elements where changes significantly affect quality attributes.
Example: Database connection pools are sensitivity points for performance.
Serverless Computing
A cloud computing model where the cloud provider manages server infrastructure.
Example: AWS Lambda runs code without provisioning or managing servers.
Service Level Agreement (SLA)
A contract that defines expected service performance levels.
Example: SLAs specify 99.9% uptime guarantees for cloud services.
Service Level Indicator (SLI)
A quantitative measure of service performance.
Example: Response time percentiles serve as SLIs for web services.
Service Level Objective (SLO)
A target value or range for service performance metrics.
Example: SLOs might target 95% of requests completing within 100ms.
Service Mesh
An infrastructure layer that handles service-to-service communication in microservices.
Example: Istio service mesh provides load balancing and security between microservices.
SGML (Standard Generalized Markup Language)
A markup language standard for defining document structure.
Example: HTML and XML are both derived from SGML principles.
Sharding
The horizontal partitioning of data across multiple database instances.
Example: User data sharding distributes users across multiple database servers.
Shortest Path
Graph algorithms that find the minimum distance between two nodes.
Example: GPS navigation uses shortest path algorithms to find optimal routes.
Slice and Dice
The operation of viewing data from different perspectives by selecting and filtering.
Example: Sales analysis tools slice and dice data by product, region, and time.
Slowly Changing Dimensions
Dimension table records that change slowly over time.
Example: Customer addresses change slowly and require historical tracking.
Snowflake Schema
A dimensional modeling approach where dimension tables are normalized.
Example: Snowflake schemas reduce redundancy by normalizing dimension tables.
Sparse Columns
Columns that contain null values for most rows.
Example: Product tables use sparse columns for optional attributes.
Sparse Data
Data where most values are zero or null.
Example: User preference matrices are sparse since users rate few products.
Sparse Indexes
Indexes that only include rows with non-null values in indexed columns.
Example: MongoDB sparse indexes skip documents with missing fields.
Spark
A distributed computing framework for large-scale data processing.
Example: Apache Spark processes big data analytics jobs across cluster nodes.
SPARQL
A query language for retrieving and manipulating RDF data.
Example: SPARQL queries extract information from semantic web knowledge graphs.
SQL (Structured Query Language)
A standardized language for managing relational databases.
Example: SQL SELECT statements retrieve data from relational database tables.
Stakeholders
Individuals or groups who have an interest in or are affected by a system.
Example: System stakeholders include users, developers, and business owners.
Star Schema
A dimensional modeling approach with a central fact table surrounded by dimension tables.
Example: Sales star schemas have sales facts connected to customer, product, and time dimensions.
Storm
A distributed stream processing framework for real-time data analysis.
Example: Apache Storm processes streaming data from social media feeds.
Streaming Analytics
The real-time analysis of data as it flows through a system.
Example: Streaming analytics detect fraudulent transactions as they occur.
Strong Consistency
A consistency model where all nodes return the same data simultaneously.
Example: Traditional relational databases provide strong consistency for transactions.
Structured Data
Data that is organized in a predefined format with clear relationships.
Example: Relational database tables contain structured data with defined columns.
Super Columns
Columns that contain multiple sub-columns in column-family databases.
Example: Cassandra super columns group related data within a single column.
Surrogate Keys
Artificial keys created solely for uniquely identifying rows.
Example: Auto-incrementing ID columns serve as surrogate keys.
Time-Series Data
Data points collected over time intervals, typically for analysis of trends.
Example: IoT sensors generate time-series data for temperature monitoring.
Time-Series Databases
Specialized databases optimized for storing and querying time-series data.
Example: InfluxDB stores metrics data with timestamp-based indexing.
Time-To-Live (TTL)
A mechanism that automatically expires data after a specified time period.
Example: Cache entries use TTL to automatically remove stale data.
TigerGraph
A graph database designed for real-time analytics on large-scale graphs.
Example: TigerGraph analyzes fraud patterns in financial transaction networks.
Token Ranges
Ranges of hash values that determine data distribution in consistent hashing.
Example: Cassandra assigns token ranges to nodes for data distribution.
Tombstones
Markers that indicate deleted data in distributed databases.
Example: Cassandra tombstones mark deleted records for eventual cleanup.
Tradeoffs
The architectural decisions that improve one quality attribute at the expense of another.
Example: Caching improves performance but may reduce consistency.
Transformability
The ease with which a system can be modified to incorporate new requirements.
Example: Plugin architectures provide high transformability for extending functionality.
Triple Store
A database designed for storing and querying RDF triples.
Example: Triple stores enable semantic web applications to query knowledge graphs.
Tunable Consistency
The ability to adjust consistency levels based on application requirements.
Example: Cassandra allows tunable consistency from eventual to strong consistency.
Two-Phase Commit (2PC)
A distributed transaction protocol that ensures atomicity across multiple databases.
Example: 2PC coordinates commits across multiple database systems.
Typeswitch
A conditional construct in XQuery that performs different actions based on data types.
Example: Typeswitch handles different XML element types in query processing.
Unstructured Data
Data that doesn't fit into traditional row-column database structures.
Example: Text documents, images, and videos are unstructured data.
Usability
The ease with which users can learn and use a system effectively.
Example: Intuitive user interfaces improve system usability.
User Stories
Short descriptions of system features from the user's perspective.
Example: "As a customer, I want to view my order history" is a user story.
Utility Tree
A hierarchical structure that organizes quality attributes by their importance to stakeholders.
Example: Utility trees prioritize performance over security for gaming applications.
Variability
The degree to which data values differ from each other.
Example: Customer age data shows low variability while income data shows high variability.
Vector Clocks
A logical clock mechanism for ordering events in distributed systems.
Example: Vector clocks resolve conflicts in distributed database updates.
Vectorized Processing
A processing technique that applies operations to multiple data elements simultaneously.
Example: Columnar databases use vectorized processing for efficient aggregations.
Velocity
The speed at which data is generated and processed.
Example: High-velocity data streams require real-time processing capabilities.
Veracity
The accuracy and trustworthiness of data.
Example: Data validation ensures veracity by checking for errors and inconsistencies.
Vertical Scaling
The practice of adding more power to existing servers to handle increased load.
Example: Upgrading database servers with more RAM provides vertical scaling.
Virtual Nodes (vnodes)
A technique that divides each physical node into multiple virtual nodes for better data distribution.
Example: Cassandra vnodes improve load balancing across cluster nodes.
Volume
The amount of data that needs to be stored and processed.
Example: Big data applications handle large volumes of data across distributed systems.
Write-Ahead Logging
A technique that logs changes before applying them to ensure durability.
Example: Database WAL ensures recovery after system crashes.
XML (Extensible Markup Language)
A markup language for encoding documents in human-readable format.
Example: XML documents store structured data with custom tags.
XML Schema (XSD)
A language for describing the structure and constraints of XML documents.
Example: XSD validates that XML documents conform to expected structures.
XPath
A query language for selecting nodes from XML documents.
Example: XPath expressions extract specific elements from XML documents.
XQuery
A query language for finding and extracting information from XML documents.
Example: XQuery transforms XML data into different formats for presentation.
XSLT (Extensible Stylesheet Language Transformations)
A language for transforming XML documents into other formats.
Example: XSLT transforms XML data into HTML for web presentation.