Quiz: Performance Metrics and Benchmarking

Test your understanding of graph database performance, index-free adjacency, benchmarking techniques, and optimization strategies.

1. What is hop count in graph traversal?

The number of servers in the cluster
The number of edges traversed in a path between two nodes, measuring distance in the graph
The number of properties on a node
The number of users connected to the database

Show Answer

The correct answer is B. Hop count is the number of edges traversed in a path between nodes, measuring distance in the graph. For example, "friends within 3 hops" means exploring friend → friend-of-friend → friend-of-friend-of-friend relationships. Hop count directly impacts query performance.

Concept Tested: Hop Count

See: Hop Count Section

2. What is the indegree of a node?

The total number of edges connected to a node
The count of incoming edges to a node, measuring how many other nodes point to it
The number of properties on a node
The node's position in the graph

Show Answer

The correct answer is B. Indegree is the count of incoming edges to a node—how many other nodes point to it. For example, in a follower graph, a celebrity's indegree counts their followers. Outdegree counts outgoing edges (who they follow). Total degree is indegree + outdegree.

Concept Tested: Indegree

See: Node Degree Metrics

3. Why do graph indexes primarily serve as entry points rather than speeding up joins?

Indexes don't work in graph databases
Graph databases use index-free adjacency for traversal, so indexes mainly enable fast lookup of starting nodes
Joins are faster than indexes
Indexes slow down queries

Show Answer

The correct answer is B. Unlike relational databases where indexes speed up joins, graph databases use index-free adjacency for traversal after finding starting nodes. Indexes primarily serve as efficient entry points—for example, an index on Person.email enables rapid user lookup to begin traversal from that specific node.

Concept Tested: Graph Indexes

See: Index Strategy

4. What does the LDBC SNB benchmark measure?

File system performance
Graph database performance on social network workloads with queries like finding recent posts by friends
Network bandwidth
CPU clock speed

Show Answer

The correct answer is B. The Linked Data Benchmark Council's Social Network Benchmark (LDBC SNB) is a standard for evaluating graph database performance on realistic social network workloads, including queries like finding recent posts by friends, complex aggregations, and update operations. It enables fair comparison across different graph databases.

Concept Tested: LDBC SNB Benchmark

See: Benchmarking Section

5. Why does query performance often degrade exponentially with hop count in dense graphs?

Because databases get tired
Each hop potentially explores many neighbors, causing exponential growth in nodes visited
Hop count has no effect on performance
Performance improves with more hops

Show Answer

The correct answer is B. In dense graphs, each hop potentially explores many neighbors—if each node connects to 100 others, a 3-hop query might touch 100 → 10,000 → 1,000,000 nodes. This exponential growth makes deep queries expensive. Graph databases mitigate this through index-free adjacency and query optimization, but hop count remains critical for performance.

Concept Tested: Hop Count, Traversal Cost

See: Performance Factors

6. What is the edge-to-node ratio and why does it matter?

It doesn't matter for performance
The average number of edges per node, indicating connectivity density and impacting traversal performance
The ratio of deleted to active edges
The size of edges in bytes

Show Answer

The correct answer is B. The edge-to-node ratio is the average number of edges per node in a graph. A social network with ratio 50 means users average 50 connections. Higher ratios indicate denser graphs with more connections to traverse, impacting query performance. This metric helps predict traversal costs and identify supernodes.

Concept Tested: Edge-to-Node Ratio

See: Graph Metrics

7. Given a slow query scanning millions of nodes, what optimization technique would most likely help?

Buy more RAM
Add an index on the filtered property to enable fast entry point lookup instead of full scan
Delete half the database
Restart the server

Show Answer

The correct answer is B. Adding an index on the filtered property (like Person.email or Product.SKU) enables the database to quickly find specific starting nodes instead of scanning millions. This transforms an O(n) full scan into an O(1) or O(log n) index lookup. While more RAM (A) can help with caching, it doesn't address the root cause of scanning.

Concept Tested: Query Optimization, Graph Indexes

See: Optimization Techniques

8. What distinguishes synthetic benchmarks from real-world workload benchmarks?

Synthetic benchmarks use artificially generated data and queries for controlled testing, while real-world benchmarks use actual production patterns
They are the same thing
Synthetic benchmarks are always more accurate
Real-world benchmarks don't exist

Show Answer

The correct answer is A. Synthetic benchmarks use artificially generated datasets and workloads (like Graph 500) for reproducible, controlled testing across different systems. Real-world benchmarks use actual production access patterns. Both are valuable—synthetic for comparability, real-world for relevance to specific use cases.

Concept Tested: Synthetic Benchmarks, Performance Benchmarking

See: Benchmarking Approaches

9. How does statistical query tuning improve performance?

By deleting statistics
By using statistical information about data distributions and node degrees to optimize query plans
By making queries slower
By converting all data to statistics

Show Answer

The correct answer is B. Statistical query tuning uses information about data distributions (node degree distributions, edge cardinalities, property value frequencies) to make smarter execution decisions. For example, knowing that 95% of users have degree < 100 but 5% have degree > 10,000 helps the optimizer decide whether to use indexes or full scans.

Concept Tested: Statistical Query Tuning

See: Query Tuning

10. Why is measuring both query latency and throughput important?

They're redundant metrics
Latency measures user experience (response time), while throughput measures system capacity (queries/second) under load
Only throughput matters
Only latency matters

Show Answer

The correct answer is B. Latency measures individual query response time (critical for user experience—"Does my query feel fast?"), while throughput measures how many concurrent queries the system handles (critical for capacity planning—"How many users can we support?"). A system might have low latency but low throughput, or vice versa. Both metrics are essential for production readiness.

Concept Tested: Query Latency, Query Throughput, Performance Benchmarking

See: Performance Metrics

Quiz Complete!

Questions: 10 Cognitive Levels: Remember (3), Understand (3), Apply (2), Analyze (2) Concepts Covered: Hop Count, Indegree, Outdegree, Edge-to-Node Ratio, Graph Indexes, LDBC SNB, Graph Metrics, Performance Benchmarking, Synthetic Benchmarks, Query Optimization, Statistical Query Tuning, Query Latency, Query Throughput, Traversal Cost

Next Steps: - Review Chapter Content for performance optimization strategies - Practice benchmarking graph queries - Continue to Chapter 6: Graph Algorithms