π p50, p90, p99 Metrics: Why Averages Lie & Percentiles Rule
π‘ "Our API latency is 200ms on average!" Sounds great, right? Until you realize 5% of users are waiting 5 seconds for a response. Averages lie. Percentiles tell the truth.
In this article, Obito & Rin break down:
β
What p50, p90, p99 actually mean
β
When to use each metric
β
Instrumentation techniques to measure them
β
Open-source tools for tracking percentiles
π©βπ» Rin: "Obito, I keep hearing about p50, p90, p99 in performance monitoring. Are they just fancy ways of saying average response time?"
π¨βπ» Obito: "Oh no, Rin. Averages are liars. Percentiles tell you how bad things get for real users. Let me explain."
π What Do p50, p90, and p99 Actually Mean?
π¨βπ» Obito: "Percentiles show how long a request takes for a certain percentage of users.
MetricMeaningp50 (Median)50% of users experience this latency or fasterp9090% of users get a response in this time or fasterp9999% of users get a response in this time or faster (worst-case outliers)
π©βπ» Rin: "So if p50 = 100ms, half of my users get a response in under 100ms?"
π¨βπ» Obito: "Exactly. But your p99 could still be 5 seconds, meaning 1% of users are suffering."
π When Should You Care About p50, p90, or p99?
π©βπ» Rin: "Okay, but when should I care about p90 or p99 instead of just looking at p50?"
π¨βπ» Obito: "It depends on what you're measuring!"
ποΈ Example 1: Web Page Load Time (p50 vs p90)
β
p50 β How fast most users experience your page
β
p90 β Detects UI lag for slower users (maybe mobile users on slow networks)
π¨ If p90 is way higher than p50, slow users are suffering.
π When to Use?
p50 β Optimizing for common user experience
p90 β Checking slow users on bad networks
π Example 2: API Latency (p90 vs p99)
β
p90 β Good for general API health
β
p99 β Identifies outliers (slow DB queries, cold caches, network spikes)
π¨ If p99 is much higher than p90, you have tail latency issues.
π When to Use?
p90 β Monitoring API stability
p99 β Detecting edge cases & worst-case performance
β οΈ Example 3: Load Testing (p99 vs p99.9)
β
p99 β Checks worst-case latency
β
p99.9 (p999) β Detects absolute worst outliers
π¨ If p999 spikes, it means 0.1% of users are having a horrible experience.
π When to Use?
p99 β Scaling servers before users complain
p99.9 β Checking worst-case performance under peak load
π©βπ» Rin: "So p50 is fine for happy users, but p99 shows the pain points?"
π¨βπ» Obito: "Exactly! If you only look at averages, youβll never see what's breaking for your worst users."
π οΈ How Do You Measure These Metrics? (Instrumentation Techniques)
π©βπ» Rin: "Okay, but how do I actually measure p99 latency in real-world systems?"
π¨βπ» Obito: "You need to instrument your code. Hereβs how:"
1οΈβ£ Application-Level Instrumentation
β
Use logging & timers to capture response times.
β
Store latency values and calculate percentiles in your own analytics.
π Example: Logging Latency in Python
import time
start = time.time()
handle_request() # Your function/API call
latency = time.time() - start
log_latency(latency) # Store for percentile calculations
π©βπ» Rin: "So I just log every response time and analyze them later?"
π¨βπ» Obito: "Yep! But for real-time tracking, you need open-source monitoring tools."
π Best Open-Source Tools for Percentile Tracking
π©βπ» Rin: "Whatβs the easiest way to track p50, p90, and p99?"
π¨βπ» Obito: "Use these open-source tools to do the heavy lifting!"
π©βπ» Rin: "How does Prometheus track p99?"
π¨βπ» Obito: "You define a histogram metric and let Prometheus calculate percentiles."
π Example: Prometheus Histogram for API Latency
api_request_duration_seconds_bucket{le="0.1"} 2567
api_request_duration_seconds_bucket{le="0.5"} 5890
api_request_duration_seconds_bucket{le="1"} 7845
api_request_duration_seconds_bucket{le="2"} 9402
api_request_duration_seconds_count 10000
api_request_duration_seconds_sum 12543
π― Key Takeaways: When to Use Each Metric
β
p50 (Median): When you want to see typical user experience.
β
p90: When checking how slow users are affected.
β
p99: When finding outliers & worst-case performance.
β
p99.9: When every millisecond counts (finance, gaming, trading).
π©βπ» Rin: "So if I only monitor p50, I might think my system is fine while users suffer?"
π¨βπ» Obito: "Exactly! p99 shows you where performance truly breaks down."
π Final Thoughts & Next Steps
π©βπ» Rin: "Okay, I get it nowβpercentiles tell the real story."
π¨βπ» Obito: "Yep! If youβre serious about latency monitoring, never trust averages."
π Next Steps:
β
Instrument your code to measure latency.
β
Use Prometheus + Grafana for percentile tracking.
β
Alert on p99 spikes to catch real performance issues.
π Whatβs Next on BinaryBanter?
π Coming Soon: Latency OptimizationβHow to Reduce p99 Spikes
π Follow BinaryBanter on Substack, Medium | π» Learn. Discuss. Banter.