Metrics & Definitions

Understanding the performance data is crucial for tuning the vision pipeline. This guide defines the key metrics tracked by the FirstBreath system.

⏱ Latency Metrics

Latency is the primary indicator of user experience. It measures the delay between an event (camera captures frame) and the outcome (database insert).

The Pipeline Breakdown

We measure latency at three critical stages to pinpoint bottlenecks:

Capture Latency: Time to fetch a frame from the RTSP stream.
- High? Check network bandwidth or camera stability.
Queue/Redis Wait: Time a frame sits in Redis waiting for a GPU worker.
- High? The GPU is overwhelmed (Inference is slower than Capture). Needs autoscaling or batching.
Inference Latency: Time the AI actually spends processing the batch.
- High? Model is too complex or batch size is too large.

Percentiles (P50 vs P95)

We avoid using "Averages" because they hide problems. Instead, we use percentiles:

P50 (Median): The typical experience. "Most frames take this long."
P95 (95th Percentile): The worst-case for normal operation. "5% of frames are slower than this."

Rule of Thumb: If P50 is low but P95 is high, you have "jitter" — occasional lag spikes usually caused by Garbage Collection, Network packets drops, or GPU context switching. I P50 and P95 are both high, the system is consistently overloaded.

📦 Batch Size Strategy

Batch Size is the number of images processed simultaneously by the GPU.

Dynamic Batching: Our system adjusts batch size on the fly based on incoming load.
Efficiency:
- 1 Frame = 15ms/frame (Fast response, low throughput).
- 8 Frames = 40ms total (5ms/frame! Huge throughput gain).

If your dashboard shows a consistently high batch size (e.g. 8), your system is maximizing throughput (good for cost), but individual frames wait longer (higher latency).

🎛 System throughput

FPS (Frames Per Second): The ultimate measure of capacity.
- Inlet FPS: How many images we receive.
- Outlet FPS: How many results we save.
- Ideally, Inlet should equal Outlet.
Dropped Frames: Frames discarded intentionally to prevent system crash when lag >> 5 seconds.

⏱ Latency Metrics​

The Pipeline Breakdown​

Percentiles (P50 vs P95)​

📦 Batch Size Strategy​

🎛 System throughput​

⏱ Latency Metrics

The Pipeline Breakdown

Percentiles (P50 vs P95)

📦 Batch Size Strategy

🎛 System throughput