Source from
https://github.com/giltene/wrk2
Wrk model, which is similar to the model found in many current load generators, computes the latency for a given request as the time from the sending the first byte of the quest to the time complete response was received.
While this model correctly meausures the actual completion time of individual requests, it exhits a strong Coordinated Omission effect, through which most of the high latency artifacts exhibited by the measured server will be ignored. Since each connection will only begin to send a request after receiving a response, high latency responses result in the load generator coordinating with the server to avoid measurement during high latency periods.
Some completely asychronous load generators can avoid Coordinated Omission by sending requests without waiting for previous responses to arrive. When the application being measured may involve multiple serial request/response interactions within each connection, or a blocking protocol (as in the case with most TCP and HTTP workloads), this completely asynchronoous behavior is usually not a viable option.
The model I chose to avoid Coordinated Omission in wrk2 combines the use of constant throughput load generation with latency measurement that takes the intended constant throughput into account. Rather tahn measure reponse latency from the time the actual transmission of a request occurred, wrk2 measures response latncy from the time that transmission should have occurred according to the constant throughput configured for the run. When responses take longer than normal (arriving later than the next request should have been sent), the true latency of the subsequent requests will be appropriately reflected in the recorded latency stats.
Note: This technique can be applied to variable throughput loaders. It requires a “model” or “plan” that can provide the intended start time if each request. Constant throughput load generators Make this trivial to model. More complicated schemes (such as varying throughput or stochastic arrival models) would likely require a detailed model and some memory to provide this information.