## Batch Size 1 Compile true

| Experiment | Warmup_latency (s) | Average_latency (s) | Throughput (samples/sec) | GPU Utilization (%) |
| ---------- | ------------------ | ------------------- | ------------------------ | ------------------- |
| original | 13.828 +/- 0.535 | 0.297 +/- 0.034 | 205.657 +/- 14.429 | 15.630 +/- 1.601 |
| h2d_d2h_threads | 12.515 +/- 0.666 | 0.519 +/- 0.107 | 138.126 +/- 21.821 | 12.482 +/- 1.822 |
