Jiman Cha
|
Choong-hee Cho
| 2026, 31(5)
| pp.11~27
| number of Cited : 0
Large-scale services widely use distributed pipelines for real-time log processing. However, default buffering policies, although intended to protect system resources, can create bottlenecks that degrade real-time performance. This study constructed a large-scale distributed load environment and tracked end-to-end latency from log generation to final data warehouse loading at the millisecond level.
Experiments were conducted across a five-stage optimization scenario by adjusting buffer sizes, wait times, and scan intervals. The results show that a low-latency configuration can reduce explicit buffering delay, but may increase packet-level overhead and degrade throughput. In contrast, the proposed hybrid configuration does not aim for absolute optimality in a single metric; instead, it applies cross-layer tuning to mitigate ingestion-layer traffic variability while minimizing downstream transmission delays. Under the evaluated conditions, the hybrid configuration achieved the lowest P95 latency, prevented pipeline collapse in micro-batch environments, and maintained throughput exceeding approximately 300 events per second.