본문 바로가기
  • Home

Latency Hiding based Warp Scheduling Policy for High Performance GPUs

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2019, 24(4), pp.1-9
  • DOI : 10.9708/jksci.2019.24.04.001
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science
  • Received : January 22, 2019
  • Accepted : April 11, 2019
  • Published : April 30, 2019

Gwang Bok Kim 1 Jong Myon Kim 2 Cheol Hong Kim 1

1전남대학교
2울산대학교

Accredited

ABSTRACT

LRR(Loose Round Robin) warp scheduling policy for GPU architecture results in high warp-level parallelism and balanced loads across multiple warps. However, traditional LRR policy makes multiple warps execute long latency operations at the same time. In cases that no more warps to be issued under long latency, the throughput of GPUs may be degraded significantly. In this paper, we propose a new warp scheduling policy which utilizes latency hiding, leading to more utilized memory resources in high performance GPUs. The proposed warp scheduler prioritizes memory instruction based on GTO(Greedy Then Oldest) policy in order to provide reduced memory stalls. When no warps can execute memory instruction any more, the warp scheduler selects a warp for computation instruction by round robin manner. Furthermore, our proposed technique achieves high performance by using additional information about recently committed warps. According to our experimental results, our proposed technique improves GPU performance by 12.7% and 5.6% over LRR and GTO on average, respectively.

Citation status

* References for papers published after 2023 are currently being built.

This paper was written with support from the National Research Foundation of Korea.