Workload Characteristics-based L1 Data Cache Switching-off Mechanism for GPUs
[confproc] W. Jia / 2014 / MRPB: Memory Request Prioritization for Massively Parallel Processors / IEEE International Symposium on High Performance Computer Architecture (HPCA) : 272~283
[report] NVIDIA / 2009 / Whitepaper: NVIDIA's Next Generation CUDA Compute and Graphics Architecture / Fermi
[confproc] Y. Torres. / 2011 / Understanding the Impact of CUDA Tuning Techniques for Fermi / High Performance Computing and Simulation (HPCS) : 631~639
[confproc] A. Jog / 2013 / OWL: Cooperative Thread Array Aware Scheduling Techniques for Improving GPGPU Performance / International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) : 395~406
[confproc] S. Lee / 2015 / CAWA: Coordinated Warp Scheduling and Cache Prioritization for Critical Warp Acceleration of GPGPU Workloads / the International Symposium on Computer Architecture (ISCA) : 515~527
[confproc] M. Lee / 2016 / iPAWS: Instruction-Issue Pattern-based Adaptive Warp Scheduling for GPGPUs / the IEEE International Symposium on High Performance Computer Architecture (HPCA) : 370~381
[confproc] V. Narasiman / 2014 / Improving GPU Performance via Large Warps and Two-Level Warp Scheduling / the IEEE/ACM International Symposium on Microarchitecture (MICRO) : 308~317
[confproc] M. Gebhart / 2011 / Energy-efficient Mechanisms for Managing Thread Context in Throughput Processors / the International Symposium on Computer Architecture (ISCA) : 235~246
[confproc] W. Fung / 2007 / Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow / the IEEE/ACM International Symposium on Microarchitecture (MICRO) : 407~420
[confproc] M. Qureshi / 2007 / Adaptive Insertion Policies for High Performance Caching / the International Symposium on Computer Architecture (ISCA) : 381~391
[journal] Cong Thuan Do / 2015 / A new cache replacement algorithm for last-level caches by exploiting tag-distance correlation of cache lines / Microprocessors and Microsystems / Elsevier BV 39 (4-5) : 286~295 / 10.1016/j.micpro.2015.05.005
[confproc] A. S. Leon / 2006 / The UltraSPARC T1 Processor: CMT Reliability / Custom Integrated Circuits Conference : 555~562
[confproc] T. Rogers / 2012 / Cache-consciou s Wavefront Scheduling / the IEEE/ACM International Symposium on Microarchitecture (MICRO) : 72~83
[book] NVIDIA / 2010 / NVIDA Tegra Multiprocessor Architecture
[confproc] Y. Wu / 2002 / Compiler Managed Micro-cache Bypassing for High Performance EPIC Processors / the IEEE/ACM International Symposium on Microarchitect ure (MICRO) : 134~145
[confproc] T. L. Johnson / 1997 / Run-time Adaptive Cache Hierarchy Management via Reference Analysis / the International Symposium on Computer Architecture (ISCA) : 315~326
[journal] M. Kharbutli / 2008 / Counter-Based Cache Replacement and Bypassing Algorithms / IEEE Transactions on Computers / Institute of Electrical and Electronics Engineers (IEEE) 57 (4) : 433~447 / 10.1109/TC.2007.70816
[confproc] A. Bakhola / 2009 / Analyzing CUDA Workloads Using a Detailed GPU Simulator / the International Symposium on Analysis of Systems and Software (ISPASS) : 163~174
[confproc] H. Liu / 2008 / Cache Bursts: A New Approach for Eliminating Dead Blocks and Increasing Cache Efficiency / the IEEE/ACM International Symposium on Microarchitecture (MICRO) : 222~233
[book] D. Kirk / 2010 / Programming Massively Parallel Processors
[confproc] C. Wu / 2011 / SHiP: Signature-based Hit Predictor for High Performance Caching / the IEEE/ACM International Symposium on Microarchitecture (MICRO) : 430~441
[web] NVIDA / CUDA SDK / http://developer.nvidia.com/gpu-computing-sdk
[confproc] X. Chen / 2014 / Adaptive Cache Management for Energy-Efficient GPU Computing / the IEEE/ACM International Symposium on Microarchitecture (MICRO) : 343~355
[confproc] N. Duong / 2012 / Improving Cache Management Policies Using Dynamic Reuse Distances / the IEEE/ACM International Symposium on Microarchitecture (MICRO) : 389~400
[confproc] X. Xie / 2015 / Coordinated Static and Dynamic Cache Bypassing for GPUs / the IEEE International Symposium on High Performance Computer Architecture (HPCA) : 76~88
[confproc] S. Che / 2009 / Rodinia: A Benchmark Suite for Heterogeneous Computing / the IEEE International Symposiumon Workload Characterization, (IISWC) : 44~54
[confproc] S. Hong / 2010 / An Integrated GPU Power and Performance Model / the International Symposium on Computer Architecture (ISCA) : 280~289
[confproc] J. Leng / 2013 / GPUWattch: Enabling Energy Optimizations in GPGPUs / the International Symposium on Computer Architecture (ISCA) : 487~498
KCI Citation (0)