


default search action
26th ICS 2012: Venice, Italy
- Utpal Banerjee, Kyle A. Gallivan, Gianfranco Bilardi, Manolis Katevenis:

International Conference on Supercomputing, ICS'12, Venice, Italy, June 25-29, 2012. ACM 2012, ISBN 978-1-4503-1316-2
Keynote address 1
- Yale N. Patt:

High performance supercomputers: should the individual processor be more than a brick? 1-2
Micro-architecture 1
- Mengjie Mao, Hong An, Bobin Deng, Tao Sun, Xuechao Wei, Wei Zhou, Wenting Han:

Distributed replay protocol for distributed uniprocessors. 3-14
GPUs, compilers
- Wenhao Jia, Kelly A. Shaw, Margaret Martonosi:

Characterizing and improving the use of demand-fetched caches in GPUs. 15-24 - Ziyu Guo, Bo Wu, Xipeng Shen

:
One stone two birds: synchronization relaxation and redundancy removal in GPU-CPU translation. 25-36 - Hongtao Yu, Zhiyuan Li:

Fast loop-level data dependence profiling. 37-46 - Nishkam Ravi, Yi Yang, Tao Bao, Srimat T. Chakradhar:

Apricot: an optimizing compiler and productivity tool for x86-compatible many-core coprocessors. 47-58
Fault tolerance
- Somayeh Sardashti, David A. Wood:

UniFI: leveraging non-volatile memories for a unified fault tolerance and idle power management technique. 59-68 - Manu Shantharam, Sowmyalatha Srinivasmurthy, Padma Raghavan:

Fault tolerant preconditioned conjugate gradient for sparse linear system solution. 69-78 - Wenjing Ma, Sriram Krishnamoorthy:

Data-driven fault tolerance for work stealing computations. 79-90 - Marc Casas-Guix

, Bronis R. de Supinski, Greg Bronevetsky, Martin Schulz
:
Fault resilience of the algebraic multi-grid solver. 91-100
Micro-architecture 2, interconnection networks
- Janani Mukundan, Saugata Ghose, Robert Karmazin, Engin Ipek, José F. Martínez

:
Overcoming single-thread performance hurdles in the core fusion reconfigurable multicore architecture. 101-110 - Mingxing Tan, Xianhua Liu

, Tong Tong, Xu Cheng:
CVP: an energy-efficient indirect branch prediction with compiler-guided value pattern. 111-120 - Miao Luo, Dhabaleswar K. Panda, Khaled Z. Ibrahim, Costin Iancu:

Congestion avoidance on manycore high performance computing systems. 121-132 - Yi Xu, Jun Yang, Rami G. Melhem:

Channel borrowing: an energy-efficient nanophotonic crossbar architecture with light-weight arbitration. 133-142
Runtime, dependencies, load balancing
- Liang Han, Xiaowei Jiang, Wei Liu, Youfeng Wu, James Tuck:

HiRe: using hint & release to improve synchronization of speculative threads. 143-152 - Gokcen Kestor

, Roberto Gioiosa, Osman S. Unsal
, Adrián Cristal
, Mateo Valero
:
Enhancing the performance of assisted execution runtime systems through hardware/software techniques. 153-162 - Quan Chen, Minyi Guo, Zhiyi Huang:

CATS: cache aware task-stealing based on online profiling in multi-socket multi-core architectures. 163-172 - Tao Sun, Hong An, Tao Wang, Haibo Zhang, Xiufeng Sui:

CRQ-based fair scheduling on composable multicore architectures. 173-184 - Olga Pearce, Todd Gamblin, Bronis R. de Supinski, Martin Schulz

, Nancy M. Amato:
Quantifying the effectiveness of load balance algorithms. 185-194
Communication, HPC applications
- John P. Stevenson, Amin Firoozshahian, Alex Solomatnikov, Mark Horowitz, David R. Cheriton:

Sparse matrix-vector multiply on the HICAMP architecture. 195-204 - Kenneth Czechowski, Casey Battaglino, Chris McClanahan, Kartik Iyer

, P.-K. Yeung, Richard W. Vuduc
:
On the communication complexity of 3D FFTs and its implications for Exascale. 205-214 - Gabriel Ilie Tanase, Gheorghe Almási, Hanhong Xue, Charles Archer:

Composable, non-blocking collective operations on power7 IH. 215-224 - Anshul Mittal, Nikhil Jain, Thomas George, Yogish Sabharwal, Sameer Kumar:

Collective algorithms for sub-communicators. 225-234 - Andrea Pietracaprina, Geppino Pucci

, Matteo Riondato
, Francesco Silvestri
, Eli Upfal
:
Space-round tradeoffs for MapReduce computations. 235-244
Keynote address 2
- Michael Gschwind:

Blue Gene/Q: design for sustained multi-petaflop computing. 245-246
Workloads
- Wayne Joubert, Shi-Quan Su:

An analysis of computational workloads for the ORNL Jaguar system. 247-256
Memory hierarchies & interconnects
- Nagendra Dwarakanath Gulur, R. Manikantan, Mahesh Mehendale, R. Govindarajan:

Multiple sub-row buffers in DRAM: unlocking performance and energy improvement opportunities. 257-266 - Yasuo Ishii, Mary Inaba, Kei Hiraki:

Unified memory optimizing architecture: memory subsystem control with a unified predictor. 267-278 - Dongyuan Zhan, Hong Jiang, Sharad C. Seth:

Locality & utility co-optimization for practical capacity management of shared last level caches. 279-290 - Keith D. Underwood

, Eric Borch:
Exploiting communication and packaging locality for cost-effective large scale networks. 291-300
GPUs & parallel programming
- Paruj Ratanaworabhan, Martin Burtscher, Darko Kirovski, Benjamin G. Zorn:

Hardware support for enforcing isolation in lock-based parallel programs. 301-310 - Justin Holewinski, Louis-Noël Pouchet, P. Sadayappan:

High-performance code generation for stencil computations on GPU architectures. 311-320 - John W. Romein:

An efficient work-distribution strategy for gridding radio-telescope data on GPUs. 321-330 - Oded Green, Robert McColl, David A. Bader

:
GPU merge path: a GPU merging algorithm. 331-340
GPUs, CPUs, & linear algebra
- Jungwon Kim

, Sangmin Seo, Jun Lee, Jeongho Nah, Gangwon Jo, Jaejin Lee:
SnuCL: an OpenCL framework for heterogeneous CPU/GPU clusters. 341-352 - Bor-Yiing Su, Kurt Keutzer:

clSpMV: A Cross-Platform OpenCL SpMV Framework on GPUs. 353-364 - Fengguang Song, Stanimire Tomov

, Jack J. Dongarra:
Enabling and scaling matrix computations on heterogeneous multi-core and multi-GPU systems. 365-376 - Jiajia Li

, Xingjian Li
, Guangming Tan, Mingyu Chen, Ninghui Sun:
An optimized large-scale hybrid DGEMM design for CPUs and ATI GPUs. 377-386

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














