


default search action
IPDPS 2014: Phoenix, AZ, USA
- 2014 IEEE 28th International Parallel and Distributed Processing Symposium, Phoenix, AZ, USA, May 19-23, 2014. IEEE Computer Society 2014, ISBN 978-1-4799-3799-8

Keynote Speaker 1
- Yutong Lu:

Scalability-Centric HPC System Design. 3
Session 1: Algorithms for Resource Management and Awareness
- Henri Casanova, Lipyeow Lim, Yves Robert, Frédéric Vivien

, Dounia Zaidouni:
Cost-Optimal Execution of Boolean Query Trees with Shared Streams. 7-16 - Matthias Rost, Stefan Schmid

, Anja Feldmann
:
It's About Time: On Optimal Virtual Network Embeddings under Temporal Flexibilities. 17-26 - Mehmet Deveci, Sivasankaran Rajamanickam, Vitus J. Leung, Kevin T. Pedretti, Stephen L. Olivier

, David P. Bunde, Ümit V. Çatalyürek
, Karen D. Devine:
Exploiting Geometric Partitioning in Task Mapping for Parallel Computers. 27-36 - Moshe Gabel, Assaf Schuster, Daniel Keren:

Communication-Efficient Distributed Variance Monitoring and Outlier Detection for Multivariate Time Series. 37-47
Session 2: Big Data Processing
- Huayong Wang, Li-Shiuan Peh:

MobiStreams: A Reliable Distributed Stream Processing System for Mobile Devices. 51-60 - Devesh Tiwari, Yan Solihin:

MapReuse: Reusing Computation in an In-Memory MapReduce System. 61-71 - Mücahid Kutlu

, Gagan Agrawal:
PAGE: A Framework for Easy PArallelization of GEnomic Applications. 72-81 - Marcelo Veiga Neves, César A. F. De Rose, Kostas Katrinis, Hubertus Franke:

Pythia: Faster Big Data in Motion through Predictive Software-Defined Network Optimization at Runtime. 82-90
Session 3: GPU
- Yi Yang, Ping Xiang, Michael Mantor, Norman Rubin, Lisa R. Hsu, Qunfeng Dong, Huiyang Zhou

:
A Case for a Flexible Scalar Unit in SIMT Architecture. 93-102 - Ayse Yilmazer

, Zhongliang Chen, David R. Kaeli:
Scalar Waving: Improving the Efficiency of SIMD Execution on GPUs. 103-112 - Yuki Abe, Hiroshi Sasaki, Shinpei Kato, Koji Inoue, Masato Edahiro, Martin Peres:

Power and Performance Characterization and Modeling of GPU-Accelerated Systems. 113-122 - Ivan Grasso, Petar Radojkovic

, Nikola Rajovic, Isaac Gelado, Alex Ramírez:
Energy Efficient HPC on Embedded SoCs: Optimization Techniques for Mali GPU. 123-132
Session 4: I/O, Storage, and Networking
- Bogdan Nicolae

, Pierre Riteau, Kate Keahey:
Bursting the Cloud Data Bubble: Towards Transparent Storage Elasticity in IaaS Clouds. 135-144 - Jian Huang, Xuechen Zhang, Greg Eisenhauer, Karsten Schwan, Matthew Wolf, Stéphane Ethier, Scott Klasky:

Scibox: Online Sharing of Scientific Data via the Cloud. 145-154 - Matthieu Dorier

, Gabriel Antoniu, Robert B. Ross, Dries Kimpe, Shadi Ibrahim:
CALCioM: Mitigating I/O Interference in HPC Systems through Cross-Application Coordination. 155-164 - Marc Casas

, Greg Bronevetsky:
Active Measurement of the Impact of Network Switch Utilization on Application Performance. 165-174
Session 5: Multi-core Algorithms
- Bryan C. Ward, James H. Anderson:

Multi-resource Real-Time Reader/Writer Locks for Multiprocessors. 177-186 - Ahmed Hassan, Roberto Palmieri

, Binoy Ravindran
:
Remote Invalidation: Optimizing the Critical Path of Memory Transactions. 187-197 - Haim Avron

, Alex Druinsky, Anshul Gupta:
Revisiting Asynchronous Linear Solvers: Provable Convergence Rate through Randomization. 198-207 - Benjamin S. Parsons, Vijay S. Pai:

Accelerating MPI Collective Communications through Hierarchical Algorithms Without Sacrificing Inter-Node Communication Flexibility. 208-218
Session 6: Computational Biology
- Boyu Zhang, Trilce Estrada

, Pietro Cicotti, Michela Taufer
:
Enabling In-Situ Data Analysis for Large Protein-Folding Trajectory Datasets. 221-230 - Gregory M. Striemer, Harsha Krovi

, Ali Akoglu
, Benjamin Vincent, Ben Hopson, Jeffrey Frelinger, Adam Buntzman
:
Overcoming the Limitations Posed by TCR-beta Repertoire Modeling through a GPU-Based In-Silico DNA Recombination Algorithm. 231-240 - Sanchit Misra, Kiran Pamnany, Srinivas Aluru:

Parallel Mutual Information Based Construction of Whole-Genome Networks on the Intel (R) Xeon Phi (TM) Coprocessor. 241-250 - Jing Zhang, Hao Wang, Heshan Lin, Wu-chun Feng:

cuBLASTP: Fine-Grained Parallelization of Protein Sequence Search on a GPU. 251-260
Session 7: Interconnection Network
- Ikki Fujiwara

, Michihiro Koibuchi, Hiroki Matsutani, Henri Casanova:
Skywalk: A Topology for HPC Networks with Low-Delay Switches. 263-272 - Xin Yuan, Santosh Mahapatra, Michael Lang

, Scott Pakin
:
LFTI: A New Performance Metric for Assessing Interconnect Designs for Extreme-Scale HPC Systems. 273-282 - Pavan Poluri, Ahmed Louri:

An Improved Router Design for Reliable On-Chip Networks. 283-292 - Jieming Yin, Pingqiang Zhou, Sachin S. Sapatnekar

, Antonia Zhai:
Energy-Efficient Time-Division Multiplexed Hybrid-Switched NoC for Heterogeneous Multicore Systems. 293-303
Session 8: System-Level Resource Management
- Dazhao Cheng, Changjun Jiang, Xiaobo Zhou:

Heterogeneity-Aware Workload Placement and Migration in Distributed Sustainable Datacenters. 307-316 - Zahra Abbasi, Madhurima Pore, Sandeep K. S. Gupta:

Online Server and Workload Management for Joint Optimization of Electricity Cost and Carbon Footprint Across Data Centers. 317-326 - Hsuan-Yi Chu, Yogesh Simmhan

:
Cost-Efficient and Resilient Job Life-Cycle Management on Hybrid Clouds. 327-336 - Giuseppe Coviello, Srihari Cadambi, Srimat T. Chakradhar:

A Coprocessor Sharing-Aware Scheduler for Xeon Phi-Based Compute Clusters. 337-346
Session 9: GPU Algorithms
- Andrew A. Davidson, Sean Baxter, Michael Garland, John D. Owens:

Work-Efficient Parallel GPU Methods for Single-Source Shortest Paths. 349-359 - Hristo N. Djidjev

, Sunil Thulasidasan, Guillaume Chapuis, Rumen Andonov, Dominique Lavenier:
Efficient Multi-GPU Computation of All-Pairs Shortest Paths. 360-369 - Weifeng Liu

, Brian Vinter:
An Efficient GPU General Sparse Matrix-Matrix Multiplication for Irregular Data. 370-381 - Ichitaro Yamazaki, Hartwig Anzt

, Stanimire Tomov
, Mark Hoemmen, Jack J. Dongarra:
Improving the Performance of CA-GMRES on Multicores with Multiple GPUs. 382-391
Session 10: Graph and Network Processing
- Yong Guo, Marcin Biczak, Ana Lucia Varbanescu, Alexandru Iosup

, Claudio Martella, Theodore L. Willke:
How Well Do Graph-Processing Platforms Perform? An Empirical Performance Evaluation and Analysis. 395-404 - George M. Slota, Kamesh Madduri

:
Complex Network Analysis Using Parallel Approximate Motif Counting. 405-414 - Indranil Roy, Srinivas Aluru:

Finding Motifs in Biological Sequences Using the Micron Automata Processor. 415-424 - Fabio Checconi, Fabrizio Petrini:

Traversing Trillions of Edges in Real Time: Graph Exploration on Large-Scale Parallel Machines. 425-434
Session 11: Modeling, Simulation, and Reliability
- Jen-Cheng Huang, Lifeng Nai, Hyesoon Kim, Hsien-Hsin S. Lee:

TBPoint: Reducing Simulation Time for Large-Scale GPGPU Kernels. 437-446 - Jee W. Choi, Marat Dukhan, Xing Liu, Richard W. Vuduc

:
Algorithmic Time, Energy, and Power on Candidate HPC Compute Building Blocks. 447-457 - Keun Soo Yim:

Characterization of Impact of Transient Faults and Detection of Data Corruption Errors in Large-Scale N-Body Programs Using Graphics Processing Units. 458-467 - Jichi Guo, Jiayuan Meng, Qing Yi, Vitali A. Morozov, Kalyan Kumaran:

Analytically Modeling Application Execution for Software-Hardware Co-design. 468-477
Session 12: Accelerator Application Development and Optimization
- Seyong Lee

, Dong Li, Jeffrey S. Vetter:
Interactive Program Debugging and Optimization for Directive-Based, Efficient GPU Computing. 481-490 - Azzam Haidar, Chongxiao Cao, Asim YarKhan

, Piotr Luszczek, Stanimire Tomov
, Khairul Kabir, Jack J. Dongarra:
Unified Development for Mixed Multi-GPU and Multi-coprocessor Environments Using a Lightweight Runtime Environment. 491-500 - Saurav Muralidharan, Manu Shantharam, Mary W. Hall

, Michael Garland, Bryan Catanzaro:
Nitro: A Framework for Adaptive Code Variant Tuning. 501-512 - Peter M. Kogge:

Reading the Tea-Leaves: How Architecture Has Evolved at the High End. 515
Session 13: Combinatorial Algorithms
- Fredrik Manne, Mahantesh Halappanavar:

New Effective Multithreaded Matching Algorithms. 519-528 - Daniël Maria Pelt

, Rob H. Bisseling:
A Medium-Grain Method for Fast 2D Bipartitioning of Sparse Matrices. 529-539 - Fanny Dufossé, Kamer Kaya, Bora Uçar

:
Bipartite Matching Heuristics with Quality Guarantees on Shared Memory Parallel Computers. 540-549 - George M. Slota, Sivasankaran Rajamanickam, Kamesh Madduri

:
BFS and Coloring-Based Parallel Algorithms for Strongly Connected Components and Related Problems. 550-559
Session 14: Large Scale Scientific Applications
- Xing Liu, Edmond Chow:

Large-Scale Hydrodynamic Brownian Simulations on Multicore and Manycore Architectures. 563-572 - Adam Fidel, Sam Ade Jacobs, Shishir Sharma, Nancy M. Amato, Lawrence Rauchwerger:

Using Load Balancing to Scalably Parallelize Sampling-Based Motion Planning Algorithms. 573-582 - James E. McClure, Hao Wang, Jan F. Prins, Cass T. Miller, Wu-chun Feng:

Petascale Application of a Coupled CPU-GPU Algorithm for Simulation and Analysis of Multiphase Flow Solutions in Porous Medium Systems. 583-592 - Amanda Peters Randles

, Efthimios Kaxiras:
A Spatio-temporal Coupling Method to Reduce the Time-to-Solution of Cardiovascular Simulations. 593-602
Session 15: Multicore and Transactional Memory
- Lihang Zhao, Lizhong Chen, Jeffrey T. Draper:

Mitigating the Mismatch between the Coherence Protocol and Conflict Detection in Hardware Transactional Memory. 605-614 - Bhavishya Goel, J. Rubén Titos Gil

, Anurag Negi, Sally A. McKee, Per Stenström:
Performance and Energy Analysis of the Restricted Transactional Memory Implementation on Haswell. 615-624 - Madhavan Manivannan, Per Stenström:

Runtime-Guided Cache Coherence Optimizations in Multi-core Architectures. 625-636 - Akshay Venkatesh, Sreeram Potluri, Raghunath Rajachandrasekar, Miao Luo, Khaled Hamidouche, Dhabaleswar K. Panda:

High Performance Alltoall and Allgather Designs for InfiniBand MIC Clusters. 637-646
Session 16: HPC Operating Systems and Runtime Systems
- Brian Kocoloski, John R. Lange:

HPMMAP: Lightweight Memory Management for Commodity Operating Systems. 649-658 - Swann Perarnau, Mitsuhisa Sato:

Victim Selection and Distributed Work Stealing Performance: A Case Study. 659-668 - Ramy Medhat, Borzoo Bonakdarpour, Sebastian Fischmeister:

Power-Efficient Multiple Producer-Consumer. 669-678 - Young Wn Song, Yann-Hang Lee:

Efficient Data Race Detection for C/C++ Programs Using Dynamic Granularity. 679-688
Session 17: Algorithms for Distributed Computing
- Jiaqi Wang, Edward Talmage, Hyunyoung Lee, Jennifer L. Welch:

Improved Time Bounds for Linearizable Implementations of Abstract Data Types. 691-701 - Gopal Pandurangan

, Peter Robinson, Amitabh Trehan
:
DEX: Self-Healing Expanders. 702-711 - Jeremy T. Fineman, Calvin C. Newport, Micah Sherr, Tonghe Wang:

Fair Maximal Independent Sets. 712-721
Session 18: Milestones at the Petascale
- Chuanfu Xu, Lilun Zhang, Xiaogang Deng, Jianbin Fang

, Guangxue Wang, Wei Cao, Yonggang Che, Yongxian Wang
, Wei Liu:
Balancing CPU-GPU Collaborative High-Order CFD Simulations on the Tianhe-1A Supercomputer. 725-734 - Valéry Weber, Costas Bekas, Teodoro Laino, Alessandro Curioni, Adam Bertsch, Scott Futral:

Shedding Light on Lithium/Air Batteries Using Millions of Threads on the BG/Q Supercomputer. 735-744 - Wei Xue, Chao Yang

, Haohuan Fu, Xinliang Wang, Yangtong Xu, Lin Gan, Yutong Lu, Xiaoqian Zhu:
Enabling and Scaling a Global Shallow-Water Atmospheric Model on Tianhe-2. 745-754 - Jae-Seung Yeom

, Abhinav Bhatele, Keith R. Bisset, Eric J. Bohm, Abhishek Gupta
, Laxmikant V. Kalé, Madhav V. Marathe, Dimitrios S. Nikolopoulos
, Martin Schulz
, Lukasz Wesolowski:
Overcoming the Scalability Challenges of Epidemic Simulations on Blue Waters. 755-764
Session 19: Storage and Reliability
- Bo Mao, Hong Jiang, Suzhen Wu, Lei Tian:

POD: Performance Oriented I/O Deduplication for Primary Storage Systems in the Cloud. 767-776 - Zigang Zhang, Yinliang Yue, Bingsheng He

, Jin Xiong, Mingyu Chen, Lixin Zhang, Ninghui Sun:
Pipelined Compaction for the LSM-Tree. 777-786 - Jiaxin Ou, Jiwu Shu, Youyou Lu, Letian Yi, Wei Wang:

EDM: An Endurance-Aware Data Migration Scheme for Load Balancing in SSD Storage Clusters. 787-796
Session 20: Map/Reduce and Big Data
- Yandong Wang, Robin Goldstone, Weikuan Yu

, Teng Wang:
Characterization and Optimization of Memory-Resident MapReduce on HPC Systems. 799-808 - Yang You, Shuaiwen Leon Song, Haohuan Fu, Andres Marquez

, Maryam Mehri Dehnavi, Kevin J. Barker
, Kirk W. Cameron
, Amanda Peters Randles
, Guangwen Yang:
MIC-SVM: Designing a Highly Efficient Support Vector Machine for Advanced Modern Multi-core and Many-Core Architectures. 809-818 - Reza Mokhtari, Michael Stumm:

BigKernel - High Performance CPU-GPU Communication Pipelining for Big Data-Style Applications. 819-828 - Xiaoyi Lu, Fan Liang, Bing Wang, Li Zha, Zhiwei Xu:

DataMPI: Extending MPI to Hadoop-Like Big Data Computing. 829-838
Session 21: Network Algorithms
- Patrick MacArthur, Robert D. Russell:

An Efficient Method for Stream Semantics over RDMA. 841-851 - Zhiyang Guo, Yuanyuan Yang

:
Collaborative Network Configuration in Hybrid Electrical/Optical Data Center Networks. 852-861 - Hadas Shachnai, Ariella Voloshin, Shmuel Zaks:

Optimizing Bandwidth Allocation in Flex-Grid Optical Networks with Application to Scheduling. 862-871 - Di Zhu, Lizhong Chen, Siyu Yue, Timothy Mark Pinkston, Massoud Pedram:

Balancing On-Chip Network Latency in Multi-application Mapping for Chip-Multiprocessors. 872-881 - Joshua S. Bloom

:
Astrophysical Applications of Machine Learning at Scale and under Duress. 885
Best Papers Session
- Venkatesan T. Chakaravarthy, Fabio Checconi, Fabrizio Petrini, Yogish Sabharwal:

Scalable Single Source Shortest Path Algorithms for Massively Parallel Systems. 889-901 - Xing Liu, Aftab Patel

, Edmond Chow:
A New Scalable Parallel Algorithm for Fock Matrix Construction. 902-914 - Xun Li, Diana Franklin, Ricardo Bianchini, Frederic T. Chong

:
ReDHiP: Recalibrating Deep Hierarchy Prediction for Energy Efficiency. 915-926 - Kaushik Ravichandran, Santosh Pande

:
F2C2-STM: Flux-Based Feedback-Driven Concurrency Control for STMs. 927-938
Session 22: Performance Characterization and Optimization
- Harald Servat

, Germán Llort
, Juan Gonzalez
, Judit Giménez
, Jesús Labarta
:
Identifying Code Phases Using Piece-Wise Linear Regressions. 941-951 - Alessio Sclocco

, Henri E. Bal, Jason W. T. Hessels, Joeri van Leeuwen, Rob van Nieuwpoort
:
Auto-Tuning Dedispersion for Many-Core Accelerators. 952-961 - Florin Dinu, T. S. Eugene Ng:

RCMP: Enabling Efficient Recomputation Based Failure Resilience for Big Data Analytics. 962-971 - Tingxing Dong, Veselin Dobrev

, Tzanio V. Kolev
, Robert N. Rieben, Stanimire Tomov
, Jack J. Dongarra:
A Step towards Energy Efficient Computing: Redesigning a Hydrodynamic Application on CPU-GPU. 972-981
Session 23: Multithreading and Concurrency
- Zehra Sura, Kevin O'Brien, José R. Brunheroto:

Using Multiple Threads to Accelerate Single Thread Performance. 985-994 - Marc Casas

, Greg Bronevetsky:
Active Measurement of Memory Resource Consumption. 995-1004 - Korbinian Molitorisz, Thomas Karcher, Alexander Biele, Walter F. Tichy:

Locating Parallelization Potential in Object-Oriented Data Structures. 1005-1015
Session 24: Numerical Algorithms
- Sudip K. Seal:

An Accelerated Recursive Doubling Algorithm for Block Tridiagonal Systems. 1019-1028 - Mathieu Faverge, Julien Herrmann, Julien Langou, Bradley R. Lowery, Yves Robert

, Jack J. Dongarra:
Designing LU-QR Hybrid Solvers for Performance and Stability. 1029-1038 - Md Rakib Hasan, R. Clint Whaley:

Effectively Exploiting Parallel Scale for All Problem Sizes in LU Factorization. 1039-1048 - Tyler M. Smith, Robert A. van de Geijn, Mikhail Smelyanskiy, Jeff R. Hammond

, Field G. Van Zee:
Anatomy of High-Performance Many-Threaded Matrix Multiplication. 1049-1059
Session 25: Performance Impacts of Hardware Acceleration
- George Teodoro, Tahsin M. Kurç, Jun Kong, Lee Cooper, Joel H. Saltz:

Comparative Performance Analysis of Intel (R) Xeon Phi (TM), GPU, and CPU: A Case Study from Microscopy Image Analysis. 1063-1072 - Frank Tobias Winter, Mike A. Clark

, Robert G. Edwards
, Bálint Joó:
A Framework for Lattice QCD Calculations on GPUs. 1073-1082 - Karthikeyan Vaidyanathan, Kiran Pamnany, Dhiraj D. Kalamkar, Alexander Heinecke, Mikhail Smelyanskiy, Jongsoo Park, Daehyun Kim, Aniruddha G. Shet, Bharat Kaul, Bálint Joó, Pradeep Dubey:

Improving Communication Performance and Scalability of Native Applications on Intel Xeon Phi Coprocessor Clusters. 1083-1092 - Joshua Payne, Dana A. Knoll, Allen McPherson, William T. Taitano, Luis Chacón

, Guangye Chen
, Scott Pakin
:
Computational Co-design of a Multiscale Plasma Application: A Process and Initial Results. 1093-1102
Session 26: Programming Models and Tools
- Yili Zheng, Amir Kamil

, Michael B. Driscoll, Hongzhang Shan, Katherine A. Yelick
:
UPC++: A PGAS Extension for C++. 1105-1114 - Khaled Z. Ibrahim, Paul Hargrove

, Costin Iancu, Katherine A. Yelick
:
An Evaluation of One-Sided and Two-Sided Communication Paradigms on Relaxed-Ordering Interconnect. 1115-1125 - Alessandro Morari, Antonino Tumeo

, Daniel G. Chavarría-Miranda, Oreste Villa, Mateo Valero
:
Scaling Irregular Applications through Data Aggregation and Software Multithreading. 1126-1135 - Michelle Mills Strout, Fabio Luporini, Christopher D. Krieger, Carlo Bertolli, Gheorghe-Teodor Bercea, Catherine Olschanowsky, J. Ramanujam

, Paul H. J. Kelly:
Generalizing Run-Time Tiling with the Loop Chain Abstraction. 1136-1145
Session 27: Algorithms for High Performance Computing
- Samuel Williams

, Mike Lijewski, Ann S. Almgren
, Brian van Straalen, Erin C. Carson
, Nicholas Knight, James Demmel:
s-Step Krylov Subspace Methods as Bottom Solvers for Geometric Multigrid. 1149-1158 - Grey Ballard

, James Demmel, Laura Grigori, Mathias Jacquelin
, Hong Diep Nguyen, Edgar Solomonik:
Reconstructing Householder Vectors from Tall-Skinny QR. 1159-1170 - Katsuki Fujisawa

, Toshio Endo, Yuichiro Yasui, Hitoshi Sato
, Naoki Matsuzawa, Satoshi Matsuoka, Hayato Waki
:
Petascale General Solver for Semidefinite Programming Problems with Over Two Million Constraints. 1171-1180 - Sheng Di, Mohamed-Slim Bouguerra, Leonardo Arturo Bautista-Gomez

, Franck Cappello:
Optimization of Multi-level Checkpoint Model for Large Scale HPC Applications. 1181-1190
Session 28: Scalable Algorithms
- James Elliott, Mark Hoemmen, Frank Mueller:

Evaluating the Impact of SDC on the GMRES Iterative Solver. 1193-1202 - Mohand Mezmaz, Rudi Leroy, Nouredine Melab, Daniel Tuyttens:

A Multi-core Parallel Branch-and-Bound Algorithm Using Factorial Number System. 1203-1212 - Hasan Metin Aktulga

, Aydin Buluç
, Samuel Williams
, Chao Yang:
Optimizing Sparse Matrix-Multiple Vectors Multiplication for Nuclear Configuration Interaction Calculations. 1213-1222
Session 29: Resilience and Reliability
- Kento Sato, Adam Moody, Kathryn M. Mohror

, Todd Gamblin, Bronis R. de Supinski, Naoya Maruyama, Satoshi Matsuoka:
FMI: Fault Tolerant Messaging Interface for Fast and Transparent Recovery. 1225-1234 - Andrea Arteaga, Oliver Fuhrer

, Torsten Hoefler:
Designing Bit-Reproducible Portable High-Performance Applications. 1235-1244 - Qiang Guan, Nathan DeBardeleben, Sean Blanchard, Song Fu

:
F-SEFI: A Fine-Grained Soft Error Fault Injection Tool for Profiling Application Vulnerability. 1245-1254

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














