ACM/IEEE Intl. Conf. for High Perf. Computing, Networking, Storage and Analysis, SC 2016


Article Details
Title: Failure detection and propagation in HPC systems
Article URLs:
Alternative Article URLs:
Authors: George Bosilca
  • University of Tennessee, ICL
Aurélien Bouteiller
  • University of Tennessee, ICL
Amina Guermouche
  • University of Tennessee, ICL
Thomas Hérault
  • University of Tennessee, ICL
Yves Robert
  • University of Tennessee, ICL
  • ENS Lyon, LIP
Pierre Sens
  • Sorbonne Univ.
  • UPMC
  • CNRS
  • Inria
  • LIP6
Jack J. Dongarra
  • University of Tennessee, ICL
  • Oak Ridge National Lab.
  • Manchester University
Sharing: Unknown
Verification: Authors have not verified information
Artifact Evaluation Badge: none
Artifact URLs:
Artifact Correspondence Email Addresses:
NSF Award Numbers:
DBLP Key: conf/sc/BosilcaBGHRSD16
Author Comments:

Discuss this paper and its artifacts below