ACM/IEEE Intl. Conf. for High Perf. Computing, Networking, Storage and Analysis, SC 2018


Article Details
Title: PRISM: predicting resilience of GPU applications using statistical methods
Article URLs:
Alternative Article URLs: No
Authors: Cham Kalra
  • Apple Computer
Fritz Previlon
  • ARM
Xiangyu Li
  • Northeastern University
Norman Rubin
  • Northeastern University
David R. Kaeli
  • Northeastern University
Sharing: Research produced artifacts
Verification: Authors have verified information
Artifact Evaluation Badge:
Artifacts for some papers are reviewed by an artifact evaluation, reproducibility, or similarly named committee. This is one such paper that passed review.
awarded
Artifact URLs:
Artifact Correspondence Email Addresses:
NSF Award Numbers:
DBLP Key: conf/sc/KalraPLRK18
Author Comments: PRISM provides a systematic approach to predict failures in GPU programs. PRISM extracts micro-architecture agnostic features to characterize program resiliency, and serves as an effective predictor to drive our statistical model. PRISM can predict failures in applications without running exhaustive fault injection campaigns, thereby reducing the error estimation effort.

Discuss this paper and its artifacts below