ACM International Conference on Supercomputing, ICS, 2017

Title/Authors	Title	Research Artifacts [?] A research artifact is any by-product of a research project that is not directly included in the published research paper. In Computer Science research this is often source code and data sets, but it could also be media, documentation, inputs to proof assistants, shell-scripts to run experiments, etc.	Details

Hardware/software cooperative caching for hybrid DRAM/NVM memory architectures Haikun Liu, Yujie Chen, Xiaofei Liao, Hai Jin, Bingsheng He, Long Zheng, Rentong Guo	Hardware/software cooperative caching for hybrid DRAM/NVM memory architectures Details	https://github.com/CGCL-codes/HSCC	Discussion Comments: 0 Verification: Authors have not verified information More...
Carpool: a bufferless on-chip network supporting adaptive multicast and hotspot alleviation Xi-Yue Xiang, Wentao Shi, Saugata Ghose, Lu Peng, Onur Mutlu, Nian-Feng Tzeng	Carpool: a bufferless on-chip network supporting adaptive multicast and hotspot alleviation Details		Discussion Comments: 0 Verification: Authors have not verified information More...
Iteration-fusing conjugate gradient Sicong Zhuang, Marc Casas	Iteration-fusing conjugate gradient Details		Discussion Comments: 0 Verification: Authors have not verified information More...
Frequent subtree mining on the automata processor: challenges and opportunities Elaheh Sadredini, Reza Rahimi, Ke Wang, Kevin Skadron	Frequent subtree mining on the automata processor: challenges and opportunities Details		Discussion Comments: 0 Verification: Authors have not verified information More...
Supporting automatic recovery in offloaded distributed programming models through MPI-3 techniques Antonio J. Peña, Vicenç Beltran, Carsten Clauss, Thomas Moschny	Supporting automatic recovery in offloaded distributed programming models through MPI-3 techniques Details		Discussion Comments: 0 Verification: Authors have not verified information More...
SSDUP: a traffic-aware ssd burst buffer for HPC systems Xuanhua Shi, Ming Li, Wei Liu, Hai Jin, Chen Yu, Yong Chen	SSDUP: a traffic-aware ssd burst buffer for HPC systems Details		Discussion Comments: 0 Verification: Authors have not verified information More...
Design and implementation of bandwidth-aware memory placement and migration policies for heterogeneous memory systems Seongdae Yu, Seongbeom Park, Woongki Baek	Design and implementation of bandwidth-aware memory placement and migration policies for heterogeneous memory systems Details		Discussion Comments: 0 Verification: Authors have not verified information More...
HPAT: high performance analytics with scripting ease-of-use Ehsan Totoni, Todd A. Anderson, Tatiana Shpeisman	HPAT: high performance analytics with scripting ease-of-use Details	https://github.com/IntelLabs/HPAT.jl	Discussion Comments: 0 Verification: Authors have not verified information More...
A performance analysis framework for exploiting GPU microarchitectural capability Ke-ren Zhou, Guangming Tan, Xiuxia Zhang, Chaowei Wang, Ninghui Sun	A performance analysis framework for exploiting GPU microarchitectural capability Details		Discussion Comments: 0 Verification: Authors have not verified information More...
GraphGrind: addressing load imbalance of graph partitioning Jiawen Sun, Hans Vandierendonck, Dimitrios S. Nikolopoulos	GraphGrind: addressing load imbalance of graph partitioning Details	https://github.com/hvdieren/graphgrind h.vandierendonck@qub.ac.uk hvandierendonck@acm.org	Author Comments: Discussion Comments: 0 Sharing: Research produced artifacts Verification: Authors have verified information More...
Efficient SIMD and MIMD parallelization of hash-based aggregation by conflict mitigation Peng Jiang, Gagan Agrawal	Efficient SIMD and MIMD parallelization of hash-based aggregation by conflict mitigation Details	https://github.com/jiangohiostate/ics2017_artifact	Author Comments: Discussion Comments: 0 Sharing: Research produced artifacts Verification: Authors have verified information More...
On improving performance of sparse matrix-matrix multiplication on GPUs Rakshith Kunchum, Ankur Chaudhry, Aravind Sukumaran-Rajam, Qingpeng Niu, Israt Nisa, P. Sadayappan	On improving performance of sparse matrix-matrix multiplication on GPUs Details		Discussion Comments: 0 Verification: Authors have not verified information More...
Demystifying automata processing: GPUs, FPGAs or Micron's AP? Marziyeh Nourian, Xiang Wang, Xiaodong Yu, Wu-chun Feng, Michela Becchi	Demystifying automata processing: GPUs, FPGAs or Micron's AP? Details		Discussion Comments: 0 Verification: Authors have not verified information More...
Way-combining directory: an adaptive and scalable low-cost coherence directory J. Rubén Titos Gil, Antonio Flores, Ricardo Fernández Pascual, Alberto Ros, Manuel E. Acacio	Way-combining directory: an adaptive and scalable low-cost coherence directory Details		Discussion Comments: 0 Verification: Authors have not verified information More...
Globally homogeneous, locally adaptive sparse matrix-vector multiplication on the GPU Markus Steinberger, Rhaleb Zayer, Hans-Peter Seidel	Globally homogeneous, locally adaptive sparse matrix-vector multiplication on the GPU Details	https://bitbucket.org/gpusmack/holaspmv	Author Comments: Discussion Comments: 0 Sharing: Research produced artifacts Verification: Authors have verified information More...
Novel HPC techniques to batch execution of many variable size BLAS computations on GPUs Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack J. Dongarra	Novel HPC techniques to batch execution of many variable size BLAS computations on GPUs Details		Discussion Comments: 0 Verification: Authors have not verified information More...
libPRISM: an intelligent adaptation of prefetch and SMT levels Cristobal Ortega, Miquel Moretó, Marc Casas, Ramon Bertran, Alper Buyuktosunoglu, Alexandre E. Eichenberger, Pradip Bose	libPRISM: an intelligent adaptation of prefetch and SMT levels Details	https://github.com/criort/libPRISM	Author Comments: Discussion Comments: 0 Sharing: Research produced artifacts Verification: Authors have verified information More...
Enabling scalability-sensitive speculative parallelization for FSM computations Junqiao Qiu, Zhijia Zhao, Bo Wu, Abhinav Vishnu, Shuaiwen Leon Song	Enabling scalability-sensitive speculative parallelization for FSM computations Details		Discussion Comments: 0 Verification: Authors have not verified information More...
Packet coalescing exploiting data redundancy in GPGPU architectures Kyung Hoon Kim, Rahul Boyapati, Jiayi Huang, Yuho Jin, Ki Hwan Yum, Eun Jung Kim	Packet coalescing exploiting data redundancy in GPGPU architectures Details		Discussion Comments: 0 Verification: Authors have not verified information More...
SPIRIT: a framework for creating distributed recursive tree applications Nikhil Hegde, Jianqiao Liu, Milind Kulkarni	SPIRIT: a framework for creating distributed recursive tree applications Details		Author Comments: Discussion Comments: 0 Sharing: Research produced artifacts Verification: Authors have verified information More...
Dynamic scheduling for efficient hierarchical sparse matrix operations on the GPU Andreas Derler, Rhaleb Zayer, Hans-Peter Seidel, Markus Steinberger	Dynamic scheduling for efficient hierarchical sparse matrix operations on the GPU Details		Author Comments: Artifacts are not available yet, but might be available in the future. Please contact the authors directly for more information. Discussion Comments: 0 Sharing: Other Verification: Authors have verified information More...
Automatic topology mapping of diverse large-scale parallel applications Juan J. Galvez, Nikhil Jain, Laxmikant V. Kalé	Automatic topology mapping of diverse large-scale parallel applications Details	https://github.com/jovo/FastApproximateQAP	Discussion Comments: 0 Verification: Authors have not verified information More...
Optimizing recursive task parallel programs Suyash Gupta, Rahul Shrivastava, V. Krishna Nandivada	Optimizing recursive task parallel programs Details		Discussion Comments: 0 Verification: Authors have not verified information More...
Fast segmented sort on GPUs Kaixi Hou, Weifeng Liu, Hao Wang, Wu-chun Feng	Fast segmented sort on GPUs Details		Discussion Comments: 0 Verification: Authors have not verified information More...
HiPA: history-based piecewise approximation for functions Aurangzeb, Rudolf Eigenmann	HiPA: history-based piecewise approximation for functions Details		Discussion Comments: 0 Verification: Authors have not verified information More...
Simplification and runtime resolution of data dependence constraints for loop transformations Diogo Nunes Sampaio, Louis-Noël Pouchet, Fabrice Rastello	Simplification and runtime resolution of data dependence constraints for loop transformations Details	https://gitlab.inria.fr/nunessam/pghc	Author Comments: Discussion Comments: 0 Sharing: Research produced artifacts Verification: Authors have verified information More...
Revisiting phased transactional memory Joao P. L. de Carvalho, Guido Araujo, Alexandro Baldassin	Revisiting phased transactional memory Details		Discussion Comments: 0 Verification: Authors have not verified information More...
Compile-time optimized and statically scheduled N-D convnet primitives for multi-core and many-core (Xeon Phi) CPUs Aleksandar Zlateski, H. Sebastian Seung	Compile-time optimized and statically scheduled N-D convnet primitives for multi-core and many-core (Xeon Phi) CPUs Details		Discussion Comments: 0 Verification: Authors have not verified information More...

ACM International Conference on Supercomputing, ICS 2017