IISWC 2022 Program

Sunday, November 6, 2022 - Tutorials
10am - 12pm CST	Tutorial 1: Performance Engineering in the Public Cloud, Padma Apparao, Intel Corporation. Abstract Slides
1pm - 3pm CST	Tutorial 2: Methods for characterizing workloads with Hardware accelerated memory page compression, Ravindran, Binuraj; Al-Fahad, Rakib; Liang, Dan; Chowdhury, Muktadir; Luo, Zhenlin, Intel Corporation. Abstract Slides
3:15pm - 5:15pm CST	Tutorial 3: Sparse Weight Compression and Decompression for Intel AMX/TMUL to Improve Deep Learning Performance, Shamima Najnin Rajesh Poornachandran, Sreekanth V. Yalachigere Mona Minakshi, Anik Khan, Ofir Zafrir, Nilesh Jain, Md Faijul Amin, Guy Boudoukh, Tatyana Primak, Pallavi G, Intel Corporation. Abstract Slides

Sunday, November 6, 2022 - Tutorials

10am - 12pm CST

Tutorial 1: Performance Engineering in the Public Cloud, Padma Apparao, Intel Corporation.
Abstract Slides

1pm - 3pm CST

Tutorial 2: Methods for characterizing workloads with Hardware accelerated memory page compression, Ravindran, Binuraj; Al-Fahad, Rakib; Liang, Dan; Chowdhury, Muktadir; Luo, Zhenlin, Intel Corporation.
Abstract Slides

3:15pm - 5:15pm CST

Tutorial 3: Sparse Weight Compression and Decompression for Intel AMX/TMUL to Improve Deep Learning Performance, Shamima Najnin Rajesh Poornachandran, Sreekanth V. Yalachigere Mona Minakshi, Anik Khan, Ofir Zafrir, Nilesh Jain, Md Faijul Amin, Guy Boudoukh, Tatyana Primak, Pallavi G, Intel Corporation.
Abstract Slides

Monday, November 7, 2022
8:00 - 8:30 am CST	Breakfast
8:30 - 8:45 am CST	Opening Remarks
8:45 - 9:30 am CST	Virtual Reality: The current state of the art and the opportunities, Amit Puntambekar, Meta
9:30 - 10:45 am CST	Session 1: Microarchitecture/HW Performance Analysis
11:00 - 12:15 pm CST	Session 2: HPC
1:00 - 2:30 pm CST	Panel
2:45 - 4:25 pm CST	Session 3: AI Systems
5:00 - 8:00 pm CST	Social event: Austin River Cruise and Dinner

Monday, November 7, 2022

8:00 - 8:30 am CST

Breakfast

8:30 - 8:45 am CST

Opening Remarks

8:45 - 9:30 am CST

Virtual Reality: The current state of the art and the opportunities, Amit Puntambekar, Meta

9:30 - 10:45 am CST

Session 1: Microarchitecture/HW Performance Analysis

11:00 - 12:15 pm CST

Session 2: HPC

1:00 - 2:30 pm CST

Panel

2:45 - 4:25 pm CST

Session 3: AI Systems

5:00 - 8:00 pm CST

Social event: Austin River Cruise and Dinner

Tuesday, November 8, 2022
8:00 - 8:30 am CST	Breakfast
8:30 - 8:45 am CST	Opening Remarks
8:45 - 9:30 am CST	Keynote 2: Overcoming the challenges when viewing oneAPI as a performance workload, Paul Petersen, Intel
9:30 - 10:45 am CST	Session 4: Graph Neural Networks
11:00 - 12:15 pm CST	Session 5: Graph Analytics and GPUs
1:30 - 3:15 pm CST	Session 6: Mobile, Web, and Cloud
3:30 - 4:45 pm CST	Session 7: AI Benchmarks & Characterization
4:45 - 5:00 pm CST	Closing Remarks

Tuesday, November 8, 2022

8:00 - 8:30 am CST

Breakfast

8:30 - 8:45 am CST

Opening Remarks

8:45 - 9:30 am CST

Keynote 2: Overcoming the challenges when viewing oneAPI as a performance workload, Paul Petersen, Intel

9:30 - 10:45 am CST

Session 4: Graph Neural Networks

11:00 - 12:15 pm CST

Session 5: Graph Analytics and GPUs

1:30 - 3:15 pm CST

Session 6: Mobile, Web, and Cloud

3:30 - 4:45 pm CST

Session 7: AI Benchmarks & Characterization

4:45 - 5:00 pm CST

Closing Remarks

Monday, November 7, 2022
8:45 - 9:30 am CST	Keynote 1: Virtual Reality: The current state of the art and the opportunities, Amit Puntambekar, Meta
Recent advances in computing technology, mobile computing, computer graphics and better understanding of human perceptual processes are making experiences provided by virtual reality devices incredibly realistic. Impressive new technologies like hand tracking, voice recognition, face tracking, mixed reality, driven by ever more sophisticated AI Models, are being incorporated into completely standalone devices making them significantly easier to use and accessible to all. These devices have the ability to transport the human mind into space going down a roller coaster or to Alaska watching the northern lights while their body is sitting on a couch in the comfort of their home. I think we are once again at an inflexion point in technology that will revolutionize the way humans communicate with each other using these new capabilities in VR devices just like the invention of the telephone over 120 years ago. In this talk I will explore some of the technologies underlying these devices and the opportunities they afford to the research community to advance VRs state of the art. Amit is a Director of Engineering at Meta and currently leads platform engineering efforts in VR, which includes VR Operating System, VR Foundation and VR Ecosystem engineering teams. Prior to VR, Amit spent the last 7+ years at Meta in Video in Facebook, leading efforts on video encoding, video platform and machine learning for content understanding and recommendation systems. Prior to Meta, Amit co-founded a video processing company, which was acquired by Meta. He holds 10+ patents across Video, ML, Infra and Distributed Systems.
9:30 - 10:45 am CST	Session 1: Microarchitecture/HW Performance Analysis Carol-Jean Wu (Meta AI/Arizona State University)
	PInTE: Probabilistic Induction of Theft Evictions Slides Cesar A Gomes, Xuesi Chen, Mark Hempstead (Tufts University)
	GRANITE: A Graph Neural Network Model for Basic Block Throughput Estimation Ondřej Sýkora, Phitchaya Mangpo Phothilimthana (Google Research), Charith Mendis (UIUC), Amir Yazdanbakhsh (Google Research)
	UVM Discard: Eliminating Redundant Memory Transfers for Accelerators Weixi Zhu (Rice University), Guilherme Cox, Jan Vesely, Mark Hairgrove (NVIDIA), Alan L. Cox, Scott Rixner (Rice University)
11:00 - 12:15 pm CST	Session 2: HPC Mark Hempstead (Tufts University)
	FPChecker: Floating-Point Exception Detection Tool and Benchmark for Parallel and Distributed HPC Ignacio Laguna (Lawrence Livermore National Laboratory), Tanmay Tirpankar, Xinyi Li, Ganesh Gopalakrishnan (University of Utah)
	Splash-4: A Modern Benchmark Suite with Lock-Free Constructs Slides Eduardo José Gómez-Hernández, Juan M. Cebrian (University of Murcia), Stefanos Kaxiras (Uppsala University), Alberto Ros (University of Murcia)
	Characterizing Molecular Dynamics Simulation on Commodity Platforms Francesco Peverelli, Davide Conficconi (Politecnico di Milano, Italy), Davide B. Bartolini, Alberto Scolari (Huawei), Marco D. Santambrogio (Politecnico di Milano, Italy)
1:00 - 2:30 pm CST	25 Years of IISWC: Looking Back and Forward Moderator: Lieven Eeckhout (Ghent University) Panelists: John Carter (IBM), Lizy K. John (University of Texas at Austin), David Kaeli (Northeastern University), Vijay Janapa Reddi (Harvard University), Carole-Jean Wu (Meta), Neeraja J. Yadwadkar (University of Texas at Austin)
2:45 - 4:25 pm CST	Session 3: AI Systems Ravi Iyer (Intel)
	An Evaluation of Edge TPU Accelerators for Convolutional Neural Networks Kiran Seshadri, Berkin Akin (Google), James Laudon (Google Research), Ravi Narayanaswami (Google, Cruise), Amir Yazdanbakhsh (Google Research)
	Accelerating Transformer Networks through Recomposing Softmax Layers Slides Jaewan Choi, Hailong Li (Seoul National University), Byeongho Kim (Samsung Electronics), Seunghwan Hwang, Jung Ho Ahn (Seoul National University)
	A Slice and Dice Approach to Accelerate Compound Sparse Attention on GPU Slides Hailong Li, Jaewan Choi, Jung Ho Ahn (Seoul National University)
	FedGPO: Heterogeneity-Aware Global Parameter Optimization for Efficient Federated Learning Young Geun Kim (Korea University), Carole-Jean Wu (Meta AI / Arizona State University)

Monday, November 7, 2022

8:45 - 9:30 am CST

Keynote 1: Virtual Reality: The current state of the art and the opportunities, Amit Puntambekar, Meta

Recent advances in computing technology, mobile computing, computer graphics and better understanding of human perceptual processes are making experiences provided by virtual reality devices incredibly realistic. Impressive new technologies like hand tracking, voice recognition, face tracking, mixed reality, driven by ever more sophisticated AI Models, are being incorporated into completely standalone devices making them significantly easier to use and accessible to all. These devices have the ability to transport the human mind into space going down a roller coaster or to Alaska watching the northern lights while their body is sitting on a couch in the comfort of their home. I think we are once again at an inflexion point in technology that will revolutionize the way humans communicate with each other using these new capabilities in VR devices just like the invention of the telephone over 120 years ago. In this talk I will explore some of the technologies underlying these devices and the opportunities they afford to the research community to advance VRs state of the art.

Amit is a Director of Engineering at Meta and currently leads platform engineering efforts in VR, which includes VR Operating System, VR Foundation and VR Ecosystem engineering teams. Prior to VR, Amit spent the last 7+ years at Meta in Video in Facebook, leading efforts on video encoding, video platform and machine learning for content understanding and recommendation systems. Prior to Meta, Amit co-founded a video processing company, which was acquired by Meta. He holds 10+ patents across Video, ML, Infra and Distributed Systems.

9:30 - 10:45 am CST

Session 1: Microarchitecture/HW Performance Analysis

Carol-Jean Wu (Meta AI/Arizona State University)

PInTE: Probabilistic Induction of Theft Evictions Slides

Cesar A Gomes, Xuesi Chen, Mark Hempstead (Tufts University)

GRANITE: A Graph Neural Network Model for Basic Block Throughput Estimation

Ondřej Sýkora, Phitchaya Mangpo Phothilimthana (Google Research), Charith Mendis (UIUC), Amir Yazdanbakhsh (Google Research)

UVM Discard: Eliminating Redundant Memory Transfers for Accelerators

Weixi Zhu (Rice University), Guilherme Cox, Jan Vesely, Mark Hairgrove (NVIDIA), Alan L. Cox, Scott Rixner (Rice University)

11:00 - 12:15 pm CST

Session 2: HPC

Mark Hempstead (Tufts University)

FPChecker: Floating-Point Exception Detection Tool and Benchmark for Parallel and Distributed HPC

Ignacio Laguna (Lawrence Livermore National Laboratory), Tanmay Tirpankar, Xinyi Li, Ganesh Gopalakrishnan (University of Utah)

Splash-4: A Modern Benchmark Suite with Lock-Free Constructs Slides

Eduardo José Gómez-Hernández, Juan M. Cebrian (University of Murcia), Stefanos Kaxiras (Uppsala University), Alberto Ros (University of Murcia)

Characterizing Molecular Dynamics Simulation on Commodity Platforms

Francesco Peverelli, Davide Conficconi (Politecnico di Milano, Italy), Davide B. Bartolini, Alberto Scolari (Huawei), Marco D. Santambrogio (Politecnico di Milano, Italy)

1:00 - 2:30 pm CST

25 Years of IISWC: Looking Back and Forward

Moderator: Lieven Eeckhout (Ghent University)

Panelists: John Carter (IBM), Lizy K. John (University of Texas at Austin), David Kaeli (Northeastern University), Vijay Janapa Reddi (Harvard University), Carole-Jean Wu (Meta), Neeraja J. Yadwadkar (University of Texas at Austin)

2:45 - 4:25 pm CST

Session 3: AI Systems

Ravi Iyer (Intel)

An Evaluation of Edge TPU Accelerators for Convolutional Neural Networks

Kiran Seshadri, Berkin Akin (Google), James Laudon (Google Research), Ravi Narayanaswami (Google, Cruise), Amir Yazdanbakhsh (Google Research)

Accelerating Transformer Networks through Recomposing Softmax Layers Slides

Jaewan Choi, Hailong Li (Seoul National University), Byeongho Kim (Samsung Electronics), Seunghwan Hwang, Jung Ho Ahn (Seoul National University)

A Slice and Dice Approach to Accelerate Compound Sparse Attention on GPU Slides

Hailong Li, Jaewan Choi, Jung Ho Ahn (Seoul National University)

FedGPO: Heterogeneity-Aware Global Parameter Optimization for Efficient Federated Learning

Young Geun Kim (Korea University), Carole-Jean Wu (Meta AI / Arizona State University)

Tuesday, November 8, 2022
8:45 - 9:30 am CST	Keynote 2: Overcoming the challenges when viewing oneAPI as a performance workload, Paul Petersen, Intel (Slides)
The vision for oneAPI is to be an open, cross-architecture programming model that allows developers to use a single code base across multiple accelerator architectures. Delivering this vision requires the creation of open specifications, the creation of open-source projects providing implementations, encouraging the emergence of an open community, and delivering an instance of this as a specific product which can enable developers to fully utilize a hardware platform. In this context, I want to talk about a range of challenges and some methods by which we solved them as we looked at the combination of applications and the oneAPI runtime as a workload to optimize. Often the challenge was in how we could observe and understand execution behaviors to see if it was an expected workload characteristic or an example of overhead we could reduce. Paul Petersen is a Fellow in Intel/SATG (Software and Advanced Technology Group), and oneAPI Architect. He received a Ph.D. in Computer Science from the University of Illinois in 1993. Starting at Kuck and Associates, Inc. (KAI) responsibility included enhancing the auto-parallelizing compiler (KAP) and the early definition and implementations of OpenMP. While at KAI, he developed the Assure line of parallelization/correctness products, for Fortran, C++ and Java. In 2000, Intel Corporation acquired KAI, and he joined the software tools group creating the Thread Checker products, which evolved into the Inspector and Advisor components of the Intel® Parallel Studio. Inspector uses dynamic binary instrumentation to detect memory and concurrency bugs, and Advisor uses similar techniques along with performance measurement and modeling to assist developers in transforming existing serial applications to be ready for parallel execution. The passion for software architecture grew to cover all of Parallel Studio XE and its components architecture. After a few years leading the software tools pathfinding with a focus on defining next generation features for parallel runtimes and software analysis tools, Paul returned to software architecture in his current role leading the oneAPI Tools Architecture team.
9:30 - 10:45 am CST	Session 4: Graph Neural Networks Reetu Das (University of Michigan)
	Bottleneck Analysis of Dynamic Graph Neural Network Inference on CPU and GPU Slides Hanqiu Chen, Yahya Alhinai, Yihan Jiang (GaTech), Eunjee Na (KAIST), Cong (Callie) Hao (GaTech)
	gSuite: A Flexible and Framework Independent Benchmark Suite for Graph Neural Network Inference on GPUs Slides Taha Tekdoğan, Serkan Göktaş, Ayse Yilmazer-Metin (Istanbul Technical University)
	Characterizing the Efficiency of Graph Neural Network Frameworks with a Magnifying Glass Xin Huang (Texas State University), Jongryool Kim (SK hynix America), Brad Rees (NVIDIA), Chul-Ho Lee (Texas State University)
11:00 - 12:15 pm CST	Session 5: Graph Analytics and GPUs Cristina Beldica (Intel)
	Performance Characterization of AutoNUMA Memory Tiering on Graph Analytics Slides Diego Moura (Federal University of Bahia), Daniel Mossé (University of Pittsburgh), Vinicius Petrucci (Micron)
	Understanding the Power of Evolutionary Computation for GPU Code Optimization Slides Jhe-Yu Liou (Arizona State University), Muaaz Awan, Steven Hofmeyr (Lawrence Berkeley National Laboratory), Carole-Jean Wu, Stephanie Forrest (Arizona State University)
	The Implications of Page Size Management on Graph Analytics Slides Aninda Manocha (Princeton University), Zi Yan (NVIDIA), Esin Tureci (Princeton University), Juan Luis Aragón (University of Murcia), David Nellans (NVIDIA), Margaret Martonosi (Princeton University)
1:30 - 3:15 pm CST	Session 6: Mobile, Web, and Cloud Chris Hughes (Intel)
	Revisiting Temporal Storage I/O Behaviors of Smartphone Applications: Analysis and Synthesis Slides Qiang Zou (Southwest University), Bo Mao (Xiamen University)
	How Far We’ve Come – A Characterization Study of Standalone WebAssembly Runtimes Slides Wenwen Wang (University of Georgia)
	SpotLake: Diverse Spot Instance Dataset Archive Service Slides Sungjae Lee, Jaeil Hwang, Kyungyong Lee (Kookmin University)
	Leaps and Bounds: Analyzing WebAssembly's Performance with a Focus on Bounds Checking Slides Raven Szewczyk, Kim Stonehouse, Antonio Barbalace (University of Edinburgh, United Kingdom), Tom Spink (University of St Andrews, United Kingdom)
3:30 - 4:45 pm CST	Session 7: AI Benchmarks & Characterization Chris Hughes (Intel)
	Demystifying Map Space Exploration for NPUs Sheng-Chun Kao (GaTech), Angshuman Parashar, Po-An Tsai (NVIDIA), Tushar Krishna (GaTech)
	LongTail-Bench: A Benchmark Suite for Domain-Specific Operators in Deep Learning Xiuhong Li (SenseTime Research & Shanghai AI Lab), Shengen Yan, Lijuan Jiang, Ping Xu (SenseTime Research), Jinming Ma (Shanghai AI Lab), Xingcheng Zhang (SenseTime Research & Shanghai AI Lab), Dahua Lin (The Chinese University of Hong Kong & Shanghai AI Lab)
	Demystifying BERT: System Design Implications Suchita Pati (University of Wisconsin-Madison), Shaizeen Aga, Nuwan Jayasena (AMD Research), Matthew D. Sinclair (University of Wisconsin-Madison and AMD Research)
4:45 - 5:00 pm CST	Closing Remarks

Tuesday, November 8, 2022

8:45 - 9:30 am CST

Keynote 2: Overcoming the challenges when viewing oneAPI as a performance workload, Paul Petersen, Intel (Slides)

The vision for oneAPI is to be an open, cross-architecture programming model that allows developers to use a single code base across multiple accelerator architectures. Delivering this vision requires the creation of open specifications, the creation of open-source projects providing implementations, encouraging the emergence of an open community, and delivering an instance of this as a specific product which can enable developers to fully utilize a hardware platform. In this context, I want to talk about a range of challenges and some methods by which we solved them as we looked at the combination of applications and the oneAPI runtime as a workload to optimize. Often the challenge was in how we could observe and understand execution behaviors to see if it was an expected workload characteristic or an example of overhead we could reduce.

Paul Petersen is a Fellow in Intel/SATG (Software and Advanced Technology Group), and oneAPI Architect. He received a Ph.D. in Computer Science from the University of Illinois in 1993. Starting at Kuck and Associates, Inc. (KAI) responsibility included enhancing the auto-parallelizing compiler (KAP) and the early definition and implementations of OpenMP. While at KAI, he developed the Assure line of parallelization/correctness products, for Fortran, C++ and Java. In 2000, Intel Corporation acquired KAI, and he joined the software tools group creating the Thread Checker products, which evolved into the Inspector and Advisor components of the Intel® Parallel Studio. Inspector uses dynamic binary instrumentation to detect memory and concurrency bugs, and Advisor uses similar techniques along with performance measurement and modeling to assist developers in transforming existing serial applications to be ready for parallel execution. The passion for software architecture grew to cover all of Parallel Studio XE and its components architecture. After a few years leading the software tools pathfinding with a focus on defining next generation features for parallel runtimes and software analysis tools, Paul returned to software architecture in his current role leading the oneAPI Tools Architecture team.

9:30 - 10:45 am CST

Session 4: Graph Neural Networks

Reetu Das (University of Michigan)

Bottleneck Analysis of Dynamic Graph Neural Network Inference on CPU and GPU Slides

Hanqiu Chen, Yahya Alhinai, Yihan Jiang (GaTech), Eunjee Na (KAIST), Cong (Callie) Hao (GaTech)

gSuite: A Flexible and Framework Independent Benchmark Suite for Graph Neural Network Inference on GPUs Slides

Taha Tekdoğan, Serkan Göktaş, Ayse Yilmazer-Metin (Istanbul Technical University)

Characterizing the Efficiency of Graph Neural Network Frameworks with a Magnifying Glass

Xin Huang (Texas State University), Jongryool Kim (SK hynix America), Brad Rees (NVIDIA), Chul-Ho Lee (Texas State University)

11:00 - 12:15 pm CST

Session 5: Graph Analytics and GPUs

Cristina Beldica (Intel)

Performance Characterization of AutoNUMA Memory Tiering on Graph Analytics Slides

Diego Moura (Federal University of Bahia), Daniel Mossé (University of Pittsburgh), Vinicius Petrucci (Micron)

Understanding the Power of Evolutionary Computation for GPU Code Optimization Slides

Jhe-Yu Liou (Arizona State University), Muaaz Awan, Steven Hofmeyr (Lawrence Berkeley National Laboratory), Carole-Jean Wu, Stephanie Forrest (Arizona State University)

The Implications of Page Size Management on Graph Analytics Slides

Aninda Manocha (Princeton University), Zi Yan (NVIDIA), Esin Tureci (Princeton University), Juan Luis Aragón (University of Murcia), David Nellans (NVIDIA), Margaret Martonosi (Princeton University)

1:30 - 3:15 pm CST

Session 6: Mobile, Web, and Cloud

Chris Hughes (Intel)

Revisiting Temporal Storage I/O Behaviors of Smartphone Applications: Analysis and Synthesis Slides

Qiang Zou (Southwest University), Bo Mao (Xiamen University)

How Far We’ve Come – A Characterization Study of Standalone WebAssembly Runtimes Slides

Wenwen Wang (University of Georgia)

SpotLake: Diverse Spot Instance Dataset Archive Service Slides

Sungjae Lee, Jaeil Hwang, Kyungyong Lee (Kookmin University)

Leaps and Bounds: Analyzing WebAssembly's Performance with a Focus on Bounds Checking Slides

Raven Szewczyk, Kim Stonehouse, Antonio Barbalace (University of Edinburgh, United Kingdom), Tom Spink (University of St Andrews, United Kingdom)

3:30 - 4:45 pm CST

Session 7: AI Benchmarks & Characterization

Chris Hughes (Intel)

Demystifying Map Space Exploration for NPUs

Sheng-Chun Kao (GaTech), Angshuman Parashar, Po-An Tsai (NVIDIA), Tushar Krishna (GaTech)

LongTail-Bench: A Benchmark Suite for Domain-Specific Operators in Deep Learning

Xiuhong Li (SenseTime Research & Shanghai AI Lab), Shengen Yan, Lijuan Jiang, Ping Xu (SenseTime Research), Jinming Ma (Shanghai AI Lab), Xingcheng Zhang (SenseTime Research & Shanghai AI Lab), Dahua Lin (The Chinese University of Hong Kong & Shanghai AI Lab)

Demystifying BERT: System Design Implications

Suchita Pati (University of Wisconsin-Madison), Shaizeen Aga, Nuwan Jayasena (AMD Research), Matthew D. Sinclair (University of Wisconsin-Madison and AMD Research)

4:45 - 5:00 pm CST

Closing Remarks