How can AI workloads be engineered for optimal performance in modern HPC environments?
The rapid advancement of Artificial Intelligence (AI) and Machine Learning (ML) has positioned High-Performance Computing (HPC) systems as indispensable platforms for developing, training, and executing these workloads. However, the architectural complexity and batch-oriented design of traditional HPC systems pose unique challenges distinct from those encountered in resource-elastic environments such as clouds.
The parallelization characteristics, input/output requirements, and dynamic workflows of AI workloads demand innovative techniques for efficient utilization of HPC resources. Moreover, the performance engineering of such workloads is crucial to achieve scalability, portability, and reproducibility across diverse system architectures.
This workshop aims to bring together researchers, practitioners, and system developers to discuss engineering challenges, performance optimization, and emerging opportunities at the intersection of AI and HPC. It invites among others, papers that present experimental results, architectural insights, performance studies, and best practices advancing the convergence of these domains.
We welcome submissions on the following topics, including but not limited to:
Workload Characterization
Characterizing AI/ML workloads on HPC systems
Data preparation for AI/ML workload on HPC
Hybrid workloads on HPC systems
Performance & Optimization
Parallelization strategies for AI/ML
Performance optimization of AI/ML frameworks on HPC
Efficient inference of LLMs on HPC
Cross-platform portability and reproducibility
Infrastructure & Systems
AI factories and end-to-end pipelines
Next-generation HPC systems for AI/ML
Best practices for integrating ML/AI into HPC
Specialized AI/ML frameworks for HPC
Resource Management
Resource allocation and scheduling for AI/ML workloads
Energy efficiency and power management
DevOps and MLOps for HPC-AI/ML
Applications
HPC-AI/ML convergence for scientific applications
AI-enhanced HPC simulations
Industrial AI/ML on HPC
Collaborative and interactive AI/ML on HPC
Evaluation & Benchmarking
HPC-AI/ML benchmarking and evaluation
Performance studies and best practices
