Key updates include the integration of SGLang for accelerated AI inference, an enhanced FlashAttention-2 for improved AI training and inference, multi-node Fast Fourier Transform (FFT) support, a new Fortran compiler, and upgraded computer vision libraries for AV1, JPEG decoding, and audio augmentation.
According to AMD's blog post, SGLang, a runtime supported in ROCm 6.3, is specifically designed to optimize inference on large language models (LLMs) and vision-language models (VLMs) running on AMD Instinct GPUs. AMD claims the integration offers up to six times higher throughput and a more user-friendly experience with Python support and pre-configured ROCm Docker containers. The ROCm 6.3 also comes with the new Fortran compiler introduces direct GPU offloading and backward compatibility, along with integration into HIP Kernels and ROCm libraries.
The update also brings significant transformer optimizations through FlashAttention-2, which improves both forward and backward pass performance over the previous FlashAttention-1. A new ROCm 6.3 also includes new multi-node FFT support via rocFFT, improving multi-node scaling, and enhanced computer vision libraries for AV1 codec support, GPU-accelerated JPEG decoding, and audio augmentation.
AMD emphasized its focus on the open-source community, while delivering "cutting-edge tools" that simplify development while boosting performance and scalability for AI and HPC applications.
 
				