AMD has introduced the open-source AI Tensor Engine for ROCm (AITER), which integrates various commonly used AI functions to optimize on ROCm, specifically PyTorch favored by the LLM intelligence group.
LLM’s operator group can now run multiple times faster, with functions like Mixture of Experts (MoE), matrix multiplication, and Multi-Head Attention (MHA). The performance test results of DeepSeek-V3/R1 on the MI300X chip show significant speed improvements compared to before using AITER.
Currently, software running LLM such as vLLM and SGLang support AITER, and AMD has confirmed plans to invest in accelerating AI workload on AMD chips in the near future.
TLDR: AMD’s AITER enhances AI performance on ROCm, with faster execution of popular AI functions, benefiting from improved compatibility with LLM software and future investment plans from AMD.
Leave a Comment