AI Inferencing: Recommendation System

AI Inferencing: Chatbot Services   |   HPC: Computer Aided Engineering   |   In-Memory Databases: Business Intelligence & Analytics

1.7x Boost in AI Inferencing Performance with CXL

Memory Bottleneck for CPU-based Recommendation System
Without CXL

Challenge:

  • DLRM model sizes continue to grow beyond the capacity of a single system
  • Inefficient power utilization with sparse data across distributed systems
  • Limited memory bandwidth for embedding table lookup operations
  • Lack of software support for memory tiering

Memory Expansion for CPU-based Recommendation System
With CXL

Solution:

  • Increases memory capacity by 133% and memory bandwidth by 66%
  • Boosts AI inferencing performance by 73%
  • Leo provides more memory bandwidth for each AI inferencing system
  • Lowers power consumption with consolidated server fleet

Videos