Cerebras Launches Llama 3.1 Cloud Service with Blistering Speeds Exceeding 1,800 Tokens per Second, Packing RAM into the Chip
Cerebras, a leading AI chip manufacturing company, has launched the Cerebras Inference service running the Llama 3.1 model at high speeds. The Llama 3.1 70B model can achieve a remarkable...