ByteDance unveils UltraMem, cutting AI inference costs by 83%


Chinese tech heavyweight ByteDance announced on Thursday the launch of its new model architecture, UltraMem, which reduces the inference costs of artificial intelligence-powered models by up to 83 percent.
According to the company's Doubao LLM team, UltraMem enhances the inference speed by 2 to 6 times compared to traditional MoE (mixture-of-experts) architectures. This technological advancement offers a new pathway for improving inference efficiency and performance of large language models.
The move follows the surprise release of DeepSeek's high-performance and cost-efficient open-source AI model R1.
Moreover, Baidu Inc revealed on Thursday that its AI chatbot, Ernie Bot, will be available for free starting April 1, with upgraded technology and reduced costs. The AI service will be accessible at no cost to all users on both desktop and mobile platforms, the company said.
Additionally, Baidu also launched an advanced search function which will be free of charge and available from April 1.
The few function features improved reasoning capability and tool integration to deliver expert-level responses, and is capable of handling multiple tasks and achieving multimodal inputs and outputs, Baidu said.