Multiverse Computing and Cerebrium Bring Compressed AI to the Cloud, Creating a Blueprint for Economically Sustainable AI at Scale

New partnership leverages Multiverse’s quantum-inspired AI compression technology with Cerebrium’s serverless scaling infrastructure to deliver 12x faster inference and 90% smaller AI models, reducing infrastructure overhead

SAN SEBASTIÁN, Spain, Dec. 02, 2025 (GLOBE NEWSWIRE) -- As large language models continue to get larger and the scarcity of compute resources drives up development costs, AI is becoming increasingly cost-prohibitive for many enterprises. Today, Multiverse Computing, the leader in quantum-inspired AI model compression, and Cerebrium, an elastic, serverless AI infrastructure platform, announced a strategic partnership designed to alleviate this burden and create a new foundation for economically sustainable AI. Together, the technologies form a unified system that optimizes GPU utilization, minimizes latency, and lowers cost per inference across the full AI deployment lifecycle, from prototyping to production.

“Compute costs remain one of the biggest barriers to AI progress, setting artificial limits on its societal impact,” said Enrique Lizaso, cofounder and CEO of Multiverse Computing. “We’re fusing efficiency and scale so innovation can finally take off without running into a cost wall. Together, our partnership proves that performance and affordability can coexist, unlocking faster, more accessible, and more sustainable AI for everyone.”

At the heart of the partnership is a seamless pipeline between Multiverse’s CompactifAI model compression engine and Cerebrium’s dynamic container scaling, which can expand to thousands of GPUs near-instantly. The joint solution enables enterprises to deploy high-performance models that run up to 12x faster, consume up to 80% fewer compute resources, and scale globally in seconds, without sacrificing accuracy or availability.

“Organizations around the world want to take advantage of AI, but very few can actually afford to do it at scale,” said Michael Louis, founder and CEO of Cerebrium. “With Multiverse’s compression engine shrinking the computational footprint and our infrastructure expanding elastically to meet demand, we’re chipping away at the last technical and economic barriers between innovation and real-world deployment.”

The partnership also reflects a broader shift in how the AI industry defines progress, moving from sheer scale and parameter counts to a new standard that values efficiency, speed, and economic viability as a measure of performance.

Customers can now leverage Cerebrium’s elastic orchestration engine and CompactifAI models through private deployments. To learn more, reach out to Multiverse Computing at sales@multiversecomputing.com.

About Multiverse Computing

Multiverse Computing is the leader in quantum-inspired AI model compression. The company’s deep expertise in quantum software and AI led to the development of CompactifAI, a revolutionary AI model compression engine. CompactifAI compresses LLMs by up to 95% with only 2-3% precision loss. CompactifAI models reduce computing requirements and unleash new use cases for AI across industries.

Multiverse Computing is headquartered in Donostia, Spain, with offices across Europe, the U.S., and Canada. With over 160 patents and 100 customers globally, including Iberdrola, Bosch, and the Bank of Canada, Multiverse Computing has raised c.$250M to date from investors including Bullhound Capital, HP Tech Ventures, SETT, Forgepoint Capital International, CDP Venture Capital, Toshiba, and Santander Climate VC. For more information, visit multiversecomputing.com.

About Cerebrium

Cerebrium is a serverless infrastructure platform that makes it easy for engineering teams to build and scale AI applications. The platform delivers low startup times, multi-region deployments for low latency and data residency, support for over a dozen GPU types, and can scale to thousands of containers in seconds. Backed by Gradient (Google’s AI fund) and Y Combinator, Cerebrium powers real-time AI workloads for leading teams such as Tavus, Deepgram, and Vapi.

Media Contact
LaunchSquad for Multiverse Computing
multiverse@launchsquad.com

Legal Disclaimer:

EIN Presswire provides this news content "as is" without warranty of any kind. We do not accept any responsibility or liability for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this article. If you have any complaints or copyright issues related to this article, kindly contact the author above.