Inference Models - Search News

DeepSeek-V4, Chinese AI model adapted for Huawei chips

Digest more

· 1h

DeepSeek Drops Cheaper V4 AI as Huawei Jumps In

DeepSeek launches V4 AI model with Huawei chip support, offering lower costs and intensifying global AI competition.

· 14h

DeepSeek-V4, the Chinese AI model adapted for Huawei chips

· 14h · on MSN

Huawei, DeepSeek strengthen China’s AI self-reliance with collaboration on V4 model

· 21h

China's DeepSeek rolls out a long-anticipated update of its AI model

DeepSeek, the Chinese artificial intelligence startup that shook world markets last year, launched preview versions of its latest major update Friday as the AI rivalry between China and the U.S. heats...

MIT Technology Review · 2h

Three reasons why DeepSeek’s new model matters

· 17h

DeepSeek promises its new AI model has 'world-class' reasoning

Now available in preview, DeepSeek V4 cuts inference costs to a fraction of R1

Chinese AI darling DeepSeek is back with a new open weights large language model that promises performance to rival the best proprietary American LLMs. Perhaps more importantly, it claims to dramatically reduce inference costs and it extends support for Huawei's Ascend family of AI accelerators.

Two new TPUs to power the next wave of AI training and inference at Google

Google LLC introduced two new custom silicon chips for artificial intelligence today at Google Cloud Next 2026, unveiling two distinct Tensor Processor Unit architectures built for training and inference: the eighth-generation TPU 8t and TPU 8i.

Skymizer Taiwan Inc. Unveils Breakthrough Architecture Enabling Ultra-Large LLM Inference on a Single Card

Ahead of COMPUTEX 2026, Skymizer Taiwan Inc., a pioneer in AI inference solutions, today previewed a major advancement in on-premise AI deployment with its HTX301 inference chip, which integrates HyperThought™ — a software/hardware co-design platform first introduced at COMPUTEX 2025.

Tether AI is building the Stable Intelligence layer, a highly efficient platform designed to scale on edge devices, made for the people

QVAC SDK and Fabric give people and companies the ability to execute inference and fine-tune powerful models on their own terms, on their own hardware, with full control of their data.” Paolo Ardoino,

Business Wire

Vultr Launches Cloud Inference to Simplify Model Deployment and Automatically Scale AI Applications Globally

WEST PALM BEACH, Fla.--(BUSINESS WIRE)--Vultr, the world’s largest privately-held cloud computing platform, today announced the launch of Vultr Cloud Inference. This new serverless platform revolutionizes AI scalability and reach by offering global AI ...

Novita AI Ranked as the Best Performing & Reliable Inference Layer

As demand for open-source AI infrastructure grows, Novita AI is establishing itself as the inference provider for developers and engineering teams that need fast and affordable inference for production AI.

Train-to-Test scaling explained: How to optimize your end-to-end AI compute budget for inference

AI reasoning does not necessarily require spending huge amounts on frontier models. Instead, smaller models can yield stronger performance on complex tasks while keeping per-query inference costs mana

Forbes

The Rise Of The AI Inference Economy

Forbes contributors publish independent expert analyses and insights. I write about the economics of AI. When OpenAI’s ChatGPT first exploded onto the scene in late 2022, it sparked a global obsession with model training. Companies raced to build ever ...