All News

Startup Positron Takes On Nvidia with Efficient AI Chips

Positron, a private chip startup, just raised $51.6M to challenge Nvidia’s dominance in AI inference. Its Atlas chips promise 3.5× performance per dollar and up to 66% lower power usage without liquid cooling. Early deployments at Cloudflare and Parasail show drop-in compatibility and memory-optimized designs for modern transformer models. Next up is Titan, which targets multi-trillion parameter workloads with air-cooled efficiency.

Published July 30, 2025 at 04:09 AM EDT in Artificial Intelligence (AI)

In a crowded AI hardware market, Positron steps forward with Atlas, an inference accelerator claiming 2× to 5× better performance per watt and per dollar than Nvidia GPUs. The company just closed a $51.6M Series A and is eyeing enterprise deployments across data centers worldwide.

Positron Challenges Nvidia

Founded just over a year ago, Positron is betting on inference chip specialization to bust open Nvidia’s near-monopoly. By focusing narrowly on memory-optimized AI serving, its co-founder Thomas Sohmers claims up to 5× performance per watt and per dollar. Early backers Valor Equity, Atreides, and DFJ Growth oversaw a $51.6M Series A.

Atlas: An Inference-First AI Chip

Atlas is built for transformer models, delivering 3.5× better performance per dollar and up to 66% lower power consumption than Nvidia H100 GPUs. With 93% memory bandwidth utilization, it runs models up to 0.5 trillion parameters on standard 2 kW servers. Enterprises like Cloudflare and Parasail have deployed Atlas without rewriting code.

Next-Gen Titan Platform

Slated for 2026, Titan will leverage custom Asimov silicon to pack up to 2 TB of high-speed memory per accelerator and handle models up to 16 trillion parameters. Crucially, Titan maintains air-cooling compatibility, avoiding the liquid-intensive setups of next-gen GPUs and simplifying data center rollouts.

Memory-First Efficiency

Modern transformer inference flips the compute-to-memory ratio to near 1:1. Positron’s memory-first architecture targets this shift, boosting utilization well above the 10–30% range typical of GPUs. The result is faster, more cost-effective model serving that cuts power bills and rack space demands.

Domestic Production & Supply Chain Resilience

Positron’s Atlas chips are fabricated in the U.S. at Intel facilities, with assembly also kept domestic. For 2026’s Titan, fabrication will move to TSMC, but integration remains stateside. This gives enterprises a geopolitically resilient option amid global supply chain tensions.

Infra Without Liquid Cooling

By designing for air-cooled deployments, Positron ensures drop-in compatibility with hundreds of existing data centers. Enterprises avoid costly retrofits and liquid-cooling infrastructure, letting them scale AI inference with minimal disruption.

Positron’s Atlas and upcoming Titan platform could reshape enterprise AI deployments by tackling cost, power, and complexity. Organizations ready to modernize their inference stack should evaluate memory-first, energy-efficient hardware now.

Keep Reading

View All
The Future of Business is AI

AI Tools Built for Agencies That Move Fast.

Positron’s energy-efficient inference chips set a new standard for enterprise AI deployments. See how QuarkyByte can help your organization integrate drop-in hardware like Atlas into existing data centers, cutting power costs and boosting inference performance. Let's explore optimized AI scaling strategies today.