Nvidia and Microsoft Boost AI Performance on RTX PCs with TensorRT and Windows ML
Nvidia and Microsoft have partnered to accelerate AI processing on Nvidia RTX-based PCs, introducing TensorRT for RTX and Windows ML for optimized AI inference. This collaboration simplifies AI deployment with just-in-time engine building, smaller libraries, and broad hardware support. Developers benefit from enhanced performance, streamlined workflows, and access to Nvidia’s SDKs and AI models, empowering innovative generative AI applications across Windows 11 devices.
Nvidia and Microsoft have announced a strategic collaboration to enhance the performance of artificial intelligence (AI) processing on Nvidia RTX-based AI PCs. This partnership focuses on leveraging generative AI to transform PC software experiences, including digital humans, writing assistants, intelligent agents, and creative tools, all powered by Nvidia RTX technology on Windows 11.
TensorRT for RTX AI PCs
At the core of this advancement is TensorRT for RTX AI PCs, a reimagined inference engine that combines Nvidia’s industry-leading TensorRT performance with just-in-time on-device engine building. This approach reduces the package size by eight times and enables rapid AI deployment across over 100 million RTX AI PCs. TensorRT for RTX is natively supported by Windows ML, a new inference stack that offers broad hardware compatibility and state-of-the-art performance for app developers.
Gerardo Delgado, Nvidia’s director of product for AI PCs, explained that AI models consist of mathematical operations executed by Tensor cores on GPUs. TensorRT optimizes these models by quantizing them to reduce precision where possible and preparing a plan with pre-selected kernels to maximize performance. Compared to traditional AI execution on Windows, this method achieves approximately 1.6 times better performance on average.
The new TensorRT for RTX improves developer workflows by generating a generic engine with the application installation and then building the optimal engine for the specific GPU in seconds. This innovation simplifies deployment, reduces library sizes, and enhances performance for applications such as video generation and livestreaming.
Windows ML and AI Inference Optimization
Windows ML, built on ONNX Runtime, addresses the challenge developers face in balancing broad hardware support with high performance. It automatically selects the appropriate hardware and downloads the necessary execution providers, removing packaging burdens. For GeForce RTX GPUs, Windows ML leverages TensorRT for RTX, delivering over 50% faster AI workload performance compared to DirectML.
This seamless integration ensures developers can deploy AI features efficiently while benefiting from the latest performance optimizations without repackaging their applications.
Expanding the AI Ecosystem with Nvidia SDKs and NIM
Nvidia’s comprehensive SDK suite, including CUDA, TensorRT, DLSS, Optix, RTX Video, Maxine, Riva, Nemotron, and ACE, empowers developers to integrate AI features and accelerate applications on GeForce RTX GPUs. This month, leading software like Autodesk, Bilibili, Chaos, LM Studio, and Topaz are releasing updates to unlock RTX AI features.
Nvidia NIM (Nvidia AI Models) simplifies AI development by providing pre-packaged, optimized AI models that run efficiently on RTX GPUs. These containerized microservices enable seamless deployment across PC and cloud environments. Recent releases include the FLUX.1-schnell image generation model and updates to FLUX.1-dev, enhancing compatibility and performance on a broad range of RTX GPUs.
Developers can also leverage Nvidia AI Blueprints—sample workflows and projects using NIM—to accelerate innovation in areas such as 3D guided generative AI, creative workflows, and productivity applications.
Project G-Assist and Community Plug-ins
Project G-Assist is Nvidia’s experimental AI assistant integrated into the Nvidia app, enabling users to control their RTX PC with voice and text commands. It offers a no-code/low-code Plug-in Builder powered by ChatGPT, allowing developers and enthusiasts to create custom assistant workflows easily.
New community plug-ins extend G-Assist’s capabilities with integrations for Google Gemini web search, Spotify, Twitch, IFTTT, SignalRGB, and Discord. These plug-ins enable automation, enhanced streaming, unified lighting control, and hands-free music or game highlight sharing, enriching the user experience on RTX AI PCs.
Enthusiasts and developers are encouraged to join Nvidia’s Developer Discord channel to collaborate, share creations, and receive support for building and publishing G-Assist plug-ins.
This collaboration between Nvidia and Microsoft marks a significant step in making high-performance AI more accessible and efficient on Windows PCs. By combining advanced hardware, optimized software stacks, and developer-friendly tools, the ecosystem is poised to accelerate innovation in generative AI applications across industries.
Keep Reading
View AllRaindrop Revolutionizes AI Observability to Detect Real-Time Generative AI Failures
Raindrop offers AI-native observability tools to monitor and resolve generative AI issues in production with privacy-first solutions.
Microsoft Discovery Accelerates Scientific Research with AI and Supercomputing
Microsoft Discovery uses AI agents and supercomputing to speed up scientific breakthroughs, enabling faster R&D across industries.
Inside OpenAI Chaos and Bitcoin-Powered Spa Heating Plus Nvidia's AI Supercomputer Plans
Explore OpenAI's internal challenges, bitcoin mining heating a Brooklyn spa, and Nvidia's AI supercomputer ambitions.
AI Tools Built for Agencies That Move Fast.
QuarkyByte offers deep insights into Nvidia and Microsoft’s AI acceleration technologies, helping developers and businesses harness TensorRT for RTX and Windows ML. Explore how our expert analysis can guide your AI integration strategies to maximize performance and innovation on RTX AI PCs.