<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>AI Performance on AI VOID</title><link>https://ai-blog.noorshomelab.dev/tags/ai-performance/</link><description>Recent content in AI Performance on AI VOID</description><generator>Hugo</generator><language>en</language><lastBuildDate>Mon, 30 Mar 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://ai-blog.noorshomelab.dev/tags/ai-performance/index.xml" rel="self" type="application/rss+xml"/><item><title>TurboQuant vs. GGUF &amp;amp; INT8/INT4 Quantization: Complete Comparison 2026</title><link>https://ai-blog.noorshomelab.dev/comparisons/turboquant-gguf-int8-int4-quantization-comparison-2026/</link><pubDate>Mon, 30 Mar 2026 00:00:00 +0000</pubDate><guid>https://ai-blog.noorshomelab.dev/comparisons/turboquant-gguf-int8-int4-quantization-comparison-2026/</guid><description>&lt;h2 id="introduction"&gt;Introduction&lt;/h2&gt;
&lt;p&gt;The rapid growth of Large Language Models (LLMs) has brought unprecedented capabilities but also significant computational demands, particularly in terms of memory footprint and inference speed. Quantization has emerged as a critical technique to address these challenges, allowing LLMs to run more efficiently on a wider range of hardware, from powerful data center GPUs to consumer-grade CPUs.&lt;/p&gt;
&lt;p&gt;This comprehensive guide provides an objective, side-by-side comparison of the latest advancements in LLM quantization as of March 30, 2026:&lt;/p&gt;</description></item></channel></rss>