<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Quantization-Aware Training on AI VOID</title><link>https://ai-blog.noorshomelab.dev/tags/quantization-aware-training/</link><description>Recent content in Quantization-Aware Training on AI VOID</description><generator>Hugo</generator><language>en</language><lastBuildDate>Sun, 07 Jun 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://ai-blog.noorshomelab.dev/tags/quantization-aware-training/index.xml" rel="self" type="application/rss+xml"/><item><title>The Quest for Efficiency: Understanding Model Compression and Quantization</title><link>https://ai-blog.noorshomelab.dev/gemma-4-qat-guide-2026/model-compression-quantization-fundamentals/</link><pubDate>Sun, 07 Jun 2026 00:00:00 +0000</pubDate><guid>https://ai-blog.noorshomelab.dev/gemma-4-qat-guide-2026/model-compression-quantization-fundamentals/</guid><description>&lt;h2 id="the-quest-for-efficiency-understanding-model-compression-and-quantization"&gt;The Quest for Efficiency: Understanding Model Compression and Quantization&lt;/h2&gt;
&lt;p&gt;Welcome to the exciting world of optimizing AI models for the real world! You&amp;rsquo;ve likely marvelled at the power of large language models (LLMs), but have you ever wondered how to make them run smoothly on everyday devices like your smartphone or laptop? That&amp;rsquo;s the challenge we&amp;rsquo;re tackling in this guide.&lt;/p&gt;
&lt;p&gt;In this first chapter, we&amp;rsquo;ll embark on a journey to understand the foundational concepts behind making these powerful AI models nimble and efficient. We&amp;rsquo;ll explore why model size is a critical factor, dive deep into the techniques used to shrink them without losing their smarts, and specifically focus on Quantization-Aware Training (QAT) – a cutting-edge approach that makes models like Google&amp;rsquo;s Gemma 4 shine on constrained hardware. By the end of this chapter, you&amp;rsquo;ll have a solid grasp of the &amp;ldquo;why&amp;rdquo; and &amp;ldquo;what&amp;rdquo; behind model compression, setting the stage for practical implementation.&lt;/p&gt;</description></item></channel></rss>