<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Data Preprocessing on AI VOID</title><link>https://ai-blog.noorshomelab.dev/tags/data-preprocessing/</link><description>Recent content in Data Preprocessing on AI VOID</description><generator>Hugo</generator><language>en</language><lastBuildDate>Fri, 20 Mar 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://ai-blog.noorshomelab.dev/tags/data-preprocessing/index.xml" rel="self" type="application/rss+xml"/><item><title>Representing Reality: From Raw Data to Embeddings</title><link>https://ai-blog.noorshomelab.dev/multimodal-ai-guide-2026/representing-reality-raw-data-to-embeddings/</link><pubDate>Fri, 20 Mar 2026 00:00:00 +0000</pubDate><guid>https://ai-blog.noorshomelab.dev/multimodal-ai-guide-2026/representing-reality-raw-data-to-embeddings/</guid><description>&lt;p&gt;Welcome back, future multimodal AI maestros! In our previous chapter, we explored the exciting world of multimodal AI and its incredible potential. Now, it&amp;rsquo;s time to dive deeper and understand the fundamental step that makes all this magic possible: transforming the messy, diverse &amp;ldquo;real world&amp;rdquo; data into a language our AI models can understand.&lt;/p&gt;
&lt;p&gt;This chapter is all about &lt;strong&gt;representing reality&lt;/strong&gt;. We&amp;rsquo;ll learn how raw inputs like text, images, audio, and video, which seem so different to us, are converted into a common, numerical format called &lt;strong&gt;embeddings&lt;/strong&gt;. Think of it as teaching your AI system to &amp;ldquo;see,&amp;rdquo; &amp;ldquo;hear,&amp;rdquo; and &amp;ldquo;read&amp;rdquo; by giving it a universal dictionary of meaning. Mastering this concept is crucial, as it forms the bedrock for any multimodal system you&amp;rsquo;ll ever build.&lt;/p&gt;</description></item><item><title>TensorFlow Guide: Working with Data - `tf.data` API</title><link>https://ai-blog.noorshomelab.dev/tensorflow-guide/working-with-data-tf-data-api/</link><pubDate>Sun, 26 Oct 2025 00:00:00 +0000</pubDate><guid>https://ai-blog.noorshomelab.dev/tensorflow-guide/working-with-data-tf-data-api/</guid><description>&lt;h2 id="4-working-with-data-tfdata-api"&gt;4. Working with Data: &lt;code&gt;tf.data&lt;/code&gt; API&lt;/h2&gt;
&lt;p&gt;Efficiently loading, preprocessing, and feeding data to your models is crucial for performance, especially with large datasets. TensorFlow&amp;rsquo;s &lt;code&gt;tf.data&lt;/code&gt; API is designed to build high-performance input pipelines that are robust, flexible, and scalable.&lt;/p&gt;
&lt;h3 id="41-why-tfdata"&gt;4.1 Why &lt;code&gt;tf.data&lt;/code&gt;?&lt;/h3&gt;
&lt;p&gt;Traditional data loading often involves reading all data into memory or iterating over files one by one. This can be slow and memory-intensive. The &lt;code&gt;tf.data&lt;/code&gt; API solves this by:&lt;/p&gt;</description></item><item><title>Chapter 6: Getting Data Ready: Basic Data Manipulation in Python</title><link>https://ai-blog.noorshomelab.dev/ai-ml-journey-2026/basic-data-manipulation-python/</link><pubDate>Sun, 18 Jan 2026 00:00:00 +0000</pubDate><guid>https://ai-blog.noorshomelab.dev/ai-ml-journey-2026/basic-data-manipulation-python/</guid><description>&lt;h2 id="introduction-shaping-the-raw-material"&gt;Introduction: Shaping the Raw Material&lt;/h2&gt;
&lt;p&gt;Welcome back, future AI explorer! In our previous chapters, we&amp;rsquo;ve journeyed through the fascinating world of AI and Machine Learning, understanding the core concepts of how machines &amp;ldquo;learn&amp;rdquo; and why data is their lifeblood. We also took our first exciting steps into Python programming, learning about variables, data types, and basic operations. You&amp;rsquo;re doing great!&lt;/p&gt;
&lt;p&gt;Now, it&amp;rsquo;s time to get our hands a little dirty (in a good way!) with that precious data. Imagine you&amp;rsquo;re a chef, and you&amp;rsquo;ve just received a basket full of fresh ingredients. Before you can cook a delicious meal, you need to wash, peel, chop, and prepare everything, right? Data is no different. Raw data, straight from its source, is rarely in the perfect shape for a machine learning model. It might have missing pieces, incorrect values, or be organized in a way that&amp;rsquo;s hard for our algorithms to understand.&lt;/p&gt;</description></item></channel></rss>