<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Tokenization on AI VOID</title><link>https://ai-blog.noorshomelab.dev/tags/tokenization/</link><description>Recent content in Tokenization on AI VOID</description><generator>Hugo</generator><language>en</language><lastBuildDate>Tue, 17 Mar 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://ai-blog.noorshomelab.dev/tags/tokenization/index.xml" rel="self" type="application/rss+xml"/><item><title>Chapter 2: Designing the Lexer: Tokenization of Mermaid Syntax</title><link>https://ai-blog.noorshomelab.dev/mermaid-lint-guide/chapter-2-designing-the-lexer/</link><pubDate>Tue, 17 Mar 2026 00:00:00 +0000</pubDate><guid>https://ai-blog.noorshomelab.dev/mermaid-lint-guide/chapter-2-designing-the-lexer/</guid><description>&lt;h2 id="chapter-2-designing-the-lexer-tokenization-of-mermaid-syntax"&gt;Chapter 2: Designing the Lexer: Tokenization of Mermaid Syntax&lt;/h2&gt;
&lt;p&gt;Welcome to Chapter 2 of our journey to build a robust Mermaid code analyzer and fixer in Rust! In the previous chapter, we laid the foundational project structure and set up our development environment. With the groundwork complete, we&amp;rsquo;re now ready to dive into the core components of our compiler-like tool. This chapter focuses on the very first stage of any compiler pipeline: the &lt;strong&gt;Lexer&lt;/strong&gt;.&lt;/p&gt;</description></item><item><title>Chapter 5: Data Preparation and Loading for Tunix</title><link>https://ai-blog.noorshomelab.dev/tunix-mastery-2026/05-data-preparation/</link><pubDate>Fri, 30 Jan 2026 00:00:00 +0000</pubDate><guid>https://ai-blog.noorshomelab.dev/tunix-mastery-2026/05-data-preparation/</guid><description>&lt;h2 id="chapter-5-data-preparation-and-loading-for-tunix"&gt;Chapter 5: Data Preparation and Loading for Tunix&lt;/h2&gt;
&lt;p&gt;Welcome back, future LLM master! In the previous chapters, we laid the groundwork by understanding Tunix&amp;rsquo;s architecture and setting up our development environment. Now, it&amp;rsquo;s time to talk about the fuel that powers any Large Language Model: data!&lt;/p&gt;
&lt;p&gt;This chapter is all about getting your data ready for Tunix. We&amp;rsquo;ll dive deep into the crucial steps of preparing your text-based datasets, understanding how to tokenize them, and setting up efficient data loading pipelines that play nicely with JAX and Tunix. Think of this as preparing a delicious meal – you need to carefully select, clean, and chop your ingredients before you can even think about cooking!&lt;/p&gt;</description></item></channel></rss>