<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Text Processing on AI VOID</title><link>https://ai-blog.noorshomelab.dev/categories/text-processing/</link><description>Recent content in Text Processing on AI VOID</description><generator>Hugo</generator><language>en</language><lastBuildDate>Tue, 30 Dec 2025 00:00:00 +0000</lastBuildDate><atom:link href="https://ai-blog.noorshomelab.dev/categories/text-processing/index.xml" rel="self" type="application/rss+xml"/><item><title>Developing an LLM-Powered Content Summarizer (Hands-on Project)</title><link>https://ai-blog.noorshomelab.dev/any-llm-guide-2025/content-summarizer/</link><pubDate>Tue, 30 Dec 2025 00:00:00 +0000</pubDate><guid>https://ai-blog.noorshomelab.dev/any-llm-guide-2025/content-summarizer/</guid><description>&lt;h2 id="introduction-your-first-practical-llm-application"&gt;Introduction: Your First Practical LLM Application!&lt;/h2&gt;
&lt;p&gt;Welcome to an exciting chapter where we&amp;rsquo;ll put all your &lt;code&gt;any-llm&lt;/code&gt; knowledge into action! So far, we&amp;rsquo;ve explored the foundations of &lt;code&gt;any-llm&lt;/code&gt;, learned how to connect to various providers, handle different output types, and manage asynchronous operations. Now, it&amp;rsquo;s time to build something tangible and incredibly useful: an LLM-powered content summarizer.&lt;/p&gt;
&lt;p&gt;In this chapter, you&amp;rsquo;ll learn how to design, implement, and refine a Python application that can distill lengthy articles or documents into concise summaries using the &lt;code&gt;any-llm&lt;/code&gt; library. This project will solidify your understanding of prompt engineering, API interaction, error handling, and basic application structure. Get ready to transform raw text into digestible insights with the power of large language models!&lt;/p&gt;</description></item><item><title>AWK Demystified - Text Processing Essentials</title><link>https://ai-blog.noorshomelab.dev/cut-the-chase/awk-demystified/</link><pubDate>Mon, 29 Dec 2025 00:00:00 +0000</pubDate><guid>https://ai-blog.noorshomelab.dev/cut-the-chase/awk-demystified/</guid><description>&lt;h1 id="awk-demystified---text-processing-essentials"&gt;AWK Demystified - Text Processing Essentials&lt;/h1&gt;
&lt;p&gt;GNU Awk (gawk) 5.3.0 (stable as of late 2025) is the primary implementation.&lt;/p&gt;
&lt;h2 id="core-syntax"&gt;Core Syntax&lt;/h2&gt;
&lt;p&gt;AWK processes input line by line, executing &lt;code&gt;action&lt;/code&gt; blocks when &lt;code&gt;pattern&lt;/code&gt; matches. &lt;code&gt;BEGIN&lt;/code&gt; and &lt;code&gt;END&lt;/code&gt; blocks run before and after file processing, respectively.&lt;/p&gt;
&lt;div class="highlight"&gt;
&lt;pre class="language-awk line-numbers" data-start="1" tabindex="0"&gt;&lt;code class="language-awk" data-lang="awk"&gt;# Basic structure: &amp;#39;pattern { action }&amp;#39;
# Prints every line (default action if none specified)
awk &amp;#39;{ print }&amp;#39; data.txt
# Prints lines containing &amp;#34;error&amp;#34;
awk &amp;#39;/error/ { print }&amp;#39; log.txt
# BEGIN block: executed once before any input is read
# END block: executed once after all input is processed
awk &amp;#39;BEGIN { print &amp;#34;--- Log Analysis Start ---&amp;#34; } /FAIL/ { count&amp;#43;&amp;#43; } END { print &amp;#34;Total failures:&amp;#34;, count }&amp;#39; system.log&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;&lt;h2 id="field-handling--built-in-variables"&gt;Field Handling &amp;amp; Built-in Variables&lt;/h2&gt;
&lt;p&gt;AWK automatically splits each input line into fields. &lt;code&gt;$0&lt;/code&gt; is the entire line, &lt;code&gt;$1&lt;/code&gt; is the first field, &lt;code&gt;$2&lt;/code&gt; the second, and so on. &lt;code&gt;NF&lt;/code&gt; is the number of fields, &lt;code&gt;NR&lt;/code&gt; is the current record (line) number, &lt;code&gt;FS&lt;/code&gt; is the field separator (default space/tab), &lt;code&gt;OFS&lt;/code&gt; is the output field separator (default space).&lt;/p&gt;</description></item></channel></rss>