<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Speech Recognition on AI VOID</title><link>https://ai-blog.noorshomelab.dev/tags/speech-recognition/</link><description>Recent content in Speech Recognition on AI VOID</description><generator>Hugo</generator><language>en</language><lastBuildDate>Fri, 20 Mar 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://ai-blog.noorshomelab.dev/tags/speech-recognition/index.xml" rel="self" type="application/rss+xml"/><item><title>Understanding Multimodal AI Systems</title><link>https://ai-blog.noorshomelab.dev/multimodal-ai-guide-2026/</link><pubDate>Fri, 20 Mar 2026 00:00:00 +0000</pubDate><guid>https://ai-blog.noorshomelab.dev/multimodal-ai-guide-2026/</guid><description>&lt;p&gt;Welcome to this comprehensive guide on multimodal AI systems. Here, you will explore how these advanced systems integrate and process text, image, audio, and video inputs, covering their core architectures and data pipelines. Discover real-world applications, from intelligent voice assistants to sophisticated vision-based AI, and understand their practical impact.&lt;/p&gt;</description></item><item><title>Audio Processing: Speech Recognition and Generation</title><link>https://ai-blog.noorshomelab.dev/transformers-js-guide/audio-processing-speech-recognition-and-generation/</link><pubDate>Sun, 26 Oct 2025 00:00:00 +0000</pubDate><guid>https://ai-blog.noorshomelab.dev/transformers-js-guide/audio-processing-speech-recognition-and-generation/</guid><description>&lt;h1 id="5-audio-processing-speech-recognition-and-generation"&gt;5. Audio Processing: Speech Recognition and Generation&lt;/h1&gt;
&lt;p&gt;Transformers.js extends its capabilities beyond text and vision to include audio processing tasks. This chapter will cover two fundamental audio tasks: Automatic Speech Recognition (ASR) to convert spoken words into text, and Text-to-Speech (TTS) to generate natural-sounding speech from text.&lt;/p&gt;
&lt;h2 id="51-automatic-speech-recognition-asr"&gt;5.1. Automatic Speech Recognition (ASR)&lt;/h2&gt;
&lt;p&gt;ASR allows applications to transcribe spoken language into written text. This is crucial for voice assistants, dictation tools, and transcribing audio recordings.&lt;/p&gt;</description></item></channel></rss>