<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Tracing on AI VOID</title><link>https://ai-blog.noorshomelab.dev/tags/tracing/</link><description>Recent content in Tracing on AI VOID</description><generator>Hugo</generator><language>en</language><lastBuildDate>Fri, 15 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://ai-blog.noorshomelab.dev/tags/tracing/index.xml" rel="self" type="application/rss+xml"/><item><title>Building Your AI Observability Foundation with OpenTelemetry</title><link>https://ai-blog.noorshomelab.dev/ai-observability-guide/building-ai-observability-foundation-opentelemetry/</link><pubDate>Fri, 20 Mar 2026 00:00:00 +0000</pubDate><guid>https://ai-blog.noorshomelab.dev/ai-observability-guide/building-ai-observability-foundation-opentelemetry/</guid><description>&lt;h2 id="introduction-laying-the-observability-groundwork-with-opentelemetry"&gt;Introduction: Laying the Observability Groundwork with OpenTelemetry&lt;/h2&gt;
&lt;p&gt;Welcome back, future AI observability masters! In the previous chapter (or what you&amp;rsquo;d have learned in it!), we explored the &lt;em&gt;why&lt;/em&gt; of AI observability, understanding its critical role in managing the unique complexities of AI systems in production. Now, it&amp;rsquo;s time to dive into the &lt;em&gt;how&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;This chapter is all about building a solid foundation using &lt;strong&gt;OpenTelemetry (OTel)&lt;/strong&gt;, the open-source, vendor-neutral standard for collecting and managing telemetry data. Think of OpenTelemetry as your universal language for telling the story of your AI application&amp;rsquo;s performance, behavior, and health. Why is this so crucial for AI? Because AI systems often involve multiple components, non-deterministic outputs, and a constant need to understand prompt-to-response dynamics. Without a standardized way to collect and correlate data, debugging a misbehaving LLM or an underperforming recommendation engine can feel like searching for a needle in a haystack&amp;hellip; in the dark!&lt;/p&gt;</description></item><item><title>Tracing AI Workflows: From Prompt to Prediction</title><link>https://ai-blog.noorshomelab.dev/ai-observability-guide/tracing-ai-workflows-prompt-to-prediction/</link><pubDate>Fri, 20 Mar 2026 00:00:00 +0000</pubDate><guid>https://ai-blog.noorshomelab.dev/ai-observability-guide/tracing-ai-workflows-prompt-to-prediction/</guid><description>&lt;h2 id="tracing-ai-workflows-from-prompt-to-prediction"&gt;Tracing AI Workflows: From Prompt to Prediction&lt;/h2&gt;
&lt;p&gt;Welcome back, future MLOps heroes! In our previous chapter, we explored the fundamentals of logging for AI systems, setting the stage for gaining visibility into our applications. We learned how structured, contextual logs are invaluable for understanding &lt;em&gt;what happened&lt;/em&gt;. But what if you need to understand &lt;em&gt;how&lt;/em&gt; something happened, especially when your AI application interacts with multiple services, databases, and external APIs? How do you follow a single user request or an AI agent&amp;rsquo;s decision-making process across all these moving parts?&lt;/p&gt;</description></item><item><title>Observability: Logging, Metrics, and Distributed Tracing</title><link>https://ai-blog.noorshomelab.dev/systems-engineering-2026/observability-logging-metrics-tracing/</link><pubDate>Fri, 15 May 2026 00:00:00 +0000</pubDate><guid>https://ai-blog.noorshomelab.dev/systems-engineering-2026/observability-logging-metrics-tracing/</guid><description>&lt;p&gt;Imagine your beautifully crafted distributed system running in production. It&amp;rsquo;s composed of many microservices, perhaps handling millions of requests per day, or coordinating a fleet of AI agents. Suddenly, a customer reports an error, or a critical business process slows to a crawl. How do you find out what&amp;rsquo;s going on? Where do you even begin looking?&lt;/p&gt;
&lt;p&gt;This is where &lt;strong&gt;observability&lt;/strong&gt; comes in. It&amp;rsquo;s the ability to infer the internal state of a system by examining its external outputs. In complex, distributed systems, you can&amp;rsquo;t just attach a debugger to a single process. You need to gather data from every corner of your architecture to piece together the full story. This chapter will equip you with the fundamental tools and mindset for achieving deep visibility into your systems: logging, metrics, and distributed tracing.&lt;/p&gt;</description></item><item><title>Observability for AI Systems: Monitoring, Logging &amp;amp; Tracing</title><link>https://ai-blog.noorshomelab.dev/ai-system-design-2026-guide/observability-ai-systems/</link><pubDate>Fri, 20 Mar 2026 00:00:00 +0000</pubDate><guid>https://ai-blog.noorshomelab.dev/ai-system-design-2026-guide/observability-ai-systems/</guid><description>&lt;h2 id="introduction-to-observability-for-ai-systems"&gt;Introduction to Observability for AI Systems&lt;/h2&gt;
&lt;p&gt;Welcome to Chapter 9! In our journey to design scalable AI-powered applications, we&amp;rsquo;ve explored modular microservices, efficient data pipelines, and intelligent orchestration. Now, it&amp;rsquo;s time to talk about what happens &lt;em&gt;after&lt;/em&gt; your brilliant AI system is deployed: how do you know it&amp;rsquo;s working as expected? How do you detect problems before they impact users? How do you understand &lt;em&gt;why&lt;/em&gt; something went wrong?&lt;/p&gt;
&lt;p&gt;This is where &lt;strong&gt;observability&lt;/strong&gt; comes into play. Observability isn&amp;rsquo;t just about knowing if your system is up or down; it&amp;rsquo;s about being able to infer the internal state of your system by examining the data it produces. For AI systems, this is even more critical, as model performance can degrade silently, data can drift, and complex interactions between agents can lead to unpredictable behavior.&lt;/p&gt;</description></item><item><title>Chapter 15: Robust Error Handling, Logging, and Debugging</title><link>https://ai-blog.noorshomelab.dev/stellar-gen-guide/chapter-15-error-handling/</link><pubDate>Mon, 02 Mar 2026 00:00:00 +0000</pubDate><guid>https://ai-blog.noorshomelab.dev/stellar-gen-guide/chapter-15-error-handling/</guid><description>&lt;h2 id="chapter-15-robust-error-handling-logging-and-debugging"&gt;Chapter 15: Robust Error Handling, Logging, and Debugging&lt;/h2&gt;
&lt;p&gt;Welcome to Chapter 15 of our journey to build a production-grade Rust static site generator! Up until now, we&amp;rsquo;ve focused on building out core functionalities like content parsing, templating, and routing. While our SSG can generate sites, it&amp;rsquo;s not yet resilient to real-world issues like malformed content files, missing templates, or unexpected I/O errors. In a production environment, an application that crashes silently or provides cryptic error messages is a nightmare to maintain.&lt;/p&gt;</description></item><item><title>AI Observability: A Comprehensive Guide</title><link>https://ai-blog.noorshomelab.dev/ai-observability-guide/</link><pubDate>Fri, 20 Mar 2026 00:00:00 +0000</pubDate><guid>https://ai-blog.noorshomelab.dev/ai-observability-guide/</guid><description>&lt;p&gt;Welcome to this essential guide on AI Observability. Here, you will learn how to implement comprehensive monitoring for your AI systems, covering critical aspects like logging, tracing, metrics, and cost management. Discover best practices for tracking prompts, responses, latency, and overall performance to ensure your AI models operate reliably in production environments.&lt;/p&gt;</description></item><item><title>AI Observability: A Practical Guide to Monitoring AI Systems</title><link>https://ai-blog.noorshomelab.dev/guides/ai-observability-guide/</link><pubDate>Fri, 20 Mar 2026 00:00:00 +0000</pubDate><guid>https://ai-blog.noorshomelab.dev/guides/ai-observability-guide/</guid><description>&lt;p&gt;Welcome to this guide on AI Observability. If you&amp;rsquo;re working with AI models, especially in production, you know that getting them to work is one thing, but making sure they &lt;em&gt;keep&lt;/em&gt; working reliably, efficiently, and cost-effectively is a different challenge. That&amp;rsquo;s exactly what AI observability helps us achieve.&lt;/p&gt;
&lt;h3 id="what-is-ai-observability"&gt;What is AI Observability?&lt;/h3&gt;
&lt;p&gt;In plain language, AI observability is about understanding the internal state of your AI systems—like large language models (LLMs) or custom machine learning models—from their external outputs. It&amp;rsquo;s like giving your AI system a set of senses so you can see, hear, and feel what it&amp;rsquo;s doing, how it&amp;rsquo;s performing, and why it might be behaving in a certain way.&lt;/p&gt;</description></item></channel></rss>