<?xml version="1.0" encoding="utf-8"?><?xml-stylesheet type="text/xsl" href="rss.xsl"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel>
        <title>vLLora - Debug your agents in realtime Blog</title>
        <link>https://vllora.dev/blog</link>
        <description>vLLora - Debug your agents in realtime Blog</description>
        <lastBuildDate>Tue, 20 Jan 2026 00:00:00 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <language>en</language>
        <item>
            <title><![CDATA[Introducing Lucy: Trace-Native Debugging Inside vLLora]]></title>
            <link>https://vllora.dev/blog/introducing-lucy</link>
            <guid>https://vllora.dev/blog/introducing-lucy</guid>
            <pubDate>Tue, 20 Jan 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[Lucy is a built-in assistant that reads your traces, diagnoses agent failures, and suggests concrete fixes in seconds.]]></description>
            <content:encoded><![CDATA[<div class="prose prose-slate dark:prose-invert max-w-none"><p>Your agent fails midway through a task. The trace is right there in vLLora, but it's 200 spans deep. You start scrolling, scanning for the red error or the suspicious tool call. Somewhere in those spans is the answer, but finding it takes longer than it should.</p>
<p>Today we're launching <strong>Lucy</strong>, an AI assistant built directly into vLLora that reads your traces and tells you what went wrong. You ask a question in plain English, Lucy inspects the trace, and you get a diagnosis with concrete next steps. Lucy is available now in beta.</p>
<video controls="" playsinline="" muted="" loop="" style="width:100%;border-radius:12px"><source src="/videos/vllora-lucy-whats-wrong-with-run.mp4" type="video/mp4"><p>Sorry, your browser doesn’t support embedded videos.</p></video>
<!-- -->
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="why-finding-out-what-went-wrong-is-hard">Why finding out what went wrong is hard<a href="https://vllora.dev/blog/introducing-lucy#why-finding-out-what-went-wrong-is-hard" class="hash-link" aria-label="Direct link to Why finding out what went wrong is hard" title="Direct link to Why finding out what went wrong is hard" translate="no">​</a></h2>
<p>Agent failures don’t look like traditional exceptions. A single bad response is usually the result of a chain of small choices spread across a long execution.</p>
<ul>
<li class=""><strong>Long traces:</strong> One thread can include hundreds of spans across model calls, tool calls, retries, and fallbacks.</li>
<li class=""><strong>Delayed symptoms:</strong> The root cause often happens early, but only becomes visible much later in the run.</li>
<li class=""><strong>Silent degradation:</strong> A thread can be marked "successful" while actually running with missing data, wrong assumptions, or a broken tool path.</li>
</ul>
<p>When debugging becomes "scroll until you get lucky," you miss important signals and burn time (and tokens) doing it.</p>
<p>Lucy is good at exactly this: reading the trace end-to-end, spotting failure patterns, and turning them into actionable fixes.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="what-can-you-ask-lucy-to-do">What can you ask Lucy to do<a href="https://vllora.dev/blog/introducing-lucy#what-can-you-ask-lucy-to-do" class="hash-link" aria-label="Direct link to What can you ask Lucy to do" title="Direct link to What can you ask Lucy to do" translate="no">​</a></h2>
<p>Lucy sits next to your traces and threads. Ask a plain-English question, and it will inspect the trace, flag failure points, and return a fix-oriented report: root cause, impact, and recommended next steps.</p>
<p>Ask Lucy questions like:</p>
<ul>
<li class="">Analyze this thread for issues</li>
<li class="">Check for errors in this thread</li>
<li class="">Show me the slowest operations</li>
<li class="">What's the total cost?</li>
</ul>
<p>Lucy can also help you spot patterns across multiple failing runs and suggest prompt rewrites to reduce ambiguity.</p>
<p><img decoding="async" loading="lazy" alt="Lucy Agent Debuging the errors in the thread" src="https://vllora.dev/assets/images/lucy-whats-wrong-with-my-thread-bf22a1a8030712b716da80750f620100.png" width="1916" height="960" class="img_uaae"></p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="whats-wrong-with-my-thread">"What’s wrong with my thread?"<a href="https://vllora.dev/blog/introducing-lucy#whats-wrong-with-my-thread" class="hash-link" aria-label="Direct link to &quot;What’s wrong with my thread?&quot;" title="Direct link to &quot;What’s wrong with my thread?&quot;" translate="no">​</a></h2>
<p>We had a Travel agent which was running for a long time, apparently stuck in a loop within the <code>BetweenHorizonalEnd</code> span. Instead of digging through the logs manually, we simply asked Lucy:</p>
<blockquote>
<p>What's wrong with my thread?</p>
</blockquote>
<p>Lucy inspected the thread's spans, identified a recurring failure pattern, and explained the root cause and impact, along with concrete next steps.</p>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="what-lucy-found-schema-mismatches-and-contradictory-prompts">What Lucy found: Schema mismatches and contradictory prompts<a href="https://vllora.dev/blog/introducing-lucy#what-lucy-found-schema-mismatches-and-contradictory-prompts" class="hash-link" aria-label="Direct link to What Lucy found: Schema mismatches and contradictory prompts" title="Direct link to What Lucy found: Schema mismatches and contradictory prompts" translate="no">​</a></h3>
<p>In this trace, the agent was failing to complete a travel itinerary. Lucy didn't just flag the error; she identified a complex failure pattern involving both the <strong>code</strong> (schema) and the <strong>instructions</strong> (prompt).</p>
<p><strong>1. The "Hallucinated" Arguments</strong>
Lucy pinpointed exactly why the tools were failing. The model was trying to call <code>research_flights</code> with a <code>from_city</code> argument and <code>research_accommodations</code> with <code>check_in_date</code>.</p>
<ul>
<li class=""><strong>The Diagnosis:</strong> "Severe Tool Schema Mismatch."</li>
<li class=""><strong>The Reality:</strong> These arguments didn't exist in the registered tool definition, causing the model to hit a wall of <code>unexpected keyword</code> errors.</li>
</ul>
<p><strong>2. The Hidden Logic Trap</strong>
Critically, Lucy found a root cause that a human scanning logs would likely miss: <strong>Prompt Contradiction.</strong></p>
<ul>
<li class=""><strong>The Conflict:</strong> The system prompt instructed the agent to "prefer analysis only" while simultaneously telling it that it "MUST call tools."</li>
<li class=""><strong>The Result:</strong> The model was paralyzed between two opposing instructions, leading to the erratic tool behavior.</li>
</ul>
<p><strong>3. Silent Failures (Truncation)</strong>
Lucy also caught a silent degradation issue: <code>Severe Output Truncation</code>. The <code>Restaurant Extraction</code> step was hitting token limits and cutting off data mid-list (<code>output_tokens: 4000... truncated</code>). The run looked "successful" to the server, but the downstream user was getting incomplete data.</p>
<p>Lucy’s report turned a vague "it's not working" complaint into three distinct engineering tasks: fix the tool schema, clarify the system prompt, and increase the context window for extraction.</p>
<p><img decoding="async" loading="lazy" alt="Lucy interface showing detected issues list" src="https://vllora.dev/assets/images/lucy-output-ed2d1f35c0ac4a0b8e16bc1c933cebaa.png" width="1915" height="963" class="img_uaae">
<em>Caption: Lucy analyzes the trace and detects multiple issues simultaneously: invalid tool arguments (<code>from_city</code>), contradictory system prompt instructions, and token truncation in the output.</em></p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="why-this-matters-in-production">Why this matters in production<a href="https://vllora.dev/blog/introducing-lucy#why-this-matters-in-production" class="hash-link" aria-label="Direct link to Why this matters in production" title="Direct link to Why this matters in production" translate="no">​</a></h2>
<p>This is a common failure mode in tool-using agents: when the tool contract isn't perfectly aligned (schema, handler, prompt, examples), the model starts guessing.</p>
<p>The cost isn't limited to a single failed call:</p>
<ul>
<li class=""><strong>Latency increases</strong> as the agent retries and thrashes on deterministic validation failures</li>
<li class=""><strong>Cost increases</strong> as token usage accumulates across repeated attempts</li>
<li class=""><strong>Quality degrades</strong> when the agent gives up on structured tools and improvises without real data</li>
</ul>
<p>Even if your run "succeeds," you can still be paying for broken execution paths.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="how-lucy-works-with-vllora-tracing">How Lucy works with vLLora tracing<a href="https://vllora.dev/blog/introducing-lucy#how-lucy-works-with-vllora-tracing" class="hash-link" aria-label="Direct link to How Lucy works with vLLora tracing" title="Direct link to How Lucy works with vLLora tracing" translate="no">​</a></h2>
<p>Lucy's intelligence comes from vLLora's tracing infrastructure. vLLora captures everything your agent does:</p>
<ul>
<li class=""><strong>Spans:</strong> Individual operations like LLM calls, tool executions, and retrieval steps</li>
<li class=""><strong>Runs:</strong> A single execution of your agent, made up of a tree of spans</li>
<li class=""><strong>Threads:</strong> A full conversation, containing multiple runs over time</li>
</ul>
<p>When you ask Lucy a question, it pulls the relevant spans and runs, reconstructs the execution flow, and analyzes patterns across the data. This is context that would take a human hours to piece together manually.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="get-started">Get started<a href="https://vllora.dev/blog/introducing-lucy#get-started" class="hash-link" aria-label="Direct link to Get started" title="Direct link to Get started" translate="no">​</a></h2>
<p>Lucy is available now in beta for all vLLora users.</p>
<ul>
<li class=""><strong>Click the Lucy icon</strong> in the bottom right corner.</li>
<li class=""><strong>Ask a question</strong> like "What's wrong with my thread?"</li>
<li class=""><strong>Get an instant diagnosis</strong> without leaving your workflow.</li>
</ul>
<p>Lucy will inspect your active context and give you a clear diagnosis, so you can spend less time scrolling and more time shipping.</p>
<p>See the full Lucy documentation <a class="" href="https://vllora.dev/docs/lucy">here</a></p></div>]]></content:encoded>
            <category>Debugging</category>
            <category>Agents</category>
            <category>Tracing</category>
        </item>
        <item>
            <title><![CDATA[Silent Failures: Why a “Successful” LLM Workflow Can Cost 40% More]]></title>
            <link>https://vllora.dev/blog/debugging-silent-failures</link>
            <guid>https://vllora.dev/blog/debugging-silent-failures</guid>
            <pubDate>Wed, 31 Dec 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[Your agent finishes without errors, but it might be burning budget on invisible retries. Learn how to uncover the silent failures inflating your LLM costs by 40% using vLLora MCP.]]></description>
            <content:encoded><![CDATA[<div class="prose prose-slate dark:prose-invert max-w-none"><p>Your agent returns the right answer. The status is <code>200 OK</code>, and the user walks away satisfied. On the surface, everything looks fine. But when you check the API bill, it doesn’t line up with how simple the task actually was.</p>
<p>LLMs are unusually resilient. When a tool call fails, they don’t stop execution. They try again with small variations. When a response looks off, they adjust and keep going. That behavior is often helpful, but it can also hide broken execution paths. The user sees a successful result, while your token usage quietly absorbs retries, fallbacks, and extra reasoning that never needed to happen.</p>
<p><img decoding="async" loading="lazy" alt="Silent failures" src="https://vllora.dev/assets/images/silent-failures-cover-7fdd5fc30c049b498d7a54083548fd26.png" width="2722" height="1361" class="img_uaae"></p>
<!-- -->
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="the-illusion-of-success">The Illusion of Success<a href="https://vllora.dev/blog/debugging-silent-failures#the-illusion-of-success" class="hash-link" aria-label="Direct link to The Illusion of Success" title="Direct link to The Illusion of Success" translate="no">​</a></h2>
<p>When an agent returns the correct output and the logs are clean, we assume the logic is sound. However, LLM resilience introduces a new debugging challenge.</p>
<ul>
<li class="">
<p><strong>Standard Software:</strong> Invalid parameters trigger immediate exceptions. You see the stack trace and fix the bug.</p>
</li>
<li class="">
<p><strong>LLMs:</strong> If a tool call fails, the workflow doesn't crash with an error. Most of the SDKs have a built-in retry mechanism that will retry with new arguments, switches strategies, or forces a solution.</p>
</li>
</ul>
<p>This resilience masks architectural issues. The agent produces the correct output while quietly absorbing retries and extra reasoning steps.</p>
<p>Standard observability tools catch crashes but often miss these silent performance leaks. A "successful" run looks identical to an optimized one on a dashboard, even if it performed three times the necessary work.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="the-suspect-a-slow-travel-agent">The Suspect: A Slow Travel Agent<a href="https://vllora.dev/blog/debugging-silent-failures#the-suspect-a-slow-travel-agent" class="hash-link" aria-label="Direct link to The Suspect: A Slow Travel Agent" title="Direct link to The Suspect: A Slow Travel Agent" translate="no">​</a></h2>
<p>To make this concrete, consider a simple travel planning agent. It takes a destination, travel dates, and a few preferences, then generates a five-day itinerary.</p>
<p>From a functional perspective, the agent behaves as expected. Each run produces a reasonable itinerary that matches the user’s request, and there are no visible errors or user complaints.</p>
<p>The problem shows up when you look at the metrics:</p>
<ul>
<li class=""><strong>Output:</strong> Correct itinerary</li>
<li class=""><strong>Status:</strong> <code>200 OK</code></li>
<li class=""><strong>Time Taken:</strong> 361 seconds (over 6 minutes)</li>
<li class=""><strong>Cost:</strong> $0.068 per run</li>
</ul>
<p>For a task of this scope, both the time taken and the cost were unusually high. A closer look raised an obvious question: why did generating a straightforward itinerary require <strong>49 separate LLM calls</strong>?</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="the-investigation-using-vllora-mcp">The Investigation (Using vLLora MCP)<a href="https://vllora.dev/blog/debugging-silent-failures#the-investigation-using-vllora-mcp" class="hash-link" aria-label="Direct link to The Investigation (Using vLLora MCP)" title="Direct link to The Investigation (Using vLLora MCP)" translate="no">​</a></h2>
<p>At this point, the problem wasn’t correctness — it was understanding <em>how</em> the agent arrived at the result. Manually tracing through dozens of JSON logs would have been slow and error-prone, especially given the number of model calls involved.</p>
<p>Instead, we used the <a class="" href="https://vllora.dev/docs/vllora-mcp-server"><strong>vLLora MCP server</strong></a> to inspect the most recent agent run. MCP exposes trace data as structured tools, which means a coding agent can reason about execution flow, tool calls, and model behavior directly — without parsing raw logs or switching to a separate dashboard.</p>
<p>We asked the coding agent:</p>
<blockquote>
<p>Use vLLora MCP to inspect the most recent agent run and explain why it produced this result.</p>
</blockquote>
<video controls="" playsinline="" muted="" loop="" style="width:100%;border-radius:12px"><source src="/videos/vllora-mcp-shorter.mp4" type="video/mp4"><p>Sorry, your browser doesn’t support embedded videos.</p></video>
<p>The agent inspected the latest traces and summarized what actually happened during the run. While the execution was marked successful, the trace revealed repeated failed attempts to call the same tool.</p>
<p>Specifically:</p>
<ul>
<li class="">The agent retried the same tool call multiple times with adjusted parameters</li>
<li class="">Each failure was handled internally without surfacing an error</li>
<li class="">A fallback path eventually produced the correct result</li>
<li class="">The extra retries directly inflated both latency and cost</li>
</ul>
<p>Because the run completed successfully, none of this appeared in error metrics. The inefficiency only becomes visible when you inspect the execution path itself rather than the final outcome.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="the-reveal-the-parameter-mismatch">The Reveal: The Parameter Mismatch<a href="https://vllora.dev/blog/debugging-silent-failures#the-reveal-the-parameter-mismatch" class="hash-link" aria-label="Direct link to The Reveal: The Parameter Mismatch" title="Direct link to The Reveal: The Parameter Mismatch" translate="no">​</a></h2>
<p>The MCP analysis pointed to a very specific failure pattern. This wasn’t a logic bug or a model hallucination. It was a syntax mismatch between what the model assumed and what the tool schema actually required.</p>
<p>The agent was effectively stuck in a <strong>validation loop</strong>.</p>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="attempt-1">Attempt 1<a href="https://vllora.dev/blog/debugging-silent-failures#attempt-1" class="hash-link" aria-label="Direct link to Attempt 1" title="Direct link to Attempt 1" translate="no">​</a></h3>
<p>The model called <code>research_accommodations</code> using camelCase parameters such as <code>checkin_date</code>.</p>
<ul>
<li class=""><strong>Result:</strong> <code>ValidationError</code></li>
<li class=""><strong>Reason:</strong> The tool schema expected snake_case parameter names.</li>
</ul>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="attempt-2">Attempt 2<a href="https://vllora.dev/blog/debugging-silent-failures#attempt-2" class="hash-link" aria-label="Direct link to Attempt 2" title="Direct link to Attempt 2" translate="no">​</a></h3>
<p>After observing the failure, the model retried with a lowercase variation: <code>checkindate</code>.</p>
<ul>
<li class=""><strong>Result:</strong> <code>ValidationError</code></li>
<li class=""><strong>Reason:</strong> The parameter name still did not match the schema.</li>
</ul>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="attempt-3">Attempt 3<a href="https://vllora.dev/blog/debugging-silent-failures#attempt-3" class="hash-link" aria-label="Direct link to Attempt 3" title="Direct link to Attempt 3" translate="no">​</a></h3>
<p>The model simplified further, removing part of the name and trying <code>check_in</code>.</p>
<ul>
<li class=""><strong>Result:</strong> <code>ValidationError</code></li>
<li class=""><strong>Reason:</strong> Still not a valid parameter.</li>
</ul>
<p>After multiple failed attempts, the agent abandoned the structured tool entirely.</p>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="fallback-path">Fallback path<a href="https://vllora.dev/blog/debugging-silent-failures#fallback-path" class="hash-link" aria-label="Direct link to Fallback path" title="Direct link to Fallback path" translate="no">​</a></h3>
<p>The model fell back to a generic search call:</p>
<div class="language-text codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-text codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token plain">tavily_search("hotels in Tokyo")</span><br></span></code></pre></div></div>
<p>This fallback produced usable results, which is why the overall run completed successfully and returned a 200 OK. However, that success came at a cost. The trace showed 21 wasted tool calls and thousands of input tokens consumed by repeated retries, error messages, and recovery logic.</p>
<p>From the outside, the agent looked healthy. Under the hood, it was working much harder than it needed to.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="the-fix-delegating-to-the-agent">The Fix: Delegating to the Agent<a href="https://vllora.dev/blog/debugging-silent-failures#the-fix-delegating-to-the-agent" class="hash-link" aria-label="Direct link to The Fix: Delegating to the Agent" title="Direct link to The Fix: Delegating to the Agent" translate="no">​</a></h2>
<p>Once the MCP analysis identified the root cause, ambiguous docstrings, there was no need to manually search through the codebase or write the fix by hand. We delegated the change to the coding agent.</p>
<p>From Cursor, we asked:</p>
<blockquote>
<p>Update the <code>research_accommodations</code> tool definition.<br>
<!-- -->Make the <code>check_in_date</code> parameter explicitly require snake_case to prevent retry loops.</p>
</blockquote>
<p>The agent located the relevant Pydantic model and updated the docstrings to remove any ambiguity for the model.</p>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="the-code-change">The Code Change<a href="https://vllora.dev/blog/debugging-silent-failures#the-code-change" class="hash-link" aria-label="Direct link to The Code Change" title="Direct link to The Code Change" translate="no">​</a></h3>
<h4 class="anchor anchorTargetStickyNavbar_sDfD" id="before-ambiguous">Before: Ambiguous<a href="https://vllora.dev/blog/debugging-silent-failures#before-ambiguous" class="hash-link" aria-label="Direct link to Before: Ambiguous" title="Direct link to Before: Ambiguous" translate="no">​</a></h4>
<div class="language-python codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-python codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token keyword" style="color:rgb(86, 156, 214)">class</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(78, 201, 176)">AccommodationSearch</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">BaseModel</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token triple-quoted-string string" style="color:rgb(206, 145, 120)">"""Search for hotels and accommodations."""</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    check_in_date</span><span class="token punctuation" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(86, 156, 214)">str</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"> Field</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        description</span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token string" style="color:rgb(206, 145, 120)">"Check-in date in YYYY-MM-DD format"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><br></span></code></pre></div></div>
<p>The description specified the value format, but left the parameter name open to interpretation.</p>
<h4 class="anchor anchorTargetStickyNavbar_sDfD" id="after-explicit">After: Explicit<a href="https://vllora.dev/blog/debugging-silent-failures#after-explicit" class="hash-link" aria-label="Direct link to After: Explicit" title="Direct link to After: Explicit" translate="no">​</a></h4>
<div class="language-python codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-python codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token keyword" style="color:rgb(86, 156, 214)">class</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(78, 201, 176)">AccommodationSearch</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">BaseModel</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token triple-quoted-string string" style="color:rgb(206, 145, 120)">"""</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token triple-quoted-string string" style="color:rgb(206, 145, 120)">    Search for hotels.</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token triple-quoted-string string" style="color:rgb(206, 145, 120)">    IMPORTANT: All parameters must be in snake_case.</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token triple-quoted-string string" style="color:rgb(206, 145, 120)">    """</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    check_in_date</span><span class="token punctuation" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token builtin" style="color:rgb(86, 156, 214)">str</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"> Field</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain"></span><br></span><span class="token-line theme-code-block-highlighted-line" style="color:#9CDCFE"><span class="token plain">        description</span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token string" style="color:rgb(206, 145, 120)">"Check-in date (YYYY-MM-DD). Strictly use parameter name: 'check_in_date'."</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><br></span></code></pre></div></div>
<p>By explicitly stating the required parameter name, the ambiguity that caused the retry loop was removed.</p>
<p>With the fix applied, we cleared the agent context and ran the exact same travel planning task again to verify the results.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="measuring-the-impact">Measuring the Impact<a href="https://vllora.dev/blog/debugging-silent-failures#measuring-the-impact" class="hash-link" aria-label="Direct link to Measuring the Impact" title="Direct link to Measuring the Impact" translate="no">​</a></h2>
<p>To compare the two runs, we asked the coding agent to analyze both traces side by side and summarize the differences.</p>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="the-prompt">The Prompt<a href="https://vllora.dev/blog/debugging-silent-failures#the-prompt" class="hash-link" aria-label="Direct link to The Prompt" title="Direct link to The Prompt" translate="no">​</a></h3>
<blockquote>
<p>Compare the performance of the bad run 4ea18f79-4c4c-4d2c-b628-20d510af7181 against the fixed run a5cf084b-01b2-4288-acef-aa2bedc31426. Show me a table of Latency, Cost, and Token Usage differences.</p>
</blockquote>
<video controls="" playsinline="" muted="" loop="" style="width:100%;border-radius:12px"><source src="/videos/vllora-mcp-compare.mp4" type="video/mp4"><p>Sorry, your browser doesn’t support embedded videos.</p></video>
<p>The agent analyzed the telemetry from both traces and generated this comparison:</p>
<table><thead><tr><th>Metric</th><th>Bad Run (4ea18f79)</th><th>Fixed Run (a5cf084b)</th><th>Difference</th><th>Improvement</th></tr></thead><tbody><tr><td><strong>Latency</strong></td><td>361.21 seconds (6.02 min)</td><td>194.66 seconds (3.24 min)</td><td>-166.55 seconds</td><td><strong>46.1% faster</strong></td></tr><tr><td><strong>Total Cost</strong></td><td>$0.0683</td><td>$0.0430</td><td>-$0.0254</td><td><strong>37.1% cheaper</strong></td></tr><tr><td><strong>LLM Calls</strong></td><td>49 calls</td><td>28 calls</td><td>-21 calls</td><td><strong>42.9% fewer</strong></td></tr><tr><td><strong>Input Tokens</strong></td><td>114,162</td><td>64,608</td><td>-49,554</td><td><strong>43.4% reduction</strong></td></tr><tr><td><strong>Output Tokens</strong></td><td>14,916</td><td>10,691</td><td>-4,225</td><td><strong>28.3% reduction</strong></td></tr><tr><td><strong>Total Tokens</strong></td><td>129,078</td><td>75,299</td><td>-53,779</td><td><strong>41.7% reduction</strong></td></tr></tbody></table>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="impact-at-scale">Impact at Scale<a href="https://vllora.dev/blog/debugging-silent-failures#impact-at-scale" class="hash-link" aria-label="Direct link to Impact at Scale" title="Direct link to Impact at Scale" translate="no">​</a></h3>
<p>For a single run, saving 2 cents might seem negligible. But at production scale, "silent failures" are a massive budget leak.</p>
<p>Based on these numbers, an agent running 1,000 times a day would see:</p>
<ul>
<li class="">
<p><strong>Annual Savings:</strong> ~$9,271/year</p>
</li>
<li class="">
<p><strong>Processing Time Saved:</strong> ~46 hours per day</p>
</li>
<li class="">
<p><strong>Token Reduction:</strong> ~54 million tokens/day</p>
</li>
</ul>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="where-did-the-waste-go">Where did the waste go?<a href="https://vllora.dev/blog/debugging-silent-failures#where-did-the-waste-go" class="hash-link" aria-label="Direct link to Where did the waste go?" title="Direct link to Where did the waste go?" translate="no">​</a></h3>
<p>The comparison highlights exactly where the inefficiency was hiding. By fixing the parameter names, we eliminated:</p>
<ul>
<li class="">
<p><strong>Multiple Retry Loops:</strong> The agent no longer wastes rounds guessing the correct parameter syntax.</p>
</li>
<li class="">
<p><strong>Context Pollution:</strong> We removed thousands of tokens of error messages and failed tool outputs from the context window.</p>
</li>
<li class="">
<p><strong>Inefficient Fallbacks:</strong> The agent uses the specialized <code>research_accommodations</code> tool immediately, rather than falling back to a more expensive generic search.</p>
</li>
</ul>
<p>The fix was a one-line documentation change. But we wouldn't have found it without seeing the actual execution pattern—the retry attempts that looked like normal agent behavior until we inspected the traces.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="why-this-matters">Why This Matters<a href="https://vllora.dev/blog/debugging-silent-failures#why-this-matters" class="hash-link" aria-label="Direct link to Why This Matters" title="Direct link to Why This Matters" translate="no">​</a></h2>
<p>Observability isn't just about catching errors; it's about catching inefficiencies. When agents "work" but cost too much, you need to see the execution flow, not just the final result.</p>
<p>Traditional debugging workflows require you to:</p>
<ol>
<li class="">Notice the performance issue</li>
<li class="">Switch to a tracing UI</li>
<li class="">Search for the relevant trace</li>
<li class="">Manually parse JSON logs</li>
<li class="">Connect the dots across multiple tool calls</li>
</ol>
<p>The MCP workflow lets your coding agent do steps 2-5. You stay in your editor. The agent understands the trace structure and can explain what's happening—not just what failed, but what's inefficient.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="connecting-the-mcp-server">Connecting the MCP Server<a href="https://vllora.dev/blog/debugging-silent-failures#connecting-the-mcp-server" class="hash-link" aria-label="Direct link to Connecting the MCP Server" title="Direct link to Connecting the MCP Server" translate="no">​</a></h2>
<p>vLLora's MCP server runs alongside your vLLora instance. Configure your MCP client to connect to it:</p>
<div class="language-json codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-json codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">  </span><span class="token property">"mcpServers"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token property">"vllora"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">      </span><span class="token property">"url"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">"http://localhost:9090/mcp"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">  </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><br></span></code></pre></div></div>
<p>or install the MCP server in your IDE:</p>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="quick-install">Quick Install<a href="https://vllora.dev/blog/debugging-silent-failures#quick-install" class="hash-link" aria-label="Direct link to Quick Install" title="Direct link to Quick Install" translate="no">​</a></h3>
<p><a href="https://insiders.vscode.dev/redirect?url=vscode:mcp/install?%7B%22type%22%3A%22http%22%2C%22name%22%3A%22vLLora%22%2C%22url%22%3A%22http%3A%2F%2Flocalhost%3A9090%2Fmcp%22%7D" target="_blank" rel="noopener noreferrer" class=""><img decoding="async" loading="lazy" src="https://img.shields.io/badge/VS_Code-Install_Server-0098FF?style=for-the-badge&amp;logo=visual-studio-code&amp;logoColor=white" alt="Install in VS Code" class="img_uaae"></a>
<a href="https://vs-open.link/mcp-install?%7B%22type%22%3A%22http%22%2C%22url%22%3A%22http%3A%2F%2Flocalhost%3A9090%2Fmcp%22%7D" target="_blank" rel="noopener noreferrer" class=""><img decoding="async" loading="lazy" src="https://img.shields.io/badge/Visual_Studio-Install_Server-C16FDE?style=for-the-badge&amp;logo=visualstudio&amp;logoColor=white" alt="Install in Visual Studio" class="img_uaae"></a>
<a href="cursor://anysphere.cursor-deeplink/mcp/install?name=vLLora&amp;config=eyJ1cmwiOiJodHRwOi8vbG9jYWxob3N0OjkwOTAvbWNwIn0=" target="_blank" rel="noopener noreferrer" class=""><img decoding="async" loading="lazy" src="https://cursor.com/deeplink/mcp-install-dark.svg" alt="Add to Cursor" class="img_uaae"></a></p>
<p>Once connected, your coding agent automatically discovers the trace inspection tools and can start using them immediately.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="closing-thoughts">Closing Thoughts<a href="https://vllora.dev/blog/debugging-silent-failures#closing-thoughts" class="hash-link" aria-label="Direct link to Closing Thoughts" title="Direct link to Closing Thoughts" translate="no">​</a></h2>
<p>Silent failures are expensive. They don't break your application, but they inflate your costs and slow down your users. The challenge is visibility: you need to see the execution flow, not just the final result.</p>
<p>vLLora's MCP Server brings trace inspection into your coding workflow, so you can debug inefficiencies the same way you debug errors: in your editor, with your tools. Don't just check if your agent works. Check <em>how</em> it works.</p>
<p>For setup details and advanced configuration, see the <a class="" href="https://vllora.dev/docs/vllora-mcp-server">vLLora MCP Server documentation</a>.</p></div>]]></content:encoded>
            <category>vLLora MCP</category>
            <category>Debugging</category>
            <category>Agents</category>
            <category>Tracing</category>
        </item>
        <item>
            <title><![CDATA[Introducing the vLLora MCP Server]]></title>
            <link>https://vllora.dev/blog/introducing-vllora-mcp-server</link>
            <guid>https://vllora.dev/blog/introducing-vllora-mcp-server</guid>
            <pubDate>Tue, 23 Dec 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[Inspect traces and debug AI agents directly from your IDE using MCP tools.]]></description>
            <content:encoded><![CDATA[<div class="prose prose-slate dark:prose-invert max-w-none"><p>If you’re building agents with tools like Claude Code or Cursor, or you prefer working in the terminal, you’ve probably hit this friction already. Your agent runs, something breaks partway through, and now you have to context-switch to a web UI to understand what happened. You search for the right trace, click through LLM calls, and then try to carry that context back into your editor.</p>
<p>vLLora’s MCP Server removes that context switch. Your coding agent becomes the interface for inspecting traces, understanding failures, and debugging agent behavior — without leaving your editor or terminal.</p>
<p><img decoding="async" loading="lazy" alt="vLLora MCP Server" src="https://vllora.dev/assets/images/vllora-mcp-shorter-1585d23fa8d811872e2e31db0a3b39e4.gif" width="1280" height="720" class="img_uaae"></p>
<!-- -->
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="making-traces-programmatic">Making Traces Programmatic<a href="https://vllora.dev/blog/introducing-vllora-mcp-server#making-traces-programmatic" class="hash-link" aria-label="Direct link to Making Traces Programmatic" title="Direct link to Making Traces Programmatic" translate="no">​</a></h2>
<p>vLLora already captures detailed traces for every agent run — model calls, tool executions, and execution flow — and the web UI remains a powerful way to explore that data.</p>
<p>But not every debugging workflow fits a dashboard. If you’re working from the terminal, iterating inside an IDE, or using a coding agent to help debug another agent, you need trace data where that work happens. You need structured access that tools and agents can consume directly.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="built-for-coding-agents">Built for Coding Agents<a href="https://vllora.dev/blog/introducing-vllora-mcp-server#built-for-coding-agents" class="hash-link" aria-label="Direct link to Built for Coding Agents" title="Direct link to Built for Coding Agents" translate="no">​</a></h2>
<p>When you're building AI agents that need debugging, you shouldn't have to leave your coding environment to inspect traces.</p>
<p>Your coding agent already understands MCP. When you connect vLLora's MCP server, your agent immediately knows how to use the trace inspection tools. The JSON schemas are built into the protocol, so your agent understands what parameters each tool needs and what it returns.</p>
<p>For a complete list of available tools and prompts, see the <a class="" href="https://vllora.dev/docs/vllora-mcp-server#tools-available">MCP Server documentation</a>.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="the-something-just-failed-workflow">The "Something Just Failed" Workflow<a href="https://vllora.dev/blog/introducing-vllora-mcp-server#the-something-just-failed-workflow" class="hash-link" aria-label="Direct link to The &quot;Something Just Failed&quot; Workflow" title="Direct link to The &quot;Something Just Failed&quot; Workflow" translate="no">​</a></h2>
<p>You run your agent and it produces an unexpected result. You need to debug it.</p>
<p>Instead of opening a tracing UI, you ask your coding agent to help debug it. The agent can:</p>
<ul>
<li class="">locate recent failing runs</li>
<li class="">walk execution flow across spans</li>
<li class="">inspect the exact payload sent to the model</li>
</ul>
<p>The agent handles the underlying queries and returns the context you need — while you stay in your editor.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="debugging-in-practice">Debugging in Practice<a href="https://vllora.dev/blog/introducing-vllora-mcp-server#debugging-in-practice" class="hash-link" aria-label="Direct link to Debugging in Practice" title="Direct link to Debugging in Practice" translate="no">​</a></h2>
<p>Here’s what debugging looks like once the MCP server is connected.</p>
<video controls="" playsinline="" muted="" loop="" style="width:100%;border-radius:12px"><source src="/videos/vllora-mcp-shorter.mp4" type="video/mp4"><p>Sorry, your browser doesn’t support embedded videos.</p></video>
<p>An agent run completes, but keeps failing in the same way. The agent believes it’s fixing the issue by retrying with different parameter names, but the failures persist.</p>
<p>You ask your coding agent:</p>
<blockquote>
<p>Use vLLora MCP to inspect the most recent agent run and explain why it produced this result.</p>
</blockquote>
<p>The agent searches recent traces, follows the execution flow, and inspects the tool call spans. It finds repeated calls like:</p>
<div class="language-json codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-json codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">  </span><span class="token property">"tool"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">"research_flights"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">  </span><span class="token property">"arguments"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token property">"from_city"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">"NYC"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token property">"to_city"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">"SFO"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token property">"departure_date"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">"2025-02-20"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">  </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><br></span></code></pre></div></div>
<p>From the trace data, the agent sees that from_city is not a valid parameter in the registered tool schema. Because the argument names don’t match the schema exposed at runtime, the function never executes — every retry fails before the tool logic runs.</p>
<p>Instead of guessing, the agent explains the root cause directly from execution data: a mismatch between the agent’s assumed parameter names and the actual tool definition.</p>
<p><img decoding="async" loading="lazy" alt="Analyzing the trace data" src="https://vllora.dev/assets/images/vllora-mcp-result-c139f3998d571e55e9179bc05227b48d.png" width="864" height="869" class="img_uaae"></p>
<p>You get a clear explanation of why retries didn’t help and what needs to change, without leaving your editor or inspecting raw logs.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="connecting-the-mcp-server">Connecting the MCP Server<a href="https://vllora.dev/blog/introducing-vllora-mcp-server#connecting-the-mcp-server" class="hash-link" aria-label="Direct link to Connecting the MCP Server" title="Direct link to Connecting the MCP Server" translate="no">​</a></h2>
<p>vLLora's MCP server runs alongside your vLLora instance. Configure your MCP client to connect to it:</p>
<div class="language-json codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-json codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">  </span><span class="token property">"mcpServers"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token property">"vllora"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">      </span><span class="token property">"url"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">"http://localhost:9090/mcp"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">  </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><br></span></code></pre></div></div>
<p>or install the MCP server in your IDE:</p>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="quick-install">Quick Install<a href="https://vllora.dev/blog/introducing-vllora-mcp-server#quick-install" class="hash-link" aria-label="Direct link to Quick Install" title="Direct link to Quick Install" translate="no">​</a></h3>
<p><a href="https://insiders.vscode.dev/redirect?url=vscode:mcp/install?%7B%22type%22%3A%22http%22%2C%22name%22%3A%22vLLora%22%2C%22url%22%3A%22http%3A%2F%2Flocalhost%3A9090%2Fmcp%22%7D" target="_blank" rel="noopener noreferrer" class=""><img decoding="async" loading="lazy" src="https://img.shields.io/badge/VS_Code-Install_Server-0098FF?style=for-the-badge&amp;logo=visual-studio-code&amp;logoColor=white" alt="Install in VS Code" class="img_uaae"></a>
<a href="https://vs-open.link/mcp-install?%7B%22type%22%3A%22http%22%2C%22url%22%3A%22http%3A%2F%2Flocalhost%3A9090%2Fmcp%22%7D" target="_blank" rel="noopener noreferrer" class=""><img decoding="async" loading="lazy" src="https://img.shields.io/badge/Visual_Studio-Install_Server-C16FDE?style=for-the-badge&amp;logo=visualstudio&amp;logoColor=white" alt="Install in Visual Studio" class="img_uaae"></a>
<a href="cursor://anysphere.cursor-deeplink/mcp/install?name=vLLora&amp;config=eyJ1cmwiOiJodHRwOi8vbG9jYWxob3N0OjkwOTAvbWNwIn0=" target="_blank" rel="noopener noreferrer" class=""><img decoding="async" loading="lazy" src="https://cursor.com/deeplink/mcp-install-dark.svg" alt="Add to Cursor" class="img_uaae"></a></p>
<p>Once connected, your coding agent automatically discovers the trace inspection tools and can start using them.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="closing-thoughts">Closing Thoughts<a href="https://vllora.dev/blog/introducing-vllora-mcp-server#closing-thoughts" class="hash-link" aria-label="Direct link to Closing Thoughts" title="Direct link to Closing Thoughts" translate="no">​</a></h2>
<p>Debugging AI agents has been tedious—too much context switching, too little visibility into what's happening. vLLora's MCP Server brings trace inspection into your coding workflow, so you can debug agents the same way you debug code: in your editor, with your tools.</p>
<p>This brings observability closer to where agent reasoning happens.</p>
<p>For setup details and advanced configuration, see the <a class="" href="https://vllora.dev/docs/vllora-mcp-server">vLLora MCP Server documentation</a>.</p></div>]]></content:encoded>
            <category>vLLora MCP</category>
            <category>Debugging</category>
            <category>Agents</category>
            <category>Tracing</category>
        </item>
        <item>
            <title><![CDATA[Debugging Agents: Why Prompt Tweaks Can't Fix Stale State]]></title>
            <link>https://vllora.dev/blog/debugging-agents-stale-state</link>
            <guid>https://vllora.dev/blog/debugging-agents-stale-state</guid>
            <pubDate>Mon, 22 Dec 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[A post-mortem where an agent faithfully executed outdated map coordinates despite correct prompts. The fix required synchronizing frontend viewport state, not better prompt engineering.]]></description>
            <content:encoded><![CDATA[<div class="prose prose-slate dark:prose-invert max-w-none"><p>In the earlier deep-agent case study (<a class="" href="https://vllora.dev/blog/exploring-deep-agents">Browsr</a>), I focused on architecture. Here I'll stay grounded in one debugging failure I hit in a maps agent—a failure that looked like a prompt problem but wasn't. The agent behaved correctly in chat, the UI looked correct, and yet the results were consistently from the wrong area. I tried the usual prompt tweaks: stronger instructions, "be careful," "use the visible map," retries. None of it moved the needle.</p>
<p>Here's how map state flows through the agent loop and where it can drift:</p>
<p><img decoding="async" loading="lazy" alt="Maps agent architecture" src="https://vllora.dev/assets/images/maps-agent-architecture-d5dd8ef036a0e16e221cb569e498daef.png" width="855" height="819" class="img_uaae"></p>
<!-- -->
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="the-bug">The Bug<a href="https://vllora.dev/blog/debugging-agents-stale-state#the-bug" class="hash-link" aria-label="Direct link to The Bug" title="Direct link to The Bug" translate="no">​</a></h2>
<p>The user expectation was simple:</p>
<ul>
<li class="">Pan the map to the neighborhood you care about.</li>
<li class="">Ask for "Starbucks."</li>
<li class="">Get Starbucks locations <em>in what's visible on the map</em>.</li>
</ul>
<p>What I observed instead:</p>
<ul>
<li class="">The user panned and zoomed to San Francisco.</li>
<li class="">The agent responded confidently and took action.</li>
<li class="">The places returned were from Mumbai, not San Francisco.</li>
</ul>
<p>Nothing in the conversation transcript looked obviously wrong. The root cause: <strong>the agent wasn't getting the context of what the user was seeing on the map</strong>. The mismatch was between the visible map state on the user's screen and what the agent had access to when making tool calls.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="the-tool-payloads">The Tool Payloads<a href="https://vllora.dev/blog/debugging-agents-stale-state#the-tool-payloads" class="hash-link" aria-label="Direct link to The Tool Payloads" title="Direct link to The Tool Payloads" translate="no">​</a></h2>
<p>You don't need the implementation to see the bug. The tool call arguments are enough. Here's the <em>wrong</em> tool call:</p>
<div class="language-json codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-json codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">  </span><span class="token property">"name"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">"search_place_by_name"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">  </span><span class="token property">"arguments"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token property">"query"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">"Starbucks"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token comment" style="color:rgb(106, 153, 85)">// MISSING CONTEXT:</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token comment" style="color:rgb(106, 153, 85)">// No "center_point" or "viewport_bbox" passed here.</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token comment" style="color:rgb(106, 153, 85)">// The backend silently defaulted to the session start location (Mumbai).</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">  </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><br></span></code></pre></div></div>
<p>The tool was called without any location context because the agent didn't have access to what the user was seeing on the map. It defaulted to GPS/stale coordinates internally, so results came from the wrong area.</p>
<p>After I fixed the state being passed through, the tool call looked like this:</p>
<div class="language-json codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-json codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">  </span><span class="token property">"name"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">"search_places"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">  </span><span class="token property">"arguments"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token property">"latitude"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token number" style="color:rgb(181, 206, 168)">37.7476</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token property">"longitude"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token number" style="color:rgb(181, 206, 168)">-122.4337</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token property">"query"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">"starbucks"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">  </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><br></span></code></pre></div></div>
<p>Now the visible map coordinates are explicitly passed, and results align with what the user is looking at.</p>
<p>And importantly, retries didn't help. Without correcting the state, the agent would keep calling <code>search_place_by_name</code> without location, producing the same wrong results.</p>
<p>The agent wasn’t “bad at following instructions.” It was acting on the wrong state.</p>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="why-prompt-tweaks-failed">Why Prompt Tweaks Failed<a href="https://vllora.dev/blog/debugging-agents-stale-state#why-prompt-tweaks-failed" class="hash-link" aria-label="Direct link to Why Prompt Tweaks Failed" title="Direct link to Why Prompt Tweaks Failed" translate="no">​</a></h3>
<p>I had already told the agent the right thing, in multiple forms:</p>
<ul>
<li class="">Search where the user is looking.</li>
<li class="">Prefer the visible map area over the user's location.</li>
<li class="">If the map moved, use the new location.</li>
</ul>
<p>But the agent <strong>couldn't see what the user was seeing on the map</strong>. The agent didn't have access to the current map view context—the center coordinates, zoom level, or visible bounds. When the user panned to San Francisco, that information stayed on the client side and never made it to the agent's context.</p>
<p>The agent can't follow an instruction that depends on information it doesn't have. When the tool schema expects explicit coordinates, and the agent's internal state still contains GPS/default coordinates (because it never received the updated map context), retries reproduce the same error:</p>
<ul>
<li class="">The prompt is correct.</li>
<li class="">The reasoning is coherent.</li>
<li class="">The tool arguments are wrong because the agent doesn't have the map context.</li>
</ul>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="how-i-discovered-it">How I Discovered It<a href="https://vllora.dev/blog/debugging-agents-stale-state#how-i-discovered-it" class="hash-link" aria-label="Direct link to How I Discovered It" title="Direct link to How I Discovered It" translate="no">​</a></h3>
<p>I found this bug by inspecting the tool call arguments in the execution logs. The logs showed the <code>search_place_by_name</code> tool being called with just the query—no location context.</p>
<p><img decoding="async" loading="lazy" alt="Maps Tool call arguments" src="https://vllora.dev/assets/images/tool-call-wrong-8969189cb5ad408960724a4ae8e16c92.png" width="1539" height="902" class="img_uaae"></p>
<p>The results came back from the wrong area because the agent never received the context of what the user was seeing—it was using stale GPS coordinates internally instead of the visible map bounds the user was actually looking at. Once I saw this mismatch between what the user saw and what the agent knew, the rest of the debugging was straightforward. I used <a class="" href="https://vllora.dev/using-vllora">vLLora</a> to capture these traces, which made the missing location argument obvious immediately.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="the-fix">The Fix<a href="https://vllora.dev/blog/debugging-agents-stale-state#the-fix" class="hash-link" aria-label="Direct link to The Fix" title="Direct link to The Fix" translate="no">​</a></h2>
<p>The fix wasn't changing prompts or tool schemas. It was ensuring the map state flowed from the React frontend to the agent's execution context.</p>
<p>Here's the mechanism:</p>
<p><strong>Frontend: React state tracking.</strong> The Google Maps component (<code>GoogleMapsManager</code>) tracks the current map center and zoom in React state. When the user pans or zooms, <code>setCenter</code> and <code>setZoom</code> update this state. This state lives entirely on the client side—the backend agent never sees it unless we explicitly send it.</p>
<p><strong>State capture on message send.</strong> When the user types a message and submits it, we capture the current map state from React <em>before</em> sending the request to the agent backend. The Chat component reads <code>center</code> and <code>zoom</code> from the map component's state at that moment.</p>
<p><strong>Context injection into agent execution.</strong> We inject the captured map coordinates into the agent's execution context. In our setup, this happens through the task context object that gets passed with each agent invocation. The context includes:</p>
<div class="language-typescript codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-typescript codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">  map_center</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"> latitude</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token number" style="color:rgb(181, 206, 168)">37.7749</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> longitude</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">-</span><span class="token number" style="color:rgb(181, 206, 168)">122.4194</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">  map_zoom</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token number" style="color:rgb(181, 206, 168)">13</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><br></span></code></pre></div></div>
<p>This context is available to the agent throughout its execution. The agent's system prompt can reference these values, or they can be injected directly into tool calls.</p>
<p><strong>Agent uses context in tool calls.</strong> The agent now has access to the visible map coordinates. When it needs to call <code>search_places</code>, it extracts the coordinates from the context and passes them explicitly:</p>
<div class="language-json codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-json codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">  </span><span class="token property">"name"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">"search_places"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">  </span><span class="token property">"arguments"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token property">"latitude"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token number" style="color:rgb(181, 206, 168)">37.7749</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain">  </span><span class="token comment" style="color:rgb(106, 153, 85)">// from context.map_center.latitude</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token property">"longitude"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token number" style="color:rgb(181, 206, 168)">-122.4194</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain">  </span><span class="token comment" style="color:rgb(106, 153, 85)">// from context.map_center.longitude</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token property">"query"</span><span class="token operator" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">"starbucks"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">  </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><br></span></code></pre></div></div>
<p>The tools themselves don't change—<code>search_places</code> still requires explicit <code>latitude</code> and <code>longitude</code> parameters. What changed is that the agent now receives the current visible map coordinates as context, so when the user pans to San Francisco and asks for "Starbucks," the agent uses the San Francisco coordinates instead of defaulting to GPS or stale coordinates.</p>
<p><strong>Alternative approaches we considered:</strong></p>
<ul>
<li class=""><strong>WebSocket sync:</strong> Continuously sync map state to the backend. Too much overhead for infrequent updates.</li>
<li class=""><strong>Specialized tool:</strong> Add a <code>get_current_map_state()</code> tool the agent could call. Adds latency and another step the agent might forget.</li>
<li class=""><strong>Augment system prompt:</strong> Inject coordinates directly into the system prompt string. Works, but harder to debug and less flexible than structured context.</li>
</ul>
<p>The context injection approach is clean: the state flows once per message, the agent has structured access to it, and we can inspect it in logs.</p>
<p><img decoding="async" loading="lazy" alt="Maps showing correct location" src="https://vllora.dev/assets/images/maps-correct-location-1935e58288e27d704729c860f503cc42.png" width="1602" height="961" class="img_uaae"></p>
<p>After the fix, the tool calls now include the visible map context, and the results appear exactly where the user is looking.</p>
<p><img decoding="async" loading="lazy" alt="Corrected tool call logs" src="https://vllora.dev/assets/images/tool-call-correct-e42a2c3699830ab425be55ab4c128658.png" width="1531" height="890" class="img_uaae"></p>
<p>The logs show the corrected behavior: location context is now properly passed through in the execution context, and the search results align with the visible map area.</p>
<p>In hindsight, it's obvious. Without inspecting the actual tool call arguments, it wasn't.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="the-lesson">The Lesson<a href="https://vllora.dev/blog/debugging-agents-stale-state#the-lesson" class="hash-link" aria-label="Direct link to The Lesson" title="Direct link to The Lesson" translate="no">​</a></h2>
<p>This maps bug is one instance of a broader class:</p>
<ul>
<li class="">The UI can be correct.</li>
<li class="">The agent's narration can be correct.</li>
<li class="">The tool call can still be wrong if state is stale, mis-scoped, or silently substituted.</li>
</ul>
<p>Prompt tweaks help when the agent is misunderstanding an instruction. They don't help when the agent is faithfully executing the wrong state. That's when you need to inspect what context the agent actually has—and log or trace what state flows through your tool calls.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="building-better-agents">Building Better Agents<a href="https://vllora.dev/blog/debugging-agents-stale-state#building-better-agents" class="hash-link" aria-label="Direct link to Building Better Agents" title="Direct link to Building Better Agents" translate="no">​</a></h2>
<p>This bug highlights a design pattern for building agents that need to stay in sync with dynamic UI state:</p>
<p><strong>Explicitly pass visible state into agent context.</strong> If your agent needs to act on what the user is seeing (map location, selected text, visible table rows, etc.), don't assume the agent knows. Make the connection explicit: when the user interacts with the UI, capture that state and inject it into the agent's context before each turn.</p>
<p><strong>Design your state flow.</strong> Map out what state your agent needs to make correct tool calls. Then trace where that state lives (UI component, backend, user session) and ensure it flows through to the agent at the right time. The maps agent needed map center/zoom—those live in React state and get passed through the task context.</p>
<p><strong>Inspect tool arguments, not just responses.</strong> The conversation looked fine because the agent's responses were coherent. The bug was in the tool call arguments. Make tool call inspection part of your debugging workflow—capture the actual arguments being sent, not just the tool responses or conversation transcript.</p>
<p><strong>Prefer explicit context over implicit defaults.</strong> The <code>search_place_by_name</code> tool defaulted to GPS coordinates when location wasn't provided. That default masked the real problem. Better to require explicit parameters or fail fast when context is missing.</p>
<p>The fix wasn't changing prompts or tool schemas—it was ensuring the agent receives the state it needs to make correct decisions. That's the difference between debugging symptoms and fixing architecture.</p></div>]]></content:encoded>
            <category>Deep Agents</category>
            <category>Postmortem</category>
            <category>Maps</category>
            <category>State Drift</category>
        </item>
        <item>
            <title><![CDATA[Building AI-Powered Image Generation with OpenAI-Compatible Responses API]]></title>
            <link>https://vllora.dev/blog/building-ai-powered-image-gen-responses-api</link>
            <guid>https://vllora.dev/blog/building-ai-powered-image-gen-responses-api</guid>
            <pubDate>Fri, 12 Dec 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[Learn how to build an AI-powered application that combines web search and image generation using the Responses API with Vllora LLM client in Rust.]]></description>
            <content:encoded><![CDATA[<div class="prose prose-slate dark:prose-invert max-w-none"><h2 class="anchor anchorTargetStickyNavbar_sDfD" id="introduction">Introduction<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#introduction" class="hash-link" aria-label="Direct link to Introduction" title="Direct link to Introduction" translate="no">​</a></h2>
<p>The Responses API represents a powerful evolution in how we interact with large language models. Unlike traditional chat completion APIs that return simple text responses, the Responses API enables structured, multi-step workflows that can orchestrate multiple tools and produce rich, multi-modal outputs.</p>
<p>In this article, we'll explore how to build an AI-powered application that combines web search and image generation capabilities.</p>
<blockquote>
<p><strong>Source Code:</strong> The complete example is available on <a href="https://github.com/vllora/vllora/tree/main/llm/examples/responses_image_generation" target="_blank" rel="noopener noreferrer" class="">GitHub</a>.</p>
<p><strong>Documentation:</strong> For comprehensive Responses API documentation, see the <a class="" href="https://vllora.dev/docs/vllora-llm/responses-api">Responses API guide</a> and <a class="" href="https://vllora.dev/docs/vllora-llm/responses-api/image-generation">Image Generation guide</a>.</p>
</blockquote>
<!-- -->
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="understanding-the-responses-api">Understanding the Responses API<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#understanding-the-responses-api" class="hash-link" aria-label="Direct link to Understanding the Responses API" title="Direct link to Understanding the Responses API" translate="no">​</a></h2>
<p>The Responses API is a more powerful alternative to the traditional Completions API. It enables structured, multi-step workflows with support for multiple built-in tools like web search and image generation, producing rich, multi-modal outputs that can be easily processed programmatically.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="prerequisites-and-setup">Prerequisites and Setup<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#prerequisites-and-setup" class="hash-link" aria-label="Direct link to Prerequisites and Setup" title="Direct link to Prerequisites and Setup" translate="no">​</a></h2>
<p>Before we dive into the code, let's ensure we have everything we need.</p>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="required-dependencies">Required Dependencies<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#required-dependencies" class="hash-link" aria-label="Direct link to Required Dependencies" title="Direct link to Required Dependencies" translate="no">​</a></h3>
<p>Our example requires the following Rust crates:</p>
<ul>
<li class=""><code>vllora_llm</code> - The Vllora LLM client library</li>
<li class=""><code>async-openai-compat</code> - OpenAI-compatible type definitions (version 0.30.1)</li>
<li class=""><code>base64</code> - For decoding base64-encoded images (version 0.22)</li>
<li class=""><code>tokio</code> - Async runtime (version 1.x with full features)</li>
<li class=""><code>serde_json</code> - JSON serialization support</li>
</ul>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="cargotoml-configuration">Cargo.toml Configuration<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#cargotoml-configuration" class="hash-link" aria-label="Direct link to Cargo.toml Configuration" title="Direct link to Cargo.toml Configuration" translate="no">​</a></h3>
<p>Here's the complete <code>Cargo.toml</code> for our example:</p>
<div class="language-toml codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-toml codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token plain">[package]</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">name = "responses_image_generation_example"</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">version = "0.1.0"</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">edition = "2021"</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">[workspace]</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">[dependencies]</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">vllora_llm = "0.1.17"</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">tokio = { version = "1", features = ["full"] }</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">serde_json = "1.0"</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">base64 = "0.22"</span><br></span></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="environment-setup">Environment Setup<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#environment-setup" class="hash-link" aria-label="Direct link to Environment Setup" title="Direct link to Environment Setup" translate="no">​</a></h3>
<p>You'll need to set your API key as an environment variable:</p>
<div class="language-bash codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-bash codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token builtin class-name" style="color:rgb(78, 201, 176)">export</span><span class="token plain"> </span><span class="token assign-left variable" style="color:rgb(156, 220, 254)">VLLORA_OPENAI_API_KEY</span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token string" style="color:rgb(206, 145, 120)">"your-api-key-here"</span><br></span></code></pre></div></div>
<blockquote>
<p><strong>Note:</strong> Make sure to keep your API key secure. Never commit it to version control or expose it in client-side code.</p>
</blockquote>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="building-the-request">Building the Request<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#building-the-request" class="hash-link" aria-label="Direct link to Building the Request" title="Direct link to Building the Request" translate="no">​</a></h2>
<p>Now let's construct our Responses API request. We'll create a request that uses both web search and image generation tools.</p>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="creating-the-createresponse-structure">Creating the CreateResponse Structure<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#creating-the-createresponse-structure" class="hash-link" aria-label="Direct link to Creating the CreateResponse Structure" title="Direct link to Creating the CreateResponse Structure" translate="no">​</a></h3>
<div class="language-rust codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-rust codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">async_openai</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">responses</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">CreateResponse</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">async_openai</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">responses</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">ImageGenTool</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">async_openai</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">responses</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">InputParam</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">async_openai</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">responses</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">Tool</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">async_openai</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">responses</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">WebSearchTool</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">let</span><span class="token plain"> responses_req </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(78, 201, 176)">CreateResponse</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    model</span><span class="token punctuation" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(78, 201, 176)">Some</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"gpt-4.1"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">to_string</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    input</span><span class="token punctuation" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(78, 201, 176)">InputParam</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">Text</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token string" style="color:rgb(206, 145, 120)">"Search for the latest news from today and generate an image about it"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">to_string</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    tools</span><span class="token punctuation" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(78, 201, 176)">Some</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token macro property">vec!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">[</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token class-name" style="color:rgb(78, 201, 176)">Tool</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">WebSearch</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token class-name" style="color:rgb(78, 201, 176)">WebSearchTool</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token function" style="color:rgb(220, 220, 170)">default</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token class-name" style="color:rgb(78, 201, 176)">Tool</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">ImageGeneration</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token class-name" style="color:rgb(78, 201, 176)">ImageGenTool</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token function" style="color:rgb(220, 220, 170)">default</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(212, 212, 212)">]</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(212, 212, 212)">..</span><span class="token class-name" style="color:rgb(78, 201, 176)">Default</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token function" style="color:rgb(220, 220, 170)">default</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><br></span></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="understanding-the-components">Understanding the Components<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#understanding-the-components" class="hash-link" aria-label="Direct link to Understanding the Components" title="Direct link to Understanding the Components" translate="no">​</a></h3>
<p><strong>Model Selection</strong> - We're using <code>"gpt-4.1"</code>, which supports the Responses API and tool calling. Make sure to use a model that supports these features.</p>
<p><strong>Input Parameter</strong> - We use <code>InputParam::Text</code> to provide a simple text prompt. The model will:</p>
<ol>
<li class="">First use the web search tool to find current news</li>
<li class="">Then use the image generation tool to create an image related to that news</li>
</ol>
<p><strong>Tool Configuration</strong> - We specify two tools:</p>
<ul>
<li class=""><code>WebSearchTool::default()</code> - Uses default web search configuration</li>
<li class=""><code>ImageGenTool::default()</code> - Uses default image generation settings</li>
</ul>
<p>The <code>..Default::default()</code> ensures all other fields use their default values, which is a common Rust pattern for struct initialization.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="initializing-the-client">Initializing the Client<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#initializing-the-client" class="hash-link" aria-label="Direct link to Initializing the Client" title="Direct link to Initializing the Client" translate="no">​</a></h2>
<p>Next, we need to set up the Vllora LLM client with our credentials.</p>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="client-configuration">Client Configuration<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#client-configuration" class="hash-link" aria-label="Direct link to Client Configuration" title="Direct link to Client Configuration" translate="no">​</a></h3>
<div class="language-rust codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-rust codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">client</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">VlloraLLMClient</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">credentials</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">ApiKeyCredentials</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">credentials</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">Credentials</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">let</span><span class="token plain"> client </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(78, 201, 176)">VlloraLLMClient</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token function" style="color:rgb(220, 220, 170)">default</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">with_credentials</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token class-name" style="color:rgb(78, 201, 176)">Credentials</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">ApiKey</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token class-name" style="color:rgb(78, 201, 176)">ApiKeyCredentials</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        api_key</span><span class="token punctuation" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token namespace">std</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">env</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token function" style="color:rgb(220, 220, 170)">var</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"VLLORA_OPENAI_API_KEY"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">            </span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">expect</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"VLLORA_OPENAI_API_KEY must be set"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><br></span></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="credential-management">Credential Management<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#credential-management" class="hash-link" aria-label="Direct link to Credential Management" title="Direct link to Credential Management" translate="no">​</a></h3>
<p>The client uses a builder pattern for configuration. Here we:</p>
<ol>
<li class="">Start with <code>VlloraLLMClient::default()</code> for default settings</li>
<li class="">Chain <code>.with_credentials()</code> to provide authentication</li>
<li class="">Use <code>Credentials::ApiKey()</code> with <code>ApiKeyCredentials</code> for API key authentication</li>
<li class="">Read the API key from the environment variable</li>
</ol>
<blockquote>
<p><strong>Tip:</strong> In production, consider using a more robust error handling approach instead of <code>.expect()</code>, such as returning a <code>Result</code> or using a configuration management library.</p>
</blockquote>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="sending-the-request-and-handling-responses">Sending the Request and Handling Responses<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#sending-the-request-and-handling-responses" class="hash-link" aria-label="Direct link to Sending the Request and Handling Responses" title="Direct link to Sending the Request and Handling Responses" translate="no">​</a></h2>
<p>Now let's send our request and see what we get back.</p>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="making-the-api-call">Making the API Call<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#making-the-api-call" class="hash-link" aria-label="Direct link to Making the API Call" title="Direct link to Making the API Call" translate="no">​</a></h3>
<div class="language-rust codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-rust codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">error</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">LLMResult</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"Sending request with tools: web_search_preview and image_generation"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">let</span><span class="token plain"> response </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"> client</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">responses</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">create</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">responses_req</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token keyword" style="color:rgb(86, 156, 214)">await</span><span class="token operator" style="color:rgb(212, 212, 212)">?</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><br></span></code></pre></div></div>
<p>The <code>client.responses().create()</code> method:</p>
<ul>
<li class="">Returns a <code>Result&lt;Response, LLMError&gt;</code></li>
<li class="">Is async, so we use <code>.await</code></li>
<li class="">The <code>?</code> operator propagates errors up the call stack</li>
</ul>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="understanding-the-response-structure">Understanding the Response Structure<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#understanding-the-response-structure" class="hash-link" aria-label="Direct link to Understanding the Response Structure" title="Direct link to Understanding the Response Structure" translate="no">​</a></h3>
<p>The <code>Response</code> struct contains an <code>output</code> field, which is a vector of <code>OutputItem</code> variants. Each item represents a different type of output from the API:</p>
<ul>
<li class="">Text messages from the model</li>
<li class="">Image generation results</li>
<li class="">Web search results</li>
<li class="">Other tool outputs</li>
</ul>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="processing-text-messages">Processing Text Messages<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#processing-text-messages" class="hash-link" aria-label="Direct link to Processing Text Messages" title="Direct link to Processing Text Messages" translate="no">​</a></h2>
<p>Let's see how to extract and display text content from the response.</p>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="matching-message-outputs">Matching Message Outputs<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#matching-message-outputs" class="hash-link" aria-label="Direct link to Matching Message Outputs" title="Direct link to Matching Message Outputs" translate="no">​</a></h3>
<div class="language-rust codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-rust codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">async_openai</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">responses</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">OutputItem</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">async_openai</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">responses</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">OutputMessageContent</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">for</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">index</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> output</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token keyword" style="color:rgb(86, 156, 214)">in</span><span class="token plain"> response</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">output</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">iter</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">enumerate</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token keyword" style="color:rgb(86, 156, 214)">match</span><span class="token plain"> output </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token class-name" style="color:rgb(78, 201, 176)">OutputItem</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">Message</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">message</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">=&gt;</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">            </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"\n[Message {}]"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> index</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">            </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"{}"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">"-"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">repeat</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token number" style="color:rgb(181, 206, 168)">80</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">            </span><span class="token keyword" style="color:rgb(86, 156, 214)">for</span><span class="token plain"> content </span><span class="token keyword" style="color:rgb(86, 156, 214)">in</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">&amp;</span><span class="token plain">message</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">content </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                </span><span class="token keyword" style="color:rgb(86, 156, 214)">match</span><span class="token plain"> content </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                    </span><span class="token class-name" style="color:rgb(78, 201, 176)">OutputMessageContent</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">OutputText</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">text_output</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">=&gt;</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                        </span><span class="token comment" style="color:rgb(106, 153, 85)">// Print the text content</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                        </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"\n{}"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> text_output</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">text</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                        </span><span class="token comment" style="color:rgb(106, 153, 85)">// Print sources/annotations if available</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                        </span><span class="token keyword" style="color:rgb(86, 156, 214)">if</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">!</span><span class="token plain">text_output</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">annotations</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">is_empty</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                            </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"Annotations: {:#?}"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> text_output</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">annotations</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                        </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                    </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                    _ </span><span class="token operator" style="color:rgb(212, 212, 212)">=&gt;</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                        </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"Other content type: {:?}"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> content</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                    </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">            </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">            </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"\n{}"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">"="</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">repeat</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token number" style="color:rgb(181, 206, 168)">80</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token comment" style="color:rgb(106, 153, 85)">// ... handle other output types</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><br></span></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="understanding-message-content">Understanding Message Content<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#understanding-message-content" class="hash-link" aria-label="Direct link to Understanding Message Content" title="Direct link to Understanding Message Content" translate="no">​</a></h3>
<p><strong>Message Structure</strong> - Each <code>Message</code> contains a <code>content</code> vector that can hold different content types:</p>
<ul>
<li class=""><code>OutputText</code> - The actual text response</li>
<li class="">Other content types for different media</li>
</ul>
<p><strong>Annotations</strong> - Text outputs can include <code>annotations</code> which provide:</p>
<ul>
<li class="">Citations and sources (especially useful with web search)</li>
<li class="">References to tool calls</li>
<li class="">Additional metadata</li>
</ul>
<p>These annotations are particularly valuable when using web search tools, as they show where the information came from.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="handling-image-generation-results">Handling Image Generation Results<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#handling-image-generation-results" class="hash-link" aria-label="Direct link to Handling Image Generation Results" title="Direct link to Handling Image Generation Results" translate="no">​</a></h2>
<p>This is the core focus of our example - extracting and saving generated images.</p>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="understanding-imagegentoolcall">Understanding ImageGenToolCall<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#understanding-imagegentoolcall" class="hash-link" aria-label="Direct link to Understanding ImageGenToolCall" title="Direct link to Understanding ImageGenToolCall" translate="no">​</a></h3>
<p>When the model uses the image generation tool, the response includes <code>OutputItem::ImageGenerationCall</code> variants. Each call contains:</p>
<ul>
<li class="">A <code>result</code> field with the base64-encoded image data</li>
<li class="">Metadata about the generation</li>
</ul>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="decoding-and-saving-images">Decoding and Saving Images<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#decoding-and-saving-images" class="hash-link" aria-label="Direct link to Decoding and Saving Images" title="Direct link to Decoding and Saving Images" translate="no">​</a></h3>
<p>Here's our complete image handling function:</p>
<div class="language-rust codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-rust codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">async_openai</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">responses</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">ImageGenToolCall</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">base64</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token namespace">engine</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">general_purpose</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token constant" style="color:rgb(100, 102, 149)">STANDARD</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(78, 201, 176)">Engine</span><span class="token plain"> </span><span class="token keyword" style="color:rgb(86, 156, 214)">as</span><span class="token plain"> _</span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">std</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token plain">fs</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token comment" style="color:rgb(106, 153, 85)">/// Decodes a base64-encoded image from an ImageGenerationCall and saves it to a file.</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token comment" style="color:rgb(106, 153, 85)">///</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token comment" style="color:rgb(106, 153, 85)">/// # Arguments</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token comment" style="color:rgb(106, 153, 85)">/// * `image_generation_call` - The image generation call containing the base64-encoded image</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token comment" style="color:rgb(106, 153, 85)">/// * `index` - The index to use in the filename</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token comment" style="color:rgb(106, 153, 85)">///</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token comment" style="color:rgb(106, 153, 85)">/// # Returns</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token comment" style="color:rgb(106, 153, 85)">/// * `Ok(filename)` - The filename where the image was saved</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token comment" style="color:rgb(106, 153, 85)">/// * `Err(e)` - An error if the call has no result, decoding fails, or file writing fails</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">fn</span><span class="token plain"> </span><span class="token function-definition function" style="color:rgb(220, 220, 170)">decode_and_save_image</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    image_generation_call</span><span class="token punctuation" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">&amp;</span><span class="token class-name" style="color:rgb(78, 201, 176)">ImageGenToolCall</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    index</span><span class="token punctuation" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token keyword" style="color:rgb(86, 156, 214)">usize</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">-&gt;</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(78, 201, 176)">Result</span><span class="token operator" style="color:rgb(212, 212, 212)">&lt;</span><span class="token class-name" style="color:rgb(78, 201, 176)">String</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(78, 201, 176)">Box</span><span class="token operator" style="color:rgb(212, 212, 212)">&lt;</span><span class="token keyword" style="color:rgb(86, 156, 214)">dyn</span><span class="token plain"> </span><span class="token namespace">std</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">error</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">Error</span><span class="token operator" style="color:rgb(212, 212, 212)">&gt;&gt;</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token comment" style="color:rgb(106, 153, 85)">// Extract base64 image from the call</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token keyword" style="color:rgb(86, 156, 214)">let</span><span class="token plain"> base64_image </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"> image_generation_call</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">result</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">as_ref</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">ok_or</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"Image generation call has no result"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token operator" style="color:rgb(212, 212, 212)">?</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token comment" style="color:rgb(106, 153, 85)">// Decode base64 image</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token keyword" style="color:rgb(86, 156, 214)">let</span><span class="token plain"> image_data </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"> </span><span class="token constant" style="color:rgb(100, 102, 149)">STANDARD</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">decode</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">base64_image</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token operator" style="color:rgb(212, 212, 212)">?</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token comment" style="color:rgb(106, 153, 85)">// Save to file</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token keyword" style="color:rgb(86, 156, 214)">let</span><span class="token plain"> filename </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"> </span><span class="token macro property">format!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"generated_image_{}.png"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> index</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token namespace">fs</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token function" style="color:rgb(220, 220, 170)">write</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token operator" style="color:rgb(212, 212, 212)">&amp;</span><span class="token plain">filename</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> image_data</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token operator" style="color:rgb(212, 212, 212)">?</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token class-name" style="color:rgb(78, 201, 176)">Ok</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">filename</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><br></span></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="step-by-step-breakdown">Step-by-Step Breakdown<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#step-by-step-breakdown" class="hash-link" aria-label="Direct link to Step-by-Step Breakdown" title="Direct link to Step-by-Step Breakdown" translate="no">​</a></h3>
<ol>
<li class="">
<p><strong>Extract Base64 Data</strong> - We access the <code>result</code> field, which is an <code>Option&lt;String&gt;</code>. We use <code>.ok_or()</code> to convert <code>None</code> into an error if the result is missing.</p>
</li>
<li class="">
<p><strong>Decode Base64</strong> - The <code>base64</code> crate's <code>STANDARD</code> engine decodes the base64 string into raw bytes. This can fail if the string is malformed, so we use <code>?</code> to propagate errors.</p>
</li>
<li class="">
<p><strong>Save to File</strong> - We use Rust's standard library <code>fs::write()</code> to save the decoded bytes to a file. We name it <code>generated_image_{index}.png</code> to avoid conflicts when multiple images are generated.</p>
</li>
<li class="">
<p><strong>Return Filename</strong> - We return the filename so the caller knows where the image was saved.</p>
</li>
</ol>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="using-the-function">Using the Function<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#using-the-function" class="hash-link" aria-label="Direct link to Using the Function" title="Direct link to Using the Function" translate="no">​</a></h3>
<p>Here's how we integrate this into our response processing:</p>
<div class="language-rust codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-rust codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token class-name" style="color:rgb(78, 201, 176)">OutputItem</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">ImageGenerationCall</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">image_generation_call</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">=&gt;</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"\n[Image Generation Call {}]"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> index</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token keyword" style="color:rgb(86, 156, 214)">match</span><span class="token plain"> </span><span class="token function" style="color:rgb(220, 220, 170)">decode_and_save_image</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">image_generation_call</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> index</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token class-name" style="color:rgb(78, 201, 176)">Ok</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">filename</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">=&gt;</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">            </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"✓ Successfully saved image to: {}"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> filename</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token class-name" style="color:rgb(78, 201, 176)">Err</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">e</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">=&gt;</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">            </span><span class="token macro property">eprintln!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"✗ Failed to decode/save image: {}"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> e</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><br></span></code></pre></div></div>
<p>We match on <code>OutputItem::ImageGenerationCall</code>, extract the call, and pass it to our decoding function. We handle both success and error cases gracefully.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="complete-example-walkthrough">Complete Example Walkthrough<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#complete-example-walkthrough" class="hash-link" aria-label="Direct link to Complete Example Walkthrough" title="Direct link to Complete Example Walkthrough" translate="no">​</a></h2>
<p>Let's put it all together and see the complete flow:</p>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="complete-source-code">Complete Source Code<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#complete-source-code" class="hash-link" aria-label="Direct link to Complete Source Code" title="Direct link to Complete Source Code" translate="no">​</a></h3>
<div class="language-rust codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-rust codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">async_openai</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">responses</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">CreateResponse</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">async_openai</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">responses</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">ImageGenTool</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">async_openai</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">responses</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">ImageGenToolCall</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">async_openai</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">responses</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">InputParam</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">async_openai</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">responses</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">OutputItem</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">async_openai</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">responses</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">OutputMessageContent</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">async_openai</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">responses</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">Tool</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">async_openai</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">responses</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">WebSearchTool</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">base64</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token namespace">engine</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">general_purpose</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token constant" style="color:rgb(100, 102, 149)">STANDARD</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(78, 201, 176)">Engine</span><span class="token plain"> </span><span class="token keyword" style="color:rgb(86, 156, 214)">as</span><span class="token plain"> _</span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">std</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token plain">fs</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">client</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">VlloraLLMClient</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">error</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">LLMResult</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">credentials</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">ApiKeyCredentials</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">use</span><span class="token plain"> </span><span class="token namespace">vllora_llm</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">types</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">credentials</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">Credentials</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">fn</span><span class="token plain"> </span><span class="token function-definition function" style="color:rgb(220, 220, 170)">decode_and_save_image</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    image_generation_call</span><span class="token punctuation" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">&amp;</span><span class="token class-name" style="color:rgb(78, 201, 176)">ImageGenToolCall</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    index</span><span class="token punctuation" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token keyword" style="color:rgb(86, 156, 214)">usize</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">-&gt;</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(78, 201, 176)">Result</span><span class="token operator" style="color:rgb(212, 212, 212)">&lt;</span><span class="token class-name" style="color:rgb(78, 201, 176)">String</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(78, 201, 176)">Box</span><span class="token operator" style="color:rgb(212, 212, 212)">&lt;</span><span class="token keyword" style="color:rgb(86, 156, 214)">dyn</span><span class="token plain"> </span><span class="token namespace">std</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">error</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">Error</span><span class="token operator" style="color:rgb(212, 212, 212)">&gt;&gt;</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token keyword" style="color:rgb(86, 156, 214)">let</span><span class="token plain"> base64_image </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"> image_generation_call</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">result</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">as_ref</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">ok_or</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"Image generation call has no result"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token operator" style="color:rgb(212, 212, 212)">?</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token keyword" style="color:rgb(86, 156, 214)">let</span><span class="token plain"> image_data </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"> </span><span class="token constant" style="color:rgb(100, 102, 149)">STANDARD</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">decode</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">base64_image</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token operator" style="color:rgb(212, 212, 212)">?</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token keyword" style="color:rgb(86, 156, 214)">let</span><span class="token plain"> filename </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"> </span><span class="token macro property">format!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"generated_image_{}.png"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> index</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token namespace">fs</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token function" style="color:rgb(220, 220, 170)">write</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token operator" style="color:rgb(212, 212, 212)">&amp;</span><span class="token plain">filename</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> image_data</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token operator" style="color:rgb(212, 212, 212)">?</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token class-name" style="color:rgb(78, 201, 176)">Ok</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">filename</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token attribute attr-name" style="color:rgb(156, 220, 254)">#[tokio::main]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">async</span><span class="token plain"> </span><span class="token keyword" style="color:rgb(86, 156, 214)">fn</span><span class="token plain"> </span><span class="token function-definition function" style="color:rgb(220, 220, 170)">main</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">-&gt;</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(78, 201, 176)">LLMResult</span><span class="token operator" style="color:rgb(212, 212, 212)">&lt;</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token operator" style="color:rgb(212, 212, 212)">&gt;</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token comment" style="color:rgb(106, 153, 85)">// 1) Build a Responses-style request using async-openai-compat types</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token comment" style="color:rgb(106, 153, 85)">// with tools for web_search_preview and image_generation</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token keyword" style="color:rgb(86, 156, 214)">let</span><span class="token plain"> responses_req </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(78, 201, 176)">CreateResponse</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        model</span><span class="token punctuation" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(78, 201, 176)">Some</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"gpt-4.1"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">to_string</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        input</span><span class="token punctuation" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(78, 201, 176)">InputParam</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">Text</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">            </span><span class="token string" style="color:rgb(206, 145, 120)">"Search for the latest news from today and generate an image about it"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">to_string</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        tools</span><span class="token punctuation" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token class-name" style="color:rgb(78, 201, 176)">Some</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token macro property">vec!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">[</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">            </span><span class="token class-name" style="color:rgb(78, 201, 176)">Tool</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">WebSearch</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token class-name" style="color:rgb(78, 201, 176)">WebSearchTool</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token function" style="color:rgb(220, 220, 170)">default</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">            </span><span class="token class-name" style="color:rgb(78, 201, 176)">Tool</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">ImageGeneration</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token class-name" style="color:rgb(78, 201, 176)">ImageGenTool</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token function" style="color:rgb(220, 220, 170)">default</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(212, 212, 212)">]</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(212, 212, 212)">..</span><span class="token class-name" style="color:rgb(78, 201, 176)">Default</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token function" style="color:rgb(220, 220, 170)">default</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token comment" style="color:rgb(106, 153, 85)">// 2) Construct a VlloraLLMClient</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token keyword" style="color:rgb(86, 156, 214)">let</span><span class="token plain"> client </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token class-name" style="color:rgb(78, 201, 176)">VlloraLLMClient</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token function" style="color:rgb(220, 220, 170)">default</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">with_credentials</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token class-name" style="color:rgb(78, 201, 176)">Credentials</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">ApiKey</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token class-name" style="color:rgb(78, 201, 176)">ApiKeyCredentials</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">            api_key</span><span class="token punctuation" style="color:rgb(212, 212, 212)">:</span><span class="token plain"> </span><span class="token namespace">std</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token namespace">env</span><span class="token namespace punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token function" style="color:rgb(220, 220, 170)">var</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"VLLORA_OPENAI_API_KEY"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                </span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">expect</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"VLLORA_OPENAI_API_KEY must be set"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token comment" style="color:rgb(106, 153, 85)">// 3) Non-streaming: send the request and print the final reply</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"Sending request with tools: web_search_preview and image_generation"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token keyword" style="color:rgb(86, 156, 214)">let</span><span class="token plain"> response </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"> client</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">responses</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">create</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">responses_req</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token keyword" style="color:rgb(86, 156, 214)">await</span><span class="token operator" style="color:rgb(212, 212, 212)">?</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"\nNon-streaming reply:"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"{}"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">"="</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">repeat</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token number" style="color:rgb(181, 206, 168)">80</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token keyword" style="color:rgb(86, 156, 214)">for</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">index</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> output</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token keyword" style="color:rgb(86, 156, 214)">in</span><span class="token plain"> response</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">output</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">iter</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">enumerate</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token keyword" style="color:rgb(86, 156, 214)">match</span><span class="token plain"> output </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">            </span><span class="token class-name" style="color:rgb(78, 201, 176)">OutputItem</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">ImageGenerationCall</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">image_generation_call</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">=&gt;</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"\n[Image Generation Call {}]"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> index</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                </span><span class="token keyword" style="color:rgb(86, 156, 214)">match</span><span class="token plain"> </span><span class="token function" style="color:rgb(220, 220, 170)">decode_and_save_image</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">image_generation_call</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> index</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                    </span><span class="token class-name" style="color:rgb(78, 201, 176)">Ok</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">filename</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">=&gt;</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                        </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"✓ Successfully saved image to: {}"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> filename</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                    </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                    </span><span class="token class-name" style="color:rgb(78, 201, 176)">Err</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">e</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">=&gt;</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                        </span><span class="token macro property">eprintln!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"✗ Failed to decode/save image: {}"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> e</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                    </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">            </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">            </span><span class="token class-name" style="color:rgb(78, 201, 176)">OutputItem</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">Message</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">message</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">=&gt;</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"\n[Message {}]"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> index</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"{}"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">"-"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">repeat</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token number" style="color:rgb(181, 206, 168)">80</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                </span><span class="token keyword" style="color:rgb(86, 156, 214)">for</span><span class="token plain"> content </span><span class="token keyword" style="color:rgb(86, 156, 214)">in</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">&amp;</span><span class="token plain">message</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">content </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                    </span><span class="token keyword" style="color:rgb(86, 156, 214)">match</span><span class="token plain"> content </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                        </span><span class="token class-name" style="color:rgb(78, 201, 176)">OutputMessageContent</span><span class="token punctuation" style="color:rgb(212, 212, 212)">::</span><span class="token class-name" style="color:rgb(78, 201, 176)">OutputText</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">text_output</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">=&gt;</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                            </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"\n{}"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> text_output</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">text</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                            </span><span class="token keyword" style="color:rgb(86, 156, 214)">if</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">!</span><span class="token plain">text_output</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">annotations</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">is_empty</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                                </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"Annotations: {:#?}"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> text_output</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">annotations</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                            </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                        </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                        _ </span><span class="token operator" style="color:rgb(212, 212, 212)">=&gt;</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                            </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"Other content type: {:?}"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> content</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                        </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                    </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"\n{}"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">"="</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token function" style="color:rgb(220, 220, 170)">repeat</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token number" style="color:rgb(181, 206, 168)">80</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">            </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">            _ </span><span class="token operator" style="color:rgb(212, 212, 212)">=&gt;</span><span class="token plain"> </span><span class="token punctuation" style="color:rgb(212, 212, 212)">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"\n[Other Output {}]"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> index</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">                </span><span class="token macro property">println!</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token string" style="color:rgb(206, 145, 120)">"{:?}"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"> output</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">            </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">        </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token class-name" style="color:rgb(78, 201, 176)">Ok</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token punctuation" style="color:rgb(212, 212, 212)">}</span><br></span></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="execution-flow">Execution Flow<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#execution-flow" class="hash-link" aria-label="Direct link to Execution Flow" title="Direct link to Execution Flow" translate="no">​</a></h3>
<ol>
<li class=""><strong>Request Construction</strong> - We build a <code>CreateResponse</code> with our prompt and tools</li>
<li class=""><strong>Client Initialization</strong> - We create and configure the Vllora LLM client</li>
<li class=""><strong>API Call</strong> - We send the request and await the response</li>
<li class=""><strong>Response Processing</strong> - We iterate through output items:<!-- -->
<ul>
<li class="">Handle image generation calls by decoding and saving</li>
<li class="">Display text messages with annotations</li>
<li class="">Handle any other output types</li>
</ul>
</li>
<li class=""><strong>File Output</strong> - Generated images are saved to disk as PNG files</li>
</ol>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="expected-output">Expected Output<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#expected-output" class="hash-link" aria-label="Direct link to Expected Output" title="Direct link to Expected Output" translate="no">​</a></h3>
<p>When you run this example, you'll see output like:</p>
<div class="language-text codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-text codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token plain">Sending request with tools: web_search_preview and image_generation</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">Non-streaming reply:</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">================================================================================</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">[Message 0]</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">--------------------------------------------------------------------------------</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">Here's the latest news from today: [summary of current news]</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">Annotations: [citations and sources from web search]</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">================================================================================</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">[Image Generation Call 1]</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">✓ Successfully saved image to: generated_image_1.png</span><br></span></code></pre></div></div>
<p><img decoding="async" loading="lazy" alt="AI-Powered Image Generation with Responses API" src="https://vllora.dev/assets/images/image-gen-responses-api-2dfb71d83d9b005cbcd1a8e3291fac61.png" width="1536" height="1024" class="img_uaae"></p>
<p>The actual news content and image will vary based on what's happening when you run it!</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="summary">Summary<a href="https://vllora.dev/blog/building-ai-powered-image-gen-responses-api#summary" class="hash-link" aria-label="Direct link to Summary" title="Direct link to Summary" translate="no">​</a></h2>
<p>This example demonstrates how to use the Responses API to create multi-tool workflows that combine web search and image generation. The key steps are:</p>
<ol>
<li class="">Build a <code>CreateResponse</code> request with the desired tools (<code>WebSearchTool</code> and <code>ImageGenTool</code>)</li>
<li class="">Initialize the <code>VlloraLLMClient</code> with your API credentials</li>
<li class="">Send the request and receive structured outputs</li>
<li class="">Process different output types: extract text from <code>OutputItem::Message</code> and decode base64 images from <code>OutputItem::ImageGenerationCall</code></li>
<li class="">Save decoded images to disk using standard Rust file I/O</li>
</ol>
<p>The Responses API enables powerful, structured workflows that go beyond simple text completions, making it ideal for building applications that need to orchestrate multiple AI capabilities.</p></div>]]></content:encoded>
            <category>Responses API</category>
            <category>Image Generation</category>
        </item>
        <item>
            <title><![CDATA[Pause, Inspect, Edit: Debug Mode for LLM Requests in vLLora]]></title>
            <link>https://vllora.dev/blog/debug-mode</link>
            <guid>https://vllora.dev/blog/debug-mode</guid>
            <pubDate>Thu, 11 Dec 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[Debug LLM agents with vLLora’s Debug Mode. Pause requests, view full prompts and parameters, edit them, and resume long workflows without code changes.]]></description>
            <content:encoded><![CDATA[<div class="prose prose-slate dark:prose-invert max-w-none"><p>LLMs behave like black boxes. You send them a request, hope the prompt is right, hope your agent didn't mutate it, hope the framework packaged it correctly — and then hope the response makes sense.
In simple one-shot queries this usually works fine. But when you're building agents, tools, multi-step workflows, or RAG pipelines, it becomes very hard to see what the model is actually receiving. A single unexpected message, parameter, or system prompt change can shift the entire run.</p>
<p>Today we're introducing <strong>Debug Mode</strong> for LLM requests in vLLora that makes this visible — and editable.</p>
<p>Here’s what debugging looks like in practice:</p>
<p><img decoding="async" loading="lazy" alt="Debugging LLM Request using Debug Mode" src="https://vllora.dev/assets/images/debug-mode-46200412ac1723e1be064a63e6c7e51f.gif" width="1280" height="720" class="img_uaae"></p>
<!-- -->
<p>vLLora now supports <strong>Debug Mode</strong> for LLM requests. When Debug Mode is enabled, every request pauses <em>before</em> it reaches the model. Debug Mode works by inserting breakpoints on every outgoing LLM request, allowing you to inspect, edit, or continue execution.</p>
<p>You can:</p>
<ul>
<li class="">Inspect the exact request</li>
<li class="">Edit anything</li>
<li class="">Continue execution normally</li>
</ul>
<p>This brings a familiar software-engineering workflow ("pause -&gt; inspect -&gt; edit -&gt; continue") to LLM development.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="why-we-built-this">Why We Built This<a href="https://vllora.dev/blog/debug-mode#why-we-built-this" class="hash-link" aria-label="Direct link to Why We Built This" title="Direct link to Why We Built This" translate="no">​</a></h2>
<p>If you've built anything beyond a simple chat interface, you've likely hit one of these:</p>
<ul>
<li class="">Silent <strong>tool-call failures</strong> (wrong name / bad params / malformed JSON)</li>
<li class=""><strong>Overloaded</strong> or <strong>corrupted</strong> context / RAG input leading to hallucination or truncation</li>
<li class=""><strong>Error accumulation</strong> and state drift in long or multi-step workflows</li>
<li class=""><strong>Lack of visibility</strong>: standard logs rarely show the actual request sent to the model</li>
</ul>
<p>It is difficult to fix these issues without proper observability. Debug Mode changes that.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="what-happens-when-a-request-pauses">What Happens When a Request Pauses<a href="https://vllora.dev/blog/debug-mode#what-happens-when-a-request-pauses" class="hash-link" aria-label="Direct link to What Happens When a Request Pauses" title="Direct link to What Happens When a Request Pauses" translate="no">​</a></h2>
<p>Here's what it looks like when vLLora intercepts a request right before it's sent:</p>
<p><img decoding="async" loading="lazy" alt="Paused request example" src="https://vllora.dev/assets/images/debugging-paused-breakpoint-view-c06ef659f3582e0722cbbb4ad782e0ff.png" width="1587" height="955" class="img_uaae"></p>
<p>You get a real-time snapshot of:</p>
<ul>
<li class="">The selected model</li>
<li class="">Full message array (system, user, assistant)</li>
<li class="">Parameters like temperature or max tokens</li>
<li class="">Any tool definitions</li>
<li class="">Any extra fields and headers your framework injected</li>
</ul>
<p>This is the <strong>full request payload</strong> your application is about to send — not what you assume it's sending.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="edit-anything">Edit Anything<a href="https://vllora.dev/blog/debug-mode#edit-anything" class="hash-link" aria-label="Direct link to Edit Anything" title="Direct link to Edit Anything" translate="no">​</a></h2>
<p>Click <strong>Edit</strong> and the payload becomes modifiable:</p>
<p><img decoding="async" loading="lazy" alt="Edit Request modal with JSON editor" src="https://vllora.dev/assets/images/debugging-edit-request-b6e49be94207ea05a8eced9ec37364ce.png" width="674" height="569" class="img_uaae"></p>
<p>You can adjust:</p>
<ul>
<li class="">Message content</li>
<li class="">System prompts</li>
<li class="">Model name</li>
<li class="">Parameters</li>
<li class="">Tool definitions</li>
<li class="">Metadata</li>
</ul>
<div class="theme-admonition theme-admonition-info admonition_Sj3K alert alert--info"><div class="admonitionHeading_bd6j"><span class="admonitionIcon_DS9F"><svg viewBox="0 0 14 16"><path fill-rule="evenodd" d="M7 2.3c3.14 0 5.7 2.56 5.7 5.7s-2.56 5.7-5.7 5.7A5.71 5.71 0 0 1 1.3 8c0-3.14 2.56-5.7 5.7-5.7zM7 1C3.14 1 0 4.14 0 8s3.14 7 7 7 7-3.14 7-7-3.14-7-7-7zm1 3H6v5h2V4zm0 6H6v2h2v-2z"></path></svg></span>Temporary Changes</div><div class="admonitionContent_KvEJ"><p>This affects only the current request. Your application code stays untouched.</p></div></div>
<p>It's a fast way to validate fixes, test ideas, and confirm what the agent <em>should</em> have sent.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="continue-the-workflow">Continue the Workflow<a href="https://vllora.dev/blog/debug-mode#continue-the-workflow" class="hash-link" aria-label="Direct link to Continue the Workflow" title="Direct link to Continue the Workflow" translate="no">​</a></h2>
<p><img decoding="async" loading="lazy" alt="Continue Workflow" src="https://vllora.dev/assets/images/debugging-continue-execution-3fe145aa49df8fda0a8c789e51b69b0c.png" width="520" height="185" class="img_uaae"></p>
<p>When you click <strong>Continue</strong>, vLLora:</p>
<ol>
<li class="">Sends your edited request to the model</li>
<li class="">Receives the real response</li>
<li class="">Passes it back to your application</li>
<li class="">Resumes the workflow as if nothing unusual happened</li>
</ol>
<p>After you click Continue, the workflow proceeds using the response from your edited request. The agent treats it the same way it would treat any normal response from the model.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="why-this-matters-for-agents">Why This Matters for Agents<a href="https://vllora.dev/blog/debug-mode#why-this-matters-for-agents" class="hash-link" aria-label="Direct link to Why This Matters for Agents" title="Direct link to Why This Matters for Agents" translate="no">​</a></h2>
<p>Agents are long-running chains of decisions. Each step can depend on the previous one, and each step can affect the next. Once you're 15 steps deep, you might not know whether:</p>
<ul>
<li class="">The prompt changed</li>
<li class="">A system message was overwritten</li>
<li class="">A parameter was set differently than expected</li>
<li class="">The context blew up</li>
<li class="">A tool schema got mutated</li>
</ul>
<p>With Debug Mode:</p>
<ul>
<li class="">You catch drift early</li>
<li class="">You see exactly what the model receives</li>
<li class="">You fix issues in seconds</li>
<li class="">You avoid rerunning long multi-step workflows</li>
<li class="">You test prompt or parameter changes instantly</li>
</ul>
<p>For deep agents, debugging becomes 10x easier.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="closing-thoughts">Closing Thoughts<a href="https://vllora.dev/blog/debug-mode#closing-thoughts" class="hash-link" aria-label="Direct link to Closing Thoughts" title="Direct link to Closing Thoughts" translate="no">​</a></h2>
<p>Debugging LLM systems has been mostly tedious. Debug Mode gives you a clear view into what’s happening and a way to correct issues as they occur.</p>
<p>If you need to understand or fix what an agent is sending, this is the most direct way to do it.</p>
<p>Read the docs: <a class="" href="https://vllora.dev/docs/debug-mode">Debug Mode</a></p>
<p>Try it locally: <a class="" href="https://vllora.dev/docs/quickstart">Quickstart</a></p></div>]]></content:encoded>
            <category>Tracing</category>
            <category>Agents</category>
            <category>Coding</category>
        </item>
        <item>
            <title><![CDATA[Exploring Deep Agent Architecture with vLLora: Case Study – Browsr]]></title>
            <link>https://vllora.dev/blog/exploring-deep-agents</link>
            <guid>https://vllora.dev/blog/exploring-deep-agents</guid>
            <pubDate>Mon, 08 Dec 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[Deep dive into how Browsr’s deep-agent loop plans, tools, and self-evaluates, with vLLora traces for observability.]]></description>
            <content:encoded><![CDATA[<div class="prose prose-slate dark:prose-invert max-w-none">
<p>Over the last year, agents have grown from one-shot prompt wrappers into systems that can work a problem for minutes or hours—researching, trying ideas, fixing mistakes, and resuming where they left off. Tools like Claude Code, Deep Research, Manus AI, and LangChain’s deep-agents all use this pattern.</p>
<p><img decoding="async" loading="lazy" alt="Typical Deep Agent Architecture" src="https://vllora.dev/assets/images/deepagents-architecture-5b939b56434ee84a71e4ab6b754811ce.png" width="1390" height="1000" class="img_uaae"></p>
<!-- -->
<p>A typical deep-agent architecture:</p>
<ul>
<li class="">Keeps a <em>running plan / TODO list</em> of what still needs to be done.</li>
<li class="">Uses <em>tools</em> (like a browser, shell, APIs) to act in the world step by step.</li>
<li class="">Stores <em>persistent memory</em> (artifacts, notes, intermediate results) so it doesn’t forget earlier work.</li>
<li class="">Regularly <em>evaluates its own progress</em>, adjusts the plan, and retries when something fails.</li>
</ul>
<p>Because it can plan, remember, and correct itself, a deep agent can run for a long duration, tens or hundreds of steps without losing the thread of the task.</p>
<p>Let’s debug and observe <a href="https://browsr.dev/" target="_blank" rel="noopener noreferrer" class="">Browsr</a> using <a href="https://vllora.ai/" target="_blank" rel="noopener noreferrer" class="">vLLora</a>(a tool for agent observability) and see what happens under the hood.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="browsr">Browsr<a href="https://vllora.dev/blog/exploring-deep-agents#browsr" class="hash-link" aria-label="Direct link to Browsr" title="Direct link to Browsr" translate="no">​</a></h2>
<p><a href="https://browsr.dev/" target="_blank" rel="noopener noreferrer" class="">Browsr</a> is a headless browser agent that lets you create sequences using a deep agent pattern and then hands you the payloads to run over APIs at scale. It also exports website data as structured or LLM-friendly markdown.</p>
<p>You can explore the definition and related configurations in this <a href="https://github.com/browsr-dev/browsr" target="_blank" rel="noopener noreferrer" class="">repo</a>.</p>
<iframe src="https://www.youtube.com/embed/YY6L7x0DvpQ?si=6t1emacl9eLDdiAy" width="744" height="504" frameborder="0" title="Browsr Deep Agent Walkthrough"></iframe>
<blockquote>
<p>Note: Always respect the copyright rules and terms of the sites you scrape.</p>
</blockquote>
<p><img decoding="async" loading="lazy" alt="Browsr Agent Architecture" src="https://vllora.dev/assets/images/browsr-agent-architecture-904eddc078a0ecec98a77c3ef8d87c5f.png" width="1242" height="1000" class="img_uaae"></p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="debugging-with-vllora">Debugging with vLLora<a href="https://vllora.dev/blog/exploring-deep-agents#debugging-with-vllora" class="hash-link" aria-label="Direct link to Debugging with vLLora" title="Direct link to Debugging with vLLora" translate="no">​</a></h2>
<p>In this article, we use <a href="https://vllora.ai/" target="_blank" rel="noopener noreferrer" class="">vLLora</a> to illustrate how deep agents work. vLLora lets you debug and observe your agents locally. vLLora can help us to better understand our architecture; toolcalls and observe the full agent timeline. It also works with all popular models.</p>
<p>Browsr iterates in 1–3 command bursts as a single step, saving context to artifacts and completes the task with final tool.</p>
<ul>
<li class=""><em>Driver</em>: browser_step is the main executor; every turn runs 1–3 browser commands with explicit thinking, evaluation_previous_goal, memory, and next_goal.</li>
<li class=""><em>Context control</em>: Large tool outputs are written to disk so the model can drop token-heavy responses and reload them on demand.</li>
<li class=""><em>Stateful loop</em>: Up to eight iterations, each grounded in the latest observation block (DOM + screenshot) to avoid hallucinating.</li>
<li class=""><em>Strict tool contract</em>: Exactly one tool call per reply (no free text), keeping the agent deterministic and debuggable.</li>
</ul>
<p>Lets further examine tool definitions as stated below.</p>
<p><img decoding="async" loading="lazy" alt="Browsr Tool Definitions" src="https://vllora.dev/assets/images/browsr-tool-definitions-aa9cd079efa42f123196d536fb16119a.png" width="1002" height="1000" class="img_uaae"></p>
<p>browser_step is the driver between steps. The system prompt forces the model to read the latest DOM and screenshot, report the current state, and then decide what to do next. Each turn must include:</p>
<ul>
<li class="">thinking: Reasoning about the current state.</li>
<li class="">evaluation_previous_goal: Verdict on last step</li>
<li class="">next_goal: Next immediate goal in one sentence.</li>
<li class="">commands: Array of commands to be executed.</li>
</ul>
<p>You can checkout the <a href="https://raw.githubusercontent.com/browsr-dev/browsr/refs/heads/main/agents/browsr.md" target="_blank" rel="noopener noreferrer" class="">full agent defintion here</a>.</p>
<p>Example: In one representative run, Browsr used the available context to navigate in step one, click in step two, and then run a JS evaluation to return structured data from the page.</p>
<p><img decoding="async" loading="lazy" alt="Example invocation of Steps" src="https://vllora.dev/assets/images/browsr-example-invocation-2551fd695ee9a341a4a30d0e0d299b79.png" width="2136" height="958" class="img_uaae"></p>
<h3 class="anchor anchorTargetStickyNavbar_sDfD" id="sample-traces">Sample Traces<a href="https://vllora.dev/blog/exploring-deep-agents#sample-traces" class="hash-link" aria-label="Direct link to Sample Traces" title="Direct link to Sample Traces" translate="no">​</a></h3>
<p><img decoding="async" loading="lazy" alt="Sample Traces" src="https://vllora.dev/assets/images/browsr-sample-traces-0df19bc9f9c74722485f1d0dab0beded.png" width="1445" height="1000" class="img_uaae"></p>
<h4 class="anchor anchorTargetStickyNavbar_sDfD" id="average-cost-and-no-of-steps-using-gpt-41-mini">Average cost and no. of steps using gpt-4.1-mini<a href="https://vllora.dev/blog/exploring-deep-agents#average-cost-and-no-of-steps-using-gpt-41-mini" class="hash-link" aria-label="Direct link to Average cost and no. of steps using gpt-4.1-mini" title="Direct link to Average cost and no. of steps using gpt-4.1-mini" translate="no">​</a></h4>
<ul>
<li class="">Average cost per trace ≈ <em>$0.0303</em> per run</li>
<li class="">Average steps ≈ <em>10.5</em> steps per run</li>
</ul>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="why-observability-is-critical-for-deep-agents">Why Observability is Critical for Deep Agents<a href="https://vllora.dev/blog/exploring-deep-agents#why-observability-is-critical-for-deep-agents" class="hash-link" aria-label="Direct link to Why Observability is Critical for Deep Agents" title="Direct link to Why Observability is Critical for Deep Agents" translate="no">​</a></h2>
<p>AI engineers spend a lot of time trying to understand why their agents behave the way they do tweaking system prompts, stepping through tool calls, and guessing what went wrong somewhere in the middle of a long run.</p>
<p>As agents move from single-shot tasks to long-running, multi-step workflows, understanding their behavior becomes extremely harder. A "deep agent" might run for 50+ steps, making hundreds of decisions.</p>
<ul>
<li class=""><em>Drift over time</em>: An agent can start off doing exactly what you want, then slowly drift off-course because of noisy context, misinterpreted instructions, or a small misunderstanding early on that compounds over later steps.</li>
<li class=""><em>Expose cost and context</em>: Spot token spikes, context bloat, and expensive branches and compare between different models.</li>
<li class=""><em>Make decisions traceable</em>: Line up what the agent read, wrote, and decided so you can see cause and effect.</li>
<li class=""><em>No big-picture view of execution</em>: You rarely get a clear, end-to-end picture of where time and money are going: is it planning, tool execution, retries, or extraction?</li>
</ul>
<p>vLLora is built to make this debuggable. It lets you see what your deep agents are actually doing across long runs.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="next-steps">Next Steps<a href="https://vllora.dev/blog/exploring-deep-agents#next-steps" class="hash-link" aria-label="Direct link to Next Steps" title="Direct link to Next Steps" translate="no">​</a></h2>
<ul>
<li class="">Explore and compare using other models</li>
<li class="">Test the architecture with different LLMs to evaluate performance and cost-effectiveness</li>
<li class="">Test Computer use automation with custom fine tuned models</li>
<li class="">Extend the agent's capabilities beyond the browser to general computer use, leveraging fine-tuned models for specific tasks.</li>
<li class="">Simulate a complex scenario involving several steps to showcase real capability of deep agents.</li>
</ul>
<p>In the next article, we'll explore these extensions and how they change the agent behavior.</p></div>]]></content:encoded>
            <category>Deep Agents</category>
            <category>Browsr</category>
        </item>
        <item>
            <title><![CDATA[Debugging LiveKit Voice Agents with vLLora]]></title>
            <link>https://vllora.dev/blog/voice-agents</link>
            <guid>https://vllora.dev/blog/voice-agents</guid>
            <pubDate>Tue, 04 Nov 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[Debug your voice agents built with LiveKit Agents using vLLora. See every model call, tool call, and response in real time.]]></description>
            <content:encoded><![CDATA[<div class="prose prose-slate dark:prose-invert max-w-none"><p>Voice agents built with <a href="https://docs.livekit.io/agents/" target="_blank" rel="noopener noreferrer" class="">LiveKit Agents</a> enable real-time, multimodal AI interactions that can handle voice, video, and text. These agents power everything from customer support bots to telehealth assistants, and debugging them requires visibility into the complex pipeline of speech-to-text, language model, and text-to-speech interactions.</p>
<p>In this video, we go over how you can debug voice agents built using <a href="https://docs.livekit.io/agents/" target="_blank" rel="noopener noreferrer" class="">LiveKit Agents</a> with vLLora. You'll see how to trace every model call, tool execution, and response as your agent processes real-time audio streams.</p>
<iframe src="https://www.veed.io/embed/72cafb44-9663-4036-b784-13e8d3168d79?watermark=1&amp;color=&amp;sharing=1&amp;title=1" width="744" height="504" frameborder="0" title="Vllora Demo"></iframe>
<!-- -->
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="setup">Setup<a href="https://vllora.dev/blog/voice-agents#setup" class="hash-link" aria-label="Direct link to Setup" title="Direct link to Setup" translate="no">​</a></h2>
<p>Run and configure vLLora locally. Follow the <a class="" href="https://vllora.dev/docs/quickstart">Quickstart</a> guide to get started.</p>
<div class="language-bash codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-bash codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token plain">brew tap vllora/vllora</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">brew </span><span class="token function" style="color:rgb(220, 220, 170)">install</span><span class="token plain"> vllora</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">vllora</span><br></span></code></pre></div></div>
<p>In your LiveKit Agent code, configure your LLM provider to use vLLora's endpoint:</p>
<div class="language-python codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-python codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token keyword" style="color:rgb(86, 156, 214)">from</span><span class="token plain"> livekit</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">plugins </span><span class="token keyword" style="color:rgb(86, 156, 214)">import</span><span class="token plain"> openai</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">import</span><span class="token plain"> os</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">session </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"> AgentSession</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">   llm</span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain">openai</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">LLM</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">      model</span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token string" style="color:rgb(206, 145, 120)">"model-name"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line theme-code-block-highlighted-line" style="color:#9CDCFE"><span class="token plain">      base_url</span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token string" style="color:rgb(206, 145, 120)">"http://localhost:9090/v1"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain">  </span><span class="token comment" style="color:rgb(106, 153, 85)"># vLLora endpoint</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">      api_key</span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token string" style="color:rgb(206, 145, 120)">"no_key"</span><span class="token plain">  </span><span class="token comment" style="color:rgb(106, 153, 85)"># vLLora doesn't validate API keys</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">   </span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token comment" style="color:rgb(106, 153, 85)"># ... stt, tts, etc ...</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><br></span></code></pre></div></div>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="what-vllora-shows-you">What vLLora Shows You<a href="https://vllora.dev/blog/voice-agents#what-vllora-shows-you" class="hash-link" aria-label="Direct link to What vLLora Shows You" title="Direct link to What vLLora Shows You" translate="no">​</a></h2>
<p><img decoding="async" loading="lazy" alt="Voice Agent Trace" src="https://vllora.dev/assets/images/thread-voice-agent-fe392326e1211d0414b75b33c1a068d4.png" width="1334" height="756" class="img_uaae"></p>
<p>With vLLora running, you can see:</p>
<ul>
<li class=""><strong>Model Calls</strong>: Every LLM model call with complete input/output, token usage, cost, and timing information</li>
<li class=""><strong>Tool Definitions</strong>: All tools available to your agent, including their schemas and descriptions</li>
<li class=""><strong>Tool Usage</strong>: Every tool call made by the agent, including parameters and responses</li>
</ul>
<p>By providing complete visibility into your voice agent's execution, vLLora makes it easier to build reliable, performant voice AI applications with LiveKit Agents.</p></div>]]></content:encoded>
            <category>Tracing</category>
            <category>Agents</category>
            <category>LiveKit</category>
        </item>
        <item>
            <title><![CDATA[Debugging Kilocode with vLLora]]></title>
            <link>https://vllora.dev/blog/debugging-kilocode</link>
            <guid>https://vllora.dev/blog/debugging-kilocode</guid>
            <pubDate>Mon, 03 Nov 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[Debug your Kilocode coding agent with vLLora, see every model call, tool call, and response in real time.]]></description>
            <content:encoded><![CDATA[<div class="prose prose-slate dark:prose-invert max-w-none"><p>Developers building coding agents need visiblity into how context is flowing through the agent, how much context is used, what tools are being called. vLLora enables you to debug all of this in real time.</p>
<p><img decoding="async" loading="lazy" alt="Peeking Inside Your Coding Agent" src="https://vllora.dev/assets/images/kilocode-improved-8508e982eae89d0f671044bf1a5d0884.gif" width="1280" height="720" class="img_uaae"></p>
<!-- -->
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="setup">Setup<a href="https://vllora.dev/blog/debugging-kilocode#setup" class="hash-link" aria-label="Direct link to Setup" title="Direct link to Setup" translate="no">​</a></h2>
<p>Run and configure vLLora locally. Follow the <a class="" href="https://vllora.dev/docs/quickstart">Quickstart</a> guide to get started.</p>
<div class="language-bash codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-bash codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token plain">brew tap vllora/vllora</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">brew </span><span class="token function" style="color:rgb(220, 220, 170)">install</span><span class="token plain"> vllora</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">vllora</span><br></span></code></pre></div></div>
<p>In <a href="https://kilocode.ai/" target="_blank" rel="noopener noreferrer" class="">KiloCode</a>, during setup select OpenAI Compatible and set the base URL to vLLora's endpoint. For API key, use <code>no_key</code> as vLLora does not validate the API key, since you set API key in the vLLora UI.</p>
<p>Now open your code editor with <a href="https://kilocode.ai/" target="_blank" rel="noopener noreferrer" class="">KiloCode</a> and start prompting your agent.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="the-prompt">The Prompt<a href="https://vllora.dev/blog/debugging-kilocode#the-prompt" class="hash-link" aria-label="Direct link to The Prompt" title="Direct link to The Prompt" translate="no">​</a></h2>
<div class="language-text codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-text codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token plain">Add a customer leaderboard or loyalty points tracker component, </span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">and embed a mini gallery section for user engagement.</span><br></span></code></pre></div></div>
<p>When this prompt runs in KiloCode, the agent edits several files, creates new components, updates imports, and adjusts the layout to match the request.</p>
<p><img decoding="async" loading="lazy" alt="Kilocode Trace" src="https://vllora.dev/assets/images/kilocode-tools-4e5ec5eddc87ab6918a69cb80ca67138.png" width="1920" height="958" class="img_uaae"></p>
<p>With vLLora running, we could see run involved <strong>10 model calls</strong> and a sequence of tool executions including <code>read_file</code>, <code>write_to_file</code>, <code>execute_command</code>, <code>apply_diff</code>, and <code>update_todo_list</code>.</p>
<p>Across the session, we could see the context size steadily grow as it started with about <strong>9,000 input tokens</strong> and reached nearly <strong>90,000 tokens</strong> by the end as the agent read, wrote, and reloaded files.<br>
<!-- -->This illustrates how coding agents like KiloCode repeatedly expand their working context as the project state evolves.</p>
<p>Beyond the visible tools in this trace, the underlying agent also defines a larger toolset, such as:</p>
<ul>
<li class=""><code>new_task</code>, <code>list_code_definition_names</code>, and <code>search_files</code> for project understanding</li>
<li class=""><code>insert_content</code>, <code>search_and_replace</code>, and <code>apply_diff</code> for precise code edits</li>
<li class=""><code>browser_action</code> and <code>execute_command</code> for testing and validation</li>
<li class=""><code>update_todo_list</code> and <code>attempt_completion</code> for managing the reasoning cycle</li>
</ul>
<p>vLLora captures every call in sequence, showing which tools and how they were used, how much context each request consumed, and how the model responded. This experience makes debugging easier by exposing where the agent slows down, repeats steps, or mismanages context. It helps you identify issues faster, optimize performance, and build more reliable coding agents.</p></div>]]></content:encoded>
            <category>Tracing</category>
            <category>Agents</category>
            <category>Coding</category>
        </item>
        <item>
            <title><![CDATA[Using vLLora with OpenAI Agents SDK]]></title>
            <link>https://vllora.dev/blog/using-vllora-with-openai-agents</link>
            <guid>https://vllora.dev/blog/using-vllora-with-openai-agents</guid>
            <pubDate>Sun, 02 Nov 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[Debug, trace, and analyze your OpenAI agents in real time with vLLora]]></description>
            <content:encoded><![CDATA[<div class="prose prose-slate dark:prose-invert max-w-none"><p>The OpenAI Agents SDK makes it easy to build agents with handoffs, streaming, and function calling. The hard part? Seeing what's actually happening when things don't work as expected.</p>
<p><img decoding="async" loading="lazy" alt="OpenAI Agents Tracing" src="https://vllora.dev/assets/images/traces-openai-625954fe9d3884f94b963ab3081ed511.png" width="1363" height="870" class="img_uaae"></p>
<!-- -->
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="setup-vllora">Setup vLLora<a href="https://vllora.dev/blog/using-vllora-with-openai-agents#setup-vllora" class="hash-link" aria-label="Direct link to Setup vLLora" title="Direct link to Setup vLLora" translate="no">​</a></h2>
<p>First, install vLLora using Homebrew:</p>
<div class="language-bash codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-bash codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token plain">brew tap vllora/vllora</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">brew </span><span class="token function" style="color:rgb(220, 220, 170)">install</span><span class="token plain"> vllora</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">vllora</span><br></span></code></pre></div></div>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="quick-setup">Quick Setup<a href="https://vllora.dev/blog/using-vllora-with-openai-agents#quick-setup" class="hash-link" aria-label="Direct link to Quick Setup" title="Direct link to Quick Setup" translate="no">​</a></h2>
<p>Route your OpenAI requests through vLLora by changing the base URL:</p>
<div class="language-python codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-python codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token keyword" style="color:rgb(86, 156, 214)">from</span><span class="token plain"> openai </span><span class="token keyword" style="color:rgb(86, 156, 214)">import</span><span class="token plain"> OpenAI</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">client </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"> OpenAI</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    api_key</span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token string" style="color:rgb(206, 145, 120)">"no_key"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line theme-code-block-highlighted-line" style="color:#9CDCFE"><span class="token plain">    base_url</span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token string" style="color:rgb(206, 145, 120)">"http://localhost:9090/v1"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><br></span></code></pre></div></div>
<p>This gives you basic traces showing model calls, latencies, token usage, and function executions. You'll see what's being sent and received, but you're missing agent-specific context like handoffs, state transitions, and streaming details.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="full-agent-visibility">Full Agent Visibility<a href="https://vllora.dev/blog/using-vllora-with-openai-agents#full-agent-visibility" class="hash-link" aria-label="Direct link to Full Agent Visibility" title="Direct link to Full Agent Visibility" translate="no">​</a></h2>
<p>For complete tracing with agent state, handoffs, and streaming context, use the vLLora Python library:</p>
<div class="language-bash codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-bash codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token plain">pip </span><span class="token function" style="color:rgb(220, 220, 170)">install</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">'vllora[openai]'</span><br></span></code></pre></div></div>
<p>Set your vLLora endpoint:</p>
<div class="language-bash codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-bash codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token builtin class-name" style="color:rgb(78, 201, 176)">export</span><span class="token plain"> </span><span class="token assign-left variable" style="color:rgb(156, 220, 254)">VLLORA_API_BASE_URL</span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain">http://localhost:9090</span><br></span></code></pre></div></div>
<p>Initialize vLLora before creating agents:</p>
<div class="language-python codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-python codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token keyword" style="color:rgb(86, 156, 214)">from</span><span class="token plain"> vllora</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">openai </span><span class="token keyword" style="color:rgb(86, 156, 214)">import</span><span class="token plain"> init</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">init</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token comment" style="color:rgb(106, 153, 85)"># Now define your agents</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">from</span><span class="token plain"> openai </span><span class="token keyword" style="color:rgb(86, 156, 214)">import</span><span class="token plain"> OpenAI</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token comment" style="color:rgb(106, 153, 85)"># ...</span><br></span></code></pre></div></div>
<p>vLLora automatically captures agent interactions, handoffs, function calls, and streaming responses. No client configuration needed—just initialize once and all your agent workflows are traced end-to-end.</p>
<p><img decoding="async" loading="lazy" alt="Traces of OpenAI Agents on vLLora" src="https://vllora.dev/assets/images/traces-openai-int-8abd731739a3180d3e758019d17bad63.png" width="1920" height="958" class="img_uaae"></p>
<p>You'll see agent state transitions, handoff triggers, function inputs and outputs, and streaming chunks bundled into unified traces. Each trace shows the complete execution path with timing information, so you can spot bottlenecks and debug multi-agent workflows. When an agent hands off to another, when a function executes, or when streaming starts and stops—it's all visible in one place.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="next-steps">Next Steps<a href="https://vllora.dev/blog/using-vllora-with-openai-agents#next-steps" class="hash-link" aria-label="Direct link to Next Steps" title="Direct link to Next Steps" translate="no">​</a></h2>
<ul>
<li class="">Get started with vLLora: <a class="" href="https://vllora.dev/docs/quickstart">Quickstart Guide</a></li>
<li class="">Learn about deeper integrations: <a class="" href="https://vllora.dev/docs/working-with-agents">Working with Agent Frameworks</a></li>
<li class="">Explore the full documentation: <a class="" href="https://vllora.dev/docs/">Introduction</a></li>
</ul></div>]]></content:encoded>
            <category>OpenAI</category>
        </item>
        <item>
            <title><![CDATA[Using vLLora with Google ADK]]></title>
            <link>https://vllora.dev/blog/using-vllora-with-google-adk</link>
            <guid>https://vllora.dev/blog/using-vllora-with-google-adk</guid>
            <pubDate>Sat, 01 Nov 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[Debug, trace, and analyze your Google ADK agents in real time with vLLora]]></description>
            <content:encoded><![CDATA[<div class="prose prose-slate dark:prose-invert max-w-none"><p>Google ADK (Agent Development Kit) lets you build multi-agent systems across different LLM providers—Gemini, OpenAI, Anthropic, and more. But when your planner agent produces a <code>FunctionCall</code> for an <code>AgentTool</code> that doesn't run correctly, or a nested sub-agent fails silently, debugging what happened across agents and sessions becomes nearly impossible.</p>
<p><img decoding="async" loading="lazy" alt="Traces of Google ADK on vLLora" src="https://vllora.dev/assets/images/traces-adk-bc78faa8fb7a33f5de4ab99f009f707e.png" width="1355" height="848" class="img_uaae"></p>
<!-- -->
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="debugging-with-vllora">Debugging with vLLora<a href="https://vllora.dev/blog/using-vllora-with-google-adk#debugging-with-vllora" class="hash-link" aria-label="Direct link to Debugging with vLLora" title="Direct link to Debugging with vLLora" translate="no">​</a></h2>
<div class="language-python codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-python codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token keyword" style="color:rgb(86, 156, 214)">import</span><span class="token plain"> litellm</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">import</span><span class="token plain"> os</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token comment" style="color:rgb(106, 153, 85)"># Configure LiteLLM to route through vLLora</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">os</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">environ</span><span class="token punctuation" style="color:rgb(212, 212, 212)">[</span><span class="token string" style="color:rgb(206, 145, 120)">"OPENAI_API_KEY"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">]</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">"no_key"</span><span class="token plain"></span><br></span><span class="token-line theme-code-block-highlighted-line" style="color:#9CDCFE"><span class="token plain">os</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">environ</span><span class="token punctuation" style="color:rgb(212, 212, 212)">[</span><span class="token string" style="color:rgb(206, 145, 120)">"OPENAI_API_BASE"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">]</span><span class="token plain"> </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">"http://localhost:9090/v1"</span><br></span></code></pre></div></div>
<p>Then use LiteLLM models in your agents as usual:</p>
<div class="language-python codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-python codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token keyword" style="color:rgb(86, 156, 214)">from</span><span class="token plain"> google</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">adk</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">agents </span><span class="token keyword" style="color:rgb(86, 156, 214)">import</span><span class="token plain"> Agent</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">from</span><span class="token plain"> google</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">adk</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">models</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">lite_llm </span><span class="token keyword" style="color:rgb(86, 156, 214)">import</span><span class="token plain"> LiteLlm</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">weather_agent </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"> Agent</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    name</span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token string" style="color:rgb(206, 145, 120)">"weather_agent"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    model</span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain">LiteLlm</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain">model</span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token string" style="color:rgb(206, 145, 120)">"openai/gpt-4o"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    tools</span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token punctuation" style="color:rgb(212, 212, 212)">[</span><span class="token plain">get_weather</span><span class="token punctuation" style="color:rgb(212, 212, 212)">]</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    </span><span class="token comment" style="color:rgb(106, 153, 85)"># ...</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><br></span></code></pre></div></div>
<p>All requests from this agent will flow through vLLora, giving you traces of model calls.</p>
<p><img decoding="async" loading="lazy" alt="Traces of Google ADK on vLLora" src="https://vllora.dev/assets/images/traces-adk-direct-35d6e1b0fe3c09a2334429592f7c77fd.png" width="1918" height="961" class="img_uaae"></p>
<p>Here you can see the traces of the model calls as well as the the <code>get_weather</code> tool call. But Google ADK has addtional metadata which is missing in the traces.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="advanced-tracing">Advanced Tracing<a href="https://vllora.dev/blog/using-vllora-with-google-adk#advanced-tracing" class="hash-link" aria-label="Direct link to Advanced Tracing" title="Direct link to Advanced Tracing" translate="no">​</a></h2>
<p>To get the complete observability including agent boundaries, tool calls, and nested workflows, use the vLLora Python library:</p>
<div class="language-bash codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-bash codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token plain">pip </span><span class="token function" style="color:rgb(220, 220, 170)">install</span><span class="token plain"> </span><span class="token string" style="color:rgb(206, 145, 120)">'vllora[adk]'</span><br></span></code></pre></div></div>
<p>Set your vLLora endpoint:</p>
<div class="language-bash codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-bash codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token builtin class-name" style="color:rgb(78, 201, 176)">export</span><span class="token plain"> </span><span class="token assign-left variable" style="color:rgb(156, 220, 254)">VLLORA_API_BASE_URL</span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain">http://localhost:9090</span><br></span></code></pre></div></div>
<p>Initialize vLLora before creating agents:</p>
<div class="language-python codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-python codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token keyword" style="color:rgb(86, 156, 214)">from</span><span class="token plain"> vllora</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">adk </span><span class="token keyword" style="color:rgb(86, 156, 214)">import</span><span class="token plain"> init</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">init</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token comment" style="color:rgb(106, 153, 85)"># Now define your agents</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token keyword" style="color:rgb(86, 156, 214)">from</span><span class="token plain"> google</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">adk</span><span class="token punctuation" style="color:rgb(212, 212, 212)">.</span><span class="token plain">agents </span><span class="token keyword" style="color:rgb(86, 156, 214)">import</span><span class="token plain"> Agent</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token comment" style="color:rgb(106, 153, 85)"># ...</span><br></span></code></pre></div></div>
<p>That's it. vLLora automatically discovers all agents, wraps their methods, and links sessions across your entire workflow. You don't need to configure LiteLLM separately; the initialization handles everything.</p>
<p><img decoding="async" loading="lazy" alt="Traces of Google ADK on vLLora" src="https://vllora.dev/assets/images/traces-adk-int-2a2d1bc28c128bce144d26617fcfea3d.png" width="1918" height="492" class="img_uaae"></p>
<p>Now you can see the full ADK workflow with extra metadata about the agent and tools, all bundled together as a single run. Agent transitions, tool executions, and model calls are captured in one unified trace.</p>
<p>With the library integration, you get complete visibility into agent boundaries, seeing exactly when control passes between agents. Every tool call is tracked with its inputs and outputs, sessions are linked across multiple agents and sub-agents, and complex nested workflows become visualizable. Whether you're debugging a single agent or orchestrating dozens, vLLora shows you exactly what's happening at every step.</p>
<p>For more details on integrating vLLora with Google ADK and other agent frameworks, check out our <a class="" href="https://vllora.dev/docs/working-with-agents">agent framework documentation</a>.</p></div>]]></content:encoded>
            <category>Google ADK</category>
        </item>
        <item>
            <title><![CDATA[Using vLLora to debug Agents]]></title>
            <link>https://vllora.dev/blog/debug-ai-agents</link>
            <guid>https://vllora.dev/blog/debug-ai-agents</guid>
            <pubDate>Thu, 30 Oct 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[Debug, trace, and analyze your AI agents in real time with vLLora — an OpenAI-compatible local debugging tool for LangChain, ADK, and other frameworks.]]></description>
            <content:encoded><![CDATA[<div class="prose prose-slate dark:prose-invert max-w-none"><p>Building AI agents is hard. Debugging them locally across multiple SDKs, tools, and providers feels like flying blind. Logs give you partial visibility. You need to see every call, latency, cost, and output in context without rewriting code.</p>
<p><img decoding="async" loading="lazy" alt="Debugging demo" src="https://vllora.dev/assets/images/traces-6455a15d8a7b8a2f79dd84bf3f2556c6.gif" width="1280" height="643" class="img_uaae"></p>
<!-- -->
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="why-debugging-agents-is-hard">Why debugging agents is hard<a href="https://vllora.dev/blog/debug-ai-agents#why-debugging-agents-is-hard" class="hash-link" aria-label="Direct link to Why debugging agents is hard" title="Direct link to Why debugging agents is hard" translate="no">​</a></h2>
<p>When you debug locally, requests disappear into SDKs. You piece together prints, partial logs, and guesswork. When something breaks or slows down, pinpointing the step, model, or tool is hard. Cost tracking is manual at best.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="meet-vllora">Meet vLLora<a href="https://vllora.dev/blog/debug-ai-agents#meet-vllora" class="hash-link" aria-label="Direct link to Meet vLLora" title="Direct link to Meet vLLora" translate="no">​</a></h2>
<p>vLLora is a local debugging tool with a UI that intercepts LLM requests. It implements the OpenAI API, so your existing clients and frameworks work unchanged. Set <code>base_url</code> to <code>http://localhost:9090/v1</code> and run your code as-is. vLLora forwards requests to your chosen provider using your keys, preserves streaming and tool/function calls, and records a trace for each step.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="get-started-in-under-a-minute">Get started in under a minute<a href="https://vllora.dev/blog/debug-ai-agents#get-started-in-under-a-minute" class="hash-link" aria-label="Direct link to Get started in under a minute" title="Direct link to Get started in under a minute" translate="no">​</a></h2>
<p>Install vLLora, point your SDK to it, and keep your existing code:</p>
<div class="language-bash codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-bash codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token plain">brew tap vllora/vllora</span><br></span><span class="token-line theme-code-block-highlighted-line" style="color:#9CDCFE"><span class="token plain">brew </span><span class="token function" style="color:rgb(220, 220, 170)">install</span><span class="token plain"> vllora</span><br></span><span class="token-line theme-code-block-highlighted-line" style="color:#9CDCFE"><span class="token plain">vllora</span><br></span></code></pre></div></div>
<p>Change your base URL and you're done:</p>
<div class="language-python codeBlockContainer_CFf0 theme-code-block" style="--prism-color:#9CDCFE;--prism-background-color:#1E1E1E"><div class="codeBlockTitle_Uuc9">LangChain (Python)</div><div class="codeBlockContent_NkbH"><pre tabindex="0" class="prism-code language-python codeBlock_AIpX thin-scrollbar" style="color:#9CDCFE;background-color:#1E1E1E"><code class="codeBlockLines_pvbz"><span class="token-line" style="color:#9CDCFE"><span class="token keyword" style="color:rgb(86, 156, 214)">from</span><span class="token plain"> langchain_openai </span><span class="token keyword" style="color:rgb(86, 156, 214)">import</span><span class="token plain"> ChatOpenAI</span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">llm </span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token plain"> ChatOpenAI</span><span class="token punctuation" style="color:rgb(212, 212, 212)">(</span><span class="token plain"></span><br></span><span class="token-line theme-code-block-highlighted-line" style="color:#9CDCFE"><span class="token plain">    base_url</span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token string" style="color:rgb(206, 145, 120)">"http://localhost:9090/v1"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain">    model</span><span class="token operator" style="color:rgb(212, 212, 212)">=</span><span class="token string" style="color:rgb(206, 145, 120)">"openai/gpt-4o-mini"</span><span class="token punctuation" style="color:rgb(212, 212, 212)">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#9CDCFE"><span class="token plain"></span><span class="token punctuation" style="color:rgb(212, 212, 212)">)</span><br></span></code></pre></div></div>
<p>Every request now flows through vLLora. Open <code>http://localhost:9091</code> to see the traces streaming in real time. For detailed setup instructions across different frameworks, see <a class="" href="https://vllora.dev/using-vllora">Using vLLora</a>.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="observe-your-agent-in-real-time">Observe your agent in real time<a href="https://vllora.dev/blog/debug-ai-agents#observe-your-agent-in-real-time" class="hash-link" aria-label="Direct link to Observe your agent in real time" title="Direct link to Observe your agent in real time" translate="no">​</a></h2>
<p>Open the UI while you build. Each request shows inputs, outputs, timing, and cost. No custom logging code needed.</p>
<p><img decoding="async" loading="lazy" alt="Trace view" src="https://vllora.dev/assets/images/traces-openai-625954fe9d3884f94b963ab3081ed511.png" width="1363" height="870" class="img_uaae"></p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="how-debugging-works">How debugging works<a href="https://vllora.dev/blog/debug-ai-agents#how-debugging-works" class="hash-link" aria-label="Direct link to How debugging works" title="Direct link to How debugging works" translate="no">​</a></h2>
<p><img decoding="async" loading="lazy" alt="Grouped by time bucket" src="https://vllora.dev/assets/images/grouped-bucket-36792c10ebe8e51c4139aafe05cdea20.png" width="1733" height="929" class="img_uaae"></p>
<p>vLLora sits between your SDK and the provider, capturing every request your framework makes and streaming traces to the UI in real time. You get full visibility into inputs, outputs, latency, and cost per model call. Requests are grouped by run or time bucket so you can see how your agent behaves step by step, replay turns, inspect streaming output, or compare model responses across different calls.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="compatibility-and-models">Compatibility and models<a href="https://vllora.dev/blog/debug-ai-agents#compatibility-and-models" class="hash-link" aria-label="Direct link to Compatibility and models" title="Direct link to Compatibility and models" translate="no">​</a></h2>
<p>vLLora works out of the box with OpenAI-compatible clients and major agent frameworks (LangChain, Google ADK, OpenAI Agents). Keep your code, just change the base URL. Use your own provider keys and switch between 300+ models to compare quality, performance, and cost.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="when-to-use-vllora">When to use vLLora<a href="https://vllora.dev/blog/debug-ai-agents#when-to-use-vllora" class="hash-link" aria-label="Direct link to When to use vLLora" title="Direct link to When to use vLLora" translate="no">​</a></h2>
<p>vLLora shines when you're building agents that use multiple tools and models, measuring latency and cost per step matters, or you're switching between providers like OpenAI, Anthropic, and Gemini and need consistent logs across all of them. If you're debugging chain-of-thought issues or tracking down missing tool calls, vLLora gives you a single pane of glass to see everything that's happening.</p>
<h2 class="anchor anchorTargetStickyNavbar_sDfD" id="next-steps">Next steps<a href="https://vllora.dev/blog/debug-ai-agents#next-steps" class="hash-link" aria-label="Direct link to Next steps" title="Direct link to Next steps" translate="no">​</a></h2>
<p>Ready to dive deeper? Check out the <a class="" href="https://vllora.dev/docs/quickstart">Quickstart</a> for installation details and sending your first trace, or explore <a class="" href="https://vllora.dev/docs/working-with-agents">Working with Agent Frameworks</a> for deeper integration with frameworks like OpenAI Agents SDK and Google ADK. For a complete overview of the product and setup details, see the <a class="" href="https://vllora.dev/docs/">Introduction</a>.</p>
<p>You now have x-ray vision for your agents. Build, trace, and optimize faster, all without touching your code.</p></div>]]></content:encoded>
            <category>Langchain</category>
            <category>Google ADK</category>
        </item>
    </channel>
</rss>