Technical overview

How ListingLens works

A real explanation of the agent architecture, API choices, and engineering decisions — written for the technical founder who wants to know whether the product is actually well-built, not just well-marketed.

Architecture

Four agents, one synthesis layer

ListingLens doesn't run a single model against an Amazon URL. It runs four independent AI agents in parallel, each with a different data source and evaluation rubric, then synthesises their outputs into a unified report.

POST /api/analyse{ asin, url }
Upstash Redisrate limiting + caching
SerpAPI producttitle, images, bullets
Agent fan-outPromise.allSettled
Visual auditor
Claude Sonnet
vision + rubric
Review intelligence
Firecrawl
NLP + gap mapping
AI search visibility
Claude + Gemini
multi-model query
Category benchmarker
SerpAPI + Claude
competitor diff
SSE streamagent events → client
Synthesis layerClaude aggregator
Structured reportJSON + Upstash KV

The fan-out runs via Promise.allSettled — all four agents start simultaneously, each streams its progress events independently, and a failed agent doesn't block the others. The synthesis layer only runs once all four have either completed or failed.

Agents in depth

What each agent actually does

01

Visual auditor

Claude Sonnet 4.6visionstructured output

Each image is base64-encoded and sent to Claude Sonnet with a custom rubric prompt: mobile readability (375px simulation), focal-point strength, text density, hierarchy clarity, and background competition. The model returns a structured JSON score for each image with per-failure explanations.

Claude Sonnet is chosen over GPT-4V because Sonnet is significantly more precise when scoring against a numerical rubric. GPT-4V tends to over-justify; Sonnet gives cleaner scores when the prompt is tightly constrained.

02

Review intelligence

FirecrawlClaude SonnetNLP

Reviews are scraped via Firecrawl with an extraction schema (paginated, up to 50 reviews). Claude runs extraction in a single batch: identify purchase triggers, extract recurring complaints, and surface keywords that appear in reviews but not in listing images.

This gap list feeds directly into the brief generator — the most actionable output of the tool.

03

AI search visibility

Claude APIGemini API6 queries

This is the most novel agent. It constructs 6 high-intent shopper queries from the product category and sends each to both Claude and Gemini: "Recommend a product for this need." The agent checks whether the product ASIN, brand name, or primary keyword appears.

Both models are queried because AI shopping behaviour differs between them — a product can be visible on Claude and invisible on Gemini, or vice versa.

04

Category benchmarker

SerpAPI ShoppingClaude Visiondiff scoring

SerpAPI returns the top 5 organic competitors for the listing's primary keyword. Each competitor's hero image is fetched and proxied. Claude Vision then runs a pairwise comparison against your hero image on 4 axes: text clarity, visual hierarchy, trust signals, and mobile-first composition.

The output is a gap score (0–100) per competitor, plus a narrative callout on the single highest-opportunity visual gap. The benchmarker is started first as it's the slowest agent.

Transport layer

Why SSE, not WebSockets

Decision: Server-Sent Events over WebSockets. This is deliberate, not a shortcut.

The analysis pipeline is entirely server-to-client: the client submits a URL, then receives a stream of events. There is no bidirectional communication requirement after submission. SSE is strictly correct here — it's unidirectional by design, rides HTTP/2 for free multiplexing, and doesn't require a WebSocket handshake upgrade.

On Vercel Edge Functions specifically, WebSocket connections require a persistent connection that conflicts with the serverless execution model. SSE works natively with ReadableStream and flush control.

t=0ms
event: agent_start data: {"id":"visual","title":"Visual auditor"}
t=120ms
event: agent_line data: {"id":"visual","line":"Loading listing images…"}
t=340ms
event: agent_line data: {"id":"reviews","line":"Reading customer reviews…"}
t=4200ms
event: agent_complete data: {"id":"visual","score":58,"summary":"7 images scored."}
t=9800ms
event: synthesis_complete data: {"reportId":"abc123","score":63,"grade":"C+"}
Resilience

How agent failure is handled gracefully

Each agent runs inside a withTimeout(agentFn, 15_000) wrapper. If an agent exceeds 15 seconds or throws, it returns a degraded result object instead of propagating an exception.

// In the fan-out: const results = await Promise.allSettled([ withTimeout(visualAgent, 15_000), withTimeout(reviewAgent, 15_000), withTimeout(aiSearchAgent, 15_000), withTimeout(benchmarkAgent, 15_000), ]); // Failed agents get a degraded placeholder: const safe = results.map((r, i) => r.status === 'fulfilled' ? r.value : { ...AGENT_DEFAULTS[i], failed: true, score: null });
Degraded mode:if one agent fails, the report is generated from the remaining three. The failed agent's section shows "Analysis unavailable" rather than a score. The overall score excludes it from the weighted average.
Decisions

Key architectural choices

DecisionChoseRejectedReason
TransportSSEWebSocketsUnidirectional flow; works on Vercel Edge without persistent connections
Visual scoringClaude SonnetGPT-4VCleaner structured output; less over-justification on rubric-constrained scoring
PersistenceUpstash KVPostgresServerless-native; no connection pooling overhead; free tier covers MVP volume
Agent fan-outPromise.allSettledPromise.allSequential is 4× slower; Promise.all fails-fast on any error
Image proxyEdge FunctionDirect CDN URLsAmazon CDN URLs require browser cookies; proxying server-side bypasses CORS
Score card exporthtml-to-image@vercel/og (Satori)Client-side canvas rendering handles complex CSS seamlessly and avoids strict Satori layout limitations