Technical overview

How ListingLens works

A real explanation of the agent architecture, API choices, and engineering decisions — written for the technical founder who wants to know whether the product is actually well-built, not just well-marketed.

Architecture

Four agents, one synthesis layer

ListingLens doesn't run a single model against an Amazon URL. It runs four independent AI agents in parallel, each with a different data source and evaluation rubric, then synthesises their outputs into a unified report.

POST /api/analyse{ asin, url }

→

Upstash Redisrate limiting + caching

→

SerpAPI producttitle, images, bullets

→

Agent fan-outPromise.allSettled

↓

Visual auditor

Claude Sonnet

vision + rubric

Review intelligence

Firecrawl

NLP + gap mapping

AI search visibility

Claude + Gemini

multi-model query

Category benchmarker

SerpAPI + Claude

competitor diff

↓

SSE streamagent events → client

→

Synthesis layerClaude aggregator

→

Structured reportJSON + Upstash KV

The fan-out runs via Promise.allSettled — all four agents start simultaneously, each streams its progress events independently, and a failed agent doesn't block the others. The synthesis layer only runs once all four have either completed or failed.

Agents in depth

What each agent actually does

Visual auditor

Claude Sonnet 4.6visionstructured output

Each image is base64-encoded and sent to Claude Sonnet with a custom rubric prompt: mobile readability (375px simulation), focal-point strength, text density, hierarchy clarity, and background competition. The model returns a structured JSON score for each image with per-failure explanations.

Claude Sonnet is chosen over GPT-4V because Sonnet is significantly more precise when scoring against a numerical rubric. GPT-4V tends to over-justify; Sonnet gives cleaner scores when the prompt is tightly constrained.

Review intelligence

FirecrawlClaude SonnetNLP

Reviews are scraped via Firecrawl with an extraction schema (paginated, up to 50 reviews). Claude runs extraction in a single batch: identify purchase triggers, extract recurring complaints, and surface keywords that appear in reviews but not in listing images.

This gap list feeds directly into the brief generator — the most actionable output of the tool.

AI search visibility

Claude APIGemini API6 queries

This is the most novel agent. It constructs 6 high-intent shopper queries from the product category and sends each to both Claude and Gemini: "Recommend a product for this need." The agent checks whether the product ASIN, brand name, or primary keyword appears.

Both models are queried because AI shopping behaviour differs between them — a product can be visible on Claude and invisible on Gemini, or vice versa.

Category benchmarker

SerpAPI ShoppingClaude Visiondiff scoring

SerpAPI returns the top 5 organic competitors for the listing's primary keyword. Each competitor's hero image is fetched and proxied. Claude Vision then runs a pairwise comparison against your hero image on 4 axes: text clarity, visual hierarchy, trust signals, and mobile-first composition.

The output is a gap score (0–100) per competitor, plus a narrative callout on the single highest-opportunity visual gap. The benchmarker is started first as it's the slowest agent.

Transport layer

Why SSE, not WebSockets

Decision: Server-Sent Events over WebSockets. This is deliberate, not a shortcut.

The analysis pipeline is entirely server-to-client: the client submits a URL, then receives a stream of events. There is no bidirectional communication requirement after submission. SSE is strictly correct here — it's unidirectional by design, rides HTTP/2 for free multiplexing, and doesn't require a WebSocket handshake upgrade.

On Vercel Edge Functions specifically, WebSocket connections require a persistent connection that conflicts with the serverless execution model. SSE works natively with ReadableStream and flush control.

t=0ms

event: agent_start data: {"id":"visual","title":"Visual auditor"}

t=120ms

event: agent_line data: {"id":"visual","line":"Loading listing images…"}

t=340ms

event: agent_line data: {"id":"reviews","line":"Reading customer reviews…"}

t=4200ms

event: agent_complete data: {"id":"visual","score":58,"summary":"7 images scored."}

t=9800ms

event: synthesis_complete data: {"reportId":"abc123","score":63,"grade":"C+"}

Resilience

How agent failure is handled gracefully

Each agent runs inside a withTimeout(agentFn, 15_000) wrapper. If an agent exceeds 15 seconds or throws, it returns a degraded result object instead of propagating an exception.

// In the fan-out: const results = await Promise.allSettled([ withTimeout(visualAgent, 15_000), withTimeout(reviewAgent, 15_000), withTimeout(aiSearchAgent, 15_000), withTimeout(benchmarkAgent, 15_000), ]); // Failed agents get a degraded placeholder: const safe = results.map((r, i) => r.status === 'fulfilled' ? r.value : { ...AGENT_DEFAULTS[i], failed: true, score: null });

Degraded mode:if one agent fails, the report is generated from the remaining three. The failed agent's section shows "Analysis unavailable" rather than a score. The overall score excludes it from the weighted average.

Decisions

Key architectural choices

Decision	Chose	Rejected	Reason
Transport	SSE	WebSockets	Unidirectional flow; works on Vercel Edge without persistent connections
Visual scoring	Claude Sonnet	GPT-4V	Cleaner structured output; less over-justification on rubric-constrained scoring
Persistence	Upstash KV	Postgres	Serverless-native; no connection pooling overhead; free tier covers MVP volume
Agent fan-out	Promise.allSettled	Promise.all	Sequential is 4× slower; Promise.all fails-fast on any error
Image proxy	Edge Function	Direct CDN URLs	Amazon CDN URLs require browser cookies; proxying server-side bypasses CORS
Score card export	html-to-image	@vercel/og (Satori)	Client-side canvas rendering handles complex CSS seamlessly and avoids strict Satori layout limitations