The Stack Has a Gap
Three things became true in commerce over the last eighteen months, and they became true close enough together that you could miss the shape of what's happening if you weren't looking for it.
The first is that AI-driven discovery stopped being a curiosity. ChatGPT now reports around 900 million weekly active users. AI Overviews appear on roughly 18% of Google searches. Perplexity is doing close to a billion queries a quarter. A meaningful share of that traffic is people researching things to buy, comparing jackets, looking up coffee tables, asking which running shoe suits a flat foot. The search bar is no longer the only front door, and for a growing slice of shoppers it isn't even the main one.
The second is that the transaction layer is being rewired to match. OpenAI and Stripe shipped the Agentic Commerce Protocol in late 2025, and ChatGPT's Instant Checkout is now live with Etsy and rolling out across more than a million Shopify merchants, including Glossier, SKIMS, Vuori, and Spanx. Google and Shopify are pushing a parallel standard, the Universal Commerce Protocol. PayPal is bringing tens of millions of small businesses onto ACP this year. The plumbing that lets an agent close a purchase on a shopper's behalf is no longer theoretical. It's a spec with version numbers.
The third is quieter, but it's the one that actually changes the experience. Image generation crossed a threshold somewhere in the last year where a model can render a specific product on a specific person's body, with fabric drape, lighting, and proportion at a quality a brand will actually put in front of a customer. Google's try-on, powered by a custom fashion model, now works from a single selfie. Doji is generating photorealistic outfit try-ons from six photos. eMarketer found that try-on imagery in Google Search gets 60% more high-quality views than standard product listings. This is no longer the AR era of awkward overlays and 3D approximations. The output is photographic.
Discovery moved. Checkout is moving. The layer in between, the moment a shopper decides yes, that's for me, has barely been touched.
That's the gap.
Why the middle has always been broken
Every product detail page you've ever seen ships with two primitives: Add to Cart and some flavor of Shop the Look. Neither has a visualization pair. There is, on most of the internet, no fast way to see the thing on yourself before you commit to it.
The numbers tell you what that costs. The National Retail Federation put US ecommerce returns at roughly 20% of orders in 2024, around $360 billion in goods coming back. Apparel runs much higher, somewhere between 25% and 40% depending on category and season, against 8 to 10% for in-store. Roughly half of online shoppers admit to bracketing, buying multiple items with the explicit intent to return what doesn't look good. None of these numbers are new. They've been roughly this bad for years.
The industry tried to fix this once, in the AR era. Overlays, 3D models, virtual fitting rooms. The technology was rough, the experience didn't feel like the brand, and adoption stalled. A lot of brand teams learned to say no to the entire category, and that institutional memory is still operating. So the technology has leapt forward in the last eighteen months, but trust hasn't caught up. The category needs to be reintroduced more than it needs to be iterated on.
The three-layer stack
It helps to think about the commerce experience as three layers stacked on top of each other.
- Discovery is where shoppers find products. AI has already moved here, both in the obvious sense (ChatGPT, Perplexity, AI Overviews) and in the less obvious one (the Shopping Graph, agent-readable feeds, structured catalogs as the new SEO).
- Checkout is where transactions close. ACP and UCP are the protocols being standardized to let agents complete purchases on a shopper's behalf, with the merchant remaining merchant-of-record. The race here is mostly about who sets the spec.
- Visualization is where shoppers decide. Right now it lives, when it lives at all, as a feature bolted onto a PDP. It doesn't show up in CX chat. It doesn't show up when a stylist is helping on the sales floor. And it doesn't show up where a growing share of discovery is happening, inside the AI assistant the shopper is already talking to.
If discovery is moving and checkout is moving, the middle layer being static isn't a steady state. It's a queue.
What changes when fidelity is good enough
The reason this moment is different from the AR era isn't really better technology in the abstract. It's that the output is now good enough to be trusted. When a shopper uploads a photo and sees themselves wearing the thing, with fabric falling correctly, lighting consistent, the product sitting on their actual body in their actual proportions, the gap between I think this might work and I want this closes inside one image.
That's not a gimmick. It's a different conversion mechanism, and it's measurable. The eMarketer data is one signal: try-on imagery getting more engagement than static listings. So is the behavior pattern. Shoppers don't try on once. They iterate. They compare two jackets, swap a top, see the same dress in a different color. The feature ends up working as a comparison tool, which is a different and more durable thing than a one-shot novelty.
There's also a secondary effect on returns. A lot of what gets sent back isn't a product problem. It's a decision the shopper made without enough information. They saw the product on a model, on a hanger, on a flat lay, and had to extrapolate from there to themselves. Sometimes the extrapolation works. Often it doesn't, and the box goes back.
Visualization closes that gap. When the shopper can see the product on their own body before they commit, the decision they're making is the one they'll actually live with when the package arrives. There's no extrapolation step to get wrong. That's why 85% of apparel brands Coresight surveyed either had virtual try-on in production or were planning it. The intent has been there for years. The output quality is what was holding it back.
The agent distribution moment
Here's what makes the timing more urgent than the fidelity unlock alone.
Agentic commerce protocols (ACP and UCP) are being designed to let AI assistants do the whole thing: find the product, confirm details, close the purchase. That's the thesis the specs are built around. But there's a step in the middle the specs don't address. Between the assistant surfacing a product and the shopper saying yes, the shopper still has to decide. And the way humans decide whether they want something is by picturing it. An agent that can find a jacket and complete a transaction but can't help the shopper see the jacket on themselves is solving the easy halves of the problem and leaving the hard one in the middle.
That middle step is going to get filled. The interesting question is by what. Surfaces the brand doesn't own (chat assistants, recommendation feeds, third-party storefronts) are where a growing share of decisions get made, and those surfaces need something to call when a shopper asks to see the product on themselves. Whoever ends up being the thing they call is in a different position than whoever ends up being one of many widgets bolted onto a PDP.
The fitting room stops being a place and starts being something more abstract. Less a feature on a website, more a capability that gets reached for from wherever the shopper happens to be when they're deciding
Measurement is the unlock
There's one more reason the AR era stalled, and it's worth saying out loud because the fix isn't just better pixels.
Visualization was never a reportable channel. AR vendors couldn't close the loop between try-on and revenue, which meant brand teams couldn't answer the question their CFO was always going to ask: what did this drive? Try-on stayed in the nice to have bucket, and nice to have is the first line item cut when budgets tighten.
The shift now isn't just rendering quality. It's attribution. Try-on volume, checkout clicks from a try-on session, shared looks, orders tied back to specific renders, all of it can flow as first-party data into the analytics stack a brand already runs. When procurement can see a measured revenue line, the conversation moves from marketing experiment to infrastructure with a budget.
What comes next
The brands that move on this early get something the late movers won't, and it isn't just first-mover positioning. It's data. Every try-on session is a signal: what's being paired with what, what's being abandoned at the render step, what's converting on the first image versus the third. That data compounds, and it's the kind of signal that's hard to backfill once shoppers have settled into a competitor's surface.
The AR-era vendors are being displaced. The question isn't whether visualization becomes a standard layer in the stack. Discovery and checkout being agentic on either side of it makes that close to inevitable. The question is who builds the layer that everything else integrates with.
Discovery moved. Checkout is moving. Visualization is the unclaimed middle. The window is open right now. Windows like this one don't usually stay open very long.
