In this guide

How to improve the ROI of your AI stack with real customer context

    How to improve the ROI of your AI stack with real customer context

    7 min read
    Person working on laptop


    You’ve got every AI tool in the market, but are they delivering the value that justifies their price?

    Organizations are investing in coding assistants, AI design tools, enterprise chatbots, agentic workflows, internal copilots, and model infrastructure at the same time. Tools like Claude, ChatGPT, Cursor, Codex, and Figma Make are helping teams move faster, generate more, and automate work that used to take hours or days.

    But faster output does not automatically translate into better business performance.

    Enterprise AI usage is rising quickly, and spend is rising with it. Deloitte notes that token unit prices are falling while overall enterprise AI spend continues to increase as usage, workload complexity, and model intensity grow. McKinsey’s 2025 global survey found that 88% of organizations report regular AI use in at least one function, but only 39% report EBIT impact at the enterprise level.

    And in UserTesting’s Defensible Design in the Age of AI report, the vast majority (64.7%) of designers can’t confidently say their work outcomes are better with AI.

    Disconnect between AI adoption and outcomes

    In short, everyone’s adopted AI, but not everyone can say it’s delivering a return on investment.

    Inside many organizations, the pattern is already visible. Teams are producing more code, more designs, more content, and more ideas than ever before. But a significant portion of that output never makes it through to impact. It gets rewritten, reworked, or quietly abandoned after review.

    The fastest way to improve AI ROI is to reduce how much generated work gets thrown away.

    Most AI systems are powerful pattern engines. They can produce work that looks complete and credible on the first pass. But without the right inputs, they optimize for what is typical, not what is true for your customers.

    As the Defensible Design report found, the number one challenge for designers using AI was that it “sounds right, but is hard to verify”

    Customer context is the missing layer that takes AI outcomes from ideas that look correct, to ones we can defend, and use to prove impact.

    The impact of customer context on AI workflows

    Most AI tools are powerful pattern engines.

    They can generate code, summarize documents, create design variations, write copy, propose product flows, and produce research plans. But unless they have access to the right context, they are often optimizing from general patterns rather than specific customer understanding.

    That is a problem for any team using AI to create customer-facing work.

    A model may know what a typical checkout flow looks like. It may not know why your customers abandon yours.

    It may know how to generate a landing page. It may not know which customer objections your sales, product, and research teams have already uncovered.

    This is where customer understanding becomes AI infrastructure.

    Customer insight gives AI systems better input:

    • real customer needs
    • recurring friction points
    • verbatims
    • behavioral patterns
    • usability findings
    • audience-specific objections
    • evidence from prior studies

    Having this information gives us the tools to identify exactly what outcomes need to be improved, and how we can improve them.

    The new dimensions for evaluating AI outcomes

    AI ROI is often discussed in terms of cost and productivity. Those factors matter, but they are incomplete on their own.

    A more useful way to evaluate AI is to consider five dimensions together: 

    • The cost of generating work, 
    • The speed at which it is produced, 
    • The quality of the output, 
    • The impact on customers
    • The amount of rework required along the way.
    The new dimensions for evaluating AI outcomes


    Most organizations have visibility into the first two. The latter three determine whether AI is creating meaningful value.

    Customer context influences all of them. It improves the quality of initial output, reduces unnecessary iteration, and increases the likelihood that what is produced will perform.

    We created the following 5-step operating model to consistently evaluate these dimensions, giving us a clear and actionable method for proving the impact of the AI tools we use.

    A practical operating model for improving AI ROI

    Improving AI ROI requires changing how work flows through the AI tools you’re currently using.

    The organizations seeing the most value from AI are not simply generating more. They are increasing the percentage of AI-generated work that is usable, validated, and aligned with real customer needs. That shift can be operationalized.

    What follows is a simple model for doing that consistently.
     

    5 step model for operationalizing AI value

    1. Identify your highest-cost AI workflows

    Start by creating a simple inventory of where AI is actually used.

    Pull in data from engineering, product, design, marketing, and support, and list:

    • the workflow (e.g., “generate landing pages,” “prototype new flows,” “write backend code”)
    • the tools used (ChatGPT, Claude, Cursor, etc.)
    • approximate usage (high / medium / low)
    • estimated cost (seats, tokens, or time spent reviewing output)

    Then ask:

    • Which workflows consume the most AI spend?
    • Which ones generate the most output?
    • Which ones lead directly to product or customer-facing decisions?

    Put this into a simple table. You are not trying to be perfect—you are trying to identify the 3–5 workflows that matter most.

    Decision:
     Prioritize a small set of high-cost, high-impact workflows. Ignore the rest for now.

    2. Measure output quality, not just activity

    For each prioritized workflow, audit what happens after AI generates something.

    Take a recent sample (last 1–2 weeks) and review outputs with the team that uses them.

    Track:

    • Was the output used as-is, lightly edited, or heavily rewritten?
    • How many revision cycles did it go through?
    • How much time was spent reviewing or fixing it?
    • Did it move forward or get abandoned?

    You can capture this in a simple scoring system:

    • 3 = used as-is
    • 2 = minor edits
    • 1 = major rewrite
    • 0 = discarded

    Then ask:

    • What % of outputs are actually usable?
    • Where are we spending the most time fixing AI output?
    • Which workflows consistently produce weak first drafts?

    Decision:
     Identify the workflows with high rework and low acceptance. These are your biggest ROI opportunities.

    3. Diagnose the context gap

    Now look at the workflows with the highest rework and ask a different question:

    What did the AI know when it generated this?

    For each workflow, review:

    • the prompts or inputs used
    • the brief or instructions given
    • what customer information (if any) was included

    Ask:

    • Did this include real customer evidence (verbatims, research, usability findings)?
    • Or was it based on general instructions and internal assumptions?
    • Are teams relying on “best practices” instead of known customer behavior?
    • When does customer feedback enter the process—before or after generation?

    Document this simply:

    • “Context used: none / limited / strong”
    • “Customer evidence included: yes / no”

    Patterns will emerge quickly.

    Decision:
     Flag workflows where output quality is low AND customer context is weak or missing. These are context gaps you can fix.

     

    4. Inject customer context upstream

    For the workflows you flagged, change how inputs are created before AI is used.

    Start by gathering existing customer evidence:

    • recent user interviews
    • usability findings
    • support tickets
    • common objections from sales
    • behavioral patterns or drop-off points

    Then standardize how that context is used.

    For example:

    • Add a “customer context” section to product briefs
    • Require prompts to include 3–5 real customer insights
    • Create reusable context blocks (e.g., “Top onboarding friction points”)

    Instead of:

    “Generate a signup flow”

    Teams should be working from:

    “Generate a signup flow using these known user frustrations and behaviors”

    You do not need new research to start—use what you already have.

    Decision:
     Update workflows so that AI is always prompted with customer evidence, not just instructions.

     

    5. Validate early and reduce rework

    Once output improves, introduce validation before further investment.

    For each workflow:

    • Select a small set of AI-generated outputs (designs, flows, content)
    • Test them with real or representative users
    • Compare variations where possible

    Ask:

    • Do users understand this?
    • Where do they struggle?
    • Which version performs better, and why?

    Track:

    • number of revision cycles before approval
    • time to final decision
    • major issues found before vs after launch

    Then compare against your baseline from Step 2.

    Decision:

    • Stop investing in outputs that fail validation early
    • Scale the workflows where validated outputs move forward with fewer revisions

    Over time, you should see:

    • fewer discarded outputs
    • faster approvals
    • less downstream rework

    How MCP enables this workflow

    Model Context Protocol, or MCP, is an open standard that allows AI tools to connect to external services and data sources.

    For UserTesting, MCP creates a way to bring real customer understanding closer to the AI tools teams already use — including tools such as Claude, ChatGPT, Figma Make, Cursor, and other MCP clients.

    That changes the AI workflow in three important ways.

    1. AI can work from customer evidence, not just general patterns

    Instead of prompting an AI tool from memory or assumptions, teams can bring in real UserTesting findings, customer themes, verbatims, and behavioral patterns.

    That means the first draft, design direction, prototype, or product brief can start from a stronger understanding of what customers actually think, feel, and do.

    2. Teams can move from question to test faster

    If a team does not have enough evidence yet, MCP-supported workflows can help move from a question to study creation faster. Teams can define what they need to learn, generate or refine a test plan, and connect to recruiting through User Interviews.

    That matters because AI-generated work needs fast validation. Otherwise, teams can create faster than they can learn.

    3. Insights can show up where decisions happen

    Insights create more value when they reach teams at the moment of decision. MCP helps bring UserTesting closer to the workflows where product, engineering, design, and marketing teams are already generating ideas, reviewing data, and making tradeoffs.

    This is the shift from UserTesting as a research destination to UserTesting as a customer-context layer inside AI-assisted work.

    Request MCP early access

    Use UserTesting’s MCP server to recruit participants, create studies, and launch tests from Claude, ChatGPT, Figma Make, and other AI clients.

    AI ROI by role

    Make customer understanding part of your AI infrastructure

    AI tools are becoming embedded in how organizations build, design, decide, and communicate. The return from those tools depends on the quality of the inputs they receive and the feedback loops that shape their output.

    Improving AI ROI does not start with buying more tools or limiting usage. It starts with increasing the percentage of work that is useful, validated, and aligned with customer needs.

    UserTesting’s MCP functionality helps bring real customer understanding into AI-assisted workflows so teams can generate stronger first drafts, test ideas faster, recruit the right participants, and make decisions grounded in what customers actually think, feel, and do.

    That is how organizations move from AI output to AI impact.