
In this guide
Research as Organizational Intelligence : A playbook for research leaders in the AI era
Research as Organizational Intelligence : A playbook for research leaders in the AI era

This UserTesting playbook was written in collaboration with Aarron Walter and Eli Woolery, hosts of the Design Better podcast and Substack, based on interviews conducted with multiple research leaders from some of the world's most innovative companies.
Executive Summary
This playbook is for heads of research and the design leaders who partner with them. If you lead a user research function today, you are almost certainly feeling three pressures at once: Product teams move faster than your studies can keep up with. AI tools put research-adjacent capabilities in the hands of people who have never run a study. A seat at the strategy table is harder to claim than it used to be.
That combination has quietly broken the service model most research teams still operate inside. When a researcher waits to be asked, designs a careful study, and delivers a polished report, they often arrive after the decision has already been made.
The service model is no longer fit for purpose, and AI is both an accelerant of its decline and an opportunity to get out of it. The teams that are succeeding are evolving from a function that answers questions to one that shapes the questions the organization asks.
This is a playbook based on learnings from companies that have successfully made the shift from service delivery to strategic intelligence. It is organized around three capabilities every modern research function needs: processes that let research operate at decision speed without losing judgment; tools that keep insight findable and useful long after a study ends; and people practices that turn research into an organizational habit. A closing chapter operationalizes everything into a ninety-day plan.
A NOTE ON SOURCING
This report is based on in-depth conversations with four research leaders actively shaping practice in the AI era:
Michael Margolis (formerly Google Ventures, author of Learn More Faster), Danika Patrick (Carta), Jess Holbrook (Microsoft AI), and Prakriti Parijat (Instacart).
Where a practice or play is attributed to one of them, that attribution appears in the text. Direct quotes are attributed inline. Case studies are drawn from the interviewees’ first-person accounts of work at the companies named. The markers of excellence, plays, pitfalls, and ninety-day plan are our synthesis of what emerged across the interviews, shaped by our own observations of research leaders we have worked with over the past two years. Any errors of interpretation are ours.
Chapter 1: The Shift
From service delivery to strategic intelligence
The new operating model for research, and why most teams are still running the old one.
Why the old service model is breaking down
In the service model, research operates as a vendor to the rest of the organization. A product manager has a question, commissions a study, and receives findings on a timeline that runs parallel to but separate from the decision the findings were meant to inform. The researcher is skilled, the study is rigorous, and the deliverable is thorough. The product manager decides whether to use it.
This model produces excellent work. However, it also produces three predictable challenges.
- The team has already moved on by the time the research arrives.
- Findings are delivered in “user language" in rooms that speak business language.
- The insights generated by each study disappear once the project ends, leaving the next team to relearn what the last one already knew.
None of these challenges is new. What has changed is how much they now cost. AI has compressed product cycles, democratized prototyping, and raised the number of decisions a team can make in a quarter. In that environment, a research function that waits to be asked loses influence faster than it can build it.
Consider the most common experience in a service-model team. A planning meeting happens in Q3. The product roadmap for the next year is being shaped. The research leader is not in the room. A few weeks later, the research team is asked to validate a direction that is already mostly decided. They run a careful study, surface a real concern, and deliver the findings. The concern is acknowledged and filed. The roadmap ships on schedule because the momentum is already too strong to redirect.
The cost is not the individual study. It is the compounding loss of influence that happens when research is consistently too late to change the decision. Over time, the team that could have prevented a bad bet becomes the team that documents why it failed.
What strategic intelligence looks like instead
A research function operating as strategic intelligence does three things the service model does not.
It shapes questions before decisions are locked. When a planning conversation starts, the research leader is either in the room or has already made sure the people in the room have what they need. The research that matters most is not the study that validates a direction; it is the synthesis that informs which directions are worth pursuing in the first place.
It translates findings into business consequences. A user insight is the beginning of the work, not the end of it. The researcher who can say “this friction is costing us conversion in the segment that matters most” operates in a different category than the researcher who stops at “users find this confusing.”
It makes knowledge durable and accessibleA body of research compounds in value only if it is easily accessed. Teams that invest in making past studies findable and reusable can answer new questions in minutes that would otherwise take weeks to commission. That speed is itself a strategic capability.
“The moment research becomes strategic is usually one specific conversation with one specific person who suddenly gets it. You are not trying to convert the whole organization. You are trying to find that person and give them something they can use.”
— Danika Patrick, Design Research and Strategy Lead, Carta
How the rest of this playbook is organized
The next three chapters cover the capabilities that make strategic intelligence possible: processes, tools, and people. Each chapter follows the same structure: what the capability is, why it matters now, what excellence looks like in practice, the plays that get you there, and the pitfalls to avoid. The final chapter operationalizes everything into a ninety-day plan.
Chapter 2: Processes
What modern research workflows look like
Research at decision speed, without losing the judgment that makes findings trustworthy.
For most of history, the research process ran in a predictable straight line. A question was formed, a study was designed, participants were recruited, sessions were run, data was analyzed, findings were written up, and a report was delivered. Then the next question began the cycle again.
AI has not eliminated any of those steps, but it has compressed every one of them. What used to be a six-week sequence can now be a one-week iterative loop, with several rounds of learning happening inside a single product sprint. The research function that takes advantage of that compression can move at the same speed as design and engineering. The one that does not falls further behind with each cycle.
Speed creates new research risks
Fast tools make it easy to confuse motion with progress. A team can now run a study in a day, iterate on a prototype between sessions, and produce a synthesis overnight. That velocity is a real asset. It is also a new source of risk, because the judgment once built into a slower process now has to be applied deliberately at speed.
Markers of excellence in modern research workflows
The processes described in this chapter are the ones we saw consistently at teams that are running at AI-era velocity without sacrificing the quality of the decisions their research informs.
Research happens before decisions are locked. Planning conversations include a researcher who can bring existing knowledge to the discussion without commissioning a new study. Research is an input to how priorities are set, not an audit of priorities that are already set.
Research and design iterate together. A concept tested on Monday is modified on Tuesday and retested by Wednesday. The prototype is a conversation tool for a research session, not a finished artifact requiring a separate design cycle before it can be tested. Design and research are working on the same question at the same time.
Fidelity matches the question. Low fidelity prototypes are used for discovery, where the goal is a reaction to the concept. Medium or high fidelity is used for refinement, where participants already have a mental model of the product and the question is about improvement. Teams do not default to high fidelity just because AI makes it easy.
Recruitment is precise. The team can describe the target participant specifically enough to recruit against that description unambiguously. For example, not “enterprise software users,” but “mid-market ops managers at companies with 200 to 500 employees who currently use Excel for workflow management.” A well-recruited sample of five produces more actionable findings than a loosely recruited sample of fifteen.
AI scales execution only after human calibration. On a study running thirty or more participants, the researcher runs the first several sessions themselves before handing off to AI-assisted moderation. Those researcher-led sessions build the reference point needed to recognize when AI output is trustworthy and when it is not.
Research is embedded in the documents where decisions get made. Findings are referenced inside product change requests, strategic planning memos, and roadmap documents. Research is not a parallel deliverable to be consulted; it is an input to the document where the decision is written.
What excellence looks like in practice
Splunk - Danika Patrick (currently Design Research and Strategy Lead at Carta)
When Splunk acquired a network performance monitoring company, the working assumption was that enterprise customers wanted richer diagnostic data about their networks. Danika Patrick’s team talked to users before the roadmap was locked.
What they found reframed the problem. Even when the tool correctly identified that the network was broken, customers could not act on that information. What they needed was not more diagnostic detail, but a clear signal that told them where to look next. The product shifted from displaying data to surfacing a clear actionable signal.
This is a marker of excellent process: research was present at the moment the acquisition strategy was being shaped, with a question that had not yet been locked. The intervention changed what got built, not just how well it performed after launch.
Plays
The Bullseye Customer Sprint
Before any recruiting begins, define the target customer with enough precision to recruit against the description unambiguously. Use screener tools to apply inclusion and exclusion criteria with rigor. Five well-recruited sessions, run over a single day with the team watching live, will produce more actionable findings than a larger loosely recruited study.
Margolis, Google Ventures; developed further in Learn More Faster.
Match Fidelity to Purpose
For discovery, use low or medium fidelity prototypes. The job is to provoke a reaction to the concept, not evaluate the execution. For refinement of an existing product, higher fidelity helps participants react to the specific change being tested. Connect your AI prototyping tools to your design system via MCP so refinement prototypes look like the real product.
The In-Session Iteration
Build a rapid prototype before a moderated session. If a concept is not landing during the session, modify the prototype based on what the participant says, and retest the revised version before they leave. The research loop compresses from weeks to hours, and the prototype earns its keep as a conversation tool.
The Hybrid Study
On any study where the target sample size is thirty or more, the researcher runs the first five sessions themselves to calibrate on the protocol, develop intuition for the data, and surface any problems in the discussion guide. The remaining sessions scale using AI-assisted moderation tools. The first five sessions are not a warm-up; they are the quality infrastructure for everything that follows.
Holbrook, Microsoft.
The Two-User Gut Check
When a team cannot wait for a full study, run two sessions with well-targeted participants and frame the output explicitly as directional input rather than validated evidence. Two participants will surface the most obvious problems with a concept. The alternative being proposed is typically zero. This play exists to keep research present in conversations that would otherwise happen without any user signal at all.
The Self-Serve Research Layer
Document the team’s standards for common research activities — discussion guide creation, concept test design, screener writing — in a format that designers and product managers can follow when running their own fast tests. The research team sets the quality bar and reviews outputs; the infrastructure scales execution to teams researchers would not otherwise reach.
Parijat, Instacart.
Pitfalls
Speed Without Judgment
The loop is productive when it produces learning and unproductive when it produces iteration for its own sake. Every round of testing should end with a researcher asking whether the concept is worth iterating on at all. Teams that only know how to run the loop are not equipped to know when to break it.
Concreteness Bias From Polished Prototypes
When a prototype looks finished, participants evaluate it as a finished product. They anchor on the interface rather than the idea underneath it. Use low fidelity when the question is about the concept, regardless of how easy high fidelity has become to produce.
The Incrementality Trap
Fast iteration naturally biases toward small improvements to the current concept. A research function that only runs the loop tends to refine in place, even when the right answer is to step back and question whether the underlying approach is sound.
Confusing Directional Input With Validated Evidence
Two-user gut checks and in-session iterations are enormously useful when framed correctly and dangerous when they are not. A pattern observed with two participants is a hypothesis. A pattern observed with thirty well-recruited participants is something closer to a finding. Teams that blur this distinction lose the credibility that makes either useful.
Skipping Calibration Before Scale
AI-assisted moderation and synthesis can produce output that looks thorough and is wrong in ways a casual reader will not detect. The only reliable protection is a researcher who has done enough of the work themselves to recognize what good output looks like. Handing the full study off to automation without first running sessions yourself is the fastest path to plausible-sounding findings that mislead the team.
Chapter 3: Tools
What excellent insight infrastructure looks like
Research infrastructure that is alive, not archival.
By tools we mean the infrastructure that makes research useful after the study ends: the repository where findings live, the tagging and search systems that make them retrievable, the connections between qualitative insight and quantitative signal, the channels through which findings reach the people who can act on them, and the AI capabilities that make all of the above work at scale.
Most research teams have a knowledge problem they misdiagnose as a volume problem. They assume that running more studies will give the organization more insight. In practice, the bottleneck is almost never the amount of research. It is the accessibility of the research already on hand.
Why it matters now
AI has changed what is possible here in a way that is hard to overstate. A team with five years of interview transcripts and a hundred studies on file can now answer questions across the full archive in minutes. Patterns that would have taken a team a week of manual review to find can be surfaced on demand. The ceiling on what a research team can offer the rest of the organization has risen sharply.
That capability matters because the alternative is still, at most companies, what Jess Holbrook describes as institutional amnesia: research that costs real time and money to generate, gets used once, and then effectively disappears. The same users are recruited six months later to answer a question the earlier study had already answered, and nobody knows the first study exists.
“The most valuable thing I can do in the next hour is not run another study. It is to make sure the study we ran six months ago is answering the question being asked in the product planning meeting right now.”
— Jess Holbrook, Partner Head of UX Research, Microsoft AI
Markers of excellence in insight infrastructure
Insights are searchable and reusable. A product manager preparing for a strategy session can answer the question “what do we already know about this segment?” in minutes, not days. Past research is a first stop, not a last resort.
Findings are stored as reusable knowledge. The repository holds insights as discrete entities separate from the reports they came from, along with the study materials, raw video clips, and quotes that let a future reader understand how the insight was generated and replicate the method.
Research is accessible in the flow of work. People who need a research finding can get it in the tools they already use — Slack, the product change request template, the planning doc — without learning a new system or filing a request with the research team.
Qualitative and quantitative signals are connected. An NPS dip triggers a qualitative investigation rather than standing alone as a conclusion. Researchers can pull behavioral data directly, without routing every question through a data science partner, and combine it with qualitative findings in a single synthesis.
Knowledge is maintained, not just archived. Someone is responsible for the repository. Outdated findings are flagged or retired. Contradictions between studies get surfaced and reconciled. Research knowledge that is not actively maintained depreciates, and untended archives become unreliable.
What excellence looks like in practice
Instacart - Research on Demand via Slack · Prakriti Parijat (Head of Research & Insights)
After Instacart’s research team downsized from forty researchers to seventeen, Prakriti Parijat’s team built a research agent running on the company’s internal AI platform with a Slack integration. Anyone in the company — in marketing, policy, operations, or sales — can ask a question in a dedicated channel and receive an answer drawn from the full research archive, with the source researcher tagged for follow-up.
The marker of excellence is not the tool. It is that existing knowledge becomes accessible in the flow of work, without requiring a researcher to answer every question manually. Before the agent, researchers summarized past findings by hand every time a new stakeholder asked. That work has been eliminated for first-pass queries, and research moved from a support ticket to a piece of infrastructure.
Plays
The Knowledge Repository
Build a repository where past research is searchable, tagged, and actually used. Store raw video clips and verbatim quotes alongside reports. Keep the tagging taxonomy small at the start — five tags, not fifty — and optimize for retrieval speed over comprehensive capture. If your team has a Research Operations function, they own the maintenance. If you do not, assign the responsibility explicitly to one researcher with dedicated time; the repository will not maintain itself.
What Tools Should We Be Building?
Schedule a recurring conversation with the team about what is missing from the workflow — not what vendors offer, but what the team actually needs. A synthesis prompt library, a pattern-detection script, a stakeholder-specific report template. Researchers who can identify the gap and prototype the solution occupy a new kind of organizational power.
Holbrook, Microsoft.
The Connected Data Layer
Integrate live quantitative sources — NPS, product analytics, support ticket themes — into the research workflow. The goal is for quantitative signals to trigger qualitative investigation. When the NPS dip hits, the team should already have a working hypothesis from existing qualitative context and know whether to commission a new study or pull from the archive.
The Research Query Channel
Stand up a Slack channel where anyone in the company can query the research archive through an AI assistant. Pre-index the archive, tag the source researcher in every response so retrieval starts a conversation, and let the channel grow by word of mouth.
Parijat, Instacart.
The Analysis Skills Library
Build a shared library of structured AI prompts for synthesis and pattern recognition, encoding the team’s standards into the base prompts. Require every researcher to fork and personalize their copy. Review the library quarterly. The goal is AI output that sounds like the individual researcher wrote it, not a template.
Holbrook, Microsoft.
The Research Podcast
Use an AI audio tool to generate a conversational summary of a recent research sprint — ten to fifteen minutes, distributed via Slack. The format reaches people who will never read a deck and creates ambient awareness of what research is learning.
Patrick, Carta.
AI-Assisted Synthesis, Human Judgment
The workflow that holds up under scrutiny: researchers conduct the sessions and identify the themes, AI produces a first-draft synthesis based on those themes, and the researcher reviews critically against the source material before anything is shared. The test is whether every claim in the AI output can be defended by pointing to a specific participant who said it.
Pitfalls
The Repository Nobody Uses
A research repository fails not because the taxonomy is wrong, but because contributing feels like more work than it returns. With AI-powered search, you don't need elaborate tagging — you need researchers to actually put things in. Optimize ruthlessly for the act of deposit: raw notes, a brief summary, a link to the source. That's enough.
Overcomplicated Taxonomies
Related to the repository problem, but worth naming separately. Complex tagging structures make sense on a whiteboard and fail in practice. Modern AI retrieval works well against lightly tagged material. Invest less in the ontology and more in the quality of what is stored.
Aggregation Without Curation
Connecting NPS scores, support tickets, and product analytics to the research repository gives you a bigger pile of data, not a better understanding of users. Someone still has to decide what the quantitative signal means in the context of what the team knows about the user. AI can assist, but cannot replace the human judgment that makes the connection.
Stale Knowledge Treated as Current Truth
A research finding from two years ago may or may not still be accurate. Without maintenance, it gets cited as current, and decisions get made on evidence that has quietly expired. Someone needs to own the question of what is still true.
Relying on Decks Instead of Durable Formats
The meeting ends and the deck gets filed. Three weeks later nobody can remember whether the key finding was on slide 14 or slide 22. Invest in formats that outlive the meeting: short video clips, a Slack canvas that becomes the standing reference, an audio summary that reaches people on their commute.
Chapter 4: PEOPLE
What excellent research leadership looks like
Leadership practices that turn a research function into an organizational habit.
By people we mean the leadership practices and individual capabilities that make a research function influential over time. This includes what the head of research does to build the function’s standing in the organization, and what individual researchers do to translate findings into decisions and maintain quality judgment in AI-assisted workflows.
These practices are the most transferable part of the playbook. Tools change. Processes evolve. The underlying work of building relationships, speaking business language, and protecting the quality of research judgment is durable, and it is where the most impactful changes tend to sit.
Why it matters now
As AI takes over more of the execution, the relative value of the work it cannot do rises. Running a study is increasingly commoditized. Knowing which study to run, which findings will change the decision, and how to make those findings impossible to ignore is not. Leadership capability becomes a larger share of what research contributes, not a smaller one.
The research functions we saw thriving were led by people who treated influence as a craft. They were deliberate about where they spent their time, careful about how they framed findings for different audiences, and clear-eyed about the difference between producing good research and being present when it mattered.
Markers of excellence for research leaders
Leaders understand how decisions actually get made. They know the real flow of authority in the organization, not just the org chart. They know when options close and who shapes them before they do. They attend planning meetings, ask sponsors what keeps them up at night, and time their work to land when decisions are still open.
Leaders build allies and visible wins. They spend early-tenure energy on the teams and individuals who already believe in research, deliver for them memorably, and document the business impact in language that travels. The PM who championed a research project tells the story of how it changed the roadmap. That story reaches rooms the research leader will never enter.
Leaders set quality standards as AI absorbs more execution. They decide which tasks humans continue to do. They sample AI output, require verification on high-stakes findings, and keep the team doing enough hands-on work that the calibration needed to review AI synthesis does not erode. This is quality guardianship at the level of the function, not the individual study.
Leaders build organizational habits, not just project outputs. They invest in recurring practices — a standing Watch Party, a regular insight digest, a query channel — that compound in influence over time. A one-off study is an event. A habit is an infrastructure.
Markers of excellence for individual researchers
Researchers translate findings into business implications. For every user insight, they write the corresponding business sentence. “Users struggle to find X” becomes “this friction is likely contributing to the drop-off we see at this step.” Findings that connect to revenue, retention, or risk get acted on. Findings that stop at user pain do not.
Senior researchers maintain quality judgment in AI-assisted workflows. They continue to conduct sessions, tag their own data, and write their own synthesis often enough to stay calibrated. They do not approve AI outputs they have not stress-tested against source material. Quality guardianship is a perishable skill, and the practice required to maintain it is part of the job, not a distraction from it.
Researchers synthesize across multiple signal types. They bring behavioral data, qualitative findings, market research, and experiment history into a single point of view. The synthesis that used to require a week of cross-team coordination now takes a morning when the researcher has the fluency to direct it. That integrated perspective is what makes a researcher the most valuable person in a strategic planning conversation.
“Researchers can now tap into behavioral data, market research, what competitors are doing, even what experiments we’ve run inside the company and what we learned. It gives them a multitude of signals to come with a very sharp point of view. The researchers who can do that are the ones who will succeed in the next two years.”
— Prakriti Parijat, Head of Research, Instacart
Researchers offer the reframe. The most irreplaceable thing a senior researcher does is help someone see a problem differently: “here is the bigger picture you are not seeing.” This is not pattern-matching across transcripts. It requires understanding the organizational context, the unstated assumptions, and the history of how the problem was framed in the first place.
“The most important thing — the best advice a senior researcher gives — is the reframe. It is the ‘here is the bigger picture you are not seeing.’ I do not get that from AI yet.”
— Jess Holbrook, Partner Head of UX Research, Microsoft AI
“Before you build anything, you really need to go observe someone — an allergy parent shopping for a gluten-free product in the store. I don’t think there is a way to bypass that quickly. You have to go, spend the time, get that nuanced insight, come back, and then build something.”
— Prakriti Parijat, Head of Research, Instacart
What excellence looks like in practice
Google Ventures - The Watch Party as a standing practice
At Google Ventures, Michael Margolis runs the Watch Party as a recurring organizational practice, not a one-off event. Founders, product leads, and engineers observe user interviews live, with structured note-taking roles and a debrief after each session. Team members form their own interpretations, surface disagreements in the debrief, and leave with a shared vocabulary that changes how they make decisions for weeks afterward.
The marker of excellence is that research became an organizational habit rather than a deliverable. The researcher’s role shifted from author of a report to curator of a shared experience. That kind of standing compounds over time in a way that no single study can.
What excellence looks like in practice
Instacart - Research as an internal AI consultant
When Instacart’s research team began integrating AI into its workflow, Prakriti Parijat’s team documented every step in granular detail: how to write a synthesis prompt that does not flatten nuance, how to structure a discussion guide for AI-assisted sessions, how to calibrate AI output against human-reviewed samples.
When product and design teams started asking how to integrate AI into their own work — a wave that arrived roughly six months later — research was ready. The team ran internal workshops using their own workflows as the curriculum, and research became the organizational guide to AI-assisted practice for functions it had never previously touched.
“We played on our superpower of understanding workflows — breaking a process into the steps an agent can actually follow — and showed product and design teams where AI could really help them scale. Eventually every team was coming to us, so we had to package it as a playbook they could run themselves.”
— Prakriti Parijat, Head of Research, Instacart
Plays
Find Allies & Build Credibility Before You Need ItStart early tenure with the teams already open to research — deliver for them visibly, document the impact in language that travels, and use those wins to expand scope. The PM who became a champion is your best advocate with the next skeptical VP. Through all of it, be honest about uncertainty: "I don't know yet" is a trust-building act, not a weakness. Follow up explicitly when a research prediction proves right or wrong. Over time, a visible track record creates the organizational precondition for research to be taken seriously before the decision is already made.
Patrick, Carta.
Know the Decision-Making Landscape
Treat the informal organization as something to be mapped. Learn who approves decisions and who shapes them. Learn the meeting cadence that sets priorities. Time your work to land when the decision is still open.
The Executive Alignment Tour
Establish a standing check-in, at least monthly, with the most strategically positioned executive you can reach. Frame it as a research intake conversation. Bring a brief summary of what research has been learning and a list of open questions about the company’s highest-priority bets. The goal is to be in the room where priorities form, not informed about them afterward.
Additionally, twice a year, hold individual conversations with senior leaders — CPO, CMO, CEO — about where the company is headed and what is keeping them up at night. Bring a brief summary of what research has been learning. This is a listening session, not a reporting session. The output is better prioritization and a relationship that earns research a seat at the planning table.
Patrick, Carta.
The Watch Party
Run it as a standing practice, not a one-time event. Cross-functional team observes sessions live, takes structured notes, and debriefs immediately. The shared experience produces a shared vocabulary. The researcher becomes the curator of that experience, which is a more durable form of influence than delivering a report.
Margolis, Google Ventures.
The Insight Digest
A short, regular summary of recent findings, calibrated to the audience. For skeptics, include methodology notes and confidence levels. For champions, give the headline and a recommendation. For executives, frame in terms of revenue, retention, efficiency, or risk with a single clear ask. Regularity matters more than length.
The Business Translation
For every research presentation, write the business sentence that corresponds to each user insight before the meeting starts. This is not spin; it is translation. The executive who thinks in P&L terms can act on a finding framed in those terms. Research leaders who do this consistently find that the same findings get acted on where they previously did not.
Pitfalls
Confusing Quality of Work With Influence
A carefully designed study that nobody remembers is not a strategic contribution, no matter how well it was run. Influence is a separate skill that has to be practiced on purpose. Research leaders who assume the work speaks for itself tend to be surprised by how often it does not.
Staying in Methodology Language Instead of Business Language
Researchers often sound like scientists — careful, qualified, neutral. That register is useful for describing what was found; it is a poor register for getting action. Translating findings into the language of the room is not a compromise to rigor; it is what allows the rigorous finding to drive a decision.
Over-Delegating Judgment to AI
As AI handles more execution, the risk is not that quality declines all at once — it is that the team's ability to recognize quality erodes quietly. A researcher who has stopped doing hands-on synthesis loses the reference point needed to review AI synthesis. The tell is when a team accepts an AI-generated themes summary without anyone being able to say, from memory, what a participant actually said. Maintain the practice that maintains the judgment.
Chapter 5: YOUR FIRST 90 DAYS
A plan to strengthen strategic research capability
A practical plan for research and design leaders ready to move from service provider to strategic intelligence. This chapter does not introduce new concepts. It operationalizes the playbook.
How to use this plan
The plan is organized by the same three capabilities as the rest of the playbook: processes, tools, and people. Each section lists the first moves we would recommend a research leader make. Three caveats are worth naming upfront.
The pace is aggressive by design. Most teams will not complete every action in ninety days, and that is fine. The point is to pick a starting direction in all three capabilities, rather than sequencing them and losing the early momentum.
Teams should adjust the order based on which capability is weakest in their current environment.
First moves in processes
- Run a Two-User Gut Check in the first thirty days. Pick a live product question, recruit two participants against a precise profile, and frame the findings explicitly as directional input. The goal is to make research visible in a conversation that would otherwise happen without any user signal.
- Run a Bullseye Customer Sprint in the first sixty days. Define the target participant in precise detail. Recruit five matched participants. Test two or three concepts in a single day with the team watching. Debrief together and document what you learned in your repository.
- Run a Hybrid Study by day sixty. Pick a topic that warrants a larger sample. Run the first five sessions yourself to calibrate on the protocol. Scale to thirty or more participants using AI-assisted moderation. Review the scaled sessions against the reference point built in the first five.
- Match fidelity to purpose from day one. On every prototype-based study you scope in the first ninety days, write down explicitly whether the question is discovery or refinement, and choose fidelity accordingly.
First moves in tools
- Audit your research archive in the first ten days. How much of it is findable? How much is current? What has been used versus filed and forgotten? This diagnostic is the starting point for everything else in the tools capability.
- Stand up the Knowledge Repository by day forty. Five tags, not fifty. Store video clips and quotes alongside reports. The test is whether someone who was not in the room can answer a reasonable question about users using only what is in the system.
- Build the Analysis Skills Library by day sixty. Create the first ten structured AI prompts for your team. Encode your standards for pattern recognition, insight framing, and recommendation structure. Require every researcher to fork and personalize their copy.
- Connect one quantitative source by day sixty. NPS, analytics, or support tickets. The goal is a workflow where a quantitative signal triggers qualitative investigation. Start with one source you can actually integrate, not a comprehensive data layer.
- Launch the Research Query Channel by day eighty. Pre-index the archive, connect it to an AI assistant in Slack, tag source researchers in every response. Announce it first to the allies you built early in the quarter. Let it grow by word of mouth.
First moves in people
- Map the decision-making terrain in the first five days. Who makes decisions, when they close, who shapes them before they do. This is observational work: attend planning meetings, ask sponsors what is keeping them up at night, listen more than you contribute.
- Identify three to five allies and book time with them in the first two weeks. Teams and individuals already open to research. Deliver something useful before the month ends, even if it is a Two-User Gut Check on their most pressing question.
- Schedule a standing check-in with the Head of Product by day forty-five. Monthly at minimum. Bring a summary of what research has been learning and a list of open questions about the company’s highest-priority bets. Frame it as research intake, not a status update.
- Run your first Watch Party by day seventy. Invite your new allies to observe a live research session together, with structured note-taking roles and a debrief. The goal is a shared experience, not a shared deliverable.
- Start the Insight Digest by day seventy-five. A brief, regular summary of findings calibrated to audience. Push it to one cross-functional partner first. Regularity matters more than length.
- Schedule the first C-Suite Listening Tour conversation by day ninety. One senior leader. Listening, not reporting. The output is a better sense of where the company is headed and a relationship that earns research a seat at the planning table.
How to measure early progress
At day ninety, look back at four numbers.
- How many product decisions this quarter referenced a specific research finding? This is the clearest signal of whether research is being used as an input to decisions, or only as a debrief.
- How quickly can the team answer a new question using existing knowledge, versus commissioning a new study? A team that can answer in minutes what used to take weeks has moved from service delivery to intelligence.
- How many teams outside of product and design are actively using research? Marketing, policy, operations, sales. Research reaches them when it is accessible in the tools they already use.
- What was the cost of an insight this quarter, compared to last? Not a comprehensive metric, but a directional one. A healthy insight system drives the cost per insight down over time.
None of these numbers are easy to track in the first quarter. Asking the questions, and building the systems that make them answerable, is itself a signal of a research function operating strategically.
All the plays at a glance
Use this as a working checklist. Every play in the book, organized by the capability it supports.
Processes (Chapter 2)
- The Bullseye Customer Sprint
- Match Fidelity to Purpose (Discovery vs. Refinement)
- The In-Session Iteration
- The Hybrid Study
- The Two-User Gut Check
- The Self-Serve Research Layer
Tools (Chapter 3)
- The Knowledge Repository
- The Connected Data Layer
- The Research Query Channel
- The Analysis Skills Library
- The Research Podcast
- AI-Assisted Synthesis, Human Judgment
- User-Driven AI Evals
People (Chapter 4)
- Find Allies First
- Know the Decision-Making Landscape
- The Head of Product Alignment
- The Watch Party
- The Insight Digest
- The Business Translation
- What Tools Should We Be Building?
- The Long Game
- The C-Suite Listening Tour
A closing note
The research teams that navigate this moment well will not be the ones that hired the most researchers or bought the best tools. They will be the ones that made human insight useful faster than the organization around them could stop listening. That is a craft, not a science. It is built one conversation, one study, one converted skeptic at a time.
About Design Better
Design Better is a podcast and Substack by Aarron Walter and Eli Woolery that explores the intersection of design, technology, and the creative process through in-depth conversations with inspiring guests across the creative fields. Whether you are design curious or a seasoned design pro, Design Better is guaranteed to inspire and inform. The show has been recognized by the Webbys, W3 Awards, Vanity Fair, and Architectural Digest.

Aarron Walter
Aarron Walter is a design and technology leader. He started the user experience design practice at Mailchimp and helped grow the company from a few thousand customers to tens of millions. At InVision, he studied design teams at some of the most admired tech companies to identify the traits that influence success. He later joined former CDC Director Dr. Tom Frieden at Resolve to Save Lives, applying design and technology to emergency public health response for the Africa CDC and the WHO. Aarron’s design guidance has helped the White House, the US Department of State, and dozens of major corporations, startups, and venture capital firms. He is the author of several books, including the second edition of Designing for Emotion.

Eli Woolery
Eli Woolery trained in the Product Design program at Stanford University, where he now teaches as a lecturer. He brings a rare breadth of background: photography, filmmaking, product design, and industrial design. He is the former Director of Design Education at InVision, and founded Out of the Deep Blue, a design consultancy whose clients have included Genentech and Kaiser Permanente. A lifelong devotee of the ocean, Eli loves to surf, dive, and kayak.
Subscribe to Design Better at designbetterpodcast.com.