Research Repository Design treats the research repository as an information architecture problem first, not a storage problem. The taxonomy, the findability path, and the retrieval cost decide whether the repository compounds organisational knowledge over time or accumulates noise that nobody can navigate. A repository with 500 artifacts that nobody can find anything in is worse than 50 artifacts in a shared spreadsheet, because the 500-artifact version simulates having institutional memory without actually delivering it.
Rosenfeld, Morville, and Arango (2015) in Information Architecture: For the Web and Beyond established the structural framing: IA is the practice of organising information so people can find it. The framework transplants directly to research repositories. The same questions that IA asks of a website (what are the users trying to accomplish, what mental models do they bring, what taxonomy supports retrieval at speed) apply to the repository. The repository is a website for past research.
Pavliscak (2015) in Data-Informed Product Design contributed the data-quality framing. Pavliscak's central observation is that quantitative findings only inform decisions when they pair with qualitative context, and the pairing requires a repository structure that links the two at retrieval time, not just at storage time.
The principle: Design the IA before storing the first artifact. Curate the taxonomy weekly. Measure findability, not artifact count.
Research repositories as a defined practice emerged from two converging traditions: information architecture (the IA tradition that Rosenfeld and Morville established) and research operations (the ResearchOps community that codified the operational practice in the late 2010s).
Rosenfeld, Morville, and Arango (2015) in the fourth edition of Information Architecture established the canonical IA framework. Their central insight is that information architecture is the design of taxonomies, navigation, and findability paths so users can accomplish their goals. The framework applies directly to research repositories: the repository is a website whose users are PMs, designers, and engineers; their goal is to find evidence relevant to a current decision; the IA decides whether they can.
Pavliscak (2015) in Data-Informed Product Design extended the framing to repositories that hold both quantitative and qualitative artifacts. Her observation is that data without context (the quant side without the qual side) is brittle: stakeholders interpret numbers in ways the original researcher would not endorse. The repository must structurally link quant findings to their qualitative context, which requires schema discipline at storage time.
The ResearchOps Community has codified the operational pattern across 16,000+ practitioners. Their published frameworks identify the four functions of a research repository: storage (where artifacts live), discovery (how users find them), synthesis (how artifacts combine into insights), and governance (how the repository stays clean over time). All four functions must be designed; missing any one degrades the repository.
User Interviews' State of User Research Report (2025) surveyed 485 researchers globally. The 2025 report found that teams running well-maintained research repositories report substantially higher rates of insight reuse than teams running ad-hoc storage. The teams that invest in repository IA report retrieval times measured in minutes; teams that treat the repository as a shared drive report retrieval times in hours or "I gave up."
The combined finding across these sources is consistent: repositories that work were designed as information architectures from day one. Repositories that fail were built as storage buckets that accumulated structure later, after the cost of retrofitting taxonomy became prohibitive.
For Research Leads: The repository is the artifact that protects your team's impact across years. A well-designed repo compounds value as artifacts accumulate; a poorly designed one accumulates noise that buries the signal. The IA decision you make in month one shapes the next decade of organisational research memory.
For UX Researchers: A findable repo is what makes your work compound. A study that gets cited five times across the next two years has compounding impact; a study that gets archived and never found again has the impact of a single decision and nothing more.
For Product Managers: Your ability to ask "what do we already know" depends on the repository being findable. A repo that requires 30 minutes of digging to find prior research on a topic produces the same outcome as no repo at all: you commission a new study.
For Designers Running Their Own Studies: A well-designed repo lets you search before you study. Searching the repository in advance of a study saves the time of repeating work and surfaces context that sharpens your study scope.
Research repository design scales from a 5-person team using Notion or Airtable to a 50-person research organisation using Dovetail or a custom internal tool. The IA principles stay the same; the tooling scales.
Pick the organising axis before storing the first artifact. Most repositories work best with one primary axis (product area, study type, or decision link) and 2-3 secondary axes. The primary axis is what the homepage organises around; secondary axes are filters. Picking the axis upfront prevents the slow drift into category sprawl.
Design the schema before structuring the data. Each repository entry needs a fixed schema: required fields (study source, date, participant role, primary tag, decision link), optional fields (confidence level, follow-up status), and free-text fields (raw observation, verbatim quote). A schema designed upfront is the structural commitment that prevents free-text entries from accumulating without metadata.
Treat the homepage as the most important UI. The repository homepage is where most users land. It should answer "what is in the repo" in 30 seconds and "how do I find what I need" in another 30. A homepage that just shows the most recent entries fails both tests; a homepage organised by the primary IA axis with clear secondary entry points succeeds.
Curate weekly, not quarterly. A 30-minute weekly curation slot keeps the repository clean. Curation tasks: tag normalisation, duplicate identification, decision-link verification, archival of stale entries. Weekly curation prevents the slow accumulation of noise that quarterly curation cannot catch up on.
Measure findability, not artifact count. The number of artifacts is a vanity metric. The right metric is retrieval cost: how long does it take a user to find a relevant artifact starting from a search query. A repository with 200 artifacts and 90% retrieval success is more valuable than a repository with 2,000 artifacts and 30% retrieval success.
Document the navigation paths. A short guide that shows new users how to find common things (e.g. "research on checkout abandonment", "studies on power users", "past pricing research") is more valuable than the most sophisticated search tool. Navigation guides are cheap to write and substantially improve onboarding cost.