How we used T-shirt sizing to scope a complex multi-service Justice platform

Caity Kelly
Aug 18, 2025
8 min read

How do you scope and estimate 170+ features across seven work packages, involving multiple delivery teams, disciplines, and stakeholder groups? In our work with HMCTS on the Housing Disputes programme (more commonly known as Renters Reform), we ran an extensive T-shirt sizing exercise that became a shared planning language across service, product, architecture, design, and technical delivery. Here’s how we did it, and what we learned.

What is the ‘Renters Reform’ Programme?

His Majesty’s Courts and Tribunals Service (HMCTS) is responsible for administering the criminal, civil and family courts and tribunals in England and Wales. It plays a central role in delivering justice in a fair, efficient, and modern way.

One of HMCTS’s flagship programmes is the Housing Disputes Policy Infrastructure Project (HDPIP) - more commonly known as Renters Reform. This major policy and digital transformation initiative aims to streamline how real property disputes (such as those between landlords and tenants) are resolved in England and Wales, and aligns it with upcoming legislation currently progressing through Parliament.

The service will support users through a fully digital process (which is currently largely paper based) from initial case creation and submission through to tribunal decision and enforcement, with an emphasis on fairness, transparency, and accessibility for all users - both citizens and courts staff alike.

The aim is to launch the service within 18 months of Royal Assent (which is when the Bill has been passed by Parliament), aligning with the new legal framework for property disputes once enacted.

Background

Our team at Solirius Reply, alongside a blended team of other suppliers, is helping HMCTS deliver three digital interfaces which will exist as one joined up service on GOV.UK:

Citizen Interface (C-UI): For renters, landlords, managing agents, and other citizens to initiate or respond to disputes.
Expert Interface (Ex-UI): For court staff, tribunal clerks, and judges to manage and resolve cases.
Enforcement Interface: For bailiffs and enforcement agents to view and act on court-ordered outcomes.

Each interface supports different roles, access levels, workflows, and data integrations. Together, they form a single digital ecosystem designed to support our service vision of a fairer, faster, and more accessible housing disputes resolution system.

Given the cross-cutting nature of many features (e.g. impacting more than one interface), as well as the involvement of multiple delivery partners (technical, design, architecture, and policy), we needed a collaborative and time-efficient way to estimate effort across the board.

With such breadth and complexity, it became clear that scoping the effort required to deliver the interfaces required us to turn to a commonly used Agile framework - T-Shirt sizing - both to plan effectively and to set delivery expectations.

What we did

As the service, product, architecture, and design teams moved through an analysis of the ‘as system’ they identified seven major work packages (WPs) comprising over 170 features that would form the foundation of an MVP release and beyond. These included everything from case intake, document submission, and identity verification to evidence review, tribunal hearing preparation, enforcement workflows and much more.

Each work package varied in complexity, technical dependencies, and level of design maturity.

With multiple suppliers and HMCTS teams involved (technical, UCD, architecture, policy, service, product), we needed a shared and efficient way to:

Understand the relative complexity of each feature
Identify dependencies
Support delivery planning and MVP scoping

The T-shirt sizing exercise proved to be an excellent tool for developing a shared language and understanding of the scope of work across teams.

Tools and methods

Why T-shirt sizing?

T-shirt sizing is an Agile estimation method that uses intuitive size categories (e.g., Small, Medium, Large) instead of fixed hours or story points. We used the following standard:

Size	Meaning	Sprint equivalence
S (Small)	Simple/straight forward feature	Half a sprint (up to 5 days)
M (Medium)	Self-contained but non-trivial	1 sprint (10 days)
L (Large)	Moderately complex, possibly cross-functional	1–2 sprints
XL (Extra Large)	High-complexity or unknowns	2–3 sprints
XXL (Extra Extra Large)	Too big; must be broken down	3+ sprints

By standardising these definitions across teams, we could have more productive conversations — even when working remotely or across disciplines.

We also chose T-shirt sizing because it helps avoid the law of diminishing returns. While it might seem logical that the more time we spend estimating, the more accurate we’ll be, this isn’t the case. In practice, spending too long on estimation often yields only marginal gains in accuracy, and can become a wasteful exercise. By keeping things light and collaborative, we were able to agree quickly on relative effort, without overthinking it.

The goal wasn’t to get the “perfect” number, but to reach a shared, good-enough understanding that the team could act on. Estimating as a group also helped reduce the illusion of certainty and reminded us that estimates are just estimates and are likely to change when more information becomes available.

The law of diminishing returns. Graph of accuracy versus effort showing a curved line peaking at medium effort, illustrating diminishing returns.

Running the workshops

Due to the scale of the service, we ran a series of online workshops via Microsoft Teams, each focused on a single work package. Our approach:

Timeboxing - each workshop was time-boxed to 45 minutes
Discussion time - each feature discussion was capped at 10 minutes, to keep us moving
Solirius Delivery Manager facilitation gathering input from product and service teams, architecture, user centred design, HMCTS subject matter experts (SMEs), and our technical leads
Use of Figma - this was used to gain shared understanding of the features, reviewing draft service and screen designs live on the calls
Use of Excel on Sharepoint - this was used to log the estimated t-shirt sizes and record blockers and dependencies (and that spreadsheet was later transformed into our draft delivery gantt chart)
Use of Jira - this was used to connect estimates to our epics in the technical delivery backlog

Wherever conversations ran long or diverged, we captured a note on the t-shirt sizing spreadsheet and moved on - enabling velocity without sacrificing key stakeholder input.

What we learnt

Cross-discipline collaboration

One of the most valuable aspects of this process was the diversity of perspectives in the room and the cross-discipline dialogue it enabled. Attendees included:

HMCTS product owners clarifying feature intent and priority
Technical architects identifying systems integration and security considerations
Service and content designers flagging potential usability or accessibility implications
Delivery leads from multiple partner teams managing dependencies and timelines
HMCTS business analysts and subject matter experts, policy advisors
Solirius developers, testers and architects

This cross-functional collaboration helped surface hidden complexity. For example:

Technical features that seemed simple from the outside as they could reuse existing code from other GOV.UK services (e.g., email notifications) ended up being sized as “large” when accounting for the significant new content, design, and accessibility work, and considering multiple delivery channels (SMS, email, in-app), templates, and translation needs.
Integration with legacy systems or GOV.UK services added layers of complexity.
Features affecting two or more interfaces required discussion around ownership and sequencing.
Some “XXL” features highlighted areas where requirements were still too vague — prompting further discovery before delivery could be planned.

Our results

This exercise delivered far more than just a set of estimates. It helped anchor planning, align stakeholders, and build a shared understanding of complexity across the Renters Reform programme.

✅ Shared understanding:

Everyone left the workshops with a clearer view of what needed to be built, why, and how much effort it might take. This helped reduce assumptions and build a shared understanding across disciplines.

✅ Planning confidence:

We could group features by estimated effort, identify critical paths, and flag areas requiring further discovery. This gave us greater certainty around sequencing of work. This resulted in:

173 features sized across 7 work packages
Early identification of blockers, dependencies, and discovery gaps
Shared delivery language adopted across technical, UCD, and policy teams

✅ Clearer MVP definition:

“XL” and “XXL” features often sparked discussion about scope - helping us prioritise must-haves vs. nice-to-haves, and making it easier to define a realistic MVP. This resulted in the development of alignment across teams on MVP scope, and supported realistic roadmap conversations with programme leadership.

✅ Delivery velocity baseline:

By estimating using sprint equivalents, we could model different delivery scenarios e.g. what two technical pods could deliver in six sprints vs what four could deliver, which enabled more informed forecasting. This enabled delivery teams to map estimates against real capacity and helped shape sequencing for the Citizen, Expert, and Enforcement interfaces.

✅ Improved stakeholder engagement:

The structured, time-boxed format made it easier for stakeholders to participate meaningfully without being overwhelmed. It also ensured we covered all 173 features efficiently.

We captured decisions and rationale, alongside red flags raised by stakeholders for transparency
The estimation model is now being reused across the programme as a best-practice approach
Outputs now serve as a baseline for backlog refinement and prioritisation
We captured decisions and rationale, alongside red flags raised by stakeholders for transparency
The estimation model is now being reused across the programme as a best-practice approach
Outputs now serve as a baseline for backlog refinement and prioritisation

✅ Shared ownership of estimates:

One of the most valuable outcomes was the sense of shared ownership that emerged through the process. Rather than handing down estimates from a single team or role, we sized features together as a cross-functional group. Everyone’s perspective from delivery, design, architecture, and product was heard and considered. This helped build trust and alignment, and ensured that the estimates reflected the collective understanding of the team.

Because the group had worked through the complexity together, the final estimates weren’t just accepted, they were owned by the entire team. That ownership has helped sustain momentum and buy-in during ongoing planning and prioritisation.

Navigating team challenges

Of course, there were challenges:

Time constraints meant we couldn’t deep-dive into every feature. We mitigated this by capturing flags in the sizing document for follow-up.
Feature overlap across interfaces occasionally led to confusion; we tackled this by defining which team “owned” the feature for estimation purposes.
Remote working can limit engagement, but the use of collaboration tools, strong facilitation, and clear pre-reads helped keep everyone aligned and involved.

Outcomes and next steps

As a result of the exercise, we now have:

A fully sized feature set across all seven work packages
Clearer MVP priorities and sequencing
Early identification of technical risks and design gaps
A foundation for delivery planning across interfaces and suppliers

This work has already influenced sprint planning, architecture decisions, and roadmap alignment.

Reflections

In multi-interface services, early alignment of scope and expectations is crucial. Our experience showed that T-shirt sizing, a lightweight yet effective estimation method, can achieve this without heavy documentation or lengthy planning.

This approach helped us understand scope, uncover hidden complexities, and plan collaboratively, avoiding over-analysis and the law of diminishing returns by focusing just enough effort for a shared, useful estimate. Ultimately, it fostered trust, clarity, and shared ownership across teams, which are vital for successful delivery.

Our key factors for success included:

A clear process
Shared definitions
Strong facilitation
Shared ownership
Crucially, collaboration across roles

While seemingly simple, T-shirt sizing, when executed correctly, cultivates a deeper shared understanding - an essential element for programs of this scale.

About the author

Caity Kelly is a Senior Delivery Consultant at Solirius Reply, currently supporting HMCTS on the Renters Reform programme. She works at the intersection of agile delivery, digital transformation, and service design in Government.

If you have any questions about our Delivery services or you want to find out more about other services we provide at Solirius Reply, please get in touch (opens in a new tab).

Insights