AIAA 3rd Jam – Sizing

Or how we use AI to check maturity of estimations

I’m going to be excommunicated from Agility for this.

But, in this article, I’m going to propose a way to leverage a maturity level assessment based on a relative estimation that will increase the ease of working between cells/squads/pods/teams in an Enterprise Environment by a significant degree.

How am I going to do it?

It relies a lot on groundwork and consensus.

What do I mean? Let me explain.

The Assumptions

We all know how to do an estimation via Planning Poker in Scrum, right? I’m going to assume that yes, we do. Also, I’m going to assume that you’re following a system based on the Scrum Bible at scrum.org, in an Enterprise setting. You have cells doing work in User Stories, Features and Epics, in two-week Sprints.

You’re also using User Story Points to measure them in the usual changed Fibonacci: 0.5, 1, 2, 3, 5, 8, 13, 20, 40, 100.

Do this make sense? If not, tweak it to your preferred values.

The Consensus

We need to set a consensus. It’s going to be arbitrary, but it will help have a frame of reference.

Our consensus is twofold: we can set all the Sprints for two weeks. We will also do this: we’ll assume that 8 User Story Points are the maximum that a single person can produce in a Sprint. Taking this into account, 8 USPs are two weeks. 3 USPs is roughly one week, 5 means a week and some days, 2 more than a couple of days but less than a week and 1 a couple days at most.

Yes, these are rough estimates; that’s the idea. We cannot, at the estimation point, be so precise that we zero in an exact number of hours. When we will know how much it costed? We will know how much the User Story costed after we finish it.

As a bonus point, we can come back later on to check on the cost of the User Story later on.

So, if we follow the usual distribution for Scrum, we can think of a User Story as something that’s going to be taken care in one Sprint (ideally). Therefore, for a 5 person squad/cell, the maximum ideal length of a User Story should be 40 USP (8 USP per person times five).

If we use the 3 Sprints for a release, then Features will be in the 100-120 USPs range (40 times three). And Epics, being worked within quarters, will probably be between 120-240 USPs.

(of course, adjust to size and capacity)

Where does this leaves us? We can use it to check that User Stories, Features, and Epics are estimated, sized, and split correctly.

The Prompt-craft

This is going to take some work. I’ve created a list of Epics, Features and User Stories.

You can download it here.

Now, let’s see if we can get the AI to do the work for us. But this is going to take some explaining.

I’m going to use ChatGPT (on version 4o with Canvas) to get it to understand what we want.

I am an Agile Coach trying to understand if the Epics, Features and User Stories at my work are correctly sized. For that, I have reached this consensus in which a person can only produce 8 User Story Points per Sprint, and a user Story can only take one Sprint. Therefore, the maximum a User Story can be sized is 40 User Story Points for a Team of 5. I also have reached a consensus that a Feature should be produced between one Sprint and three Sprints, therefore, between 40 User Story Points and 120 User Story Points for a team of 5. And an Epic is at most Two Features, that are produced each Quarter, so between 80 and 240 User Story Points for a team of Five. I have the following list of Epics, Features and User Stories. Can you analyze it and let me know which Epics, Features and User Stories are mis-sized? Take into account the team size relative to the sizing. Write an output explaining those who are not acceptable.

Whew! That was a mouthful, wasn’t it? But we can see that we use the role-playing that we introduced last time and now we’re introducing another concept: assumptions.

What does this produce?

Not only a list of the specific issues, and recommendations…

…but a list of general recommendations.

This produces a workable, actionable list of our next focus.

If we run this, we can have some idea of the maturity of our Backlog in terms of sizing.

Hope this is useful and let me know how it works!

#aiaa #ai #agile #backlog #automation

 

 

Gen AI and Agile: a concrete proposal – AIAA Jam 1

Or how will GenAI impact in the real world

GenAI, right?

At this point, you probably are tired of hearing it. Yes, it’s going to change the world. But you’d rather have more clarity on how it’s going to do it, right?

So, I’m going to do the opposite of what most consultants do. Instead of starting high, on strategy and vision, I’m going to start very small and concrete and try to explain how I think GenAI can impact positively your day to day.

The obligatory disclaimer: this is basically me future-telling. So take this with a grain of salt.

Shall we drill down?

Let’s think on how GenAi can impact Agile.

But do not stop there. I’ll try to think how to apply GenAI to a concrete problem and I’ll try to provide a concrete solution.

To do that, I’m going to do some fortune-telling. Break out the tarot cards and crystal balls!

My future-assumption

My general view is that Agile is going to change in just a couple of years, going from Agile 2.0 (Scaled Agile) to Agile 3.0, which I call AIAA: AI Augmented Agility.

I think that most teams will quickly migrate to Low code / No code (LcNc) team members, with a lot of the development process, be it tech or business, relying on AI tools. Promptcrafting will become the main skill needed, and you will have small, more mobile teams focusing on Jobs-to-be-Done rather than traditional Scrum roles. Special, dedicated teams of engineers will continue to exists for super-specialized situations, but most work will be done by changeable, mobile teams.

Ok, Fede, you might say. That sounds well and good, but it’s very abstract. Can you give me an example?

I’m going to focus on one problem that today exists, and try to improve it with GenAI.

The Problem

In most of the Enterprise Agility situations (think SAFe, LeSS, etc.) one of the major concern is dependencies. Since Scrum allows a flexible approach to Roadmapping, how does change in which User Story Team A does impacts in Team B? And does Team C suffers any impact?

This, in any major enterprise, becomes quickly a pain, especially once chapter, tribes and other major groupings start to pile up. Suddendly you have 37 teams, with criss-crossing interdependencies that make following the changes a chore. Yes, PI Plannings or Quarterly Plannings can help mitigate the impact, but these are very high-cost, once-a-quarter solutions. Once they are finished, you either rely on people communicating their whole backlog change and other people understanding the impact or you run the risk of impacting the other teams.

Often, this leads to meeting-itis and people investing a lot of hours in trying to understand different backlogs. How can we mitigate this with GenAI?

The Solution

Some general guidelines: I’m going to be as agnostic as possible regarding tools. You can do this with Jira, Rally, ServiceNow or any backlog tool and with most GenAI offerings. I’m going to use for this one plain good-old Excel and ClaudeAI, but feel free to swap them.

I’ve created an interlocking backlog, which you can see here:

User Stories with Dependencies.pdf

There are 8 User Stories (based on a Banking implementation) that have dependencies. I’ve also created the same document without the dependencies there, to mimic a Backlog that hasn’t been analyzed, that you can access here, with 3 teams:

User Stories.pdf

What would be our objective? To find a way to have a first analysis of dependencies that we, as members of Team A, can use to understand if Teams B or C have a dependency with us or vice-versa, and if we see any change on the stories that impact either us or them, have a meeting. That way, we could quickly mitigate the risk of a bottleneck, without the need to wait for a Quarterly Planning meeting or involving dozens of people.

So, let’s go to Claude’s page (https://claude.ai/) and upload the file (User Stories.pdf) with the following prompt:

I have an Excel of User Stories for Banking. I want you to read the User Story Number (column A), User Story (column B) and Acceptance criteria (column C) and then tell me which User Story has dependencies with each other, keeping in mind that stories might have more than a single dependency

As you can see, the AI quickly reads and identifies several dependencies. But this can also be difficult to follow up.

So, let’s refine a little bit our Promptcrafting. What should we focus on?

Using this information, which stories could be a high-impact bottleneck?

Ok, using this promp, the AI can rank the importance of the User Stories based on dependency and priority (which I agree).

So we have a list of stories that we have to monitor, but this is not really a comfortable way to do so. Can, perhaps, find another way to do it?

create a dashboard to follow up on these stories

Of course, this is a static dashboard. I wonder if we can implement it via Javascript into a dynamic page?

implement this dashboard in javascript

And there you have it. A dynamic Dashboard that takes the input from a spreadsheet and makes following it a trivial matter.

Bear in mind that I’m doing this in a bare-bones way. Any kind of tool beyond Excel (like Jira, or ServiceNow) will also include the ability to export the contents, but more importantly, the ability to connect via an API. This means that you can have a bot in any kind of embedded AI (like Microsoft’s Copilot) constantly trawling the different backlogs and alerting the teams about unmet dependencies.

Is this perfect? No, it will probably require some fine tunning, and you should always double-check the results (you can do so with my file User Stories with Dependencies.pdf to see which stories I thought have dependencies with each other).

Will this save you time and effort? Absolutely.

How much? Well, let’s do something. Let’s map total hours (i.e. hours per participant) in all the Scrum of Scrum or Sync meetings that your RTEs, ATFs, etc. go to. Let’s tally them up.

How much do you think that they will save if, instead of going through their backlog, they just meet to check the points that they already have? Shall we say, a conservative 60% savings of time?

Now, let’s go back to our organization of 37 teams. That, plus RTEs and Portfolio Managers means that if you have 2 hours of sync per sprint, will mean that you will average around 185 hours (2.5 people x 37 teams, which is 92.5 people per hour of sync, times two, which is 185 hours). This will mean that you will save 111 hours per sprint. If you’re using two weeks for your sprint, this is 222 hours per month. This is more than a FTE of a high salary.

And we’re being conservative here. Once this is fine-tuned, the cost of missing a particular dependency is probably less than the cost of running the meeting.

Does this make sense? Of course, this is just a very brief way in which you can implement AIAA, but I’ll be exploring how to Augment Agility in further articles.

Let me know what you think!

#agile #ai #genai #safe #scrum #less