15 May 2016

Our team doesn't use Scrum

We're a small team of experienced engineers developing an infrastructure system for an external customer. We deliberately rejected Scrum as the project management framework and didn't regret. Why?
Infrastructure projects don't bring a direct value (unless it's the primary business). The value has a form of cost saving and comes later when this new facility gets substantial use. Due to delayed utility stakeholders usually don't want to allocate much time and manpower - it means the timeline is under pressure. At the other side infrastructure is a foundational thing, so sacrificing the quality is not an option. So beyond limiting the initial feature set to an essential minimum, the project should be run in a way that
  • is lean (has as low overhead as possible);
  • lets to meet deadlines safely;
  • does not affect product quality or incurs unmanageable technical debt.
Scrum framework mandates fixed-length (1-4 weeks) iterations (Sprints), each iteration having its distinct goal. Technically it means a fixed time-boxed backlog of "User Stories" for an iteration and the goal is to complete the backlog within the iteration. The first issue is that complexity of Stories doesn't care about the iteration length (or capacity - the team ability to do a certain amount of work within an iteration). E.g. we may have an iteration capacity of 10 "units" (story points, man-days or whatever the team uses for estimation) and few Stories of 4 units each. If we pick 2 we plan to waste 2 "units" in the iteration. If we pick 3 - we're planning to fail meeting the iteration goal. It's alignment overhead.
For an infrastructure project Stories take much labour, average time to complete is comparable to an iteration length. Given the alignment overhead is about <time_to_complete_a_Story>/2 * team_size, it's prohibitively high.
Estimation of development tasks isn't exact. Even for a long-running project and an experienced team it comes with an uncertainty. I.e. if a task is estimated for 5 days, there is 50% chance that it will take longer than 5 days (or will be expedited trading quality for time) and in 50% cases it will be completed earlier. In the first case the team fails the Sprint. To avoid that it may add a safety margin to estimations that won't be used in most cases. In the later case (early completion) we get some time but there are no tasks planned for it. This blog post discusses the issue in more details. It's estimation uncertainty overhead.
We know from statistics that the relative error is O(1/sqrt(N_of_tasks)). It comes down with smaller tasks, but for an infrastructure project tasks are long, so the error is high leading to higher safety margin. Roughly we have to waste <estimation_error> + <margin> per each team member.
Standard Sprint has a planning session at the beginning, demo and retrospective - at the end. Practically it means 2 days are lost for productive work. It's planning overhead.
Each Sprint is to deliver customer visible value. Practically it means Sprints are packed with User Stories, bug fixes come then unless they're really breaking. As shown before, at high chance the team has to expedite tasks by cutting corners here and there. Together these trends sink both visible product quality and its sustainability in a long term (that is especially important for infrastructure). Because each Sprint is like the other, nothing stops this degradation. Some teams introduce recurring "stabilisation sprints" to address it. But in our case the customer has to invest long time ahead, so it's difficult to convince him that we need extra time to fix our own kludges. It's bit rot.
OK, an alternative?
That preserves Scrum undoubtful advantages of
  • keeping the development aligned with the the goal;
  • observing and controlling progress to match the project timeline;
  • sustained delivery of user-visible value?
There is no silver bullet. For our case (that's pretty common these days) we use a set of techniques. We call it "Scrumban".
To eliminate operational overhead we got a practice primarily focusing on it - Kanban. It gets rid of alignment and estimation uncertainty costs. It lightens "deadline driven development" syndrome. By explicit limit of work in progress it supports sustained value delivery and keeps team focused. As a side effect it lowers planning burden - it's important because a human brain is quite bad at that activity. Having Kanban boards in most project tracking tools is very helpful - color paper stickers on a whiteboard aren't that fun in reality.
But Kanban doesn't help to keep the development on a right track. Here daily standups and weekly reviews help. Daily standup runs as we used to from Scrum - it starts a day, it's short, it answers 3 questions and it's the main control knob to keep the things rolling in a right direction on a daily basis. Weekly review resembles Retrospective and Planning from Scrum at once, but it's much lighter. At this event the team
  • selects tasks to do next;
  • agrees on implementation plan;
  • refines estimates and adjusts scope.
Plus there is a retrospective review but with no red tape at all. Only things that affect the team progress and are controlled by the team are to be touched. All that together safely fits to one hour - the last working hour on Friday. It closes the organizational overhead for a week.
Also Kanban does a little to follow project timeline. Here we use burndown chart. For the chart to work we need a time span (the whole project, from 1 to 4 months) and a backlog of User Stories. The Stories should be estimated in any units that reflect their complexity. We adopted traditional agile method of comparable tasks.
Then the team need to keep the customer happy and itself - in shape. Sprint Demo does that in Scrum. We use control points, "milestones" set 4-6 weeks apart. Each milestone means a consistent set of features are shown to the customer. But they don't assume there is nothing in progress at that time, so a milestone doesn't incur alignment cost like Sprint does.
Finally - planning. For a customer-requested development we have to deal with fixed price projects. It means fixed time. So we have to estimate it upfront anyway and arrange length and scope with the customer. Practically we get a set of estimated User Stories for a big "iteration" - the project. To make it manageable we should keep these projects within 1 and 3 months (unless it consists of many similar things). As shown earlier, relative estimation error for a longer iteration is less - until it starts to rise when our planning shifts from extrapolation to pure fantasy.
However these techniques don't magically boost the team efficiency. Kanban by itself is not easy. It may be represented as an iterative process where iteration is one day long. It requires much higher level of self-discipline from everybody in the team. Together the workflow is much less regulated. But it means it's denser. And there are no more hours of sitting in a meeting room tossing planning poker cards or watching Jira boards on a wall screen.
I apply this approach for last 5 years refining it over time. It works and is effective given certain prerequisites are met:
  • The team is small, seasoned and co-located.
  • Everybody practice a solid communication discipline - keep teammates informed without disrupting them.
  • The work is expressed in User Stories in its classical sense, not "do that, that and that" style.
  • A long term development is formed from fixed-scope limited-length (1-3 months) compact projects.