What works and what doesn't work in software development? For the first 10 years or so of my career, we followed a strict waterfall development model. Then in 2008 we started switching to an agile development model. In 2011 we added DevOps principles and practices. Our product team has also become increasingly global. This blog is about our successes and failures, and what we've learned along the way.

The postings on this site are my own and don't necessarily represent IBM's positions, strategies or opinions. The comments may not represent my opinions, and certainly do not represent IBM in any way.

Monday, March 22, 2010

Is Shift-Left Agile? And Death By Build

Sometimes people confuse "shift-left" and "agile" software development practices. They both involve testing earlier, and talking to customers earlier, but that's pretty much all they have in common.

"Shift-left" practices are ones that help you find and fix problems earlier. You can think of the software development life cycle as a time line, with gathering requirements on the left, then design, code, test, and field maintenance to the right. The earlier (and farther to the left) you find a problem, the less it will cost you to fix it. It takes extra time and effort to find problems earlier, but it's worth it in the long run. For example, if you create JUnit test cases for most of your new code, you will invest more time up front, but you'll almost certainly find some bugs even before the code is handed off to the testers.

Showing your code-in-progress to customers on a regular basis is also a "shift-left" practice, because you may find out that you are not delivering what the customers want early enough to change your design.

"Shift-left" practices can help make your project more agile, because if you catch bugs earlier, you don't need to have such a long test cycle at the end of a release. If you automate your tests and run the automated tests with every build, you can also find newly-introduced bugs in old code more easily, and cut down on the amount of regression testing and bug-fixing you have to do later.

But you can slow down a project with "shift-left" processes too. I know of a project where their build process takes 30+ hours. I'm assuming it takes so long because they're trying to do too much automated code analysis and automated unit testing as part of the build process. (My product is quite large, requiring several CDs, and it only takes about an hour to build everything.) Then, after the 30-hour build, they start their BVT (build verification test), which also takes many hours. The goal of testing during the build process is to catch bugs that escaped unit test before they get to the test team, and "shift-left". But because the build process is so slow and painful, they only run it once a week! So if a code change misses a build, it's delayed by an entire week. And if someone's new code breaks the build (heaven forbid), the entire project is delayed by a week.

So what did they do to avoid accidentally delaying the project by a week? They implemented a heavy process of code reviews to make sure that no one integrates code that will break the build. Code reviews and inspections are another "shift-left" tool. So now you have to make your code changes by Wednesday. On Wednesday you have to find a team leader to review your code changes and approve them so they're ready for the Thursday build. Then the build and BVT processes run on Thursday, Friday, Saturday, and code is ready to be tested on Monday.

This is the opposite of an agile process. Yet, strangely enough, this team claims that they are following agile software development processes, specifically Scrum. I'm sure the people who designed their development process had the best intentions. However, one of the more important tenets of agile software development is that you have to get your code into the hands of testers, and customers, as soon as possible. When it can easily take a week or more just to get new code to the test team, you're losing many of the benefits of agile development processes.

Don't get me wrong, shift-left and agile processes can work together quite nicely. But they can also work against each other if you're not careful!

Does anyone else have any shift-left horror stories to share?

Wednesday, March 17, 2010

Open Source Software

Thanks to Husain for this topic suggestion...
"What Software Development Methodology can best fit in delivering an open Source Software??? After doing a little research I found no better than the Agile method!!"

Open Source (and Community Source) software is special, in that much of it is developed in tiny pieces by people working independently. It does lend itself to agile development, in that the requirements are not all collected in advance. Open source development cannot be done in a waterfall fashion, where all of the design for a release is done at one time, then all of the development, then all of the testing.

The one open source project I worked on used a project backlog, where people would pick up the next high-priority work item (story) that interested them. Someone would write a bit of code, test it, document it if needed, and submit it. Then the changes were approved and integrated by the project owner, and released to the public within a few weeks. So, adding a new feature/story was like a mini-sprint. However, the project did not use time-boxed iterations with fixed start and end dates. Each person was on their own schedule. It was very agile.

I think this highlights the fact that Agile software development methodologies are not one-size-fits-all. What works for a typical open source project would not work for my current project.

Our product consists of several components that depend upon each other. To implement my new feature, I may have to ask someone from another component team to write some new code for me. And someone else may use the new code I'm writing in their component. Because we are adding major new features across the board, we usually need to install the same version of every component, or they will not work together. We also need to plan some features a few months in advance: component A will implement the first piece, then component B will use that new feature and implement the second piece, and then component C will use features from A and B to implement the third piece. Our customers expect A, B, and C to all work together, so we need some time to stabilize the code and test all of the pieces together. So we have a couple of months at the end of a release where no one is allowed to make any major changes, and only bug fixes go in. We also use this time to test the code on additional platforms, and in different languages. This is less agile, but it's not realistic to expect us to ship a complex product like this without some dedicated time to stabilize everything and clean things up.

I have some stories about a team I know of that calls themselves agile, but really isn't. It's pretty silly what they do, really. More on this later...

Monday, March 8, 2010

When is a story "done done"?

When exactly is a story "done done", ready to be checked off the product backlog?

On the one hand, saying that a story is "done" too soon will leave your project with hidden debt. If you're still writing code for a story after you've said that it's done, then it's not done! And if you're still writing code in the next sprint for something that was theoretically done in the previous sprint, then a few bad things are happening:
  • Your velocity for the previous sprint will look too high.
  • Your velocity for the current sprint will look too low.
  • You may or may not be "in sync" with the test team and product management on what's really done and what still needs to be tested.
  • You lose the benefits of time-boxed iterations.

But what about things like Bidi enablement, visual design clean-up, or extensive logging? I would argue that much of this code hygiene work can and should be done toward the end of the release. Get the new function and risky changes out there so they have time to mature. Then save a sprint or so at the end for clean-up work.

I believe it's reasonable to create a story like "Add Bidi support to the following areas: ...". It's testable, and it's new function. Plus, you get yourself into a Bidi-enablement mode, and you can make a single pass through the code making the same changes everywhere. This is good because it decreases the amount of context-switching your brain has to do.

On the other hand, I believe that logging/tracing, JUnit testing, and factoring out text for translation should be done before a story is "done". Logging/tracing make it easier to debug the code from the beginning, so you don't waste time finding where the bugs are. JUnit testing finds bugs early, so you can fix them more quickly and easily. And factoring out messages is easier to do while you're in the code; you'll inevitably miss some translatable text if you try to do it all later.

Performance testing is a tricky one. You can't leave it until the end because you may find that you need to make significant changes to your algorithm to improve performance, and then everything will need to be re-tested. But you could ship a product with new function that takes too long to run, if you ran out of time for performance testing. If a new feature warrants performance testing, I would recommend doing the performance testing one sprint after the new code goes in.

In my current project, we say a story is "done done" when all of the function test scenarios have been executed, and it has no severity 1 (the function doesn't work, and there's no work-around), severity 2 (the function has bugs that you can work around), or must-fix (as determined by the testers) defects that are still not closed. It may have severity 3 (small problem) or 4 (annoyance) defects. It also has to be demoed at the end of the sprint.

I know someone who works on a project where there is a long list of criteria to be met before a story is "done done done" as they call it. In addition to code hygiene, it also has to go through system test, translation, accessibility testing, and so on. As a result, no story is marked as "done" until the product is about to go out the door. This makes their story points meaningless. It's overkill!

On the other hand, it's not appropriate to say that a story is "done" just because you've completed unit testing on it. If the testing is not completed by the last day of the sprint, the story needs to be moved to the next sprint as debt. I've seen people try to fudge their way out of this... "well, this story only has one open defect, so we should get credit for it". We need to be at peace with sprint-to-sprint debt. The good news is that the stories that are almost done should be closed quickly at the beginning of the next sprint, so your velocity for the next sprint will be higher. Over time your story point velocity will average out to an accurate number.

I'd love to hear what some other teams are using as their "done done" criteria.

Wednesday, March 3, 2010

Using Story Points for Sprint Planning?

Our team is still trying to get the hang of using story points. Until very recently, we weren't using them at all. I think this is for a few reasons:
  • Story points are relative to each other, not to the calendar. This makes them more abstract than person-days, so they're a little hard to wrap your head around at first.
  • We weren't in the habit of playing planning poker to put points on our stories. On the rare occasion we did put points onto stories, they were really just bastardized person-days (1 story point = 1 person-day).
  • Because we didn't have story points on things we had already completed, we couldn't use previously completed stories as a reference to put points onto new stories.
  • Because we didn't have any historical data on how many story points we closed in previous sprints, we had no idea what our velocity was, so we couldn't use story points or velocities to help plan the next release.
  • Story points aren't detailed enough for sprint planning. If you know that you have two weeks to code, and 3 people writing the code, how many story points can you commit to in the sprint? It doesn't have any meaning.

It's a vicious cycle: story points aren't useful if you haven't used them before... so people don't try to use them... so you don't gather the historical data to make them useful...

The problem is that we were spending too much time doing sizing estimates. As soon as you try to size something in terms of person-weeks, that implies that you have a pretty good idea what the low-level tasks are, and how long it will take to complete them. Honestly, it wouldn't be unusual for us to invest a person-week into sizing a story, when you consider how many people were pulled into each sizing effort for a few hours apiece. So we were told that we had to start using story points, and stop spending so much time on the sizing estimates. In planning poker, you rarely spend more than 2-3 minutes sizing a story.

People scoffed at the idea that you could plan a sprint without getting down to the task level and estimating how long it would take to implement each story, though.

Then someone found this article:

Now, I think we will be using story points for release planning (meaning, planning far into the future), and task hours/person-days for planning each individual sprint.

This means that we'll have to take a leap of faith. We have to start assigning points to stories before we start working on them. Once we've done that for a couple of sprints, we'll have some idea what our velocity is. Then we can start to use that information for planning at the release level. We'll see how it goes.

So, what does your team do with story points and sprint planning? Is it working?

What is AgileFall?

AgileFall is a tongue-in-cheek term for a software development model where you are trying to be agile, but you keep falling into waterfall development habits. For example, a team practicing AgileFall may say:
  • We have sprints, but they are four or six weeks long.
  • We try to do most of the development at the beginning of the sprint and most of the testing at the end of the sprint.
  • We have a product backlog with priorities, but we start working on a release with a long list of features already committed to the business.
  • We know our product release dates several months in advance. There's a long list of target dates that must be met before the release date.
  • We test new features as they are developed in each sprint, but we also have a lonnnng system / globalization / translation / integration test cycle at the end of the release.
  • We sometimes spend too much time up front designing the software before we even start prototyping it.
  • We do our initial sizings in person-weeks, because we're not comfortable with story points and velocities yet.
  • When management asks us for a rough sizing, we might spend days working on that sizing effort, because we feel like we need to really understand the tasks involved before we "commit" to a sizing.
  • People get upset (or defensive) when stories are not completed in the sprint they were planned for.
  • We demo our software in development to customers, but only after we're happy with it. By the time we get around to having a demo, it might be too late to change much.

Is your team practicing AgileFall? In what ways?


Welcome to my AgileFall blog! I've been creating software since 1995. I have a Bachelor's of Science in Computer Science from Duke, and a Master's of Science in Computer Science from UNC, where I focused on Distributed Computing and Software Engineering. I studied software enginnering theory, including the waterfall and agile models, in school. For the first 10 years or so of my career at IBM, we followed a strict waterfall development model. Then about 2 years ago we started switching to an agile development model. Our development and support team has also become increasingly global. This blog is about our successes and failures.