Square Root's Software Engineering Reflections: November 2009

Tuesday, November 24, 2009

The Square Root Poster

For our final reflection poster (pdf) the team decided to try something a little...different.

Supergraphics are interpreted by the viewer on their own terms. Allow an audience to absorb the information at their own rate. Sure, you may wish to call attention to certain details, that’s why you’re in front of them, let the audience come to their own conclusions and this can generate fruitful discussion during or following your talk.

The best example of a supergraphic is Napoleon's War of 1812 march to Moscow by Charles Joseph Minard. It's great for a number of reasons: it's high resolution, multi-variate, indicates causality between variables, and uses great graphic design. With this map as our guide, and advice from Tufte gleaned during one of his excellent seminars (students are only $200!), the Square Root team attempted to create our own, as shown above.

Reading the Poster

The X-axis shows time logarithmically spread to show the relative effort spent in each semester.

The Y-axis shows team morale. Morale is really trust as measured using our regular team dysfunctional surveys. These surveys were taken from Patrick Lencioni's The Five Dysfunctions of a Team in which trust forms the basis of a well-functioning team.

The thickness of the line shows team maturity. Maturity was measured through our team reflection process in which we asked three simple questions. "Is the process documented?" "Is the team following the documented process?" "Is the process working for the team?" Quantitative answers to these questions let us get a notional idea of maturity and qualitative responses helped us to improve as a team.

The branching tributaries or excursions, leaving the main trunk of the graph show processes that the team was able to execute independently of other processes. This is another way of thinking about maturity. For example, by the end of the summer the team had matured such that we could tweak individual processes without affecting other processes.

The annotations on the graph show what the team decided were the most critical moments of our project. Critical moments are events which had a significant and immediate impact on the team in some way. You can read about many of the stories behind the critical moments on this blog.

Analyzing our data as a supergraphic allowed the team to see things that we would not have seen otherwise, to think about and reflect on the project in a way that no one else has thought about it. Some interesting things that can be seen in the graphic:

The forming, storming, norming, and performing team stages are clearly visible
The effects of better visibility on morale (we were blissfully ignorant in the fall)
even negative things can be a positive as was the case in our planning breakdown
Commitment to process can lead to big pay-offs
Small changes can have a huge impact on a team and the changes should be made.

In addition, it just plain looks awesome.

Our Message

There were two big messages that we wanted people who read our poster to take away.

First, there is no single right answer except "it depends." We designed our poster so you can take away messages that are meaningful to you. As you can see on this blog, every member of the team has taken away different ideas from the studio experience. The poster was meant to reflect this by making it easier to share advice on a wide range of topics, all of which will be interesting to someone but not all to the same person. Tufte puts it best: create an image which allows readers to explore the data using their own cognitive style.

Second, since there is no single right answer and no best way of doing things, experimentation is the key to success. The studio environment is an ideal time for experimentation. Success or failure is not nearly as important as understanding why you succeeded or failed.

Enjoy exploring the data. If you have questions, please feel free to read through our blog or any other data in our project archive. If you have questions, don't hesitate to get in touch.

[Edit: We've got video of the presentation too!]

Thursday, November 19, 2009

AUP Chosen as Guiding Process

One of the best decisions made during our Studio program was making AUP as our overall process guidance. This was a critical decision because early in the semester we were wandering in what type of activities to focus on.

The AUP’s phased based approach mapped really well to the MSE program semester structure. In the high level perspective of the project it helped the team to do better planning. It provided the Go or No Go criteria to move from one phase to another. These exit criteria milestones in our plan, from which all the project tasks were derived from. The good thing is that by creating our plan with these milestones we streamlined the team effort toward a same direction, and we were following AUP process by enforcing its activities in our plan.

AUP also allowed us to embed iteration within phases, this allow us to use SCRUM like iterative process for our planning and extreme programming for our Construction phase. This allowed us to follow the AUP in the long term of the project and use other processes within each phase.

The Planning Race

Step 1. Figure out which features to implement.

Step 2. Specify the tasks that are required to complete the desired features.

Step 3. Peer review the specified tasks.

Step 4. Calculate team members' velocities based on the previous two iterations of accomplished work.

Step 5. Start the planning race.

The planning race is where team members attempt to fill their task queues (the upper limit of which is determined by the velocity) as quickly as possible. The faster you grab tasks, the more likely you'll get to do the things you want to do. The race should take place in a large conference room where loud talking is allowed. Bargaining, bartering, calling dibs, and talking smack are all highly encouraged. If you're doing it right it should almost sound like the commodities market where teammates are buying and selling tasks at fixed hourly rates. As punishment for taking more tasks than allowed by your velocity, other team members are allowed to cherry pick tasks from your queue.

Advantages: Dialing the planning knob to 11 means less time spent planning. I know how much you love meetings but the less time spent planning means the more time spent getting this done.

Disadvantage: The Planning Race requires a healthy backlog of tasks to pull off. There has to be at least enough tasks for everyone to fill their queue and ideally a few more left on the backlog. Tasks also have to be well specified, meaning everyone understands and agrees what needs to be completed.

The moment of Zen for the Square Root team was when we incorporated Step 3. Peer reviewing new tasks in the backlog streamlined our entire planning process and allowed us to plan faster and better than we had ever planned before. The result: not only were we spending less time planning but the quality of our plan increased dramatically. Some of this may be due to increased maturity and practice, but I stand by the Planning Race. It's super fun.

Before tasking peer reviews:

After tasking peer reviews:

Sunday, November 15, 2009

Wicked Requirements

Rittel and Webber coined the term wicked problems (pdf) to describe a certain class of problems that were particularly tricky to deal with. Such wicked problems exhibit most of 10 specific characteristics. I propose that planning software within a fixed budget is a wicked problem.

At the start of our studio project we knew we had to gather some kind of requirements. Requirements are the first step in nearly every software lifecycle model but few processes focus any effort on describing how those requirements are supposed to be gathered. We were basically left to our own devices when building a process for requirements elicitation. Sure we had techniques we could apply, but those don't help us address the wickedness of planning the project.

There is no definitive formulation of a wicked problem. Depending on the methods we chose the elicit requirements, the solutions would change. Different techniques might prompt different reactions from the client and in turn give us a different view of the solution space. This was certainly true as the information we got from ARM, use cases, prototypes, and problem frames was always different.

Wicked problems have no stopping rule. When do you have enough requirements? Agilists subscribe to the pay-as-you-go model to deal with this issue. Knowing that we wanted to spend more time on design and didn't have a firm grasp of the problem domain, we felt we needed more information. Any requirements engineering process we built would need to provide guidance for stopping.

Solutions to wicked problems are not true-or-false, but better or worse. Any plan we create for the project based on our requirements will never be The Plan. Our requirements can only ever provide more or less information which allows us to make better or worse plans. Of course if the requirements are incorrect...

There is no immediate and no ultimate test of a solution to a wicked problem. How do you know you gathered the right requirements? The funny thing about requirements is that, though they may be testable, unambiguous, and precise, that doesn't mean they are right. The best part is that even if the client and team thinks a requirement is right, once it's implemented (the eventual, partial test) everything changes. "That's great but can it do this other thing instead?"

Every solution to a wicked problem is a "one-shot operation"; because there is no opportunity to learn by trial-and-error, every attempt counts significantly. Normally this wouldn't be too big of a problem but given that our project is so short lived, we really only get one shot at gathering our requirements or we'll be so far behind that we might be unable to catch up. Like or not, our product ships in August.

Wicked problems do not have an enumerable (or an exhaustively describable) set of potential solutions, nor is there a well-described set of permissible operations that may be incorporated into the plan. What do requirements look like? We chose to record our requirements as use cases. Eventually we needed to add more information in the form of paper prototypes, work flow diagrams, and additional team communication.

Every wicked problem is essentially unique. Sure, folks have built web applications before, but no one has ever built a SQUARE Tool like what we've built.

Every wicked problem can be considered to be a symptom of another problem. It was not uncommon for our requirements elicitation discussions to uncover other ideas about the SQUARE Process or other ways the tool should operate (other than the specific feature we were currently discussing).

The existence of a discrepancy representing a wicked problem can be explained in numerous ways. The choice of explanation determines the nature of the problem's resolution. Our solution took the form of use cases. This worked well in some ways but was awkward in others. In particular, it was difficult to articulate, plan, and implement system-wide features such as quality attributes. We knew this going in and tried to compensate but our ideas didn't always work out the way we thought they would.

The planner has no right to be wrong (planners are liable for the consequences of the actions they generate). Ultimate responsibility for each requirement laid with the team. If we were unable to gather the right requirements it would impact our grades and our relationship with our client.

10 for 10. Requirements engineering and software project planning is absolutely wicked.

Requirements Engineering
We chose to focus effort in the fall on gaining domain knowledge and gathering requirements. Whether this was the right thing to do or not I leave to another discussion on another day. Instead I'm going to discuss what we did and how it worked out.

At first we started with a depth-first approach to collecting our requirements. I've spoken on my personal reflection blog about the importance of having proper affordances in software process and this is a case where the affordances were all wrong. Our original requirements process required that the team completely record requirements for the first three steps in the SQUARE process before moving on the other steps.

Given that requirements engineering is a wicked problem, this process was doomed to failure. There were two main issues with this process. First, the team had to wait for answers to questions from the client thus blocking the completion of other use cases. Second, not enough was known about some use cases to continue pursuing them until further research or experimentation could be completed. According to the process we would need to conduct those experiments before moving on to other requirements. This is obviously not satisfactory.

Almost a year ago to the day (November 14), as the team lead at the time I called an end to the original requirements engineering process and suggested a breadth-first approach in which a known set of use cases would be specified to an agreed upon level of quality (defined by a checklist).

The new breadth-first approach worked as planned. As a team we were able to gather information on a variety of use cases simultaneously and achieve a minimum level of understanding about the system.

Having the guts to make the change as a team allowed us to avert near disaster and let us have a complete (if not finished) set of requirements that were good enough for starting design work, prototyping, and experimentation. We nearly failed because we tried to solve a wicked problem with a tame process.

Sharepoint Installation

This blog post is probably a figment of the writer's imagination. Read at your own risk.

Background

In fall semester, the SQUARE Root team's tracking was inadequate. The team was tracking a bunch of stuff, but it was not getting the value for it. Tracking was done on excel sheets on the shared folder and combined using scripts. Therefore, when a team member wanted to put his/her time on the timecard, s/he had to sign into the the VPN, remote desktop into the server, and then put in the time. This was tedious and boring. Also the team was not accustomed to tracking As a result, aggregating the data at the end of iterations was difficult, and the team could not draw any conclusions from it.

A New Hope

At the beginning of Spring, I installed sharepoint to tackle with the tracking problem. Sharepoint provides a central repository for tracking data. The user interface of Sharepoint was similar to Microsoft Excel, and so the team did not have to learn anything new.

There was still some scepticism regarding Sharepoint's ability, probably because it was not clear at first what would be documented in the wiki vs. what would be in Sharepoint. However, we resolved these pretty quickly.

We were able to trace from the milestones set at the planning meetings to the individual tasks completed by the team. Therefore, we were also able to see the team's progress on a week to week basis.

The Empire Strikes Back

However, towards mid-spring, the team realized that most of the milestones were not being completed, and at each status meeting, we had action items to clean up tasks on Sharepoint. One particular week, it appeared that only 20% of the architecture refinement milestone was done, but in reality it was 80% done. This was an issue given that the team's planning, forward progress, and overall team morale depended on the tracking data from Sharepoint.

Return of the Jedi

At that time, we changed our planning process to a Scrum-like process. Team members took tasks from the backlog in a meeting, and therefore the buy-in for these tasks increased. Since the team members all took only 48 hour's(the amount of hours available in an iteration) worth of work, they also felt more responsible to finish and track those tasks. This gradually improved our tracking process, and by the end of Spring, we could rely on our tracking data.

This helped us in Summer, when we used the tracking data to measure team velocity, and planned the iterations based on that velocity. With the building blocks of tracking fixed in Spring, we were able to make enhancements such as using earned value and using trends from previous iterations burn-down.

The key takeaways from this were:

A good collaboration tool is essential in making tracking data available for analysis and decision making.

No matter how good the tracking tool is, the team has to buy into it for it to be useful to the team.

Friday, November 13, 2009

Planning Process Communication

Background

During the first half spring semester, our team was following a continuation of our fall planning process. Milestones were assigned to milestone-owners at the planning meeting, along with budgeted time, and then it was the milestone-owners' responsibility to make sure that the milestones is completed at the end of the iteration.

There were several issues with this process:

The milestone owners were supposed to give out the tasks for the milestones, but they felt responsible for the tasks, and so owners tried to finish up milestones by themselves.
Since there was no team-review of tasks, the milestone owners often did not specify tasks in adequate detail. So even when other team members wanted to help out, they did not have enough detail about the tasks.
Lack of detail contributed to bad estimates on the milestones, and so each iteration, most of the milestones would not get done. As a result, the team was getting demoralized(Finishing makes people happy).

At that time, right before Spring break, Michael suggested that we should try something scrum-like.

Creating the process

With Michael's suggestion and a Scrum book from Marco, I wrote up our Spring planning and tracking process that galvanized the team, organized the team's work, and brought happiness to the world. No, actually, this process raised another issue, but I'll talk about that in the next section.
As you may see from this process, the key points of it were these:

Milestones still had owners, but the owner's responsibility was to adequately specify tasks before theplanning meeting, and not afterwards. These tasks would create the product backlog.
The team sat down together and decided on the milestone and task priorities, and then take tasks from the backlog according to the priorities.
The team-lead's job was to make sure overhead tasks were also included in the individual backlogs. At the end of the planning meeting, no team member was supposed to have more than 12 hours of tasks.

After writing down the process, I emailed it out to the team, and then reviewed it at our weekly status meeting. The team seemed to generally like it.

Implementation

However, after Spring break, when we came back to implement the process, we found there were significant gaps in the team's understanding of the process. It was clear that we all had different interpretations of the process, even though it was written down and crystal clear(to me at least).

That was when we had to forget our previous reviews of the process, and just work it out face to face with the team. We tried out each stage of the process, and adjusted it according our needs. We prioritized milestones, and then went off to detail tasks, and then came back to take those tasks according to milestone-priorities.

At the end of that 5 hour long exercise, we had our first product backlog and individual tasks that we were responsible for. Since everyone had tasks that they had voluntarily taken, the milestone owners were not burdened with distributing tasks. And everyone knew the milestone priorities. Therefore, the most important milestones were guaranteed to finish.

I learned a big lesson from that meeting. It does not matter how eloquently you write down something. It does not matter how emphatically your team nods their heads when you ask if they understood it. Only when the team can work through the process together and feel confident about it, can it say that it understood a process.

Conclusion

The result of all this effort was a happy ending after all. In summer, the team added some additional checkpoints to this to make sure tasks are being specified correctly and completely, and added a client-prioritization step. However, the underlying process stayed the same.
The key takeaways from this were:

Critical processes such as planning and tracking needs to be communicated. And sometimes you have to sit with your team in a 3 hour meeting for that communication to happen.

Prioritizing milestones together with the team really helps to get team buy-in on the importance of those milestones

Since the team members were taking tasks, they felt more responsible for them than before

Tuesday, November 10, 2009

Using Use Case Points

Early on we decided to record our functional requirements as use cases. Estimating how much effort was required to complete the project based on the use cases collected turned out to be a much more challenging problem. In creating our estimates I turned to use case points estimation(pdf), a technique developed by Michael Cohn of Mountain Goat Software.

The basic premise of use case points is that by counting the number of transactions in a use case and then applying some general rules of thumb to the number (possibly taking into account adjustment factors based on quality attributes and environmental factors) you can get a rough order of magnitude estimate for understanding how big a use case is. This is same basic premise used for all estimation (count, optionally adjust, project based on data) McConnell talks about in his excellent estimation handbook, Software Estimation: Demystifying the Black Art.

Counting transactions is easy and straightforward. For the SQUARE Tool, we calculated approximately 255 unadjusted use case points and 244 adjusted use case points. [A complete breakdown of use cases and their corresponding point estimations are available on our archived project website.] The use case point estimates gave us a rough idea of how big and complex each use case would be compared to other use cases. The tricky part for us was projecting effort and schedule based on the use case point number. Being a new team, we didn't have historical data. To further complicate matters, we were conducting this estimate in parallel with our architectural design work, much later in the life of the project than Cohn's paper implies this technique should be used.

Not having a means of projection I turned to some interesting sources. Keep in mind that the total effort allocated to the studio project is only about 5,300 person hours (over a 16 month period) and time must be split among all stakeholders including faculty (e.g. end of semester presentations). At the time these estimates were created about 4,000 person hours of work remained in the project.

Assuming 20 - 28 hours per point means we would need between 4,800 and 6,800 person hours of effort to complete the project.

Converting use case points to function points to Java lines of code (result is approximately 8,000 LOC) and then running these numbers through COCOMO II (default settings) gives an estimate of 3,700 - 5,900 person hours of effort to complete the project.

Surely MSE Students work faster than the default COCOMO II assumptions. Given that MSE teams typically produce code at a rate of 4.48 LOC/hour, the Square Root team will need only 1,819 person hours to complete the project.

According to estimates, none of which corroborate with one another, the project will take between 2,000 and 7,000 person hours of effort! So we'll either finish under time or blow the whole project. Not very useful.

To overcome the variation in our estimates and hopefully come up with something a little more useful, we conducted a Wide-band Delphi estimation session, sampling a set of use cases to determine an approximate value for a use case point. Following the session, we determined that use case points for our team were worth between 8 and 12 hours. This gives us an estimated range of 1,800 and 2,300 person hours of effort, a much more manageable range and certainly (hopefully) more realistic.

We used the average use case point value of 10 hours for the purposes of planning. Tracking the team's progress over time using earned value, it became clear that we should have chosen the lower, 8 hour point value.

Conclusions

Use case point estimation worked out OK for the team. Realistically, any reasonable proxy would have done. We wasted a lot of time trying to find reliable industry sources for point values when the most accurate estimation of all was a simple Wide-band Delphi estimate done by the team.

The most important thing about the estimates was that, for the first time it gave us the ability to see beyond the next iteration and allowed us to project through to the end of the summer. That we were able to produce these estimates, in my mind, marked the end of the "Period of Uncertainty." From this day forward we had a plan, knew how long it would take, and could definitively determine where within that plan we currently were and whether we were where we needed to be to successfully complete the project.

Use case points were unsatisfying because use cases were generally unsatisfying as a means for recording requirements. While the nature of use case points would have allowed us to create an estimate earlier, the Wide-band Delphi session was so successful only because we had enough detail to have meaningful conversations about what was in each use case. Had we attempted this too early, the estimate would naturally of been less accurate (though perhaps still useful if you could figure out a way to track architectural progress within the created estimates).

Square Root's Software Engineering Reflections