FrequentBuilds

IterativePlanning
IncrementalDevelopment
ShortIterations
PairProgramming
PlanonRefactoring
SelfOrganizingteams
ContinuousRetrospectives
PervasiveTesting
FrequentBuilds

I've been a fan of nightly builds since I can remember. When I was a developer and it was cheap enough, I built whenever I had "enough" changes to warrant a build, 2,3,4,5 whatever times per day was necessary. If the builds took too long, nightly builds were the answer.

I'm working with a client who does biweekly builds, and each build takes at least a week to settle down. I'm recommending that they can gain significant time in their projects by changing product and build architecture to support nightly builds.

Do you have a preference for build frequency? When is a build too long? What wisdom do you have around builds, nightly, hourly, or otherwise?

JohannaRothman 2003.06.29

I prefer nightly builds at a minimum, where the build includes running regression tests. The project I'm working on now is still small enough that I can compile the pieces and run all of the non-database unit test in less than two minutes. That's short enough that there's no excuse for checking in code that breaks a build.

As product build time increases, there's almost step-function like dynamic. First, it becomes infeasable for developers to build the entire product before checking in code. This usually leads to increased incidents of build-breakage. At that point, when breaks do happen, it can take most of the morning to get things cleaned up to the point of having a clean build. At some point, people start grousing that they're wasting lots of time waiting for a clean build, so builds are reduced to 2 or 3 times weekly, or, eventually, to weekly or less.

One of the ways out of this is to have good unit tests. The chance of breakage in the full system is reduced when the parts are well-tested. Unfortunately, it's often difficult to "stop progress" to beef up unit tests once the nightly build situation has gotten bad. (Either there's no political will, or people are too busy thrashing.) Having lived through this a few times, I now strive for "test driven" development, where unit tests are written first. Having a concrete client for an API helps avoid flights of fancy when coming up with APIs, which leads to better APIs, which leads to fewer problems when other code uses the APIs, which leads to fewer broken builds and unit test failures.

--DaveSmith 2003.06.29

Please explain the term "build". In a biweekly build, what is it exactly that happens biweekly ? -- LaurentBossavit 2003.06.30

I've seen several variations on this depending upon whether (as Laurent asked) build merely means "my local source tree" or the central shared copy.

Depending on whether builds are done via a central server on what's (already) checked-in, or on a private clone of the source files, there is different latency in picking up others' changes.

Most PC development projects I've been on develop and build on private clones of the sources, then check-in after local verification. I've seen variety in latency problems where developers continued to build with sources retrieved up to a month ago - so other's changes this iteration didn't reflect until check-in.

Most mainframe projects I've been on use the central shared source & build scheme. You ususally used library concatenation to see your changed sources in place of the checked-in baseline sources. This worked nicely, very little latency of changes external to your own.

Of the version control tools I've used, I liked Clear_Case the best because you had a "virtual" file system concatenation so you could get the benefit of current sources everywhere you didn't have an altered source of your own. I could also assess "who" or "who else" has checked-out versions that I might bump into.

I prefer to see full builds (at least in languages that use #include or import technology) rather than trust incremental builds. (I've been burned too many times by include-processing differences.) I'm usually willing to spend a lot of effort getting the team established with full build tooling that's fast enough to support multiple builds per day, whether they check-in changes hourly, daily, or weekly. I like to make it easy to refresh the "all other" sources, tests, resources to echo the central build before check-in.

BobLee 2003.06.30

Laurent, when I use "build," I mean create an executable that contains all the changes to date. Biweekly means that once every two weeks, all the code is gathered, compiled, an executable is created, and the smoke tests are run against the executable. Does that help?

JohannaRothman 2003.6.30

I think there are two rythms that matter, at least. One is to check current / recent local changes against the last known good baseline of "everything else." The other is to integrate changes from multiple sources.

Especially with systems problems, you want the "results" of a change available before you forget the problem and solution you were working with.
With a bunch of people changing "stuff" you want to push the stuff together before it gets too much mutually out of whack.

Most of the time, I've seen those amount to no less than nightly, and no less than weekly, respectively. I'm even not all that worried about an intermediate build that fails before stuff is "done." So, the half a change I got in there blew up X, Y, and Z. We expected that. The useful information is when that change also kills P, Q, and R.

Sometimes, getting a build in a day, or even a week is it's own problem. -- JimBullock 2003.06.30

If you're on a C++ project that's suffering from long builds, John Lakos, , has a great reputation for helping get the build times reduced. Other than C++ projects, I haven't seen serious build issues in the past several years, but maybe I've just been lucky.

DaveSmith 2003.06.30

I haven't seen build time issues based on hardware limits in quite a while. I have seen test-bed pushes and similar take a long time, still. Things like pushing out databases, and test sets, and registering services or interfaces. I have seen "agile" or "agile wanna be" projects stall when making and maintaining a suitable test bed for all those automated unit and acceptance tests turned out to be hard and timeconsuming. JUnit can run Java unit tests. Well and fine. What about data to go against? In a known state? Recoverable covering the edge cases?

I have also seen the agile testing approaches stall when the failures were more system-level than code-level. If a failure leaves the test system in a bollixed up state - crashes the messaging middleware, hard, for example, that takes the test suite down, but may also require manual intervention to get the environment back up.

I have seen within the last couple years several examples of "big ball of code" problems, where a large bunch of developers was working with a big, intermingled pile of code. Managing sources changes that step on each other became a real burden - daily half-day check in meetings involving dozens of people. The code was way to arbitrarily coupled. Since "everybody knew" what effected what, none of that was written down. And as it grew, with a bunch of very smart people living with the code pretty much 24 x 7, eventually the folklore needed to make changes exceeded the recollection and recounting rate available even for people who drink too much coffee and talk really fast.

There's something missing from the "frequent build" idea. It's necessary but not sufficient, I think. -- JimBullock, 2003.06.30

See PervasiveTesting for some ideas relative to Jim's points.

--BobLee 2003.07.01

As many of you know, I rewrote the Mozilla.org tool "tinderbox" which is used to monitor the state of the development environment. In particular this tool generates web pages which tell if the code compiles and if all the tests pass. I have written about this elsewhere and If you are interested just search for tinderbox on this wiki.

As for Jims comments about compilation times. I often see spaghetti C++ code where each module includes the world. This code takes forever to compile. Thus it is my belief that part of the compilation process should be a "test of the architecture" that is separate modules should be compiled separately (with no other code checked out) to ensure that they really do not rely on any information found elsewhere in the code. There are lots of useful tests which can be done at compile time to ensure that the code is "good" for some useful but often ignored criterion.

I used to work at a company which generates complex proprietary object models their model compiler generates C++. The generated code then must be compiled with their own code (What a mess! the Sun C++ compiler can not get the dependencies right on this). So it takes them all night to recompile their product!

Yes John Lakos' book is very interesting and he is an interesting guy. I had a few interviews with him this fall. There are some ideas which are easy to implement but will really help speed compilation times (put guards around the include statements themselves as well as the included code)

There's something missing from the "frequent build" idea. It's necessary but not sufficient, I think. -- JimBullock, 2003.06.30

The biggest problem that I have seen with "frequent builds" is the pernicious notion that the goal is building the code. To me the goal is to get lots of feedback quickly. This should include getting the QA group involved in some "smoke tests". That is the QA group has a set of tests which are small enough to run every day but interesting enough that they should not be automated. At Netscape/Mozilla the days development can not begin until QA has approved the current code as being OK. So to me the notion of "daily builds" is really about frequent integration tests not about being sure that we can compile our code today..

Ken

''[Build means...] create an executable that contains all the changes to date [...] all the code is gathered, compiled, an executable is created, and the smoke tests are run against the executable. Does that help? -- jr''

Johanna - this does help. My spidey-sense is still tingling though (the one that says "keep asking stupid questions").

There seem to be three components to a "build", then:

gather all code changes to date;
perform all the transformation steps which result in an executable;
run tests against resulting executable.

What are the ambiguities in this description ? I felt like doing a MaryHadaLittleLamb exercise (I'm leaving out the dictionary definitions) :

gather all code changes to date;
- gather to a a specific place (such as the repository or a central server), or will any machine do ?
- gather changes as they are, or also reconcile changes ?
gather all code changes to date;
- if someone's changes are left out, then what we have isn't a build ?
- is that all the changes to the whole system, or are there valid subsets ?
gather all code changes to date;
- does this exclude documentation, test data, etc. ?
- what about stuff that isn't in files - environment variables, installed software, etc ?
gather all code changes to date;
- just substitutions, or also additions, deletions, and things moved around ? (that's what Bob was asking about, IIRC)
- do we want to distinguish individual changes, or sum them over time and just gather "file deltas" ?
gather all code changes to date;
- can there be "hypothetical" changes, changes made to date that are not part of the build ? (I'm thinking of branches...)
- (I'm coming up short on "to date" ambiguities, which doesn't mean there aren't any more)

Similarly there are ambiguities in the "transformation" and "testing" parts, which might be worth exploring.

I'm wondering if there is perhaps a simpler, less "technical" statement of what is to be achieved by frequent builds. Something like:

When a system is large and complex, it takes a long time to assess its quality as a whole.
When a system is large and complex, that can also be an issue in assessing the quality of individual parts.
When it takes a long time to assess a system's complexity, quality issues will accumulate invisibly during the delays.
When it is hard to assess the quality of individual parts, quality issues will accumulate at the part level, in between whole-system assessments.
When a system's quality decays, it tends to grow large and complex by accretion of patches.
Therefore, increasing delays between whole-system assessments should be a warning signal to reduce size and/or complexity (e.g. by breaking down the system into parts)

LaurentBossavit 2003.07.01

Laurent, that's a fine principle. My concern is that having reduced the issue to a principle, it becomes a platitude. The devil is in the details. How do you do that? For any project? Are the XP methods sufficient in all cases? Which practice is "break it down into parts?" BTW, once you break down the system into parts, how do the parts get recombined, when, and by whom? How do we test that? Which one of the 12 practices is that? -- JimBullock, 2003.07.01 (Practices - useful but not sufficient for everything.)
Not a principle, but a hypothesized dynamic and leverage point. Spaghetti leads to longer lead times to a stable build leads to spaghetti. Is the time between "builds" getting longer and longer ? Check for spaghetti. Cut it up when you find it.
"Break down into parts" is design improvement. It seems to be language- and environment-dependent. There are plenty of good tips for C++, I hear, for instance in the Lakos. All the stuff about dependency levelization, plus what Ken summarizes above. Java is easier, though not by a lot. Code generation produces spaghetti fast. Also run-time reflection. Also premature metadata. Also irresponsible use of a RDBMS. Paying attention to "shearing layers" - which parts change at what rate - helps reduce spaghetti and make smart "parts" decisions.
"Recombining parts" into a whole system is the integration process at a higher level, where you treat "component changes" as "source changes" - I would expect you also have changes to some glue code that binds the components together.

LaurentBossavit 2003.07.02 (plus edits to my above to make clearer the intended format of a "dynamic", as opposed to a "principle")

Thus it is my belief that part of the compilation process should be a "test of the architecture" . . .

At one former J - O - B, we developed technology to route any kind of logged error into the defect database, assigned to specific modules and whomever touched them last. We did this initially to capture certain production run-time problems and route them to the people who could fix them.

That wasn't the point, however. With this tech in place, we had the opportunity to route any kind of problem automation could detect - compiler or linker hiccups, any kind of static or dynamic analysis we wanted to perform. Essentially, we expanded the use of "defects" to account for any kind of defect, not nust run-time misbehaviors you could observe through some kind of stimulus / response.

In my experience, collective ownership of code and mutual accountability for adhering to good practices doesn't always scale to several hundred people, multiple locations and a lot of pressure to ship features. So we setup to push some controls into the tool stack.

The biggest problem that I have seen with "frequent builds" is the pernicious notion that the goal is building the code.

Yes. And "build" has a floating, infinitely declining definition once "frequency" becomes the most important thing. It's that pesky 0th law again. -- JimBullock, 2003.07.01

Frequency isn't the goal, but the capability to support traveling light. If builds are only integrated every N days/weeks, there's a lot more delta to examine for problem resolution while testing makes no progress.

System integration errors with a medium size team or larger team have a lot of show-stopper effect. Facing the integration daily means that the set of changes is smaller and fresher in people's minds. It's not the only way to work, but it is one of the simplest. Don't let integration debt accumulate.

--BobLee 2003.07.01

Joel Spolsky has a nice article on the benefits of daily builds (http://www.joelonsoftware.com/articles/fog0000000023.html). In the article, "build" is well defined:

"A daily build is an automatic, daily, complete build of the entire source tree.

Automatic - because you set up the code to be compiled at a fixed time every day, using cron jobs (on UNIX) or the scheduler service (on Windows).

Daily - or even more often. It's tempting to do continuous builds, but you probably can't, because of source control issues which I'll talk about in a minute.

Complete - chances are, your code has multiple versions. Multiple language versions, multiple operating systems, or a high-end/low-end version. The daily build needs to build all of them. And it needs to build every file from scratch, not relying on the compiler's possibly imperfect incremental rebuild capabilities."

The article goes on to enumerate the benefits and mechanics of making this happen.

-RonPihlgren 2003.07.01

Spolsky's spec for what's a build is a sweet spot in the production of feedback for builds for some systems. I claim that the sweet spot is different for other systems, either more discipline and coverage or less, depending.

I think as the systems get more complex and more critical, or even simply larger, the standard for what success looks like in a "daily build" has to get higher. More can go wrong. More subtle things can go wrong. The "daily build" has to extend beyond "the entire source tree" eventually, for example. And "success" has to extend beyond "compiles clean" to "compiles and unit tests clean" and eventually to "compiles, unit tests, acceptance tests, and a bunch of other stuff clean."

The trend I have seen, however, is for the standard of build success to actually decrease as accountability and the commitment to quality disperse through a large development organization, even if they're all working on the same thing. Doing lots of builds doesn't add feedback or control the volume of changes to address if the information the build generates is legislated out of existence.

I think frequent and complete builds are important. I happen to think that "every day" and "the whole source tree" is overkill some times, and way, way inadequate other times.

-- JimBullock 2003.07.02 (Is it "agile" if it's the one true way?)

Laurent, gee, I wasn't ambiguous in my head :-) I believe I can resolve the ambiguities:

Gather all the code that was checked in by the developers, ready to be gathered. (If a developer isn't ready for that day's code to be included in the build, don't.)
Compile all the checked-in code. Use the checked-in compilers. Use the checked-in documentation.
Create the executable (or several executables)
Run the smoke tests

I'm sorry if I appear to be repeating myself. I can't see the ambiguities in this. Oh, maybe I have an unstated assumption: We only create product from configuration management system (CMS). If it's not checked in, whatever "it" is, it's not part of the releaseable product. The CMS may also have compilers or other tools, so we know what tools we used when, but if the code/doc/whatever isn't part of the CMS, it's not part of the product.

My client has a big problem: they didn't refactor (or redesign for that matter) when the code base was small, and the builds started taking a long time. Now that the builds take over 24 hours (just the compile-and-create-executable), and the smoke tests take days to run, it's a daunting process to break that apart. Anyone have experience with the breaking apart? -- JohannaRothman 2003.07.03

JR the builds take over 24 hours

I've dealt with similar performance degradation building with the CVS tools. Our system was retrieving from PVCS "just in time" for each dependency in the source tree. We found it much easier to erase the whole source hierarchy and retrieve all throughout the tree, then let MAKE drive the builds. We didn't retain compiled object outputs - they got erased and rebuilt clean or retrieved from CVS if 3rd party libraries. By segregating the "retrieve from CVS" from dependency analysis, we gained about 3 hours of dependency checking.

The other trick for C++ (at least) is include guards to preclude searching for #include files already processed rather than retrieve and skip -- this is the main trick in John Lakos book mentioned above. If the compilers you are using support it, the "#pragma once" within the include files satisfies this without clutter in the headers that include them.

Examine the contention and power of the build machine's setup. Building from unshared private disk files is faster than lan space, especially high-contention lan space. If possible solve with faster iron.

Good luck!

BobLee 2003.07.04 (ssssssss.........BOOM!)

Anyone have experience with the breaking apart?

Why yes I have. In the end it's like any other timeline reduction - a combination of making less work, and getting smarter about the work it does. Tactically, start by profiling the build runs both timeline and resource snooping ideally on a per-process basis. This isn't for exotica like kernel tuning, at least not right off.

The data comes in handy if you don't have someone like Bob around who knows where the particular tool (compiler, etc.) tends to waste time. From the profiling data, you can go: "Gee that step takes a long time, and gosh the disk / cpu / network is busy. Wonder what that's about." Then if you're lucky you get: "Oh, it walks the whole dependency tree every time it figures out the "next" file to get. That's silly. Let's fix that." Usually there's a mystery factoid about how it works which you can find buried in a manual somewhere once you know to look.

For many fixes, you'll end up having to do some surgery on the source - change include directives, move calls around. It's mostly mechanical stuff to let the compiler / linker / whatever do less work. It's useful to have a pile of space and someone like a Perl jockey around for this, to cobble together one-time tools that do a static manipulation of the code base, or the test suites, or whatever. Sort of the same idea as PrettyPrint, but you're moving stuff around to make the processing tools work better.

I also start with a clean repository and an inventory. The job is done when you've moved "everything" into the new repository, touching it up along the way. If they don't have an inventory of "everything" (and if it's this bad, they might not) start with the build control files - link lines, include files, "make" files or whatever. Actually start by hand building new ones of these, a piece at a time, until everything that is supposed to work, works. I've seen up to 20% or so source left-overs after an exercise like this - stuff just lying around for no reason anyone can tell. If it isn't referenced by anything leave it out.

-- JimBullock, 2003.07.03 (How many times have I done this?)

I haven't read the Lakos, but I know some of the ideas that are in it, by hearsay as it were, and my recollection is that it has a lot more to offer than just "include guards in header files".

I have some experience with how bad things can get, from three or four largish C/C++ projects over the past ten-plus years. I have a little experience of helping things get better, and more theoretical ideas of what should be effective than I've had occasion to put in practice - so far.

In C/C++ the typical problem is that programmers are faced with time savings in the form of the "include the world" pattern. If you spot the following pattern in a C/C++ project, you're in "include the world" trouble:

File foo.c
/* Include the headers we need */
#include "foo.h"
int functionThatNeedsBar() {...}
int functionThatNeedsBaz() {...}
int functionThatNeedsQuux() {...}
int functionExportedByFoo() {...}
File foo.h
#pragma once
#include "bar.h"
#include "baz.h"
#include "quux.h"
int functionExportedByFoo();

The problem is that every client of foo unwittingly becomes a client of bar, baz and quux, and thus dependent on them. Also, the next time you're writing code that relies on bar, baz and quux, you will remember that you have a convenient way to include all of them and not have to deal with compile errors : just include foo.h.

"Include the world" initially has a negligible effect on build time; but because of the temptation to include headers which you know already include what you want, over longer periods the pattern becomes locked in.

Eventually, it results in the situation where every source file depends on every header file. It's not just a matter of incremental compiles no longer being possible because as soon as you touch one .h you have to do a full build; it's also that each "compilation unit", i.e. each C/C++ file preprocessed with all #include directives resolved, has a size that is roughly a third to half of the project total LOC. That's a lot of work for the compiler, even if you've diligently put include guards everywhere.

The above source/header pair should be refactored to the following:

File foo.c
/* Include the headers we need */
#include "foo.h"
#include "bar.h"
#include "baz.h"
#include "quux.h"
int functionThatNeedsBar() {...}
int functionThatNeedsBaz() {...}
int functionThatNeedsQuux() {...}
int functionExportedByFoo() {...}
File foo.h
#pragma once
int functionExportedByFoo();

Now clients who depend on "foo" don't also depend on something else. It's their choice whether they also want to depend on bar, baz or quux.

This looks like an easy transformation and something you could deal with using automated tools, as Jim suggests above. Not always, in my experience.

One problem I've seen is that once "include the world" has become locked in, you have "include chains" that are usually more than one #include deep - often much more than one. So when you break the chain in one point, you're breaking all the top-level code that relies on stuff several levels deep; it can take literally days to go through a C/C++ compiler's typically verbose output and fix it all.

In C++, the problem is often compounded by circularities in dependency chains, I suppose because object code is more prone to circular dependencies than procedural. Also, C++ class definitions live in .h files; the typical pattern is to have one .cpp and .h file for each class; and when one class inherits from another, it has to include the parent class' header. This reinforces the "include the world" pattern.

So in my experience and conceited opinion, it's a matter of design as much as a matter of mechanical transformation and header hygiene. When inheritance isn't involved, there are tricks to "levelize" dependencies, which the Lakos describes, I'm told, in loving detail. Things such as forward declarations, the "pImpl" idiom, etc. You can also get a lot of mileage from preferring composition to inheritance whenever possible, from introducing "pure virtuals" (the C++ equivalent of java interfaces), etc.

Another key advice follows from what Jim suggests: identify a small subset of the whole source tree that can be compiled in isolation, and call it a "library". Then make sure that client code which depends on this library only needs to see one smallish .h file, with implementation details hidden from view. Put the library into a repository of its own.

I suspect, though, that you need a fairly clean design to start with for this to be possible at a reasonable cost. In all the "big ball of mud" C++ projects I've seen, it would have been very hard to isolate even a small "library" in this way. Very painful, like extracting a tooth. Very difficult to convince team members or project stakeholders that it is worth investing time in untangling some of the spaghetti. ("No matter how it looks at first, it's always a people problem.")

For me, the Big Question when this kind of thing comes up in product companies is, at what point do the scales tip in favor of scrapping the whole thing and starting from scratch, vs. restructuring the old ball of mud.

LaurentBossavit 2003.07.04

Laurent's right. I've seen one effort to resolve a big ball of mud estimated at 3 years. That project had problems, like no intermediate deliverables. So doing anything like a velocity check was initially impossible.

Another trick here is to embed some of the new policies in the tool chain. That's one way to encourage come of the cultural changes necessary. And since all code that is used is also maintained, there's another leverage point in the maintenance. No, we're not requiring everybody to immediately fix everything they've ever touched. But we can maybe require that anything you modify conforms to some standards. So while you're in there doing the cool, new function, clean it up.

I'm a little chary of the "design" point of view for this kind of exercise. Laurent is right that it's more than a mechanical code transformation. The problem I have seen is small cliques of would be uber-geeks who go off into theory space. There they are all trying to be little Lakos-es. "What would be goodness in resolving these kinds of dependencies?" Once you say "design" I've seen the exercise become:

A philosophical society
A clique who "knows best" who must make any / all changes themselves, because they're the only ones who "get it."
A focus on the thinking part, not the doing part.
Distraction into "redesign" of the functionality of key chunks.
More distraction into new development of great, cool services that obviously are needed - extensions to what's done in the code already.

There's a difference between doing development and managing development. If I'm doing it, maybe my job is to get it perfect. If I'm managing it, my job is to help other people get it perfect enough. Code cleanup like we're talking about works better when it's approached as an asset management task, vs. showing off one's OO virtuosity. You do need at least one person around with deep knowledge of the language / compiler / linker and so on, or at least the willingness and ability to learn.

As for "chuck it and start over", that's always an option. If you have a candidate customer around, that becomes possible. Some systems are depended upon by other systems and people, who don't know they use the thing. In that case, you run the risk of breaking something important without knowing you're doing, so.

-- JimBullock, 2003.06.10

Updated: Friday, July 4, 2003