I broke the iPhone CI build and I’m so proud!


I patched the RunIPhoneUnitTest.sh file in the Google ToolBox For Mac Project. Then I noticed someone else fixed the same problem 84 minutes before I did. If you use IPhone unitTest support from Google and you’re looking to crash your XCode (or, like me, Hudson) build download my revised edition. Don’t Go for the earlier submission, even though it’s actually a better patch. Use mine because it’s more John Blaze than the other one and it only touches one file instead of the three or four thousand that the other patch addresses. Use mine then deposit $2,000 to my account, because without my patch you can’t break your build. And without breaking your CI server [you do use CI on all of your projects, right?], it’s costing your company roughly $549,5823 in productivity every time they have to research one of those, “It works on my machine” bugs.

I’ll just patiently wait here for the balance of my Pay Pal to increase…
…I’m still waiting. You did deposit the full $3,500 didn’t you? …You hit submit didn’t you?

…waiting
…waiting
…waiting…

Look, if you’re gonna play games then I’ll just haff to get to the point! The problem comes from the unconditional “exit 0” at the bottom of this file. I consulted a guru and my supervisor for the proper bash assistance because it’s been like 400 years since I’ve immersed myself in the shell. Collaboratively we came up with a complex solution involving pipes, redirects, awk, and one of those digital converter boxes you’ll need in February. Because the converter box felt like over kill and because I wanted to pretend I knew awk we took out the converter and left in the awk stuff. See below for a replacement RunIPhoneUnitTest.sh that crashes the build. (Unit tests should always break the iPhone build, right?)

#!/bin/sh
#  RunIPhoneUnitTest.sh
#  Copyright 2008 Google Inc.
#  
#  Licensed under the Apache License, Version 2.0 (the "License"); you may not
#  use this file except in compliance with the License.  You may obtain a copy
#  of the License at
# 
#  http://www.apache.org/licenses/LICENSE-2.0
# 
#  Unless required by applicable law or agreed to in writing, software
#  distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
#  WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the
#  License for the specific language governing permissions and limitations under
#  the License.
#
#  Runs all unittests through the iPhone simulator

export DYLD_ROOT_PATH="$SDKROOT"
export DYLD_FRAMEWORK_PATH="$CONFIGURATION_BUILD_DIR"
export IPHONE_SIMULATOR_ROOT="$SDKROOT"
export CFFIXED_USER_HOME="$USER_LIBRARY_DIR/Application Support/iPhone Simulator/User"

echo "Props"
echo "export DYLD_ROOT_PATH=$DYLD_ROOT_PATH"
echo "export DYLD_FRAMEWORK_PATH=$DYLD_FRAMEWORK_PATH"
echo "export IPHONE_SIMULATOR_ROOT=$IPHONE_SIMULATOR_ROOT"
echo "export CFFIXED_USER_HOME=$CFFIXED_USER_HOME"

echo "Starting the build"
"$TARGET_BUILD_DIR/$EXECUTABLE_PATH" -RegisterForSystemEvents 2>&1 | tee .theresults
PASSFAIL=`cat .theresults | tail -1 | awk '{if($5 != 0) print "fail"; else print "pass"}'`
rm .theresults
echo "Build passed? $PASSFAIL"
echo "$PASSFAIL=pass"
if [ "$PASSFAIL" = pass ];
	then
	echo "Exit w/ success"
	exit 0
else
	echo "Exit w/ failure"
	exit -1
fi

9 thoughts on “I broke the iPhone CI build and I’m so proud!

  1. Hey Cliff, incidentally:

    > it’s costing your company roughly $549,5823 in productivity every time they have to research one of those, “It works on my machine” bugs.

    I’m on a team where having a CI server isn’t making a difference(!) Our project purely depends on the files in source control, no external dependencies at all. That and everyone is very disciplined about running their test suites before a commit means that the CI server isn’t doing much for us. I’d never have guessed!

  2. Hey Merylyn,

    Incidentally you put your comma after the wrong significant figure. The cost is $5,495,823 not $549,5823! Regarding your CI server, I find it hard to believe that it does absolutely nothing for you. What tool are you using to build? Ant? Maven? Is Ant checked into VCS? If so then you don’t depend on merely source control. If not then there’s a potential for, “it works on my machine!” Is there at least one dev who is crafty enough to use the IDE built in build tools or some build plugin? People often forget that the build system itself is a dependency of the project. Unless you lock down everyone to the same exact version of the IDE or force everyone to run builds using a build system in VCS or on a net share then you run the same risk, regardless of whether or not people run unit tests prior to check in. The equation is:

    Source + Dependency Binaries + Build Binaries = Desired Output

    If either addend on the left side is altered or different in any way then then the right hand will differ. That’s where a CI server packs its punch, ensuring that the behavior observed prior to commit wasn’t skewed by the dev environment. Then there’s the other half:

    Desired Output + Runtime = Desired Behavior

    The same principle applies. Unfortunately a CI server can’t help you here. That’s pretty much what QA is for. That coupled with varying runtimes across dev platforms exposes the “it works on my machine” early in the dev cycle. Add small frequent VCS commits with continued dev head synchs propogates the changes across the dev environment exposing an fragility to varying platforms early in the dev process. Indeed, you need not only a CI server but a solid CI process to fight the “on my machine” bug.

  3. Hey Cliff,

    No, you’re right. We assuming a specifc OS, minimum JDK, and a Gant installation. But those things don’t vary often.

    We run the same build target from the command-line before we commit as the build server does. Our dependencies are all managed by our build script.

    Our biggest exposure is that we assume that every file we integrated against we will commit to version control. That hasn’t been a problem though so far.

    That said, I have been on other projects where a CI server made a huge difference. But in my case that always said something about: our build, our non-VCS dependencies, or our discipline as developers.

  4. My point is this. Dev A. Is looking on Dev Zone after checking the morning mail influx. “Oooh! Cool new image processing Gizmo for Eclipse!”, he blares out excitedly! The live site is copy/pasted into the update thing-a-mah-jig and all of it’s dependencies are pulled down. Something in the dependency chain ends up in the build-test-compile chain and starts to skew his unit test results. Dev A. was really struggling with some other tough issue on the project but that was yesterday. the surge of caffine from his morning cup of Joe along with the excitement of the new thingy he installed has convinced him that he can conquer the world! He later attacks the tough issue with the new found confidence without validating his earlier red bar. Instead he makes a source change he dreamed up over the weekend and because his build environment is changed he gets a green bar. Now unless he immediately commits the change he’s going to think he cured Parkinson’s. My experience shows me that 95% devs hate to commit often and would rather wait until almost the end of the sprint to commit and synch up. Lets say he’s one of those kind. Now you have 7-8 other changes hiding the critical change he though fixed his problem. They all go in one uber change list with the svn log: “Added new widget progress cancel button”. True TDD with a CI server should catch the error but lets just say dev C disables the test that exposes the bug because of some unrelated issue. The bug is now embedded with no history of when it was introduced. The team goes all the way to release when catastrophe hits. No body can get the app to run correctly and Mr. “Golly Gee Wiz Image plugin” has long since disabled/removed the plugin and is no longer able to get the desired behavior. Somebody makes a source change that masks the issue long enough to go into production and when it pops up again everybody is puzzled because the original context of the bug is lost.

    When I speak of build environment I refer to all the tools engaged when you produce an executable. If you invoke the process from Eclipse but Sam down the hall prefers the cmd line you have a LOT of variables. I amplify the problem because I personally believe it’s rare that you can get a team of more than two devs to agree on not only an IDE but an OS, a runtime and a container like JBoss/Tomcat. Somebody on your team has something slightly different and something as seemingly negligible as a point release on Tomcat can cause a world of difference in behavior. I’ve seen it happen within weeks when one guy had Tomcat 5.0.x while the deployment server was 5.2 and another dev accidentally decided to try Mustang in Eclipse. Unless you run ALL of your dev builds/unit testing from the cmd line and pull the binaries from SVN or a net share you’re way more vulnerable than you think. These cmd line builds need to be done before every commit and commits always need to be done within seconds of a green bar.

  5. Hey Cliff,

    What you’re describing sounds terrible! And I think you’d be exercising your CI server safety net on a regular basis in that case.

    It doesn’t have to be so bad though; We can work on one thing at a time, integrate once a day, and all integrate using the same build tool regardless of how we prefer to run our tests during the day.

  6. What I’m describing is the harsh reality that most people overlook. And you absolutely should exercise your CI server safety net after each commit. That’s the “C” in CI. The same “build tool” does not guarantee the same build results. I can use Ant, for eg., against the same source you used Ant against and come up with different results. Lets take it a step further. I can use Ant 1.7 against the same source that you used Ant 1.7 against and come up with different results. The devil is in the antlib. Unless we both work exclusively on the same project then there’s the strong possibility that I have a need for, oh let’s say ant-img-gen.jar. (BTW, I just made that up.) The img-gen complains that it needs a different version of clogger (commons logger) and so I fix that complaint but muddied my environment. I don’t know that because I can still build our project correctly… and the unit tests all green bar.

    Let’s bring my FAVORITE example into the mix, Eclipse! You have no idea how many times I dug somebody out of the Eclipse Ant build ditch! Two devs both using Eclipse but they invoke Ant from within Eclipse differently. one double clicks the target in the Ant build window while the other, well lets just say they both aren’t using the same IDE settings. “Run As…” then “I wanna fork my Ant b/c it keeps crashing Eclipse” is too common. Now one guy is using the VM Eclipse runs in while the other uses another VM located by Eclipse [during install] on his hard drive. I’m not going to talk about how some people/companies try to use JVM extensions… just having a different VM is bad enough.

    Bottom line, you really need to be aware of all the variables and dependencies going into each build and you should ONLY promote and officially QA from a build server to guarantee reproducible results. This is because the build chain is one big fat project dependency.

    Which libraries were in effect when your buggy build was created? Was there anything extra/missing from Antlib? Did somebody sneak an extension into the VM? Does the VM have the right version of Xerces?

  7. You’re right. There can be tons of variables. Some of those you can solve technically and some of them need to be solved by agreeing procedure. The less everyone has the time to solve the technical problems and the less everyone has the will to solve the process problems the more you’re going to need a CI server*.

    * if we’re talking specifically about the “it works for me” scenario – there are other benefits (autogenerated reports, build time history, etc)

  8. Oh yeah, I’m getting to love Hudson. It’s hands down easier than CC. But more important, I love the process of CI. It took me this long to become a convert but now I see the instant benefit. It’s sort of a unit test for the environment.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s