DevOps Zone is brought to you in partnership with:

Doug Rathbone is a software architect working in Ad land. He is passionate about software design and automation, and regularly contributes to a number of industry sites on these topics. Douglas is a DZone MVB and is not an employee of DZone and has posted 62 posts at DZone. You can read more from them at their website. View Full User Profile

CI – Third Party Tools Live in Your Source Control

04.05.2012
| 6367 views |
  • submit to reddit

I have recently had a couple of interesting discussions with a different people on twitter and “the real world” about the use of third party build dependencies such as unit testing frameworks, database versioning tools and other command line executables in your build. The topic of these discussions has been about where these dependencies should be located, inside your project, or installed on your build server.

Wait… What are you talking about again?

When i refer to “third party build dependencies” i am talking about tools that you use as part of your build. If you have unit test projects inside your solution you want to run these unit tests using a tool such as nUnit or MsTest. If you have auditing/validation tools you want to run such as Troy’s post on using the web.config validator WCSA you want to run these as part of your build.

The question is where should these tools reside?

Should you install nUnit version x.xx.xx.x on your build server and call it using a static file path to it such as C:\program files\xxx\xxx.exe?

Or should you instead place the specific version of nUnit you’ve been using inside your solution and call it using your build’s output path?

Getting the correct answer to these questions for your project is what this post is about.

Opinion is, as Opinion does

While this topic, like a number of other eXtreme development practice discussions, is based mostly on opinion as to what the best approach is – and this is further confused by the fact that each project has different needs, and with these needs comes different tooling requirements.

My opinion on this subject is that you should always place your build dependencies inside your project when possible. Below I'll explain why.

I’ll caveat some of my answers as to how i come to the conclusions i do by adding that i work in a digital agency type environment, and we create a lot of different projects, not just one. This can sometimes lead me to different conclusions than if you were working on a single internal project for your entire career at your current company. I’ll leave your decision up to you.

It’s about not getting stuck in tooling dependency-version hell

When you install a tool on your build server, you are often locking your project to using the version installed on your build server forever. This is a tight coupling that will always need to be kept in line. Quite often when using a third party tool, there are constant changes happening to add to its functionality, make it easier to use, etc etc.

This means that if you write code that works with its functionality in version 2.2, then when version 2.3 comes out with breaking changes you have to make a decision. Do you upgrade all your projects running on your build server to work with version 2.3, or do you simply never upgrade? If you don’t upgrade, when you start a new project do you still develop for this older version of tooling because this is the version installed on your build server?

By adding your tools to your project’s source control repository you overcome this, as you can have different projects running different versions of your tool of choice. A legacy project can be happily using version 2.2, while a newer project can be running version 2.3 – this decouples your build server from a specific version of your tooling, making your whole CI process more flexible.

I recently heard Martin Fowler, one of the thought leaders on continuous integration summarise this very well:

“All a project needs to build and run should be available and versioned in its source control repository.

A new developer on a project should be able to download everything needed for a project in one command, and run it with another. This includes third party tools.”


Martin took this even further by saying that in some scenarios where you are using open source IDEs and editors, that these too should be kept in a projects’ source control to allow for a new developer to start work on a project more readily. As i work in the Microsoft .Net development space, and use the development IDE Visual Studio this is not easily possible, and even if it was you probably wouldn’t want to – anyone who runs visual studio will, if you ask them to add visual studio to your source control, give you a strange look.

I’m not so interested in this all out approach as i am only concentrating on the continuous integration/continuous delivery/build server side to this statement.

It’s about not being locked down to a single build server

When you have to run your build dependency of choice from an “installed” instance on your build server, you are making a number assumptions.

  • You have admin access to your build server, and can make upgrades to your software when required without having to hassle your sysadmin.
  • If your build server dies, or you want to migrate it to a new server (maybe you want to move your build server offsite to a service such as an Amazon EC2) you’ll have to setup your tool again, and configure it exactly the way you had it before (in the first scenario, i hope you backed up those configuration files).
  • If you want to scale your build onto multiple servers, you’ll have to do the same as the above point as well – this means if you are running a number of build servers, you have to make sure that all of their configurations are identical.

When you first setup your continuous integration configuration, choose which build server software you’re going to run, set it up, get all excited by the magic that you’ve stumbled upon, you often don’t ever think you’ll ever need a second build server. You often don’t think you might want to run tasks that might take a long time, like web performance tests, or automated user testing like selenium – tasks that are usually run in parallel to your other build or deployment tasks.

By placing your tooling inside your source control, you are not limited by how many build servers you run or whether they are internal or external. And when you need to scale out your build configuration, or rebuild one of your build servers, you don’t need to configure a thing – just point your build server at your source control, and the rest will run without any setup.

image

It’s about being able to debug the build locally on any developers machine

Taking the above one step further, build configurations shouldn’t be run just on build servers. It should be just as easy for any developer to download the project and run all the build tasks locally on his or her development machine. He/she should be able to use the same version of your unit test runner that your build server is using. The same database versioning tool that your build server is using. The same widget packager that your build server is using– and the only way to guarantee this is if the tooling and its associated configuration is stored centrally in your source control.

Anyone who’s worked with build automation and continuous integration will tell you that you need to be able to debug your build locally to avoid any bang-head-on-wall debugging sessions.

How can you troubleshoot a broken build process if you can’t repeatedly run it locally in exactly the same configuration as on your build server?

By placing all of your tooling inside your source control this becomes easy – if you are having an issue with a build configuration, download the project and try and run it locally.

The “Licensing cost” argument, or “You’re using the wrong tool”

When i state the above, a number of people will come back and mention that their  tool of choice is quite expensive, and purchasing a license for each project is simply not possible. I believe this argument, in most cases, to be quite flawed.

When setting up a project, continuous integration should be a part of whatever initial project architecture and project setup you go with. As i stated while discussing some of the things spoken about at a recent event on continuous integration i attended, continuous integration is the gift that keeps on giving to your daily development processes, so much so that it should be a requirement from the beginning of your project. This means you weigh project tooling heavily towards it’s need to be encapsulated inside your project – when looking for tooling this should be a requirement.

If you are using a tool that needs you to purchase a new license for each project instance in use,  try and find a new tool.

If you are using a tool that needs you to purchase a new license for each build server instance in use, try and find a new tool.

If you are using a tool that doesn’t allow being run without being “installed” along with a million registry keys instead of a directory of files, try and find a new tool.

If you are using a tool that can’t take configuration options using command line arguments or configuration files for flexible configuration, try and find a new tool.

Third party tool developers are usually not evil monolithic software houses. They are building software for you and I to use based on our needs. If the developer of your tool of choice doesn’t allow for some of the above – let them know that you wish their tool had this functionality (hello, Redgate???). If they don’t agree with you, your course should be clear, find another product. It’s that simple.

In most instances, the other product is open source and FREE, making your decision even easier.

The only time this is not possible, is where a tool you use in your build is one of a kind and there is no alternative. Sometimes this means you haven’t looked hard enough, other times it means you’ll have to compromise, but if this is the case – still let the developer of your tool know. You may find that the next version or licensing agreement of your tool of choice has some new functionality/easier licesning that’s right up your alley!

With a grain of salt

Above i make some pretty finite calls regarding my view on the subject at hand – you might not agree with all of them.

I’d like to think that in every case that you don’t agree with me, this is because you have a good reason not too that is based on the current project you are working on, and not your overall view on “best practices”. Every project is different and the decisions we make to use every tool we do is based on us weighing up the pay-off the tool delivers to our project.

You may use a tool that is so oober awesome that using an alternative is simply not worth your time.

You may only ever need to have one build server – and you are the only developer on the project and this build server is your local machine so you don’t care about having to buy a license for every server your tool gets used on.

At the end of the day, how you come to your decisions is none of my business – but while making these decisions to use tooling that has a requirement to not be saved in your source control, i would hope you do so knowing what limitations this delivers to your project – hopefully the benefits of your decision outweigh the possible negatives I've mentioned in this post.

Summary

So basically what i am trying to say here is, you should always place your build dependencies inside your solution unless absolutely necessary because of licensing or some other un-moveable obstacle. Your choice of tooling should strongly be guided towards being able to be run from your source control repository. If your tool of choice doesn’t support this approach to being used, you can usually find an alternative, so don’t get locked down – put your build dependencies in your source control.

Published at DZone with permission of Douglas Rathbone, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Tech Fun replied on Thu, 2012/04/05 - 8:08am

The bottom line is, if I have a fresh environment and do a checkout, can I rebuild the system successfully (including compile/test/package)? If so, then it's ok, otherwise I would consider setting problem. This obviously requires all dependcies pulled from source control.

Mark Unknown replied on Thu, 2012/04/05 - 9:00am

" This obviously requires all dependcies pulled from source control." No it does not. 

Mark Unknown replied on Thu, 2012/04/05 - 9:15am

This post makes me think you are using the wrong tool for the job. I have not used NuGet or NPanday so I cannot say if this is a Build tool issue or a .NET ecosystem issue (I bet the former.  But probably some of the latter).

As for putting all third party tools, etc and IDEs in Source Control - If Martin said that, then he was a little wonky that day. There are much better ways to solve this. Only in the .NET world are projects dependent on versions of the IDE.  A "project" should have nothing to do with an IDE setup. All the dependancies should be defined in the project. They do not actually need to exist there, though.

Having things in different places does not mean you cannot do it "in one command".  

"If you are using a tool that doesn’t allow being run without being “installed” along with a million registry keys instead of a directory of files, try and find a new tool."

So I guess you will be getting rid of VS.NET? ;)  

Drew Sudell replied on Thu, 2012/04/05 - 1:19pm

While I agree that a build should be reproducable, that you need to subject all the build requirements to configuration managment, and that ease of setting up new build servers and developer environments is a high goal, I have to object to putting the build pre-requisites into source control for two broad reasons.

 First, SCM isn't the only form of repository, nor is it the best to use for non-source artifacts.

The most obviouse prerequisite is thrid party libraries followed closely by 3rd party tools.  The thing here is that there is no inhreent relationship between the version from the SCM system's perspective and the actual version of the library / tool.  If you put them in generic locations like lib/foo.lib or bin/bar.exe you have no idea what is in those files.  Moreover when developer X copies version N of foo.lib over and checks it in with a commit comment of "moved to version M of foo.lib" where N != M, you've got a useless mess.  By contrast if you version them in some non-scm way such as bin/mytool/1.2.3/bar.exe or lib/foo-1.2.3.lib, then you still need to manage the version dependencies in your build (which you should do anyway) and you've reduced the SCM system to a really awkward WORM drive.

Better is to treat such things as the versionable, explicit entities they are and to store them in an append only, version addressable repository.  That can be as simple as a (mostly) read only shared drive with tools and libraries installed in some "reasonable" versioned path (eg /tools/sometool/ver1.3.6/...) or as complex as running your own ivy/maven reposory server in say a java shop where the build explicitly pulls version N of artifact A.  But using the scm system for to do that adds zero value and causes needless duplication and the conseqent management overhead.  It's also giving you a false sense of having solved the problem.

Second, the set of things that could possibly go into the SCM system is but a subset of the build pre-requisites that need to be managed.

Sure for many languages / runtimes, it's most of the pre-requisites.   But consider system libraries, os header files, etc. There you really need OS distros and all patches on physical media, preferably in duplicate, preferably with one copy off site.  And you need a machine (physical, virtual, whatever) that can they can be installed on.  And you need to ensure that you can have availiable machines to run that.  If your software is relativly system independent and you use commodity hardware, that's probably easy.  If you're supporting some old PDP-8 for some customer who'd rather pay than upgrade, you better have a room full of spare parts.  And it's not just obsolete platforms that raise this issue.  There are cases where the build tool chain depends on the version of say the processor.  (yes, hp-ux c compiler, I'm talking about you).

You're completely correct in recognizing that the build must be easily replicatable for ever and every.  (or at least until you're more willing to refund a support contract than address an issue).  But just dropping everything into your scm system misses much of the problem in many cases, and in all cases creates an apparent reduction of the problem to its most simplistic subset. 

The main value of an SCM system is not that you can recreate specific versions of your source.  There are a lot of easier ways to solve that.  Use an append only filesystem.  No, the main value is it's ability to say useful things about the nature of the changes.  That is to say do diffs, change logs, and such.  That simply does not apply to 3rd party artifiacts, especially binary ones. 

Bruno Barin replied on Fri, 2012/04/06 - 4:19am

Source control as the name stands is to control the version of your source code, not version of your binary dependencies. That's why maven and its repository were designed, or am I missing something?

Lajos Papp replied on Fri, 2012/04/06 - 4:51am

Hi Douglas,

Mostly i agree with you, but what do you think about the most common tool, the build tool itself? 

 

  • would you recommend to put even ant/maven in your repo?
  • in case of maven: would you create project specific maven repo?
  • how about the version control tools? would you put scn/git tools in the repo?

 

Mark Unknown replied on Fri, 2012/04/06 - 8:49am

I just looked at the original post- It was written a year ago. I know NuGet has matured greatly. Hopefully the blogger has too.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.