Skip navigation

sdk.org.nz

I’m moving my blog over to sdk.org.nz, where I’ll be joining fellow posters Justin, John and Eben in the “Software Developers (K)ollaborative”. Just need to work on the theme so my past posts less resemble pants.

It’s not done yet

Every software developer needs to experience joining projects that are at different stages of their development lifecycle. At the start, faced with an empty repository where you get to make important decisions every day, write a lot of code and functionality, and understand how everything fits together. In the middle, worrying about all those little details, dealing with testers raising defects, running into gotchas that obliterate fundamental assumptions, taking an application into production. And at the ‘end’ when a project is already running in live, fixing bugs and adding new functionality to a codebase you don’t really understand, where writing one line of code that doesn’t cause breakage is an achievement.

At my current job I’ve been fortunate enough to work on three projects covering each of these cases and I’ve learnt a few things along the way.

The first is that writing lots of code and making decisions is fun! But when you only work on that part it’s easy to look down on those who work on the project later. Ha! I wrote 90% of that in 4 weeks, now you’re spending months just adding trivial enhancements around the edges without really understanding what’s going on.

What you miss though, if you never have to take applications into live and beyond, is that there are a whole lot of things you can do that would make other people’s lives a whole lot easier if only you knew. You’ve got to see the issues that operations have to deal with in order to get the “yeah, there’s no way you could know that” empathy.

Logging

When you’re doing development, you don’t need to have very good logging. There’s a problem? Well, I’ll just read the code or step through a debugger. Once the application goes to testing it gets harder – um, I can’t reproduce that, can you give me more information? Production is worse still, with weird intermittent issues that operations need to detect and deal with even before talking to developers.

The first rule is logging is that when the application is running normally, the logs should be “clean”. Right from the beginning of development, we should keep an eye on what’s being logged. Do we get errors or warnings under normal operation? Fix them. You don’t want to be in a situation where you’re looking at a log full of errors and trying to figure out which ones are “normal” and which indicate a problem.

What do the logs contain when run at the level they’re going to be in production. We’re not running at DEBUG level – is there still enough information to see what’s going on. Are there lots of messages about trivial things and very little about other important parts of the system? Are there multiple lines of code that log the exact same message for different reasons?

Logging is one of those things that you need to care about right from the start. If you don’t keep an eye on it you’re never going to get it right. Functional (end-to-end) tests are useful here – check the output when they’re run.

Environment issues

I get really sick of dealing with P1 “defects” due to incorrectly set up test environments. The application fails with an error? What error’s that then. “table FOO doesn’t exist”. Or the reference data isn’t set up correct – “just don’t do that” and everything will be ok.

What I do these days is keep a list of environment issues in a table on the wiki. Everytime someone screws up an environment because they forgot to apply a database patch, I add a row to the table with a query and expected output that you can use to verify that the error hasn’t occured again. You can bet that if a mistake is made setting up one environment that it’s pretty likely to happen again in others.

For reference data the application really needs to apply validation rules to the configuration at load time. It’s not good enough to say that “we don’t support configuring those things together” if the application could just as easily detect the situation and give an error.

One of my favourite head-smacking issues with testing is when multiple deployments of the application end up pointing at the same database. This is especially fun when they’re different versions, and you spend hours trying to figure out how on earth this behaviour is happening in version 2.5.2 when we fixed this back in 2.5.1. This happens during development as well when you “temporarily” tell a colleague to use your development database. One way to avoid this problem is to have the application post regular updates to a ‘heartbeat’ table including a unique runtime identifier (e.g. random number generated at startup), last update timestamp and version information. Then you can run a query to find those identifiers that have updated in the last 5 minutes and see who’s really touching those tables.

Requirements changes

Requirements churn is a good thing. As we find out more about an application and how it’s used we will naturally want to change earlier decisions. The problem is that it becomes difficult for anyone to know how the system is supposed to behave – or actually behaves. The requirements have changed, but did we ever actually have a chance to implement those changes? How did we “interpret” those parts where there was hand-waving? Where did we intentionally deviate without telling anyone ;-) What was that conversation where I spoke to a business analyst and they “clarified” some details of the requirements but never wrote it down?

On one of my projects back in NZ I was a bit laissez-faire about coming up with system behaviour that “made sense” without telling anyone. It wasn’t until we hit testing and I realised that testers had no better source of information than the original business requirements that I realised that although the software functioned well, there were a whole bunch of people who needed to understand how the application worked without reading the code.

What I do these days is to write up wiki pages whenever I’m implementing a major piece of functionality, describing precisely the behaviour I’ve actually implemented. If a requirement was ambiguous and I had a conversation with an analyst, I write down the conclusion in my document. Don’t leave it up to them to do it.

The great thing about this approach is that it gives you a safe way forward when presented with imperfect requirements. You can go ahead and get most of the work done and when it gets to testing and they interpret the requirements in a different way, go and mark it as “Functions as Specified” with a link to your document. I did that twice today and it makes life just that bit more pleasant. Quite often when there are multiple ways of designing some functionality it doesn’t matter all that much which one you pick, so long as people can easily find out which one that was. Even if it later turns out you mis-interpreted the requirements, presented with a clear description of the actual system behaviour the business may decide that the implementation is good enough, and that the requirements should be retrospectively changed to match. Won’t Fix!

Bugs in live

If you hit a bug outside of production, there’s not too much of a problem. We’ll just fix the code, re-run the test, sorted. When you hit a bug in live though, you have to recover from it.

Bugger.

This is the hard part of writing “business applications”. Making them reliable. Most of the time we just don’t bother and then there’s a major panic when anything goes wrong. You’ll have requirements that cover ‘negative flows’ by saying “create an alert”. Great, so the live system has just logged an alert saying we’re fucked. Time to go on holiday. Otherwise you’re sitting in a room full of worried managers who want you to find a workaround and fast.

The best aproach I’ve found for creating reliable applications is to expect them to fail. Expect that there will be bugs. Expect that hardware will fail. Expect that exceptions will occur. Expect that we’re going to have to recover.

How do we do that?

There’s no such thing as a database table that just contains internal application state that no-one other than the developers needs to understand. In some cases operations are going to have to hack this data – the structure and contents need to be well documented and understood. Functional tests are useful here; although some people claim that these tests should only assert against external interfaces and not against the database, I argue that the database is a semi-public  interface. It’s not as formal and fixed as a web service, but developers, testers and operations all have a need to know how the database changes under processing.

Have hooks for inserting workarounds into live without deploying a new release (aka legitimate backdoors). For example, if the system applies validation rules to input messages, have some support for conditionally turning some of the rules off. One application I worked on was composed of dozens of little callable services with XML interfaces. The web interface included a developer-only page that had a textfield (service name) and textarea (xml) allowing you to send arbitrary internal messages to the live system. Dangerous but handy. Make sure you try your call in a test environment first! And don’t forget to secure it.

Expect that processing is going to fail, manual workarounds be applied, and we’ll then want the system to continue. The project with the callable services was at a Telco and integrated with billing systems and other monstrosities. Naturally some customers would have accounts that were … unusually set up, and would contradict any assumptions we could possibly make. To deal with this I turned the processing flow into a state diagram and then made it so that if an error occured at any state the system would set a “manual” flag on that customer’s record, generate an alert message, allow for any manual fixups and have a “reprocess” button to try again.

Taking that approach introduces some design contraints into the system that you simply can’t add as an afterthought. It’s easy and convenient to say, well at this processing stage we really need this piece of information, so instead of working it out again we’ll just persist it to the database at an earlier stage. That’s all well and good until the earlier stage didn’t actually go through automated processing – someone hacked it together by hand and then continued.

Whenever anyone pontificates about “loosely-coupled components” that’s what I think of. If Component A doesn’t actually run, will Component B still work, or does it depend on some of the internal behaviour of A?

Don’t try too hard to enumerate all the different things that can go wrong. Just assume that any component can fall over half way in some unexpected way, possibly leaving data in a corrupted state. If you can handle that you can handle anything. Data corruption is an interesting one, actually. Try to segment data to that if say the data related to one customer is corrupted, processing can still continue for other customers. This is similar to sharding that’s done for performance reasons.

Functional Tests

Ok, so I’ve talked about functional tests in a previous post. That’s because they’re great. When a problem occurs in live and you can in ten minutes write a test to replicate the problem and experiment with workarounds – I’ve done that many times. When a new developer joins the project and can actually understand and discover functionality and avoid breaking half the application – yup.

The rest

There are plenty of other issues to consider. How do we deal with upgrades – can we migrate old data? So you’re changing the database schema – are you aware that people have written systems that perform queries directly against your database? They’re going to break and you don’t even know it.

Reporting is a great one. It seems that “reports” are always something that get implemented at the last minute and are a nightmare because, surprise surprise, they want some information that, sorry, we just aren’t representing in our domain model. It gets even worse when you have performance challenges and you’d really like it if the application would’ve persisted certain intermediate statistics as it’s processing rather than us having to work everything out afterwards. Actually, I could do a whole ‘nother post on performance, and on system integrity checking (reconciliation), deployment (clustering) ….

I’d be interested in hearing any ideas or war stories other people have about dealing with live systems. Leave them in the comments!

Components vs Services

(This post is a bit unpolished, being a slightly edited copy of something I wrote on the train to work last Wednesday . No matter. I declare a Blogothon Principle – it is better to publish a post that isn’t ready than to miss out a week due to noble intentions …)

There’s a receipt on the table in my lounge. On the front: 1 Original Bianco Salmon pizza from Papa Johns West End Lane. Expected on 30/03/2009, at 21:05:31. On the back are some scribbles I made at 5am the next day. “Django little rest apps”, “Ruby _why mount little app”, “OMG it’s J2EE”, “seam -> EJB 3.1”, “coarsed grained well-defined interface”, “reusable component”, “can move to diff. servers, cluster etc.”, “EJB vs SCA”, “runtime vs design time”.

It all started a couple of weeks ago, where an innocent perusal of programming.reddit.com led me to an article claiming that Python web framework Django is composed of a number of components that communicate over HTTP (REST-style), which makes it easy to swap different implementations of components and plugins in and out. Knowing that Jason is both a Python and REST fan, I mentioned this to him. Although it didn’t match his recollection of Django, we both agreed the idea sounded good.

Tick tock. Coarse-grained components, communicating via a network protocol, supporting local and remote interfaces so you can transparently distribute them on different servers. Smells like E J and B.

That can’t be right though. I’m not alone in remembering the pain of EJBs in J2EE – having to pay upfront for benefits that are never realised. All that stateless session bean local remote interface container managed naming management nonsense. So why does this approach sound good now? Have I become that which I hate? Some hand-waving architect making developers jump through hoops to solve a problem they don’t actually have?

(Aside – EJB 3.1 doesn’t look so bad. A couple of annotations here and there is all you need, so you can now use them for even fine-grained interfaces.)

Here’s what I think though. Implementing “loosely-coupled” components that communicate over a well-defined interface has costs and benefits. Whether it’s worthwhile is not so much about complexity as control.

If two components are internal to my team’s application, and we’re three developers, don’t make me jump through hoops in order to get them to communicate. If I want to change the interface of one of them, well, all the code’s sitting in my IDE in my project that I’m going to release as a single package, so I’m just going to change all the callers while I’m there.

In the Django case, the idea is that people who are using Django – not those who are developing it – will choose different components and write their own extensions. In this case they have less control over the whole Django implementation, and so now the cost of separating components out in this way has a benefit that exceeds the cost of the writing and maintaining the implementation. You have to care much more about API design, because you’re going to have callers that you don’t know about, will need to maintain compatibility and care about versioning.

Without plagarising too much a sales document I wrote last month, this reminds me of Conway’s Law, the principle that the design of a software system will naturally mirror the communication structures of the teams implementing it. If you have three teams then you’re going to have three major components, with well-defined interface boundaries between them. We should embrace this observation and either design our interfaces around our teams, or organise our teams around the interfaces that we wish our software to have. If there are particular components you think should be kept independent, assign them to different developers.

But speaking of “you’re going to have callers that you don’t know about”, that can be a bit annoying. I spent a year at a Telco working on a (successful) project, and while I was there started to ponder some architectural questions. What would happen if you really say “lets have lots of reusable interfaces”. And lets have these be “runtime” things rather than “design time” i.e. other projects will call my web service rather than integrating my code into their project.

Suppose you’re a new contractor that arrives at the Telco on a Monday on a new project.

Imagine that you go to one web page, that’s an entry point for all the web services in the organisation (on the development servers at least!). So you can choose a category like “Billing” click a couple of links and get to a “query account” service. The page shows some XML on the left that you can edit, clicking it actually calls the web service and shows the result on the right. What’s more, we keep logging information about who is actually calling each service, including some simple performance stats. We can see that in the last week, project Thundercat has called this service 17,000 times, each call taking an average of 1.2 seconds (with std deviation blah). Maybe I should talk to the Thundercat developers if I need help.

Later on you’ve added a new service. (Must be easy for people to do this without central permission! Give up control. Whole command and control vs encouraging developers to see value.) Six months later – alas! – one of the assumptions it made is no longer true. Customers can now have multiple accounts. Whoops. Any application calling the service needs fixing. How do we find out which apps these are? Why we go read the accurate documentation published for each application, haha. Even if there is documentation I wouldn’t trust it to be up-to-date. Go to the web page, click the service, and find out which applications have actually used it in the last month. That’s the spirit.

There’s the whole question around reliable web services – aka what happens when you deploy a new version of the service and someone tries to make a call. Unfair for the client to have to retry. Had an idea that all callers actually connect to a proxy, which is always atomically pointing at either the Blue or the Purple servers. When deployment has finished, we point all new calls at Purple. Any web service calls that haven’t finished yet will keep running on Blue (assume that these will not take too long – couple of minutes at most). New calls will go to the new version on Purple. When all calls on Blue have finished, this server can be shutdown and is ready for the next upgrade. Next time we do the same but in reverse – switch over to Blue. Obviously can use more than 2 servers, round-robin.

(Is there widely used software doing this already? What about using something that actually supports reliable message e.g. JMS? Maybe AMQP is a good option these days. Speaking of AMQP, happy that Red Hat owns JBoss, been a big fan of the RH for ages. Despite being dominant they’ve always been good about releasing everything as Free Software. Contrast with e.g. SuSE which had “value added” things proprietary. It’s always nice when Red Hat buys a company – all their proprietary stuff gets opened up. Anyway, bit annoyed about the whole issue where Red Hat in a press release are trying to claim that they created AMQP. That’s rude and unnecessary.)

Part of the whole architectural approach I’m thinking of also, naturally, involves social issues around developers. It’s tempting to say “we need to do this and that because in the long term … therefore we must force these lazy developers to follow all this process”. Developers on individual project don’t always see the long term cost of foo. Problem is, you have the opposite problem – at a high-level you’re seeing the long term but you’re not taking into account the “short-term” (aka always pay) cost for developers to slow them down in red tape like this. So my idea is to do things that individual developers on project can see as adding value, and so encourage them to become part of the “community”. This is why the whole web page with statistics on services used must be available to all developers, without getting permission. It is not just an internal monitoring tool for project leads and architects to monitor people with. If you try to force people to do things, they won’t understand the “why” and will do it poorly anyway.

The title of this post is “Components vs Services”, but I haven’t actually addressed that issue. Perhaps some other time.

YO DAWG I HERD YOU LIKE XSL

I spent most of this week writing XSL templates for transforming XML messages into other, similar, XML messages. Ah, bless.

Thanks to the practice of companies (including us!) making minor changes to standard XSDs, several of the input/output format combinations are identical apart from namespaces and a couple of tag names. Copy-pasting a template six times seemed like a bad idea – it contains some non-obvious bits, and the formats are bound to change since the requirements aren’t yet nailed down.

The first thing I did was to split the templates apart and use <xsl:import href=”/xsl/lib/foo.xsl” > to pull them together again. This didn’t work the first time because the XSLT processor looked for foo.xsl on the filesystem rather than in the Java classpath. Solving this was easy enough – I wrote a javax.xml.transform.URIResolver like the one below to lookup any URIs that don’t have a “scheme://” from the classpath instead. This allows people to write “file:///xsl/lib/foo.xsl” if they really do want to reference the filesystem.

public Source resolve(String href, String base) throws TransformerException {
    URI uri = URI.create(href);
    if (uri.getScheme() == null) {
        InputStream is = getClass().getResourceAsStream(href);
        if (is == null) {
            return null;
        }
        return new StreamSource(is);
    } else {
        return parentResolver.resolve(href, base);
    }
}

Ok, so that worked, but I had an uncomfortable feeling that the InputStream would never get closed. It makes sense that it’d be the caller’s responsibility to close it, however there’s no Source#close() method. The javax.xml.transform javadocs were no help, a web search showed up plenty of people doing the same thing I was, but without any discussion of whether it is safe. Stepping through a debugger told me that at least the XSLT processor implementation we’re using (Saxon) does close the InputStream, and in any case Class#getResourceAsStream is returning a FileInputStream which has a documented finalize method.

It can be a bit dangerous making conclusions about libraries on the basis of runtime debugging information rather than documented behaviour and interfaces – what happens in the future when I’m no longer on the project and someone switches to a newer version of Saxon or to a different XSLT processor? A few weeks ago I wrote a post denying the benefits of unit testing (but promoting automated end-to-end functional tests) for projects like mine, however this case – verifying a technical question – is a perfect candidate for a unit test. In fact I would have been better off creating a unit test in the first place instead of using the debugger.

public void testCallerClosesInputStream() throws Exception {
    // Setup mocks
    IMocksControl mockControl = EasyMock.createControl();

    final boolean[] closed = { false };
    InputStream in = new ByteArrayInputStream(buildXSL("").getBytes()) {
        @Override
        public void close() {
            assertFalse("Already closed", closed[0]);
            closed[0] = true;
        }
    };

    URIResolver uriResolver = mockControl.createMock(URIResolver.class);
    expect(uriResolver.resolve("fake-href.xsl", "")).andReturn(new StreamSource(in));

    mockControl.replay();

    // Execute
    String xsl = buildXSL("<import href=\"fake-href.xsl\"/>");
    String xml = buildXML("<root/>");

    transformerFactory.setURIResolver(uriResolver);

    Templates template = transformerFactory.newTemplates(stringToSource(xsl));
    Transformer transformer = template.newTransformer();
    transformer.transform(stringToSource(xml), new StreamResult(new StringWriter())); 

    // Verify
    mockControl.verify();
    assertTrue("InputStream#close method should have been called", closed[0]);
}

// buildXML and buildXSL are simple helpers that wrap the standard "<?xml ..." and
// "<stylesheet ..." declarations around their arguments.
//
// stringToSource turns a String into a Source via a StreamSource and StringReader.
// new StreamSource(String) is a trap for the uninitiated; the argument is taken as
// an id rather than contents.
&#91;/sourcecode&#93;

Why didn't I use EasyMock for the InputStream? First, java.io.InputStream is an abstract class rather than an interface. That's not a problem; all I need to do is use org.easymock.classextension.EasyMock. Initially I did this, and expect()'ed a single read() method that returned -1. The XSLT processor complained though because an empty document is not a valid XSL. I could have used andStubAnswer instead of andReturn in order to delegate read calls to a ByteArrayInputStream but having to do this for both read() and read(byte&#91;&#93;, int, int) methods makes this pretty wordy. Instead I simply create an anonymous inner class that subclasses ByteArrayInputStream.

I should also note that despite my description of this as a perfect example of where a unit test is more suitable than an integration test, what you can't see is that this test lives in the project's engine-integration module rather than in our transformation library, and there's a set-up call to get the Spring bean named "xsltTransformerFactory". I've done this because I want to ensure I'm testing the exact same TransformerFactory implementation that the application is using, rather than just whatever implementation the transformation library's tests happen to use. Or perhaps I just really like integration tests.

Right, so all that helps a bit. Instead of one big template for which I need to vary namespaces I've now got three smaller templates for which I need to vary the namespaces. What to do. I was already dealing with one case around namespaces - many of the tags in input were copied exactly to output (including their sub-elements), but with a different namespace.

&#91;sourcecode language='xml'&#93;
<xsl:apply-templates mode="copy-dest-ns" select="SomeElement">
    <xsl:with-param name="copy.dest.ns.namespace" select="$whatever.namespace" />
</xsl:apply-templates>

<xsl:apply-templates mode="copy-dest-ns" select="AnotherElement">
    <xsl:with-param name="copy.dest.ns.namespace" select="$whatever.namespace" />
</xsl:apply-templates>

copy-dest-ns is a template I based off the identity transformation. Writing out the parameter all the time makes the template hard to read, so I use the XSL “tunnel parameter” feature to allow me to simply write:

<xsl:apply-templates mode="copy-dest-ns" select="SomeElement" />
<xsl:apply-templates mode="copy-dest-ns" select="AnotherElement" />

Tunnel parameters are like dynamic scoping in programming languages. The parameter is passed to a higher-up template and is “magically” available to the called template.

So I’ve got things to the point where I’ve effectively got the following two XSLs that I want to turn into one:

<?xml version="1.0"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns="urn:some:namespace1" xpath-default-namespace="urn:some:namespace2">

    ... output and parameter declarations ...

    <xsl:template match="Foo">
        <xsl:apply-templates mode="copy-dest-ns" select="SomeElement" />
        <xsl:apply-templates mode="copy-dest-ns" select="AnotherElement" />
        <More>
            <xsl:value-of select="Moo" />
        </More>
        <SomeStatus>GOOD</SomeStatus>
        ... lots more elements ...
    </xsl:template>
</xsl:stylesheet>

and

<?xml version="1.0"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns="urn:some:namespace3" xpath-default-namespace="urn:some:namespace4">
    ... rest of the contents are the same ...
</xsl:stylesheet>

What I’d really like to write is xmlns=”$some.variable”. That’s not going to fly though – XSLs are XML documents and xmlns is part of XML, whereas the dollar-for-variable syntax is part of XSL. It’s at the wrong level of abstraction for me to use an XSL variable, and XML doesn’t have any parameterisation concept. Maybe I can set some namespace substitutions on the XSLT processor? Nope, nothing in there. The XSLT 2.0 specification, while quite readable, didn’t have any clues that I could spot (although <xsl:namespace-alias> was tempting).

It’s about this time that I start thinking that, well, I’ve got two XSLs that are almost the same. XSL documents are XML documents. XSLs are good for transforming between XML documents. Hmmm. Those I’ve worked with in NZ may not be surprised to hear that I started chuckling at this point about the “evil” and the “muhahahahwahah”. Actually I was trying really hard to avoid doing this. Having an XSL that transforms another XSL is cute and is a correct way of solving this problem robustly but I actually do have sympathy for the person who’s going to have to come along and maintain this stuff. Jason, the Python fan who I sat next to for six months would probably enjoy hearing that over the last year and a half of working in this environment I have come to appreciate explicit-over-implicit and the avoidance of magic. “Discoverability” is a word I like to use.

What else can we do? How about instead of writing the output elements “literally”, we use <xsl:element>, the very point of which is to support dynamic element outputs.

    ...
       <xsl:element name="More" namespace="$some.variable">
            <xsl:value-of select="Moo" />
        </xsl:element>
        <xsl:element name="SomeStatus" namespace="$some.variable">GOOD</xsl:element>
    ...

That’s not too bad, and it’s fair to say this is the “right” way of solving this problem in XSL. I started converting the template to use this style but it ended up hard to read and awfully verbose. Not very nice at all. What else can we do? I had a chat to Jason, who’d had the same kind of issue on the project he’s working on. He ended up making it so his XSLs didn’t care about namespaces at all, by stripping them out before passing them to the XSL, and adding back a fixed header on output. Unfortunately my output documents contain elements from a mixture of namespaces, so I really need my templates to be namespace aware.

I went back and forth and decided that finally, yes, I would write an XSL that transforms an XSL. Fine. Sigh. I’d have to make something that also somehow transforms any XSLs that are included via <xsl:include> or <xsl:import>. No problem. Recall the URIResolver I wrote previously to load documents from the classpath instead of the filesystem. I can also write a TransformingURIResolver that first delegates to the parent ClasspathURIResolver to get the StreamSource then applies a transformation and returns the result. I implemented this, no problem, and thanks to the fact that we are “compiling” the templates with TransformerFactory#newTemplates, this transformation stage happens entirely at startup. By the time we get around to busily transforming lots of large XML files we’ve got the resultant XSLs all ready in memory in an efficient form.

The next step was to write the XSL to XSL XSL. I had fun naming the file, and especially the XSL inputs to the template which I called “xxx001-to-yyyyyy-xxx002-header-meta-template-input.xsl” (xxx and yyy added to anonymise things a bit). Deliciously wicked.

So how do we change the value of “xmlns”? I wrote something like the following:

    <xsl:template match="xsl:stylesheet">
        ...
    <xsl:template>

    <xsl:template match="@xmlns">
        <xsl:attribute> ... </xsl:attribute>
    </xsl:template>

It didn’t work. A quick test showed that if I renamed “xmlns” to “xmlns2” then everything worked fine. Aha! A special-case bug in the XSLT processor. Oh really? Nope. The problem is that “xmlns” and “xmlns:foo” etc. aren’t actually attributes, although they look like them. They’re namespaces. “xmlns” is an XML thing not an XSL thing. It helps if you imagine there’s special syntax for namespaces instead of having just a naming convention to differentiate them from attributes.

<?xml version="1.0"?>
<xsl:stylesheet &#91;xmlns="..."&#93; &#91;xmlns:foo="..."&#93; xpath-default-namespace="...">
    ...
</xsl:stylesheet>

From this we can see that we could match “@xpath-default-namespace” but not “@xmlns”. How do we define namespaces on an element? <xsl:element> has a namespace attribute so that’ll do the trick. We can also use <xsl:namespace> to define “xmlns:foo” and so on.

    <xsl:template match="xsl:stylesheet">
        <xsl:element name="xsl:stylesheet" namespace="{$some.variable}">
            <xsl:apply-templates mode="copy-attributes" />
        </xsl:element>
    <xsl:template>

    <xsl:template mode="copy-attributes" match="@*">
        <xsl:copy />
    </xsl:template>

Having finally decided to bite the bullet and solve this problem as an XSL, despite my misgivings about “magic”, by the time I’d got the template to a state where it was more or less complete I ended thinking that it was so ugly and hard to follow that I wasn’t really willing to live with it. XSL documents are quite readable when your templates resemble the structure of your output document, but they get quite messy at other times.

Damnit. So I went back and did what I should have done in the first place – working out exactly what transformations I’d need between the different templates. Do I need to rename any elements, or is it just the namespace declarations on the root stylesheet element? With that information in hand it was clear that the XSL solution, while robust, was way more powerful than I needed. I started to ponder that a simple solution – straightforward text replacement – might actually be the way to go.

The problem with this is that this is (again) working at the wrong abstraction level. XML is a text format, yes, but it’s a structured text format. If I replace all occurences of “urn:foo:bar” with “urn:baz:quux” then what happens if the text just happens to appear somewhere else in the document outside the namespace declaration I’m intending to match. The namespace names are pretty distinctive, so this isn’t too likely, although $copy.dest.ns.namespace parameter values are good candidates for accidental replacement.

Programming taste is often a matter of balancing good and evil. The evil of treating XML as unstructured text, the good of a simple understandable solution. The evil of a complicated XSL-transforming XSL, the good of a robust, theoretically safe solution. I decided that the former was the lesser evil, but added comments over the XSLs noting that text replacements are applied on this file, so beware. Finally, instead of having the template be correct for one format, and transformed for the others, I decided to make it really explicit, and used xmlns=”TEXT_REPLACEMENT_FOO1_NAMESPACE” instead of xmlns=”urn:foo:bar”. That makes it really obvious to anyone coming along that any modifications to this template will impact multiple formats. (And gives a string they can search for in the source tree to figure out what’s going on.)

For implementing text replacement, I’d like to use a regular expression with “a|b|c|d” matching all the keys I’m going to replace, and using the regexp API that lets you iterate over match information in order to find exactly which key you’ve matched. Unfortunately Java’s regexp API lacks a Pattern.escape method. Very annoying. I wrote my own simplistic implementation instead that doesn’t bother to ensure that different keys are replaced “simultaneously” and doesn’t care about performance. Good enough enough for this situation.

Sorted. Cleaned it up, wrote tests, committed. If I was doing this a lot I’d probably write a robust namespace changing transformer, using a non-XSL mechanism like the DOM API. I’ve spent enough time on this already, so I’m ready to move on.

Update 2009-03-29 13:13 – After writing this post, I realised that it actually wouldn’t be hard to write a generic namespace-transforming XSL, where one of the input documents is an XSL and the other is of the form

<namespaceMappings>
    <mapping>
        <old>urn:foo</old>
        <new>urn:bar</old>
    </mapping>
    <mapping>
        <old>urn:foo2</old>
        <new>urn:bar2</old>
    </mapping>
</namespaceMappings>

One of those cases where writing a generic but limited solution (transforming namespaces only) is simpler than writing a meta-template for a specific case that tries to know too much about the input document. I’ll stick with the text replacement solution for now though.

Running late…

I’m supposed to write another post by Sunday, but, I’m going to Dublin tomorrow night and not getting back until Monday morning. But I will write a post! It’ll just have to be on Monday night (what a day that is going to be).

Random snippets

It’s Sunday afternoon, I’m finally having that espresso I’ve been too tired to make all afternoon, so this is going to be a shorter post of some simple little tidbits (timbits, for any canadians out there).

Unchecking Exceptions

I don’t want to get into an argument about checked vs unchecked exceptions, but from time to time I’ve written the following to “convert” any checked exceptions to unchecked:

try {
  someMethod(); // throws some checked exceptions
} catch (RuntimeException e) {
  throw e;
} catch (Exception e) {
  throw new RuntimeException(e);
}

Occasionally people come up and “simplify” this to:

try {
  someMethod(); // throws some checked exceptions
} catch (Exception e) {
  throw new RuntimeException(e);
}

Can you please stop doing that – the code fragments are not equivalent. The former avoids wrapping exceptions that are already unchecked. Spread the word, thanks.

EasyMock error

EasyMock is a useful unit testing utility for Java that lets you create mock implementations of interfaces and assert that the code using it should make certain method calls.

People, including myself, occasionally run into the error message
“missing behaviour definition for the preceeding method call” and it’s not always obvious why, so here’s an answer for the search engines.

It probably means you wrote something like this: “expect(foo.something()).andReturn(someMethodThatItselfCreatesAMockObject());”.

Another reason Hibernate annoys me

If you write this code but you’ve forgotten to include “<class>Thing</class>” in META-INF/persistence.xml, what will happen?

List foo = session.createCriteria(Thing.class).list();

You’ll get an error right? Wrong – Hibernate will return an empty list.
HHH-1665 includes a patch, created almost two years ago, to correct this, but it’s been marked as Minor, Awaiting Test Case, and hasn’t been applied.

Sure, I could do the work myself and fix this, but I run into these kinds of issues every time I use it. Any suggestions for alternate JPA implementations?

Name for a book

Someone should write a book called “Boxes & Lines, a pragmatic guide to documenting software architectures”.

Hack

When talking to non-developers about a kludge that you want to apply, call it a “tactical solution”.

Semantic identifiers

When unique identifiers escape out of the database realm (e.g. into XML messages), it’s nice to tag them so they’re more than a number. For example “PROJ_INPUT_1234” instead of just “1234”. This makes it easy to spot bugs when you’re accidently using the wrong ids and also lets you encode information in the id so you can perform some processing without having to lookup a database record (which can be an expensive operation if you’ve got a million identifiers coming in at once).

A question

To finish, a question for your ponderance. What if computers were infinitely fast? No matter how slow your program, it would either run forever or terminate immediately.

What would programming languages look like? What would be the interesting problems in computer science?

Re: What My Cat Brought In Last Night

According to the blogothon rules I wrote a couple of weeks ago, I should have commented on Justin’s post What My Cat Brought In Last Night over the weekend. Lazy me. (And I probably have Justin to thank for the fact that a well-known Rubyist originally from South Korea commented on mine. Hi Francis, I wrote the Ruby Weekly News for a couple of years in 2005-2007 before disappearing off into a different geographical and programming hemisphere.)

I found Justin’s article useful – it started out like it was going to be a bitch about how the software we write in ye “business” world is mostly boring and trivial and woe is us, but it was actually the opposite, telling the complainers they should either become knowledgeable in the business domain they’re working in or piss off and stop taking the money.

I’ve been guilty at times of being one of the complainers, but I do agree with Justin. Some of the most effective and enjoyable times I’ve had as a programmer is when I’ve developed intuition about business needs and been able to brainstorm ideas with analysts.

It sounds obvious, but one of the best ways of understanding how to “add business value” is to actually meet customers and listen to what they focus on. Especially in the UK, working at big companies, it’s easy as a developer to only know the customer through requirements documentation written by an analyst who got an email from a salesperson who thought they heard a customer say something.

Around six months ago I received some enlightenment after an extended weekend tenting at an Irish music festival, when I turned up to work on a Wednesday morning to find that I was flying to Brussels the next day for pre-sales meetings. I’d just spent a year working on an application that provides a service to banks, but hadn’t ever actually spoken directly to anyone from a bank. I wore the same tie I’m wearing for today’s ironic casual Friday (a social activity that it seems only I celebrate – my life’s sad enough that I find amusement in answering “because it’s casual Friday” when asked why I’m wearing a tie when I usually don’t. I probably wouldn’t do it if we actually had casual Friday; maybe I’m subconsciously protesting it’s absence!)

Within an hour of the meeting starting I found myself realising that all kinds of things I’d glossed over were actually really important to our customers. They had a completely different perspective to me – I was focused on our systems talking to banks’ systems and had almost forgotten that in the end our application was (indirectly) providing a service for banks’ customers.

Oh, so that’s why we care about how many days it’s going to take a downstream system to process our message – there’s a human being out there who’s trying to buy a motorcycle for travelling around India and they swear they’ve sent the money already if you just check your records sir.

See Justin, you’re not just talking to yourself!

Why unit testing is a waste of time

In the last few years of writing “enterprise” software in Java I’ve created and maintained thousands of unit tests, often working on projects that require 80% or better code coverage. Lots of little automated tests, each mocking out dependencies in order to isolate just one unit of code, ensuring minimal defects and allowing us to confidently refactor code without breaking everything. Ha! I’ve concluded that we’d be better off deleting 90% of these tests and saving a lot of time and money in the process.

The stem of the problem is that most “units” of code in business software projects are trivial. We aren’t implementing tricky algorithms, or libraries that are going to be used in unexpected ways by third parties. If you take most Java methods and mock out the services or components they call then you’ll find you aren’t testing very much at all. In fact the unit test will look very similar to the code it is testing. It will have the same mistakes and need parallel changes whenever the method is modified. How often do you find a bug thanks to a unit test? How often do you spend half the day fixing up test failures from perfectly innocent code changes?

The complexity that does arise in this kind of software is all down to interactions between components, messy and changing business requirements and how the whole blob integrates at runtime.

Should we do away with automated testing all together then? No. In fact I have found “functional” end-to-end automated tests to be incredibly useful. By functional tests, I mean those that test software in more-or-less the same way as a human tester does, covering wide swathes of the application without checking the result of every possible variation.

The benefits of functional tests are outlined below, and after discussing the one big downside I admit where unit tests can be appropriate.

Benefits of automated functional tests

Smoke testing

If we deploy this release of the application, is the application going to release a deployment of toxic fumes and metal scraps?

I’ve worked on a project where, with the “final” release looming, we were creating a release for manual testing at the end of every day. And the next morning, testers would try to run a few simple scenarios and the application would fall over at the first hurdle. Incorrect database patches, configuration errors, NullPointerExceptions in critical paths and the like.

Later on I developed a set of a dozen or so functional tests, covering just a few basic scenarios. Nevertheless, this was sufficient to ensure that manual testers were able to find useful bugs in the application rather than having to sit around for days twiddling their thumbs because nothing works.

New developers

It’s tough being a new developer on a project. Faced with a mountain of code I haven’t seen before, performing some process I don’t really understand – wait, what does this application do again? – the first thing I like to do it run the damned thing and get a few ideas on what it actually does. To play tester for a while. Unit tests aren’t very useful here. We’re wanting a high-level overview of the application rather than worrying about how some tiny part of it might be implemented.

So how do you get the application up and running on your development environment. If you’re lucky, there’ll be some up-to-date instructions for getting it to kind of start up. Then you’ll get one of the other developers to show you how to run a few things. “Um, let’s do it on my machine,” they’ll say. See, you need to hack up the reference data in the database like this and take this sample XML file I’ve got sitting around and modify this, this and this field, stick it in this table with these dozen values, and then … it goes on. Good luck ever replicating that yourself.

If however, you have some automated functional tests, you can say “run this test that covers the simplest scenario with everything going right”. They can then explore the database and see what kinds of information has been written and even create breakpoints in a debugger to step through the application and get an idea of how everything sits together.

This is massively handy, not just for the new developer but for everyone around them. Instead of having to spend weeks or months babysitting each new developer you can just point them at a set of tests and tell them to play around for a while.

What’s more, a “new developer” is not just one who is new to the entire project. With any project of a reasonable size, you will naturally find that each developer has parts of the application that they don’t see, touch or understand. If you don’t want to maintain the same part of the application for all time then it’s in your best interests to make sure you can hand it over to another person with the least impact possible. Having functional tests is a great way of doing this.

The benefits described here give us a few clues to how our functional tests should be implemented.

First, they need to set up their own reference data so that a new developer can sit down and run them straight away without hacking their database. In fact, the first thing any test should do is to clear out all the reference data, and set up just the absolute minimal data required to run through this test successfully. This helps people understand the impact of setting difference reference data values (this is often woefully under-documented!) and by having the data be independent, we avoid the problem where adding a new test requires jumping through hoops to avoid breaking every other test in the system. (Aside – I have a convenient method for setting up reference data in code that I may share at some future stage.)

Second, if people are going to be running through the test in a debugger and stopping at breakpoints to explore the runtime state of the application, then we can’t get away with putting in sleep(fixedTime) calls through the tests. You can’t say “the application should have finished processing by now” if someone has actually paused it for half an hour. Tests that sleep are slow and unreliable in any case.

Experimentation and Precise Requirements

If a client asks you “how does the application behave in this weird scenario,” how long does it take you to give an answer? The requirements documents probably don’t have that level of detail, and in any case the question is how the application really behaves.

If you have functional tests already, then it’s probably straightforward to modify one that checks a related scenario. After you’ve written it, you can read over it again to check that you’re actually testing what you intend to. Checking it manually would be error-prone and involve a lot of set up time.

Odd special cases

I’ve had cases where we decided that one particular client would be able to submit input data that was incorrect in a couple of specific ways, and we’d accept it anyway. The last was one of those good times where something that’s causing major headaches at a high-level in the business turns out to be an issue that I as an individual developer could make go away with some minor code changes. I think sometimes we take for granted how fortunate we are as developers to be able to make such a direct impact on problems instead of being wholly dependent on others.

Right, so I’ve put this worthwhile hack in. What’s to stop another developer coming along and changing the code again? There’s no obvious reason for why it behaves in this way – it’s just down to history. I can put comments in the code, get requirements to update their documents, but no-one reads documents anyway ;-) What’s more, even though I only had to make some minor changes, that was due to some co-incidental reasons of how other parts of the code just happen to be currently implemented.

What I did was to write a functional test that takes the exact sample file from the client and runs through the applicable scenario, checking that it is accepted. (I also wrote another test to check that the same file is rejected if it’s from any other client.) If a developer ever makes a change that breaks the test, the first thing they’ll do it open it up and find a big comment at the top that I wrote explaining all the history around why this is necessary and what it does.

This makes the requirement discoverable when it matters. Bonus points!

Fixing a bug, really

A tester raises a defect. “When I do this, the application gives an error.” You check the code and find, oops! We’re doing something stupid. A quick fix, commit, mark the defect as fixed. Weeks later they re-test it, re-open the defect because the same series of steps still fails in the same place. It’s a different error message though! As a developer you can try to say that the original defect was “fixed”, it’s just the application it now getting further along and hitting a different issue. No-one cares though.

If you write a functional test to verify the scenario now works (even if this is just an ad-hoc test that you don’t commit) you’ll ensure that the tester will agree with you that it’s been fixed.

Downsides of automated functional tests

There’s really only one – they are SLOW. Whereas you can easily run a hundred unit tests in a second if you’re using a lot of mock objects, a single functional test may take 2 minutes to complete. This is a big problem. Whenever I’ve been talking about writing an ad-hoc test, and it being beneficial even if you don’t commit it, I’m alluding to the issue that it’s practical to write a functional test for every defect raised, but not so practical to run them all every time someone commits, or even once a day, as a way to guard against regressions.

Because I think that functional tests have enormous benefits, I also think it’s worth investing time getting them to run as fast as you can. This may include running tests in parallel on a cluster of servers (hey, seems I’ve found a way lots of companies could utilise “cloud computing”). It also tends to be better to add more assertions to existing tests than to construct an entirely new independent test. You can make tests that are “data-driven”, so that a single test may have a table of a dozen different options that it loops through, without having to start from the beginning each time.

The slowness means that you’ll only be able to have a small number of functional tests – under a hundred – and this means that there’ll be a lot of scenarios (particularly negative scenarios) that you’re still not testing.

On the other hand, an automated functional test still runs faster than a human tester. It’s not unusual to have teams of testers that take a week to run through a hundred scenarios. With every release they have to go through this entire process again. What’s the cost of this, in both money and time? More than the cost of a set of servers?

(The other option, a key feature of Rails and becoming popular in other recent web frameworks as well, is to support “integration tests” that work at a lower level than functional tests, but still well above unit tests. Such a test would, for example, request a URL and assert that the response is to be forwarded to a different URL, without actually going through a web server or web browser. The exact level at which you test will depend on the type of application, and how easy it is to simulate a user.)

There’s an interesting post on the RunCodeRun Blog called It’s Okay to Break the Build. It argues that it’s unreasonable for developers to have to run an entire set of slow tests before every commit. I agree, and have found that simply running a couple of “relevant” functional tests before committing ensures that it is unlikely that you will break any other tests. Breakages can be quickly reverted after the test failure shows up on the main build server. (Perhaps there are some DVCS workflows that would be useful here?)

Continuous Deployment at IMVU: Doing the impossible fifty times a day is another fascinating article. It describes a project that has 4.4 hours of automated tests, including an hour of tests that involve running Internet Explorer instances an simulating a user clicking and typing. By running them across dozens of servers, they can run all the tests in 9 minutes. What’s more, they have so much faith in these tests that their code is automatically deployed to live when the tests pass. This happens 50 times a day on average. Brilliant!

Concluding by backtracking

Although I’ve dismissed the benefit of having lots of unit tests in the types of projects I’ve recently been working on, what if your environment or project isn’t the same as mine?

Consider the strength of interfaces and the separation of developers in your project.

Are you writing a library or framework that people may use in ways you don’t expect? Do you have a lot of “generic” code that has strong interface boundaries? What is the impact if a bug is found? If all your development is within one team, as soon as you discover a bug in one section of the code it’s easy enough to write a functional test and fix it. That doesn’t hold though if the developers that discover the bug and need the fix will have little or no hope of writing it.

In some cases it is useful to test code in isolation and mock out its dependencies – because you actually don’t know what these will be at runtime. In a lot of “business” code though, while people may claim some code is generic it isn’t really. Or if it is, it’s a case of premature indirection (not abstraction!) being the root of all evil.

Another dimension to consider is the rate of change of requirements documents.

If requirements are fuzzy and constantly changing, then what’s the point of testing all sorts of weird edge cases? The desired behaviour won’t be well-defined in any case! With changing requirements, functional tests are very useful to keep track of how the application is supposed to (and does) behave at the current point in time.

I would go so far in certain circumstances to say that good testers should be able to themselves write automated functional tests, based of higher-level requirements, using a domain-specific-language (DSL) written by the developers. This provides a precise language with which business analysts, testers and developers can communicate. Imagine receiving a bug report with a test case that you can immediately reproduce instead of having to guess at! With enough skill, a large set of functional tests would act in concert with higher-level requirements documents to produce a system that has well-documented behaviour. Unfortunately the “certain circumstances” I’m talking about is “Tim’s dream world”. In the real world, you need to have a least the first big set of tests written by a decent developer who has good taste.

To conclude, delete most of your unit tests and replace them with just 10 functional tests for a start. Forget about code coverage metrics, and let me know how it goes!