Components vs Services

(This post is a bit unpolished, being a slightly edited copy of something I wrote on the train to work last Wednesday . No matter. I declare a Blogothon Principle – it is better to publish a post that isn’t ready than to miss out a week due to noble intentions …)

There’s a receipt on the table in my lounge. On the front: 1 Original Bianco Salmon pizza from Papa Johns West End Lane. Expected on 30/03/2009, at 21:05:31. On the back are some scribbles I made at 5am the next day. “Django little rest apps”, “Ruby _why mount little app”, “OMG it’s J2EE”, “seam -> EJB 3.1”, “coarsed grained well-defined interface”, “reusable component”, “can move to diff. servers, cluster etc.”, “EJB vs SCA”, “runtime vs design time”.

It all started a couple of weeks ago, where an innocent perusal of programming.reddit.com led me to an article claiming that Python web framework Django is composed of a number of components that communicate over HTTP (REST-style), which makes it easy to swap different implementations of components and plugins in and out. Knowing that Jason is both a Python and REST fan, I mentioned this to him. Although it didn’t match his recollection of Django, we both agreed the idea sounded good.

Tick tock. Coarse-grained components, communicating via a network protocol, supporting local and remote interfaces so you can transparently distribute them on different servers. Smells like E J and B.

That can’t be right though. I’m not alone in remembering the pain of EJBs in J2EE – having to pay upfront for benefits that are never realised. All that stateless session bean local remote interface container managed naming management nonsense. So why does this approach sound good now? Have I become that which I hate? Some hand-waving architect making developers jump through hoops to solve a problem they don’t actually have?

(Aside – EJB 3.1 doesn’t look so bad. A couple of annotations here and there is all you need, so you can now use them for even fine-grained interfaces.)

Here’s what I think though. Implementing “loosely-coupled” components that communicate over a well-defined interface has costs and benefits. Whether it’s worthwhile is not so much about complexity as control.

If two components are internal to my team’s application, and we’re three developers, don’t make me jump through hoops in order to get them to communicate. If I want to change the interface of one of them, well, all the code’s sitting in my IDE in my project that I’m going to release as a single package, so I’m just going to change all the callers while I’m there.

In the Django case, the idea is that people who are using Django – not those who are developing it – will choose different components and write their own extensions. In this case they have less control over the whole Django implementation, and so now the cost of separating components out in this way has a benefit that exceeds the cost of the writing and maintaining the implementation. You have to care much more about API design, because you’re going to have callers that you don’t know about, will need to maintain compatibility and care about versioning.

Without plagarising too much a sales document I wrote last month, this reminds me of Conway’s Law, the principle that the design of a software system will naturally mirror the communication structures of the teams implementing it. If you have three teams then you’re going to have three major components, with well-defined interface boundaries between them. We should embrace this observation and either design our interfaces around our teams, or organise our teams around the interfaces that we wish our software to have. If there are particular components you think should be kept independent, assign them to different developers.

But speaking of “you’re going to have callers that you don’t know about”, that can be a bit annoying. I spent a year at a Telco working on a (successful) project, and while I was there started to ponder some architectural questions. What would happen if you really say “lets have lots of reusable interfaces”. And lets have these be “runtime” things rather than “design time” i.e. other projects will call my web service rather than integrating my code into their project.

Suppose you’re a new contractor that arrives at the Telco on a Monday on a new project.

Imagine that you go to one web page, that’s an entry point for all the web services in the organisation (on the development servers at least!). So you can choose a category like “Billing” click a couple of links and get to a “query account” service. The page shows some XML on the left that you can edit, clicking it actually calls the web service and shows the result on the right. What’s more, we keep logging information about who is actually calling each service, including some simple performance stats. We can see that in the last week, project Thundercat has called this service 17,000 times, each call taking an average of 1.2 seconds (with std deviation blah). Maybe I should talk to the Thundercat developers if I need help.

Later on you’ve added a new service. (Must be easy for people to do this without central permission! Give up control. Whole command and control vs encouraging developers to see value.) Six months later – alas! – one of the assumptions it made is no longer true. Customers can now have multiple accounts. Whoops. Any application calling the service needs fixing. How do we find out which apps these are? Why we go read the accurate documentation published for each application, haha. Even if there is documentation I wouldn’t trust it to be up-to-date. Go to the web page, click the service, and find out which applications have actually used it in the last month. That’s the spirit.

There’s the whole question around reliable web services – aka what happens when you deploy a new version of the service and someone tries to make a call. Unfair for the client to have to retry. Had an idea that all callers actually connect to a proxy, which is always atomically pointing at either the Blue or the Purple servers. When deployment has finished, we point all new calls at Purple. Any web service calls that haven’t finished yet will keep running on Blue (assume that these will not take too long – couple of minutes at most). New calls will go to the new version on Purple. When all calls on Blue have finished, this server can be shutdown and is ready for the next upgrade. Next time we do the same but in reverse – switch over to Blue. Obviously can use more than 2 servers, round-robin.

(Is there widely used software doing this already? What about using something that actually supports reliable message e.g. JMS? Maybe AMQP is a good option these days. Speaking of AMQP, happy that Red Hat owns JBoss, been a big fan of the RH for ages. Despite being dominant they’ve always been good about releasing everything as Free Software. Contrast with e.g. SuSE which had “value added” things proprietary. It’s always nice when Red Hat buys a company – all their proprietary stuff gets opened up. Anyway, bit annoyed about the whole issue where Red Hat in a press release are trying to claim that they created AMQP. That’s rude and unnecessary.)

Part of the whole architectural approach I’m thinking of also, naturally, involves social issues around developers. It’s tempting to say “we need to do this and that because in the long term … therefore we must force these lazy developers to follow all this process”. Developers on individual project don’t always see the long term cost of foo. Problem is, you have the opposite problem – at a high-level you’re seeing the long term but you’re not taking into account the “short-term” (aka always pay) cost for developers to slow them down in red tape like this. So my idea is to do things that individual developers on project can see as adding value, and so encourage them to become part of the “community”. This is why the whole web page with statistics on services used must be available to all developers, without getting permission. It is not just an internal monitoring tool for project leads and architects to monitor people with. If you try to force people to do things, they won’t understand the “why” and will do it poorly anyway.

The title of this post is “Components vs Services”, but I haven’t actually addressed that issue. Perhaps some other time.

One Comment

timsuth

Posted April 6, 2009 at 9:54 pm | Permalink

Something else I’ve been thinking about. It’s easier to write code than read code. When you start a new project it’s easier (and better?) to start out with a simplified model, and to evolve it into a more complex system later. This makes it difficult to re-use code that’s already at the more complex level.

Instead of encouraging re-use, should we be writing software so it’s easy to re-write parts of it? So that when we’re designing independent components we’re really thinking of how to make it easier for someone to rip out and replace part of the system without them having to understand everything that’s going on.

Runtime is Fun Time