Landing : Athabascau University
  • Blogs
  • Jon Dron
  • Signals, boundaries, and change: how to evolve an information system, and how not to evolve it

Signals, boundaries, and change: how to evolve an information system, and how not to evolve it

primitive cell development

For most organizations there tend to be three main reasons to implement an information system:

  1.     to do things the organization couldn’t do before
  2.     to improve things the organization already does (e.g. to make them more efficient/cheaper/better quality/faster/more reliable/etc)
  3.     to meet essential demands (e.g. legislation, keep existing apps working, etc)

There are other reasons (political, aesthetic, reputational, moral, corruption/bribery/kickbacks, familiarity, etc) but I reckon those are the main ones that matter. They are all very good reasons.

Costs and debts

With each IT solution there will always be costs, both initial and ongoing. Because we are talking about technology, and all technologies evolve to greater complexity over time, the ongoing costs will inevitably escalate. It’s not optional. This is what is commonly described as the ‘technological debt’ but that is a horrible misnomer. It is not a debt, but the price we pay for the solutions we need. If we don't do it, our IT systems decay and die, starved of their connections with the evolving business and global systems around them. It's no more of a debt than the need to eat or receive medical care is a debt for living.

Thinking locally, not globally

When money needs to be saved in an organization, senior executives tend to look at the inevitably burgeoning cost of IT and see it as ripe for pruning. IT managers thus tend to be placed under extreme pressure to ‘save’ costs. IT managers might often be relieved about that because they are almost certainly struggling to maintain the customized apps already, unless they have carefully planned for those increased costs over years (few do). Sensibly (from their own local perspective, given what they have been charged with doing), they therefore tend to strip out customizations, then shift to baseline applications, and/or cloud-based services that offer financial savings or, at least, predictable costs, giving the illusion of control. Often, they wind up firing, repurposing, or not renewing contracts for development staff, support staff, and others with deep knowledge of the old tools and systems. This keeps the budget in check so they achieve the goals set for them.

Unfortunately, assuming that the organization continues to need to do what it has been doing up to that point, the unavoidable consequence is that things that computers used to do are now done by people in the workforce instead. When made to perform hard mechanical tasks that computers can and should do, people are invariably far more fallible, slow, inconsistent, and inefficient. Far more. They tend to be reluctant, too. To make things worse, these mundane repetitive tasks take time, and crowd out other, more important things that people need to do, such as the things they were hired for. People tend to get tired, angry, and frustrated when made to do mechanical things over which they have little agency, which reduces productivity much further than simply the time lost in doing them. To make matters even worse, there is inevitably going to be a significant learning curve, during which staff try to figure out how to do the work of machines. This tends to lead to inflated training budgets (usually involving training sessions that, as decades of research show, are rarely very effective and that have to be repeated), time to read documentation, and more time taken out of the working day. Creativity, ingenuity, innovation, problem-solving, and interaction with others all suffer. The organization as a whole consequently winds up losing many times more (usually by orders of magnitude) than they saved on IT costs, though the IT budget now looks healthy again so it is often deemed to be a success. This is like taking the wheels off a car then proudly pointing to the savings in fuel that result. Unfortunately, such general malaises seldom appear in budget reports, and are rarely accounted for at all, because they get lost in the work that everyone is doing. Often, the only visible signs that it has happened are that the organization just gets slower, less efficient, less creative, more prone to mistakes, and less happy. Things start to break, people start to leave, sick days multiply. The reputation of the organization begins to suffer.
 
This is usually the point that more radical large scale changes to the organization are proposed, again usually driven by senior management who (unless they listen very carefully to what the workforce is telling them) may well attribute the problems they are seeing to the wrong causes, like external competition. A common approach to the problem is to impose more austerity, thus delivering the killing blow to an already demoralized workforce. That’s an almost guaranteed disaster. Another common way to tackle it is to take greater risks, made all the more risky thanks to having just converted creative, problem-solving, inquisitive workers into cogs in the machine, in the hope of opening up new sources of revenue or different goals. When done under pressure, that seldom ends well, though at least it has some chance of success, unlike austerity. This vicious cycle is hard to escape from. I don't know of any really effective way to deal with it once it has happened.

Thinking in systems

The way to avoid it in the first place is not to kill off and directly replace custom IT solutions with baseline alternatives. There are very good reasons for almost all of those customizations that have almost certainly not gone away: all those I mentioned at the start of the post don't suddenly cease to apply. It is therefore positively stupid to simply remove them without an extremely deep, multifaceted analysis of how they are used and who uses them, and even then with enormous conservatism and care. However, you probably still want to get rid of them eventually anyway, because, as well as being an ever-increasing cost,  they have probably become increasingly out of line with how the organization and the world around it is evolving. Unless there has been a steady increase in investment in new IT staff (too rare), so much time is probably now spent keeping old systems going that there is no time to work on improvements or new initiatives. Unless more money can be put into maintaining them (a hard sell, though important to try) the trick is not to slash and burn, and definitely not to replace old customized apps with something different and less well-tailored, but to gently evolve towards whatever long-term solution seems sensible using techniques such as those I describe below. This has a significant cost, too, but it's not usually as high, and it can be spread over a much longer period.
 

For example...


If you wish to move away from reliance on a heavily customized learning management system to a more flexible and adaptive learning ecosystem made of more manageable pieces, the trick is to, first of all, build connectors into and out of your old system (if they do not already exist), to expose as many discrete services as possible, and then to make use of plugin hooks (or similar) to seamlessly replace existing functions with new ones. The same may well need to be done with the new system, if it does not already work that way. This is the most expensive part, because it normally demands development time, and what is developed will have to be maintained, but it's worth it. What you are doing, at an abstract level, is creating boundaries around parts that can be treated as distinct (functions, components, objects, services, etc) and making sure that the signals that pass between them can be understood in the same way by subsystems on either side of the boundary.

Open industry standards (APIs, protocols, etc) are almost essential here, because apps at both sides of the boundary need to speak the same language. Proprietary APIs are risky: you do not want to start doing this then have a vendor decide to change its API or its terms and conditions. It’s particularly dangerous to do this with proprietary cloud-based services, where you don’t have any control whatsoever over APIs or backends,  and where sudden changes (sometimes without even a notification that they are happening) are commonplace. It's fine to use containers or virtual machines in the cloud - they can be replaced with alternatives if things go wrong, and can be treated much like applications hosted locally - and it's fine to use services with very well defined boundaries, with standards-based APIs to channel the signals. It is also fine to build your own, as long as you control both sides of the boundary, though maintenance costs will tend to be higher.  It is not fine to use whole proprietary applications or services in the cloud because you cannot simply replace them with alternatives, and changes are not under your control. Ideally, both old and new systems should be open source so that you are not bound to one provider, you can make any changes you need (if necessary), and you can rely on having ongoing access to older versions if things change too fast.
 
Having done this, you have two main ways to evolve, that you can choose according to needs:

  1.  to gradually phase in the new tools you want and phase out the old ones you don’t want in the old system until, like the ship of Theseus, you have replaced the entire thing. This lets you retain your customizations and existing investments (especially in knowledge of those systems) for the longest time, because you can replace the parts that do not rely on them before tackling those that do. Meanwhile, those same fresh tools can start to make their appearance in whatever other new systems you are trying to build, and you can make a graceful, planned transition as and when you are ready. This is particularly useful if there is a great deal of content and learning already embedded in the system, which is invariably the case with LMSs. It means people can mostly continue to work the way they’ve always worked, while slowly learning about and transitioning to a new way of working.
  2.  to make use of some services provided by the old system to power the new one. For instance, if you have a well-established means of generating class lists or collecting assessment data that involves a lot of custom code, you can offer that as a service from the old tool to your new tool, rather than reimplementing it afresh straight away or requiring users to manually replace the custom functions with fallible human work. Eventually, once the time is right to move and you can afford it, you can then simply replace it with a different service, with virtually no disruption to anyone. This is better when you want a clean break, especially useful when the new system does things that the original could not do, though it still normally allows simultaneous operation for a while if needed, as well as the option to fall back to the old system in the event of a disaster.

There are other hybrid alternatives, such as setting up other systems to link both, so that the systems do not interact directly but via a common intermediary. In the case of an LMS migration, this might be a learning record store (LRS) or student record system, for instance. The general principle, though, is to keep part or all of the old system running simultaneously for however long it is needed, parcellating its tools and services, while slowly transitioning to the new. Of course, this does imply extra cost in the short term, because you now have to manage at least two systems instead of one. However, by phasing it this way you greatly reduce risk, spread costs over a timeframe that you control, and allow for changes in direction (including reversal) along the way, which is always useful. The huge costs you save are those that are hidden from conventional accounting - the time, motivation, and morale of the workforce that uses the system. As a useful bonus, this service-oriented approach to building your systems also allows you to insert other new tools and implement other new ideas with a greatly diminished level of risk, with fewer recurring costs, and without the one-time investment of having to deal with your whole monolithic codebase and data. This is great if you want to experiment with innovations at scale. Once you have properly modularized your system, you can grow it and change it by a process of assembly. It often allows you to offer more control to end users, too: for instance, in our LMS example you might allow individuals to choose between different approaches to a discussion forum, or content presentation, or to insert a research-based component without so many of the risks (security, performance, reliability, etc) normally associated with implementing less well-managed code.

Signals and boundaries

In essence, this is all about signals and boundaries. The idea is to identify and, if they don't exist, create boundaries between distinct parts of systems, then to focus all your management efforts on the signals that pass across them. As long as the signals remain the same from both sides, what lies on either side of the boundaries can be isolated and replaced when needed. This happens to be the way that natural systems mainly evolve too, from organisms to ecosystems. It has done pretty good service for a good billion years or so.