try, catch and finally moving on from Robotlegs

Robotlegs original logoThis week I officially excused myself from the Robotlegs core team.

The ‘why?’ is really pretty exciting: I’m writing a new book, and it’s in an area that I have been pushing my publisher to let me write on for a few years, the working title being: Head First Intelligent Applications.

We actually signed off on the book a long time ago, but things were moving so rapidly in the web-apps domain that we’ve spent a while in gestation. What was originally targeting Python is now targeting node.js.

Intelligent Applications is an umbrella term for AI and Machine Learning. Data and patterns and algorithms… I am a maths-geek to the core, so this is a dream project for me.

After this book, it’s my partner’s turn to do some writing, and mine to do the housework. And after that, I plan to get involved in teaching the next generation of programmers – so I don’t anticipate returning to Robotlegs in any significant role. I want to make that clear because it would be bad for the project if you all kept my seat warm for me – the joy of open source is that it really doesn’t matter if one individual moves on.

But stepping aside from Robotlegs is not purely a not-enough-time decision.

Without jeopardy, only half your brain is online

One of the things that makes Head First books remarkably effective (and incredibly difficult to write) is that the producers/editors understand that our brain allocates resources on a requirements basis. You have to want to solve a problem in order to have the best chance of understanding the solution. If the motivation, or – better – obsession, with solving the problem isn’t there, much of your brain will continue working on whatever does matter to you at that moment in time.

Although I will continue to use Robotlegs in my long-term project, success or failure of that project doesn’t rely on me coming up with awesome solutions or great code for Robotlegs 2. On the other hand, I know my editor will accept nothing but excellence for my book, and that readers will be sensitive to weaknesses in my javascript. The relative impact on my ‘survival’ of the two problem domains couldn’t be more different.

All of the useful work I did in creating utilities and add-ons and patches for Robotlegs 1 was born out of necessity. I needed those things (modularity, the relaxedEventMap, and various command maps) to solve problems that would let me complete the feature that would let me send the invoice that would let me pay the rent.

Robotlegs 2 features will be rent-critical for many users, and as such they need to be developed by folk with that same level of jeopardy attached to their success or failure.

Let me tell you a secret: none of us feel that we know enough

Put aside concerns you may have about your prior experience, silence the voice in your head that says “You don’t know nearly enough to contribute to Robotlegs.” If necessary, tell it that Stray says that, looking back, she didn’t know nearly enough to contribute to the project but she still did make a difference.

We’ve deliberately separated concerns of ‘what?’ and ‘how?’ in building Robotlegs 2. API and behaviour comes first – what should this feature do, and how should it be used? Performance and optimisation comes later, and will be handled by a different set of people, because it does require factual knowledge of specific quirks of AS3.

So – get involved in the features that matter to you. The features you need to get your work done, your invoices out and your rent or mortgage paid. Take part in discussions, and then read the tests, and then write some tests (tests that confirm and clarify existing behaviour and corner cases are very useful), and then write or improve some code. I remember how huge that felt to me a couple of years ago, but you just need to get started. Trust me.

Start here, because this one matters to everybody: Errors

In the coming days, Shaun’s going to open up a discussion about errors in Robotlegs 2.

Errors are such an important aspect of any application, but particularly a framework, that we completely failed to discuss them at any point up until now. Yeah, really.

Not that we didn’t implement errors, but we somehow missed the fact that there is more than one fundamental approach to errors. Each of us was busily implementing errors in our particular areas of responsibility, without any awareness that we had no explicit, agreed policy that would make that error behaviour consistent.

This is a wonderful example of how we carry hidden assumptions. I worked with an anthropologist who liked to call some of these assumptions ‘felt truths’. They’re more than just ideas, because we tend to be aware of ideas as being alternative to other ideas. Felt Truths are assumptions about “how stuff is” where we don’t just fail to see alternatives, our perspective feels so true that we have a visceral ‘I’m under attack’ response when it is challenged.

Unpicking my own Error-strategy Felt Truth

My personal perspective is that Errors should, as far as is technically possible, be one-to-one with serious problems. That is: if an error is thrown, ignoring it should (almost) never be an option, because ignoring it should always represent unacceptable levels of risk for which the system was not designed. In technical terms, I favour a pure state-based feedback system, in contrast to an operator-behaviour feedback system.

When I first started to articulate this last week (in response to some discussion about whether duplicating-mappings and unmapping-things-that-arent-currently-mapped should throw errors) I was surprised by how much I “knew” about my preferred error strategy.

The one thing I didn’t know was that it was just an option. I wasn’t holding in mind that there were alternative, equally coherent and sane error strategies that would be defined quite differently. Within a few exchanges back and forth on the relevant issue, I had clocked that this was the source of the confusion we were experiencing. It was a real “how the hell did I miss that?” moment. In particular, I couldn’t find a reason why, as a programmer, I would be so embedded in one assumption that I couldn’t see that it was an assumption at all.

I should stress that while the confusion over this (y’all will have noticed that Shaun and I can have some pretty robust discussions) wasn’t the trigger for my leaving, the experience was: the thought of reworking the current implementation to a new error strategy made my heart sink, not soar, because what I want is to be playing with my Hill Climbing Optimisations… in fact I want that so bad that if you go back far enough into one early branch of RL2 you’ll find that I actually used them to solve a problem that Shaun rightly pointed out had a trivial solution.

But I digress… back to the errors.

Where does my felt truth about error strategies come from?

Eventually, I’ve tracked the source back nearly 20 years to the courses I took in studying for a degree in Manufacturing Engineering, and my first couple of jobs in that field before I switched to working in TV.

One of my favourite courses was called something like “Human-machine interaction”, and also covered Failure Modes and Effects Analysis. Each week we considered a case study of an accident or near-miss, including all the famous ones – Chernobyl, Challenger, Three Mile Island. A recurring theme was that bad stuff often occurred in systems where warnings and readouts to pilots and controllers were inconsistent or hard to understand.

A warning light labelled ‘emergency cooling system’ has come on. Does that mean that I need to activate the system, or that the system has been activated?
I press a button, and a green LED on that button is lit. Does the LED mean that the button has been pressed, or that the action activated by the button has been successfully carried out?
Three Mile Island Accident:

You can read a summary of the incident here, but the key aspect we studied in my class was this one, relating to a pressure relief valve that opened correctly, but then failed to close, due to a mechanical problem, when the pressure had been relieved:

“Despite the valve being stuck open, a light on the control panel indicated that the valve was closed. In fact the light did not indicate the position of the valve, only the status of the solenoid, thus giving false evidence of a closed valve. As a result the operators did not correctly diagnose the problem for several hours.”

Catastrophes are rarely the sole fault of the human, or the machine, but frequently the design of the relationship between them.

A challenge facing engineers in many, many areas, including AS3 framework design, is that the smarter the system becomes the more likely it is that the operator will become unable to take charge in the event of a system failure. Yet it is inevitable that systems will be put into failing states.

Polite errors?

There is a concept, within design of warning/error systems, of polite vs aggressive warning strategies. In a polite system, the operator is presented with only one problem at a time. In an agressive system, all the warnings or errors are displayed simultaneously.

There is case-study basis for favouring either of these approaches. In the case of Three Mile Island, operators also became overwhelmed by hundreds of simultaneous warning lights. In several air accidents, the aircraft monitoring system has warned of a seemingly trivial problem while the auto-pilot is correcting for a much more serious situation. When the auto-correction runs out of rope it can be too late to recover the situation.

If you are going to adopt a ‘polite’ error system then you need to be damn sure that you’re prioritising errors correctly, and that having a single error fed to you is not going to impair diagnosis of the problem when compared with seeing a more rounded picture.

There’s a great paper on the challenges of automation and feedback here.

Dude, it’s just an AS3 framework!

At the October 2012 try{harder}, as well as our pre-prepared sessions, we each gave a brief lightening-talk, with one hour to prepare those talks from being told of the topic we would talk on.

We grouped these mini-talks by theme, and one of the themes was Building software that actually matters. Neil Manuell, Angela Relle and David Arno each spoke about their experiences of working on safety-critical projects where failures could cost lives.

I have experience of working on electronic communication systems for the military and for a major airport (hardware), but it was still a really valuable reminder of the level of responsibility on the shoulders of some AS3 developers.

Of course we don’t need to apply the same error policies to Robotlegs 2 as are needed by those projects, but we do need to have a very clear and consistent approach so that it’s possible for projects of all kinds to run on Robotlegs 2 and deal appropriately with errors and warning events from the system.

I say “we”, but of course really I mean “you”. For me, it’s time to go back to being a user of whatever wonderful features emerge from the efforts of what is an exceptionally talented, generous and engaged community.

Actually, I just spoke to someone else this morning about a some social anthropology ideas (theirs) that may eventually gestate into a book featuring the Robotlegs community as a case study… but that’s just a teeny tiny seed for now…

So: go forth and think about errors. Your framework needs you.

 

About the Author

I'm an actionscript programmer living and working in a tiny village in the Yorkshire Dales, UK. I used to be a TV reporter, but my inner (and often outer) geek won. I also write stuff. Most recently Head First 2D Geometry.

Visit Stray's Website

Share the post

Delicious It Digg this! Stumble this! Share on Reddit Share on Buzz Share on FriendFeed