The Art of Making Multithreading Issues Worse

Reading Time: 3 minutes
It also took me ages to find that these were dead

I recently spent days looking for a bug – a thread safety bug – that I should have found in minutes. The reason it took so long is that someone had found it before me and attempted to fix it, twice, but each time had in fact made matters worse.

What made matters even worse than that is a phenomenon that will be all too familiar to you if you handle multithreaded code a lot – the customer could make it happen every time but on the bench, back at the office with all the debug tools? Practically impossible to reproduce.

So the code started off like this, and here I’ve extracted the essence of the problem, this is not the real code. In reality the mistake was the same, but heavily obfuscated right from the start.

class AlarmManager
{
    private Dictionary alarms = new Dictionary();

    public Guid StartAlarm()
    {
        Alarm newAlarm = new Alarm();
        newAlarm.AlarmId = Guid.NewGuid();
        alarms.Add(newAlarm.AlarmId, newAlarm);
        newAlarm.AlarmWorkflow = WorkflowManager.CreateAlarmWorkflow(newAlarm.AlarmId);
        return newAlarm.AlarmId;
    }

    public void DeleteAlarm(Guid alarmId)
    {
        Alarm toRemove;
        if (alarms.TryGetValue(alarmId, out toRemove))
        {
            toRemove.AlarmWorkflow.StopAndRemove();
            alarms.Remove(alarmId);
        }
    }
}

Two schoolboy errors in there. Someone spotted the first, Dictionaries are not thread safe, so they added some locking…

class AlarmManager
{
    private Dictionary alarms = new Dictionary();

    public Guid StartAlarm()
    {
        Alarm newAlarm = new Alarm();
        newAlarm.AlarmId = Guid.NewGuid();
        lock (alarms)
        {
            alarms.Add(newAlarm.AlarmId, newAlarm);
        }
        newAlarm.AlarmWorkflow = WorkflowManager.CreateAlarmWorkflow(newAlarm.AlarmId);
        return newAlarm.AlarmId;
    }

    public void DeleteAlarm(Guid alarmId)
    {
        Alarm toRemove;
        lock (alarms)
        {
            if (alarms.TryGetValue(alarmId, out toRemove))
            {
                toRemove.AlarmWorkflow.StopAndRemove();
                alarms.Remove(alarmId);
            }
        }
    }
}

Fixed? No. This is where we hit the real problems. Someone noticed that the line toRemove.AlarmWorkflow.Stop(); was throwing null reference exceptions, so rather than investigate how toRemove.AlarmWorkflow came to be null they simply put a null check in…

public void DeleteAlarm(Guid alarmId)
{
    Alarm toRemove;
    lock (alarms)
    {
        if (alarms.TryGetValue(alarmId, out toRemove))
        {
            if (toRemove.AlarmWorkflow != null)
                toRemove.AlarmWorkflow.StopAndRemove();
            alarms.Remove(alarmId);
        }
    }
}

The reason it was occasionally null is in StartAlarm. That’s where the bug is – the new alarm is added to the dictionary and the lock released before it’s finished initialising. So if it’s deleted by another thread immediately after it’s started, the threads can interleave in a way where the alarm is removed from the dictionary with the workflow being null, then the workflow is assigned and started. As the workflow is managed by the WorkflowManager, there’s still a reference to it, hence it continues to run.

Now the issue got compounded further, because someone spotted that exceptions were still being thrown during the delete, from the StopAndRemove method. This is where my simplification falls down a bit because the real reasons for the exception are somewhat complex involving events and another access to the alarms dictionary, suffice to say however that this was not the way to solve the problem…

public void DeleteAlarm(Guid alarmId)
{
    Alarm toRemove;
    lock (alarms)
    {
        if (alarms.TryGetValue(alarmId, out toRemove))
        {
            try
            {
                if (toRemove.AlarmWorkflow != null)
                    toRemove.AlarmWorkflow.StopAndRemove();
            }
            catch 
            {
            }
            alarms.Remove(alarmId);
        }
    }
}

These two attempted fixes are part of a mindset of patching it up rather than fixing the root cause. In certain extreme circumstances patching it up may be acceptable. I’ve had to do it, but it must be highlighted that this is what has been done and that it may actually be masking the root cause or indeed causing further knock-on issues.
I find the #warning preprocessor directive useful in such circumstances.

The solution is trivial…

class AlarmManager
{
    private Dictionary alarms = new Dictionary();

    public Guid StartAlarm()
    {
        Alarm newAlarm = new Alarm();
        newAlarm.AlarmId = Guid.NewGuid();
        lock (alarms)
        {
            newAlarm.AlarmWorkflow = WorkflowManager.CreateAlarmWorkflow(newAlarm.AlarmId);
            alarms.Add(newAlarm.AlarmId, newAlarm);
        }
        return newAlarm.AlarmId;
    }

    public void DeleteAlarm(Guid alarmId)
    {
        Alarm toRemove;
        lock (alarms)
        {
            if (alarms.TryGetValue(alarmId, out toRemove))
            {
                alarms.Remove(alarmId);
                toRemove.AlarmWorkflow.Stop();
            }
        }
    }
}

When Customers Bite

Reading Time: 2 minutes
Hungry Tom's if File, North Yorkshire
Good software can flunk if it’s badly demonstrated. This is common sense but something I learnt only recently was the importance of data in the user’s perception of what a good software is.
When I did the sales demonstrations I’d used demonstration data. It worked, the buyers placed the order and I was then asked if I wouldn’t mind demonstrating the system to the end users.
Part of the bid involved importing the customer data and I was about half way done so I thought it would be cute to show the end users their own data.

In retrospect that was a big mistake.

When replacing a core system, something that the users spend most of their time with, they’re naturally going to be a little cautious. I was very aware of this so I attempted to mitigate it by saying that there may be a few little holes as the data import wasn’t finished. Sure enough one of them spotted a data error which I attempted to brush aside, but they wouldn’t let it go and it was followed by an avalanche of questions – unsurprisingly the users knew where the complex data was – the bits that I hadn’t yet imported. End user confidence utterly nosedived and optimistic caution went to out of hand rejection. It was a very difficult session, especially when you have people saying directly to you – and I quote, “Your system is shit. We can’t use this shit.”

The fact that I had demonstrated everything that they needed to do, had demonstrated that it was easier and clearer with our system than the one they had and that we gave them a heap load of extra functionality stood for nothing because the data behind it was poor.

About the same time a failing government project to replace the many disparate control room systems in the Fire Service with one package was being lambasted in the press. The entire history of the venture pretty much reads like a manual on how to screw up project – critical mistakes were being made left, right and centre. All these reasons are complex and somewhat intangible. What the press had picked up on was that a supposedly national system only “knew about” one small area of the country – Wakefield I think. That’s tangible, that was something people could understand. That’s what people were laughing about and of course it was a problem, but it had precious little to do with the failure of the project.

So a hard lesson for me to learn that people just don’t seem able, or perhaps willing, to separate the concepts of data and functionality when it comes to software. It took months of hard work after that bad demonstration to get the users on board.
It is very important to get them on board early in a project and keep them informed, you want them to feel like they’re involved – part of the process – then acceptance of the system is much easier. Until you’re confident that their data is 100% though, use demo data.

What C#’s For is For…

Reading Time: < 1 minute

Ever wondered why C# uses such a silly format for its for loop?

I mean why not do it like Pascal?

 
    var loopVar:int;
    for loopVar := 0 to 10 do
    begin
        writeln('something erudite');
    end;

Whereas C, C#, and C++ use constructs like…

    int i;
    for(i = 0; i < 10; i++)
    {
        Console.WriteLine("something erudite");
    }

The reason is not always taught in C / C# /C++ 101 and it’s really quite useful. C’s for loop is split into 3 sections.

  1. Initialisation. Here you set up the start conditions for your loop.
  2. Comparison. The loop will continue to iterate whilst the condition here evaluates to true.
  3. Loop action. An action that is performed every time the loop iterates.

You do not have to initialise an ordinal type, you do not have to use a numeric comparison, you do not have to increment or decrement anything. Today I needed to walk an exception stack, I used a for loop, roughly (but not exactly) like this.

catch(Exception ex)
{
    for (Exception subject = ex; null != subject; subject = subject.InnerException)
    {
        if (subject is ArgumentNullException)
        {
            ///do stuff
        }
    }
}

This is just a simple example, there are all sorts of places where the C style for construct comes in useful – a friendly word of advice though, don’t go nuts. If the operations that you need to perform in order to run the loop don’t neatly fit into the for loop construct, then you should probably do it a different way, else your code is likely to become difficult to read and buggy.

The Power of the Random Other Coder

Reading Time: 2 minutes
I'm not paranoid...
Enough to make you paranoid!

If you’re a junior engineer and you have a problem you can easily ask someone more senior. But what if you’re the most senior person in the room?

Well if the problem isn’t one of not having the technical expertise but just not being able to find what’s wrong then the answer couldn’t be more simple – ask anyone.

I have a reputation for being able to find problems with other people’s code but the truth is that 90% of the time they actually find the errors themselves. The simple process of having to explain what one was trying to achieve and how he was trying to achieve it often reveals problems with the logic or problems with the code.

In fact it can be better to ask people who, although capable of understanding what you’re doing, are from a different technical area or who are considerably less experienced – the key is that you have to explain what you’re doing to them and the more explaining you have to do, the better.

At my previous company I found that the technical director was a particularly good person to ask. He had been a programmer but wasn’t really used to the way we worked in the modern production environment. He provided a very intelligent but completely different slant on things. OK, so sometimes it was frustrating when he disagreed with a perfectly good way that I was doing something, but he spotted a fair few little mistakes more often and that was very helpful.

Tom Recommends

Reading Time: < 1 minute

Occasionally, in all the stuff I use, I find something that I really like. So why not tell people about it?

Polycell Crack Free Ceilings is great. It’s thick, gloopy paint that hides a multitude of sins and not just on ceilings. It’s expensive, but combined with a little lightweight filler the time it saves is well worth it. As an aside though, this doesn’t mean that I recommend all the Polycell miracle solutions, some of which I’ve found not to work for me.

Dawn Simulator Alarm Clocks – any of them really – I happen to have the Lumie Bodyclock Starter 30. I’m not a SAD sufferer, but nonetheless I find that not being suddenly jolted awake makes me far more alert and puts me in a far better mood in the morning.

Ultrafire WF-606a – a pocket size LED torch, runs on 2 AA size batteries. It’s very bright and very rugged. A thoroughly sound piece of kit, far better in my opinion than the equivalent MagLite and much cheaper, too.

A Goal is Something You Need to have a Passion About

Reading Time: 3 minutes
The Street in the Snow
The Street in the Snow

I’m really bad with long term goals. I know that they’re good to have and that they give one a sense of purpose, but I’m poor at creating them awful at sticking to them.

There are two traps that I tend to fall into.

  1. Moving the deadline backwards. This is all too easily done – if you want to achieve something “in 2 years” it’s all too tempting to retain the idea of “2 years” and forget about the start date and, more importantly, the end date.
    If you’re going to achieve any sort of timed goal then you must set and stick to a deadline.
  2. Failing to plan. Having an idea of something that you want to achieve that is a long way off is a noble thing, but you must work out how you’re going to achieve it. If you don’t then you tend to drift haphazardly doing little bits here and there that are of no real consequence.

Then there is the goal itself. Unless I can see a major benefit for me then I find it hard to get motivated about something for which I have no passion.  If I set a goal then unless it’s something that I really believe in or something that is going to make a profound difference to my life I’m going to have motivational problems.

Despite the above I have a pretty good track record of achieving goals, and how I do it is really quite simple.

Step 1 is to sort the wheat from the chaff. What goals are actually important to you? A common mistake is to set goals of things you think you should achieve rather than things that you actually want to achieve. The difference can be subtle: career advice might tell you that you should be looking for a promotion in 5 years but if you like your job, if you’re earning enough to fulfil your future plans then you heart isn’t going to be in it. If you have a passion for horse-riding and you really, really want your own horse but you can’t afford it then you have a good driver to get a promotion, but the goal here is not the promotion but the horse. The promotion (or change of job) is just the means.

So now you know what you actually want to achieve don’t be tempted to put a number of years next to it, what that does is to dull the passion – it allows you to think that you’re working towards your goal when you’re not.

Instead of setting time limits start planning. What do you need to do? What do you need to do it? Break down large tasks into their component parts, things that have a definite start and a definite end so that you can tick them off.

Consider risks and alternatives. If you’re waiting for something to happen and it doesn’t, or if something goes wrong, what are you going to do? What are the alternatives? How do you mitigate the risks?

Lastly assemble the tasks into a time line – how long will it take? What slippage are you prepared to put up with? This is often a great moment, when you look down and it dawns on you that the goal is not only achievable, but achievable much, much faster than you’d anticipated.

The Best CVs Are Reverse-Engineered

Reading Time: 5 minutes

SEED - It Could Be You!

A CV has to be carefully designed, but if it’s not designed for the correct purpose it’s no good. This is the problem with a lot of CVs – clearly a lot of care and attention has gone into them but they miss the point.

There are hundreds, probably thousands of articles out there on how to format a CV well so there’s no point in me going down that road. I’m going to approach it differently, I believe that in order to write a good curriculum vitae (or resume), you need to understand the process that it’s going to go through. Then it all becomes much, much easier.

The first thing we need to get clear is that unless you’re applying for a very niche market, your prospective employer is going to be heavily oversubscribed with applicants, many of whom have no prospect of getting the job. The net result is that it’s a waste of time for the employer to pass all the CVs to your prospective boss for consideration – they could easily outnumber the positions available by 100 to 1 and 75% or more could be rubbish. The first “paper sort” is usually done by someone from Human Resources who may have no knowledge of the job in question, merely the job description.

The first task of your CV is to get past the HR person. Thankfully this is relatively simple, but many people do it badly or even fail here when they could actually do the job.

You must make sure that you cover the basic requirements for the job and that this jumps out from the page. Don’t assume that anyone knows that experience of one technology / methodology / system includes or implies experience of another. I know someone who failed to get an interview because she listed a plethora of technical skills but failed to mention the phrase “computer literate” which was an essential for the role.

Assume the HR person has no technical knowledge and make sure the keywords on your CV match the keywords on the job specification as closely as you can justify. Doing this with qualifications, technical skills and experience is relatively easy. Some have to be worked in more carefully such as “initiative”, try to get them in though as if it’s on the job specification the HR person will be looking for it.

It’s really, really important that you get this right. Having a CV full of really great technical stuff is going to do you no good if you can’t get it past the HR people.

The CV Process - Stay Out of The Bin!

Your CV then goes to your prospective boss, or at least someone in the department. This department is recruiting so the chances are that they’re busy, meaning that time is of the essence.  HR departments tend to err on the side of the false positive so the first sort is usually to get rid of people who could not in fact do the job. This is done quickly, so the clearer your CV the better. Don’t go over-the-top trying to list every piece of technology and every skill you can lay some kind of claim to in an attempt to impress your new boss. If anything that will make you look worse. Concentrate on providing evidence for what you’ve asserted. For instance, if you claim in the highlights that you’ve got 3 years of C# experience but you don’t list any previous positions involving C# development, it’s not going to look good.

Now you’re on the short-list. The person calling the shots is now probably looking at the set of CVs and thinking about either pulling out a few of the best or eliminating some of the worst. Fortunately for the good CV writer it’s not a level playing field. If your CV is easy to read your propensity to stay out of the bin will be greater. If you have a wordy CV that’s hard to read it may end up in the bin on those grounds alone. Similarly mistakes or inconsistencies on your CV can land you in the bin. Technical positions require the ability to communicate clearly and effectively and to show good attention to detail. If your CV doesn’t demonstrate these traits then it’s a good excuse for it to be thrown in the bin.

When I explain this to people I get a few slightly surprised reactions at this point – do employers really throw CVs in the bin simply because they’re too long or they don’t like the style? It can happen. If you’re comparing two CVs that seem to show individuals of similar suitability for the job but you can only interview one, you have to make a decision somehow. The quality of the CV can reveal a lot about the personality and attitude of the person.

Another piece of very useful inside information is that the actual goal of the recruitment is not necessarily to employ the person that matches the job description best. It’s to employ someone who’s going to be good at the job and is going to work well with the existing people in the team. Your CV should demonstrate some personality, there is width for creativity. Don’t go mad with crazy fonts, that will be counter-productive, but try to introduce a hint of your personality into it.

Something that I think very little attention is paid to and that can land your CV in the bin is the inclusion of material that reflects negatively on the applicant.  Think about what the company is asking for and if there’s anything on your CV that contradicts that then remove it or if you can’t, play it down. It’s worth mentioning that opinions are almost always a bad idea. Their negative points exceed the positive. If you’re going to link a blog, profile or web site then think carefully about its contents and how it reflects on you. If you express a lot of left wing sentiment on your blog for instance, your prospective new employer may not appreciate this.

Finally, although many disagree apparently, I like to see a “Hobbies and Interests” section. It should be every brief, though, perhaps as terse as just a list of 5 words. If for nothing else, it’s useful to provide the employer with an ice-breaker in the interview.

If you take nothing else away from this article, remember these two things;

  • You must demonstrate that you meet the job specification
  • Your CV must be clear and easily read by a non-technical HR person and a hassled potential new boss

Happy hunting!

British F1 Grand Prix at Silverstone: Village / Arena View

Reading Time: 2 minutes

Silverstone‘s new section is fantastic and the view from what’s now called Village grandstand (was called Arena) is excellent.

We dropped in there for one of the GP2 races and since it was new I thought a panorama shot might be useful for anyone considering it. Unfortunately I didn’t quite line the photos up so it doesn’t stitch together properly. So there are two, the left half and the right half.

In the left half you get a bonus in that you can see the cars go through Maggotts / Becketts / Chapel and onto the Hangar Straight as well as Village / The Loop / Aintree.

Left Hand Side of the View from Village (click for bigger)
The Right Hand Side of the view from Village B (click for bigger)

There’s a really good overtaking opportunity if you can get a good run round the outside at Village because you’ll have the inside at The Loop. This is a very bad line for the corner and you’ll run wide on the exit, but it’s seriously difficult for your opponent to duck inside – we saw a couple of people making this stick in the GP2. A lot of people were running too wide out of Village, too, meaning more excitement at The Loop.

A word to the wise, though. The grandstand is quite exposed to wind, so make sure that you have some form of wind-proof clothing option with you because even when it was 25C in the sun it was cold up there.

Taoism and Trees

Reading Time: 2 minutes
Not an oak

We change. We move on. It’s part of being human.
There’s a lot of change going on with me at the moment and it’s at times like this when a little wisdom can make a big difference.

One of the most effective pieces of advice I’ve ever been given was from a Kung-Fu instructor. She asked me which tree was stronger, the willow or the oak.
The oak is a potent symbol of power, it stands strong against the wind, assenting only to the gentlest of sways whereas the willow flops around all over the place in the merest breeze. The oak will still stand strong in a storm while the willow is battered into the ground.
When the wind gets too strong though, the oak will snap. When the wind subsides the oak will lie broken on the ground and the willow will return to its original form, swaying gently in the breeze.

The skill is knowing when to be the oak and when the forces against you are all too strong. I naturally tend toward the oakish, so I have to keep asking myself if I’m trying to be too strong, if perhaps I should stop pushing and just weather the storm. An ability to recognise when a something is beyond my control, to accept it rather than exhaust myself fighting against it (and probably lose anyway) has saved my bacon many, many times.

This goes hand in hand with another important skill – the ability to seek advantage even in adverse circumstances. When something bad happens it’s all too easy to concentrate on the bad, on what will be lost. A little bit of objective thinking often reveals that whilst some doors are closing others are opening. Sometimes an apparently bad change, on proper analysis, works out to be positive overall.

Lastly, and leading directly from the above, it’s easier to influence something that you’re on board with. If you diametrically oppose something you are likely to find that you become marginalised and are ignored. If you align yourself with it but suggest changes, you are more likely to be listened to.

Take these three together and you can remove a great deal of hassle from your life.

SSD Drives are a Total No-Brainer

Reading Time: < 1 minute

When it comes to hardware, technical staff can badger a business senseless. Every member of technical staff claims that they could do their job so much better if they just had this upgrade or that gizmo. Without spending hours reading all the latest hardware blogs, determining what would actually be useful investment in their productivity is next to impossible.

SSD drives are a no-brainer though. The biggest bottleneck in PCs today is the hard disk, clunky, mechanical things that lose an awful lot of time whilst the heads are whizzing back and forth across the platters.
At the time of writing a 64Gb SSD Drive is about £100. I bought one more out of curiosity than anything else and slung it in my ancient 3.0GHz P4.

This video shows it loading Windows XP, logging in, then starting Word and Chrome at the same time.
I then type some rubbish in Word, navigate to Facebook and shut the PC down. The difference an SSD Drive makes is mind-bending. I’m not saying that you should replace existing hard disks with SSD drives, just slap one in with the operating system and apps on and use the old (likely much larger capacity) drive for data. That’s good business sense.