Sunday, May 15, 2011

Why manual linked lists are evil

Recently I've been reminded of a pattern in code I particularly dislike. Manual traversal of linked lists. I couldn't quite put my finger on why I have an aversion to this practice, but I'm gonna try to pinpoint right here in this post, and solve this moral dilemma once and for all.

The pattern looks something like this (here i'm not talking about design patterns, just a pattern of usage).

//declaration code
struct SPointsList
{
    SPoint m_CurPoint;
    SPointsList* m_Next;
}


void foo(SPoint& pt);


//... 
//client code
SPointsList points; 


//.. some annoying initialization code goes here, creating a non-empty valid list
SPointsList* pt = points.m_CurPoint;
while (pt!=NULL)
{
    foo(pt->m_CurPoint);
    pt = pt->m_Next;
}

So what don't you like about this code, Mr. Refactoraholic? It's efficient, it's compact, it's not using templates, it's doing just what it's supposed to do and no more. Doesn't that fit into your favorite K.I.S.S. principle? Aren't you just being a tight-ass abstractionist who is just waiting for an excuse to use some templates and iterators that you read about in your favorite design pattern book?


Well, I'm glad you asked. Here's some of the things I don't like about this code:

  • It reinvents the wheel. Implementing a linked list from scratch is an exercise that I've learned how to do in high school, and yet in practice try to avoid as much as possible for reasons below
  • It does pointer arithmetics, which is an overused feature of C / C++, most other languages have wisely chosen to get away from. 
  • It is prone to off-by-one errors, since you have to be careful not to hit uninitialized memory or to get yourself into an infinite loop. 
  • Client code also has to be careful to construct a valid data structure (the last element needs to point to a NULL, so that elsewhere in client code we could check for it). I didn't include an example of this, simply to keep my blood pressure low and my stomach from getting too upset. 
  • All in all, the main negative thing about using manual linked lists, is that this approach exposes implementation details to the client, and client code everywhere has to adjust to that. If the representation changes, client code has to be re-written, and that's a real waste, as this code could have been written in a more flexible way from the start. 
One way of writing this code would be:

typedef std::list < SPoint > SPointsList

//client code:

SPointsList points; 

//some initialization code goes here, using list::insert, or list::push_back 

SPointsList::iterator itr = points.begin();
for (;itr!=points.end(); ++itr) {
    foo(*itr);
}


We can argue about how pretty this code is or that it uses more characters than the original pointer version. But in the end, this approach hides all the nasty details and pointer arithmetics, and changing the representation does not affect the client code at all.

There's also a more neutral approach that avoids using STLs, where we basically only implement the functions that are needed, but still use the iterator-based approach to hide the details of the pointer math. Here's a very sloppy version of that:

struct SPointsList
{
    SPoint m_CurPoint;
    SPointsList* m_Next;
    void AddNewPoint(const SPoint&, SPointsList* pos);
    void RemovePoint(SPointsList* pos);
    SPointsList* GetNext() {return m_Next;} 
    bool IsLast(SPointsList* pt) {return pt==NULL;}
}

Perhaps only implementing the functions you need can reduce code bloat and allow you to special-case optimize access to your data structure, but you have to do more work to reinvent the wheel, and error-proof your internal pointer math for adding / removing elements. 


[It seems like blogger is not the most code-snippet friendly of environments, so I'm gonna look into switching to wordpress or something else]

3 comments:

  1. > But in the end, this approach hides all the nasty details and pointer arithmetics, and changing the representation does not affect the client code at all.

    it also doubles the number of new/delete calls. that says it all.

    ReplyDelete
  2. > so I'm gonna look into switching to wordpress or something else

    it's not quite true :)
    look here -- it's blogger: http://geekhut.blogspot.com/2008/04/extjs-drop-in-solution-for-automatic.html

    (one of the older posts from my blog)

    ReplyDelete
  3. > it also doubles the number of new/delete calls. that says it all.

    It doubles the number of new / delete calls on what client code? In the code I had in mind, for example, the main bulk of the client code traverses the list forward.

    Or are you referring to std::list being a doubly linked list?

    ReplyDelete