Why .NET LinkedList does not support Concat and Split operations?
Concat O(1) or O(n) ?
The .NET LinkedList
is a circular doubly linked list, where each node holds a reference to its previous and next nodes. The last node’s next is the head node and the head node’s previous is the last one.
Linked lists are attractive because of O(1) insertion and removal operations. Instead of shifting elements in array, you just chain nodes in appropriate order and that’s it.
Following this logic, concatenation of two lists should be similar O(1) operation. You just bind the end of the first list with the head of the second and let both the ends of the resulting list point to each other.
Nevertheless, if you look at .NET LinkedList implementation, you will find out that the only way to join two linked lists is to add elements of the second list one by one to the first one. This is O(n) operation.
Implementing Concat
So if you look at the implementation, you will find out that the LinkedListNode
has one more field except prev
and next
. This is a reference to the owner list. So with this implementation, even if you implement concat in O(1) as described above, you will still need at least reparent all nodes which is again O(n) operation.
Can we avoid referencing list in every node? Yes we can. We can rewrite Previous
and Next
properties like this.
Before
public LinkedListNode Next
{
get
{
if ((this.next != null) && (this.next != this.list.head))
{
return this.next;
}
return null;
}
}
After
public SimpleLinkedListNode GetNext(SimpleLinkedList list)
{
if ((this.Next != null) && (this.Next != list.First))
{
return this.Next;
}
return null;
}
After removing parent checking exception handling code from LinkeList
implementation, you can implement concatenation like this:
public void Concat(SimpleLinkedList secondList)
{
if (secondList.m_Head == null)
{
return;
}
if (this.m_Head == null)
{
this.m_Head = secondList.m_Head;
}
else
{
var seamLeft = this.Last;
var seamRight = secondList.m_Head;
var newEnd = secondList.Last;
seamLeft.Next = seamRight;
seamRight.Prev = seamLeft;
newEnd.Next = this.m_Head;
this.m_Head.Prev = newEnd;
}
}
You are even able to maintain Count
property correctly by adding counts of both lists.
this.m_Count += secondList.m_Count;
Drawbacks
The major drawback of this solution is that the second list will be left in an inconsistent state after this operation. Its last.prev
does not point to its head anymore. Another point is that any active operation on one of the lists will modify both, thus they have shared nodes. These side effects violate fundamental principles of OO design. So I can imagine that Microsoft won’t compromise at that point and decided not to implement concatenation.
What about Split?
Implementing Split is very similar, with the difference that you are not able to maintain Count
property in O(1) time. You are splitting at certain node without knowing its index inside the list. So if you want to have a Split
operation, you should compromise Count
property.
Concat Extension Method
There is a Concat
extension method on IEnumarable
interface. You might think it’s as stupid as Count
extension method on IEnumarable
and spends O(n) on bringing all together.
In fact, it executes in O(1) returning a new enumerator which enumerates first list and jumps to the second when the first list reaches its end.
This might help if you are not going to continue working with concatenation result like with a LinkedList
. Another point is that several subsequent concatenation operations cause nested enumerators. So you get a tree of enumerators over enumerators, etc.
My Implementation
The actual goal of my implementation of double linked list supporting concat and split was to try out and demonstrate test driven refactoring approach.
The first implementation was just derived from common .NET LinkedList
.
public class SimpleLinkedList : LinkedList
{
public SimpleLinkedList()
{
}
public SimpleLinkedList(IEnumerable enumerable)
: base(enumerable)
{
}
internal SimpleLinkedList Split(LinkedListNode splitNode)
{
throw new NotImplementedException();
}
public LinkedListNode Find(T search, IEqualityComparer comparer)
{
return Find(search);
}
public void FindLast(T search, IEqualityComparer comparer)
{
return Find(search);
}
}
The second step was creating appropriate test set ensuring common LinkedList
behaviour.
Example
[TestCase(new string[] { }, "A", new[] { "A" })]
[TestCase(new string[] { }, null, new[] { (string)null })]
[TestCase(new[] { "A" }, "B", new[] { "A", "B" })]
[TestCase(new[] { "A", "B", "C" }, "D", new[] { "A", "B", "C", "D" })]
public void AddLast_one_element_occururs_at_the_end_of_the_list
(string[] original, string element, string[] expected )
{
var target = new SimpleLinkedList(original);
target.AddLast(element);
CollectionAssert.AreEquivalent(expected, target);
}
After I had a reasonable set, I started implementing functionality successively.
Here is the result – SimpleLinkedList.
Advantages
- Less memory consumption, thus every node
SimpleLinkedListNode
has three pointers (prev
,next
,value
) instead of four (prev
,next
,list
,value
) unlike original .NET implementation. - Supports
Concat
andSplit
operations in O(1) - Supports
IEnumarable Reverse()
enumerator in O(1) – by the way, I do not see any reason why it’s not provided natively on the .NETLinkedList
. Appropriate extension method requires O(n).
Disadvantages
- Does not support
Count
. Concat
operation leaves second list in an inconsistent state.Split
operation leaves original list in an inconsistent state.- You are able to share nodes between lists.
Other
- I have chosen an alternative strategy for implementing enumeration and find operations, rather than more verbose and purely readable original implementation. I hope the negative performance impact remains insignificant.
So be careful using this list and use it only if you are aware of the consequences.
