Tuesday, June 14, 2011

What does this code print and why?

What does this code print and why?

            HashSet<int> set = new HashSet<int>();
            int[] data = new int[] { 1, 2, 1, 2 };
            var unique = from i in data
                         where set.Add(i)
                         select i;
  // Compiles to: var unique = Enumerable.Where(data, (i) => set.Add(i));
            foreach (var i in unique)
            {
                Console.WriteLine("First: {0}", i);
            }
 
            foreach (var i in unique)
            {
                Console.WriteLine("Second: {0}", i);
            }

The output is:

First: 1
First: 2

Why is there no output of the second loop? The reason is that LINQ does not cache the results of the collection but it does recalculate the contents for every new enumeration again. Since I have used state (the Hashset does decide which entries are part of the output) I do arrive with an empty sequence since Add of the Hashset will return false for all values I have already passed in leaving nothing to return a second time.

The solution is quite simple: Use the Distinct extension method or cache the results by calling .ToList() or ToArray() for the result of the LINQ query.

Lession Learned: Do never forget to think about state in Where clauses!

No comments:

Post a Comment