Thursday, March 27, 2008 12:19 AM
I have recently been reading "Agile Principles, Patterns, and Practices in C#" by Robert C Martin and Micah Martin. So far, it's been an enjoyable read. Early on the authors explain the concept of refactoring by showing how they would go about refactoring the "Sieve of Eratosthenes":http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes. The algorithm is a way to generate a list of prime numbers. The final refactored solution is full of loops and breaks, which is fine, but like most things -- it looked like a job for LINQ! I decided to repeat the exercise using as much LINQ as I could reasonably stuff into algorithm. Here is my green pass (of a red-green-refactor cycle).
18
19 public static IList<int> GenerateListOfPrimes(int maxNumber)
20 {
21 IList<int> result = new List<int>();
22
23 if (maxNumber > 1)
24 {
25 var candidateNumbers =
26 Enumerable.Range(2, maxNumber - 1);
27
28 double stopNumber = System.Math.Sqrt(maxNumber);
29
30 int nextPrime = 2;
31
32 while (nextPrime <= stopNumber)
33 {
34 candidateNumbers =
35 (from n in candidateNumbers
36 where
37 n <= nextPrime ||
38 n % nextPrime != 0
39 select n).ToList();
40
41 nextPrime =
42 (from p in candidateNumbers
43 where p > nextPrime
44 select p).First();
45
46 }
47
48 result = candidateNumbers.ToList();
49
50 }
51
52 return result;
53 }
54
Well, it uses LINQ, but it ain't lovely. I ended up refactoring the code in much the same way that the authors did. Primarily by breaking up the long method (Extract Method). The final pass of the method ended up looking like this:
31
32 public int[] GeneratePrimes(int maxNumber)
33 {
34 int[] result = new int[] { };
35 if (maxNumber > 1)
36 {
37 InitializeRangeOfNumbers(maxNumber);
38 InitializeLastNumberToCheck(maxNumber);
39 int prime = 2;
40 while (RemainingNumbersContainFactorsOf(prime))
41 {
42 RemoveFactorsOf(prime);
43 prime = NextPrime(prime);
44 }
45
46 result = _remainingNumbers.ToArray();
47
48 }
49
50 return result;
51
52 }
53
The implementation of the methods look like this:
54
55 private void InitializeLastNumberToCheck(int maxNumber)
56 {
57 _lastNumberToCheck = Math.Sqrt(maxNumber);
58 }
59
60 private void InitializeRangeOfNumbers(int maxNumber)
61 {
62 _remainingNumbers = Enumerable.Range(2, maxNumber - 1);
63 }
64
65 private int NextPrime(int currentPrime)
66 {
67 return
68 (from p in _remainingNumbers
69 where p > currentPrime
70 select p).First();
71 }
72
73 private bool RemainingNumbersContainFactorsOf(int prime)
74 {
75 return prime <= _lastNumberToCheck;
76 }
77
78 private void RemoveFactorsOf(int prime)
79 {
80 _remainingNumbers =
81 (from n in _remainingNumbers
82 where
83 n <= prime ||
84 n % prime != 0
85 select n).ToList();
86 }
87
You might be interested to know that the LINQ implementation is (ever so) slightly faster than the Martin version. Both versions can compute the first 100,000 primes in less than a second. I haven't dug into the reason why. For fun, I decided to implement an algorithm that would fall into the LINQ way a little easier. I came up with this:
27
28 public int[] GeneratePrimes(int maxNumber)
29 {
30 int [] result = new int [] { };
31 if(maxNumber > 1)
32 {
33 var primeNumbers =
34 from n in Enumerable.Range(2, maxNumber - 1)
35 where n.IsPrime()
36 select n;
37
38 result = primeNumbers.ToArray();
39
40 }
41 return result;
42 }
43
The @IsPrime@ method is the trick.
48
49 static class IntExtensions
50 {
51 public static bool IsPrime(this int p)
52 {
53 bool result = false;
54 if (p > 1)
55 {
56 IEnumerable<int> testNumbers =
57 from n in Enumerable.Range(2, (int)Math.Sqrt(p))
58 where
59 p != n &&
60 p % n == 0
61 select n;
62 result = testNumbers.Count() == 0;
63 }
64 return result;
65 }
66 }
67
This algorithm might look a little nicer (from a LINQ point of view), but it's tragically slow. Essentially, the code performs some heavy looping for each number in the range being tested. I could probably juice up the prime test a little bit by digging into some of the more advanced algorithms, but for now it was a fun exercise.
Download the Source