The importance of a good algorithm
I'm studying for the computer science Ph.D. qualifying exam and so I've started going back through my algorithms book (Intro to Algorithms by Cormen, Leiserson, Rivest, and Stein). The first chapter was, of course, about motivating the study of algorithms. One exercise that made an impression on me was the one that had you generate a table giving the largest problem you could solve in different amounts of time given different asymptotically complex algorithms.
Since I take every opportunity to make progress learning Haskell, I coded it up:
What the table shows is the largest value of n you could process given an algorithm of certain complexity and a certain amount of time:
| f(n) | 1 sec. | 1 min. | 1 hour | 1 day | 1 month | 1 year | 1 century |
|---|---|---|---|---|---|---|---|
| lg(n) | 2.7e43 | ∞* | ∞* | ∞* | ∞* | ∞* | ∞* |
| sqrt(n) | 10000 | 3.6e8 | 1.3e11 | 7.5e13 | 6.7e16 | 9.7e18 | 9.7e22 |
| n | 100 | 6000 | 3.6e5 | 8.6e6 | 2.6e8 | 3.1e9 | 3.1e11 |
| n lg(n) | 29 | 884 | 34458 | 6.5e5 | 1.6e7 | 1.6e8 | 1.3e10 |
| n^2 | 10 | 77 | 600 | 2939 | 16099 | 55770 | 5.6e5 |
| n^3 | 4 | 17 | 68 | 194 | 597 | 1357 | 6203 |
| 2^n | 6 | 12 | 18 | 23 | 27 | 31 | 38 |
| n! | 4 | 7 | 8 | 10 | 11 | 12 | 14 |
* The values here aren't really infinity but they are over 300 digits!
So the lesson here? Having an algorithm with a good asymptotic complexity makes a huge difference in the amount of data it is feasible to process. Just look at the difference between linear complexity (n) and logarithmic complexity (lg n): 41 orders of magnitude!
Profiling Darwin, Functionality added to Haskell GD bindings
So I've been working on the Darwin hobby project again recently and decided to find out where the program is spending all of it's time. It turns out that there are some really nice profiling tools in Haskell (GHC to be specific). Armed with that, I found out rather quickly that the bottleneck was the getPixel function I added to the GD bindings.
My first thought was that it would be nice to grab all of the pixels at once instead of repeated (slow) calls to getPixel. So I read up on Haskell FFI and the GD documentation and churned out the following:
This code returns a nice Haskell array of arrays with the color information. This improved the speed from 1 minute per 10 iterations to 1 second per 10 iterations -- about a 60x improvement!
However, this only works with true color images at this point; no indexed palette support since that information gets stored elsewhere in the GD image struct. All in all, it was a very pleasant and rewarding experience into Haskell. Next on the list is (Erlang-style maybe?) parallelization.
BTW, the GD binding code is under the dependencies folder here.
Pseudo Genetic Programming in Haskell
