jasondew’s repository of blog posts

The importance of a good algorithm

I'm studying for the computer science Ph.D. qualifying exam and so I've started going back through my algorithms book (Intro to Algorithms by Cormen, Leiserson, Rivest, and Stein).  The first chapter was, of course, about motivating the study of algorithms.  One exercise that made an impression on me was the one that had you generate a table giving the largest problem you could solve in different amounts of time given different asymptotically complex algorithms.

Since I take every opportunity to make progress learning Haskell, I coded it up:

What the table shows is the largest value of n you could process given an algorithm of certain complexity and a certain amount of time:

f(n) 1 sec. 1 min. 1 hour 1 day 1 month 1 year 1 century
lg(n) 2.7e43 ∞* ∞* ∞* ∞* ∞* ∞*
sqrt(n) 10000 3.6e8 1.3e11 7.5e13 6.7e16 9.7e18 9.7e22
n 100 6000 3.6e5 8.6e6 2.6e8 3.1e9 3.1e11
n lg(n) 29 884 34458 6.5e5 1.6e7 1.6e8 1.3e10
n^2 10 77 600 2939 16099 55770 5.6e5
n^3 4 17 68 194 597 1357 6203
2^n 6 12 18 23 27 31 38
n! 4 7 8 10 11 12 14

* The values here aren't really infinity but they are over 300 digits!

So the lesson here?  Having an algorithm with a good asymptotic complexity makes a huge difference in the amount of data it is feasible to process.  Just look at the difference between linear complexity (n) and logarithmic complexity (lg n): 41 orders of magnitude!

Filed under: haskell

Profiling Darwin, Functionality added to Haskell GD bindings

So I've been working on the Darwin hobby project again recently and decided to find out where the program is spending all of it's time. It turns out that there are some really nice profiling tools in Haskell (GHC to be specific). Armed with that, I found out rather quickly that the bottleneck was the getPixel function I added to the GD bindings.

My first thought was that it would be nice to grab all of the pixels at once instead of repeated (slow) calls to getPixel. So I read up on Haskell FFI and the GD documentation and churned out the following:

This code returns a nice Haskell array of arrays with the color information. This improved the speed from 1 minute per 10 iterations to 1 second per 10 iterations -- about a 60x improvement!

However, this only works with true color images at this point; no indexed palette support since that information gets stored elsewhere in the GD image struct. All in all, it was a very pleasant and rewarding experience into Haskell. Next on the list is (Erlang-style maybe?) parallelization.

BTW, the GD binding code is under the dependencies folder here.

Filed under: haskell

Pseudo Genetic Programming in Haskell

Had some fun this weekend writing Haskell in response to this blog post. Code is on GitHub. It has some performance issues and its really my first real program in Haskell, so its a little rough around the edges, I'm sure. I think I'll rewrite it in Erlang to see if I can't speed it up a good bit by parallelizing the fitness function and increase the generation pool.

Filed under: haskell
12
To Posterous, Love Metalab