jasondew's accumulated writings http://jasondew.com Most recent posts at jasondew's accumulated writings posterous.com Mon, 07 May 2012 06:16:22 -0700 Turing Machines explained from the ground up http://jasondew.com/turing-machines-explained-from-the-ground-up http://jasondew.com/turing-machines-explained-from-the-ground-up

I gave a talk at ConvergeSE 2012 on this topic so I thought I'd write it up as a blog post as well. That said, let's jump right in.

Turing

So the obvious thing to start with is Alan Turing, the man for which Turing machines are named. Turing was a British mathematician sometimes called the "father of Computer Science."  I call him a mathematician because he did computer science before there was a discipline so named. He was influential in several fields, including AI and cryptanalysis. Specifically, he worked at Bletchley Park during World War II where he worked on breaking communications encrypted by the German enigma. His first major achievement was a paper written in 1936, before he had obtained his Ph.D., proving that the Entscheidungsproblem had no solution.

The Entscheidungsproblem was proposed by a very famous mathematician named David Hilbert in 1928. Simply put, the question is whether or not an "algorithm" could be devised to determine if a statement in first-order logic is universally valid. To accomplish this, Turing devised a theoretical machine that he used to answer this question in the negative. The machine became known as a Turing machine.

Background

In order to understand this machine, we need to start with some terminology. Most of these have roots in language, so they'll seem familiar. First, we have an alphabet. This is simpy a set of symbols. For example, we have the set of lowercase roman letters, denoted: {"a", "b", "c", ..., "z"}. Another example, that we'll use later, is the binary alphabet or {"0", "1"}. We can represent any number with just these two symbols.

The natural thing to do with a set is to combine the elements into a sequence of symbols. This construction is called a string. Some examples from the binary alphabet are "0", "0101001", and the empty string. So, to reiterate, zero or more symbols from an alphabet gets you a string.

Finally, a formal language is a set of strings "over" an alphabet. For example, we could define a formal language of metasyntactic variables: {"foo", "bar", "baz", "quux"}. The alphabet here is taken to be the lowercase roman letters, as before. Another example is the two-digit binary numbers: {"00", "01", "10", "11"}. Notice that both of these sets are finite but this doesn't have to be the case. Consider the alphabet {"a", "b"} and the formal language {"b", "ab", "aab", "aaab", ...} which is the set of zero of more "a"s followed by a single "b". You could write this more compactly as a*b but more on that later.

Finite Automata

Now for the fun stuff: Deterministic Finite Automata, also known as DFAs or just finite state machines. These simple "machine"s are defined by an alphabet, a set of states, and a transition function. The alphabet defines the valid symbols that the machine can take as input. One of the states is defined as the start state and one or more are defined as accepting states. The starting state is pretty obvious, it's just the initial state of the machine. The accepting states determine whether or not the machine "accepts" the input it was given. The most interesting part, though, is the transition function. It takes a symbol, from the input stream, and the current state the machine is in and returns the new state we will transition to. Here's a graphical representation of a DFA that accepts binary strings that are a multiple of 3:

500px-dfa_example_multiplies_of_3

The notation requires some explanation. The circled values represent states with the start state denoted by an arrow pointing to it and the accepting states denoted with double circles. The arrows leaving and entering the states are the transition. Taken all together, they define the transition function for this DFA. The arrow from state S_0 to S_1 with the label "1" means that if we are in state S_0 and we read a "1" on the input stream then we should transition to state S_2. Furthermore, just from the diagram we can infer that the alphabet here is {"0", "1"}. This is because DFAs must have a leaving arrow for each symbol on each state. This is the deterministic property.

It will really help cement the concept if you run through a few examples. Consider the input "00". We start at state S_0 and see a "0" so we stay in the same state. When we see the second "0", we again transition to state S_0. Since we're out of input at this point, we consider whether or not we're in an accepting state. It turns out we are, which means that the machine has accepted the input. In this case, the machine is saying that "00" (0 in decimal) is a multiple of 3. Since 0*3=0, we can see this is true. Consider the input "1". In this case, we end up in state S_1 which is not an accepting state. Therefore, the machine rejects that input. This makes sense because there is no (integer) value such that x * 3 = 1.

Now, consider what would happen if we relax the deterministic constraint. This would mean that the transition function can now return zero or more states. In other words, we can have no transition out of a state at all (a sink), a single transition (as before), or multiple transitions. In the final case, we're effectively allowing the machine to split itself into how ever many transitions there are. This effective gives us a tree of automata. Obviously, this gives us quite a bit more expressive power.

These machines are called nondeterministic finite automata or NFAs. Lets look at an example:

500px-nfasimpleexample

This machine is using the same alphabet as before and has only two states, p and q. Here, the starting state is p and q is the only accepting state. Notice that when in state p and seeing a "1", we simultaneously stay in the p state and also move to the q state. You can think of this as multiple universes or cloning the machine so that we keep track of all possible paths. It turns out that this machine accepts any binary string that ends with a "1".

Surprisingly, NFAs and DFAs are equivalent in expressive power. That is, any NFA can be converted into a DFA that accepts the same string. So even though we allow non-determinism, we can still convert it into an equivalent, but generally larger, DFA.

Regular languages

The set of strings that a finite automaton accepts is called it's language. Conversely, a regular language is any language that can be recognized by a finite automaton, either deterministic or not. Even more interesting is that a language is regular if and only if some regular expression describes it. What this means is that regular expressions and DFAs have the same capability in describing languages. So we can convert from a DFA into a regular expression and vice versa.

Turing Machines

Finally we're ready to describe Turing Machines. They are just a small step up in complexity from the finite automata we just looked at. We now have a name for the stream of input, the tape. Turing machines can read/write to/from the tape as well as control it's movement. So, now we need two alphabets: the input alphabet and the output (or tape) alphabet. We still have a set of states, except that now we will have one starting state, one accepting state, and one rejecting state. There will still be a transition function, except that now it takes a state and a symbol from the current position on the tape and returns a new state, possibly a symbol to write, and a direction to move (either left or right). Notice that we don't have to write a symbol, but we do have to move.

Lets look at an example Turing Machine. We'll name it M for machine. It's going to have an input alphabet of {"0"} and a tape alphabet of {"_", "x"}. The machine will accept "0" strings whose length is a power of 2. That is, string's whose length can be expressed as 2^x for some integer x. For example, "0", "00", and "0000" are the smallest strings that should be accepted because they are of length 1 (2^0), 2 (2^1), and 4 (2^2), respectively. To be clear, the machine would reject strings like "0" or "000".

Screen_shot_2012-05-04_at_10

This description should look familiar. The labels on the transition arrows has gotten a little more interesting. They are in the form "symbol -> [symbol,] direction" where the first symbol defines when this transition is applicable, the second symbol is optional and defines the symbol to write to the tape, and the direction is either L or R for moving the left or right on the tape.

So if we imagine the tape with a "." at the current position, the transition of states for the input "00" goes something like this:

Screen_shot_2012-05-04_at_10

You should try "running" the machine on other inputs, making a table similar to the one above. Basically, the procedure is to mark off half of the 0s on each pass. If we run out of zeros during the intermediate stage, then we know we should reject. Otherwise, we accept the string.

Once you get comfortable with what this machine is doing, you'll notice that each state has a specific purpose. For example, the state q1 marks the first 0 with an _ so that it knows when we're at the beginning of the string. States q2, q3, and q4 are doing the builk of the work, marking through the "0"s with "x"s and moving the tape. State q5 is a reset procedure, moving us back to the beginning of the input.

Conclusion

So why do we care about Turing machines? First of all, they define what an algorithm is, in more concrete terms. This is relatd to the Church-Turing thesis and what it means to be "computable." It also turns out that they are equivalent in power to any other reasonable computational model. This is fairly surprising considering how relatively simple they are. They represent the essence of what it is to be a "computer."


References

Deterministic finite automaton. (2012, March 11). Retrieved from http://en.wikipedia.org/wiki/Deterministic_finite_automaton
 
Nondeterministic finite automaton. (2012, April 20). Retrieved from http://en.wikipedia.org/wiki/Nondeterministic_finite_automaton

Petzold, C. (2008). The annotated turing. Indianapolis: Wiley Publishing, Inc.

Sipser, M. (2006). Introduction to the theory of computation. (2nd ed.). Boston: Thompson Course Technology.

Turing machine. (2012, April 17). Retrieved from http://en.wikipedia.org/wiki/Turing_machine

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1988051/jasondew.jpg http://posterous.com/users/5fdzKmwyYb0R Jason Dew Jason Jason Dew
Mon, 23 Jan 2012 15:20:00 -0800 Clathrus ruber http://jasondew.com/clathrus-ruber http://jasondew.com/clathrus-ruber

While walking in the woods this afternoon, I came across this red, tubular plant growing on some decaying wood.  Upon further investigation, I found that inside was a nice green mucous with lots of flies inside.  Then, the smell hit me.  It was definitely the most foul smelling plant I've ever encountered.  Interestingly enough, according to Wikipedia, its actually an invasive species brought over from Europe.  There's always something cool in the woods.

References:

http://en.wikipedia.org/wiki/Clathrus_ruber

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1988051/jasondew.jpg http://posterous.com/users/5fdzKmwyYb0R Jason Dew Jason Jason Dew
Thu, 15 Jul 2010 12:21:35 -0700 Useless (but nifty) Ruby code http://jasondew.com/useless-but-nifty-ruby-code http://jasondew.com/useless-but-nifty-ruby-code

Here's a nifty little piece of useless Ruby code:

and the result is:

require "quine"
this.that.and.something_else

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1988051/jasondew.jpg http://posterous.com/users/5fdzKmwyYb0R Jason Dew Jason Jason Dew
Mon, 12 Jul 2010 14:57:00 -0700 coded_options: A new Ruby gem for coded fields http://jasondew.com/codedoptions-a-new-ruby-gem-for-coded-fields http://jasondew.com/codedoptions-a-new-ruby-gem-for-coded-fields

I recently started a couple new projects (Rails 3 + mongoid) and I've noticed a pattern in the way I handle coded fields.  My key example is something like the following: you have a field, say status, that can have several values, say active, closed, and invalid.  Obviously you could store those as strings in the database or you can code them, say 0, 1, and 2.  Normal database practice is to code them as integers to save space but a far more important concern is that clients change their mind about what you call things.  So its much easier to just change the string values in some code than it is to go changing every string in the database.

Anyway, the usage is something like this (yanked directly from the README):

Here line 4 (the coded_options call) basically gets mapped into lines 6 through 14.  Nothing spectacular but it's really cleaned up my code quite a bit so maybe it will be useful to some other folks.  The code is up on github (http://github.com/jasondew/coded_options) and the gem is on gemcutter (http://rubygems.org/gems/coded_options).

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1988051/jasondew.jpg http://posterous.com/users/5fdzKmwyYb0R Jason Dew Jason Jason Dew
Fri, 09 Jul 2010 18:15:27 -0700 The importance of a good algorithm http://jasondew.com/the-importance-of-a-good-algorithm http://jasondew.com/the-importance-of-a-good-algorithm

I'm studying for the computer science Ph.D. qualifying exam and so I've started going back through my algorithms book (Intro to Algorithms by Cormen, Leiserson, Rivest, and Stein).  The first chapter was, of course, about motivating the study of algorithms.  One exercise that made an impression on me was the one that had you generate a table giving the largest problem you could solve in different amounts of time given different asymptotically complex algorithms.

Since I take every opportunity to make progress learning Haskell, I coded it up:

What the table shows is the largest value of n you could process given an algorithm of certain complexity and a certain amount of time:

f(n) 1 sec. 1 min. 1 hour 1 day 1 month 1 year 1 century
lg(n) 2.7e43 ∞* ∞* ∞* ∞* ∞* ∞*
sqrt(n) 10000 3.6e8 1.3e11 7.5e13 6.7e16 9.7e18 9.7e22
n 100 6000 3.6e5 8.6e6 2.6e8 3.1e9 3.1e11
n lg(n) 29 884 34458 6.5e5 1.6e7 1.6e8 1.3e10
n^2 10 77 600 2939 16099 55770 5.6e5
n^3 4 17 68 194 597 1357 6203
2^n 6 12 18 23 27 31 38
n! 4 7 8 10 11 12 14

* The values here aren't really infinity but they are over 300 digits!

So the lesson here?  Having an algorithm with a good asymptotic complexity makes a huge difference in the amount of data it is feasible to process.  Just look at the difference between linear complexity (n) and logarithmic complexity (lg n): 41 orders of magnitude!

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1988051/jasondew.jpg http://posterous.com/users/5fdzKmwyYb0R Jason Dew Jason Jason Dew
Thu, 08 Oct 2009 12:00:00 -0700 Named instances for ActiveRecord http://jasondew.com/named-instances-for-activerecord http://jasondew.com/named-instances-for-activerecord

For a project that I'm working on at my day job, we have a governmental client for which we are building a pretty large and complicated online/offline Ruby on Rails application. As part of this app there are tons of data-specific rules. For example, if a client with HIV is being assessed then certain fields may have to be displayed/hidden and there are rules that get applied differently. So lets say you have the following setup:

http://gist.github.com/205241

Now, somewhere in your code you want to be able to do take a specific action only if that client has a particular diagnosis. Without named_instances you might do something like:

http://gist.github.com/205250

With named_instances you can do the following faster and more concise code:

http://gist.github.com/205253

We've been using this functionality for about 6 months now and its been great. The gem is out on GemCutter (which rocks) and the repo is at GitHub. Hopefully it will be useful in your projects. Comments, criticisms, and patches welcome.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1988051/jasondew.jpg http://posterous.com/users/5fdzKmwyYb0R Jason Dew Jason Jason Dew
Thu, 17 Sep 2009 12:00:00 -0700 Weirdest test failure ever... http://jasondew.com/weirdest-test-failure-ever http://jasondew.com/weirdest-test-failure-ever

So based on some customer feedback on a project I'm currently working on, I added a validation requiring that one of the dates be in the past. The validation is pretty straightforward:

http://gist.github.com/188474

After re-running the test suite though, I got a couple of failures. After digging into the code, it turns out that the following code snippit evaluates to true!

http://gist.github.com/188475

Of course this makes no sense. What's more is that I tried the code again while writing this blog post and now its false. So I did some digging into the Ruby internals and it turns out that Time#< is implemented in C in the following method:

http://gist.github.com/189385

The best I can come up with is that there's an problem with the GetTimeval method since its just a macro that pulls out some data from a time struct -- but the Date class is implemented in pure Ruby. Anyone come across this or can explain better?

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1988051/jasondew.jpg http://posterous.com/users/5fdzKmwyYb0R Jason Dew Jason Jason Dew
Fri, 13 Feb 2009 12:00:00 -0800 Profiling Darwin, Functionality added to Haskell GD bindings http://jasondew.com/profiling-darwin-functionality-added-to-haske http://jasondew.com/profiling-darwin-functionality-added-to-haske

So I've been working on the Darwin hobby project again recently and decided to find out where the program is spending all of it's time. It turns out that there are some really nice profiling tools in Haskell (GHC to be specific). Armed with that, I found out rather quickly that the bottleneck was the getPixel function I added to the GD bindings.

http://gist.github.com/470216

My first thought was that it would be nice to grab all of the pixels at once instead of repeated (slow) calls to getPixel. So I read up on Haskell FFI and the GD documentation and churned out the following:

This code returns a nice Haskell array of arrays with the color information. This improved the speed from 1 minute per 10 iterations to 1 second per 10 iterations -- about a 60x improvement!

However, this only works with true color images at this point; no indexed palette support since that information gets stored elsewhere in the GD image struct. All in all, it was a very pleasant and rewarding experience into Haskell. Next on the list is (Erlang-style maybe?) parallelization.

BTW, the GD binding code is under the dependencies folder here.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1988051/jasondew.jpg http://posterous.com/users/5fdzKmwyYb0R Jason Dew Jason Jason Dew
Sun, 01 Feb 2009 12:00:00 -0800 Pseudo Genetic Programming in Haskell http://jasondew.com/pseudo-genetic-programming-in-haskell http://jasondew.com/pseudo-genetic-programming-in-haskell

Monalisa
Had some fun this weekend writing Haskell in response to this blog post. Code is on GitHub. It has some performance issues and its really my first real program in Haskell, so its a little rough around the edges, I'm sure. I think I'll rewrite it in Erlang to see if I can't speed it up a good bit by parallelizing the fitness function and increase the generation pool.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1988051/jasondew.jpg http://posterous.com/users/5fdzKmwyYb0R Jason Dew Jason Jason Dew
Fri, 08 Aug 2008 12:00:00 -0700 August AITP Presentation http://jasondew.com/august-aitp-presentation http://jasondew.com/august-aitp-presentation

Here are the slides from the talk I gave on Wednesday. Enjoy!

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1988051/jasondew.jpg http://posterous.com/users/5fdzKmwyYb0R Jason Dew Jason Jason Dew
Thu, 31 Jul 2008 12:00:00 -0700 POSScon Talk http://jasondew.com/posscon-talk http://jasondew.com/posscon-talk

Awesome conference. Met lots of great people and the roundtable and ensuing conversation at the end were the best part. We need more of these kinds of conferences. Here are the slides from my talk:

I'm looking forward to next year!

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1988051/jasondew.jpg http://posterous.com/users/5fdzKmwyYb0R Jason Dew Jason Jason Dew
Fri, 11 Jul 2008 12:00:00 -0700 POSScon: Palmetto Open Source Software Conference http://jasondew.com/posscon-palmetto-open-source-software-confere-0 http://jasondew.com/posscon-palmetto-open-source-software-confere-0

Posscon
I'll be speaking at the new South Carolina OSS conference in Columbia, SC on July 30th -- more details at their website. I plan to talk about the open-source technologies that we use at the SC Budget and Control Board -- mostly Ruby on Rails and mySQL. I'm pretty excited about the conference as its the first of its kind here in Columbia. Should be worth coming to check it out -- oh, and did I mention its free to attend? See you guys there!

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1988051/jasondew.jpg http://posterous.com/users/5fdzKmwyYb0R Jason Dew Jason Jason Dew
Tue, 20 May 2008 12:00:00 -0700 ActiveRecord Bug Squashed http://jasondew.com/activerecord-bug-squashed http://jasondew.com/activerecord-bug-squashed

Rails
So we decided to use the new dirty record feature of Edge Rails to record the history of records in our new app at work. Turns out that nullable integer fields are always dirty if their current value is NULL. So, I submitted a patch and I'm honored to report that it was accepted and committed into Rails. Its nice to be able to give back to the community.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1988051/jasondew.jpg http://posterous.com/users/5fdzKmwyYb0R Jason Dew Jason Jason Dew
Sat, 15 Mar 2008 12:00:00 -0700 learnSTAT is now open source http://jasondew.com/learnstat-is-now-open-source http://jasondew.com/learnstat-is-now-open-source

I've been teaching a Statistics course at USC for a few years now and so, being the geek that I am, I decided a couple of semesters ago to write some course management software in Rails. I've worked on it on and off since then and I would consider it to be in a semi-usable state at this point. I've used it in my last two semesters without major problems.

The features at this point are

  • course announcements
  • course documents
  • ability to assign multiple choice quizzes
  • quiz statistics, including per question
  • ability to add exam grades

The source is available at http://github.com/jasondew/learnstat. Please send any bug reports or feature requests to jason.dew at gmail.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1988051/jasondew.jpg http://posterous.com/users/5fdzKmwyYb0R Jason Dew Jason Jason Dew
Mon, 11 Feb 2008 12:00:00 -0800 Calculating IRR http://jasondew.com/calculating-irr http://jasondew.com/calculating-irr

So I decided to give the latest Ruby Quiz a shot. I created an Algebra module to deal with finding the root of a function -- using Newton's method.

and here is the more specific code to calculate the IRR:

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1988051/jasondew.jpg http://posterous.com/users/5fdzKmwyYb0R Jason Dew Jason Jason Dew
Wed, 07 Nov 2007 12:00:00 -0800 My new favorite website http://jasondew.com/my-new-favorite-website-10 http://jasondew.com/my-new-favorite-website-10

Screen-capture

I know its geeky, but I love it.  Its great practice -- http://refactormycode.com/.  The general idea is: people post code that they think could be written better and then other people refactor it and get rated on it.  How cool!

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1988051/jasondew.jpg http://posterous.com/users/5fdzKmwyYb0R Jason Dew Jason Jason Dew
Sun, 04 Nov 2007 11:00:00 -0800 Cowboys and Farmers http://jasondew.com/cowboys-and-farmers http://jasondew.com/cowboys-and-farmers

Cowboy
I can't take credit for this idea and I can't remember the blog post where I read it... but the idea goes something like this: most development groups have cowboys and farmers.

Cowboys live on the bleeding edge of technology and, therefore, tend to bleed at times (normally in the form of overtime). Of course, with risk comes reward. In software development this is increased productivity, more robust products, and programmer happiness.

Farmers, on the other hand, represent stability. They are willing to use the same tools, year after year, and normally produce steady results. They are the risk averse -- willing to do twice the amount of work with a tool that is comfortable rather than try a tool that is more specialized and/or capable.

Obviously, we need some sort of a balance between the cowboys and the farmers. Too much of either type is a recipe for destruction. However, I'm certainly a cowboy. I love learning new tools, especially when they get the job done better than the old tool.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1988051/jasondew.jpg http://posterous.com/users/5fdzKmwyYb0R Jason Dew Jason Jason Dew
Fri, 02 Nov 2007 12:00:00 -0700 Common Beginnings http://jasondew.com/common-beginnings http://jasondew.com/common-beginnings

147px-yukihiro_matsumoto
Its funny what a hobby can turn into. Listening to Matz at RubyConf 07 made me reminisce about how I got started programming. He was asked "do you consider yourself to be a scientist or an artist?" to which Matz responded: "a hobbyist." Ruby was just a hobby to him, something he found to be fun and fulfilling. Its kind of the same way I made it into full-time web development. Its what I did on the side because I enjoyed it. Now I feel privileged that I have a job where I can do what I love most of the time. It's nice to find commonalities with people that you respect.

Permalink | Leave a comment  »

]]>
http://files.posterous.com/user_profile_pics/1988051/jasondew.jpg http://posterous.com/users/5fdzKmwyYb0R Jason Dew Jason Jason Dew