A self-taught master’s in CS — April 3, 2010

A self-taught master’s in CS

Perhaps this is a far-fetched idea, but let’s toss it out anyway: it occurs to me that I probably don’t have time in my schedule to get a master’s degree in computer science, but I’ve wanted for a long time to get one. I don’t have time because, as it is, I routinely work at my job until 8 or 9 at night. As I get more proficient at what I do, I expect I’ll work less, but at least for a while there’s just no way that I could do a job *and* get a master’s *and* be a reasonably good boyfriend *and* take regular trips up to New Hampshire to spend time with my girlfriend’s kids. Oh, *and* sleep.

So. That having been determined, I should spend the next few months/years building up master’s-level proficiency on my own. I should read books, watch videos, and write a lot of code on my own. A good master’s program will build a lot of theory as well, which means a lot of math. Historically, I’ve not been very good at learning math at an abstract level — but if I can code it up (e.g., writing a crypto algorithm), I can probably internalize it quickly.

Can anyone recommend a curriculum for my self-taught computer-science master’s degree? Recommend books? Recommend the sort of programming projects that one would encounter in a good master’s program?

Automatic memoization: cleverness to solve the wrong problem — January 23, 2010

Automatic memoization: cleverness to solve the wrong problem

This is the first time in my career that I’ve used JavaScript extensively, so I’m trying to learn some best practices. A few people have recommended Crockford’s [book: JavaScript: The Good Parts], so I picked it up. While skimming through it looking for something else, I ran into his description (on page 44) of using memoization (essentially, caching) to speed up the computation of certain functions.

The example people always use here — and Crockford is no exception — is the Fibonacci sequence. The shortest way to implement it is recursively. As Crockford points out, the recursive solution doesn’t scale: the number of recursive calls necessary to compute the nth Fibonacci number is proportional to the nth power of the golden ratio. (From what I can see, the constant of proportionality converges very rapidly to about 1.11803. That number must have some special name.) I’ve coded both versions up in Python; the recursive version keeps track of how many function calls it had to make.

So then Crockford’s solution, and the solution written in lots of places ([book: Higher-Order Perl], for instance) is to use memoization: cache the results of fib(n) for smaller n, then use the cached results rather than computing those values anew when you need them later on.

This isn’t really a solution, though, as my friend Seth pointed out to me some months ago. It’s certainly clever, and languages like Perl or JavaScript make it very easy. In Perl, you can automatically memoize any of your functions with the Memoize module: just do

use Memoize;
memoize(‘fib’);

and voilà: all of your calls to `fib()` will be memoized from then on. Pretty cool, really. Manipulating the symbol table is kind of neat.

But this has only temporarily disguised the problem. Let’s look at the time and space requirements for all three of our `fib(n)` implementations:

__Recursive, non-memoized__: exponentially many function calls, hence exponentially many stack frames, hence exponential memory usage. Linear running time.

__Recursive, memoized__: linear memory usage (one cache entry for every [math: i], for [math: i] less than or equal to [math: n]). Linear running time.

__Iterative, non-memoized__: constant memory usage (must keep the [math: n]th, [math: (n-1)]st, and [math: (n-2)]th values of the sequence in memory at all times, but that’s it), linear running time.

By using some language cleverness, you’ve made the problem harder than you need to: memoization increases your memory usage from constant to linear, and does nothing to your running time. By thinking harder about the problem, you can improve both performance aspects and not need any clever language business.

Seth has told me for quite a long time that recursion and higher-order programming (closures and so forth) are interesting but make debugging a lot harder. His contention would probably be that you can often replace a clever recursive solution with a non-clever, easier-to-debug, better-performing one.

That said, at least some higher-order-programming tools make my life easier. In Python, I love list comprehensions:

print [x**2 for x in xrange(1,11) if (x**2 % 2 == 0)]

or the older-school but perhaps more familiar `map()` and `filter()`:

print filter(lambda x : x % 2 == 0, map(lambda x : x**2, xrange(1,11)))

(because I favor readability over concision, in practice I would expand this into several lines: a `map()` line, and a `filter()` line, and so forth)

As I’ve mentioned, I lament the absence of first-class functions in Java. Then again, I’ve seen enough people shoot themselves (and me) in the foot with unreadable Perl or Python, using all the cleverness that the languages provide to them, that I think I’d be okay with a less expressive language that forces programmers to be really boring.