Sunday, March 2, 2008

Finding primes using a grid of browsers

It's been a few months since I talked about my idea of combining the power of a lot of browsers into a so-called computing grid. Since writing that post I've been working on such a browser based grid. Progress has been slow, but steady. And I'm learning a lot in the process, so I'm not complaining at all.

My current prototype grid software has a server-side and a client-side part. Tasks and results get distributed over clients as needed. It's actually pretty cool to see one node start a task and another node finish it. But one thing that my grid lacks is a data storage service. Until I add such a service (in a scalable way) I am stuck with a grid that can only handle tasks like finding prime numbers. But even from such a classic task, there is still a lot to be learned.

For example one of the problems we'll need to solve for a browser based grid is the fact that the nodes in the grid are not dedicated to being part of the grid. Whereas people that download SETI@home at least consciously decide to contribute their computer cycles to the project, for a browser based grid this decision is less explicit. Of course this is part of the beauty of a using the web for this grid: it becomes very easy for people to join the grid. But it also means that we can't do certain things. Like eat up all available CPU cycles...

So how do we find prime numbers without eating up too much CPU time? As you probably recall from school: a prime number is a number that can only be divided by 1 and by itself. So the simplest possible algorithm to determine if a given number is a prime is something like:


function IsPrimeNumber(number) {
for (var i=2; i < number-1; i++) {
if (number % i == 0) {
return false;
}
}
return true;
}

As you can see this is quite a literal translation of the definition of primes we all learned. And although there are plenty of optimizations you can do to this algorithm, we'll stick to to this very simple version for now.

What happens if we execute this function on a node? It'll work fine for small numbers. But once the number goes beyond a certain threshold, the loop will start eating up quite some time. And since executing JavaScript in the browser is unrestricted but single-threaded, this means the script will eat up all CPU cycles for a noticeable amount of time. As I said before, that is something that I simply don't want to happen on this grid.

So we need to modify our algorithm to not eat up all CPU time. We could optimize it by for example skipping all even numbers above 2. But that would only double the maximum number we can check with this function before the user starts to notice. What we need is something that will make this function work without noticeable effect on the CPU for any number.

My approach to this has been to split the task into many sub-tasks. So instead of having one loop like in the version above, we split it into multiple loops that each run over a smaller range. Let's say we have the following helper function:

function IsDividableBySomethingInRange(number, start, end);

This function will return true if number can be divided by one of the numbers in the range [start, end>. With this helper function, we can rewrite our original IsPrimeNumber function to use ranges.

function IsPrimeNumber(number) {
var CHUNK_SIZE = 1000;
for (var i=0; i < number; i += CHUNK_SIZE) {
if (IsDividableBySomethingInRange(number, Math.max(i, 2), Math.min(i + CHUNK_SIZE, number-1)));
return false;
}
}
return true;
}

Now if we leave the algorithm like this, we are still executing the work in one go. To actually remove that problem we need to convert the IsDividableBySomethingInRange calls into sub-tasks, which can be executed on the grid separately. The IsPrimeNumber function/task then just has to wait until all its sub-tasks have completed.

The splitting of a task into sub-tasks that can be executed separately and independent of each other is a typical fork operation that I learned back in my college days. We would fork off some work to separate sub-threads, so it could be completed in parallel instead of executing each one in turn. Waiting for the sub-tasks is called a join operation. By creating a fork/join algorithm we're not just making task execution more deterministic, we're also improving parallelism by allowing the sub-tasks to run independently from each other.

So what we really wanted to create all along is a fork/join version of our IsPrimeNumber function/task.

Let's say that we have a fork operation that we can call to somehow fork off a function into a separate sub-task. And let's say that whenever such a sub-task is completed, it will call back into the original task with the result:

class IsPrimeNumber {
var number;
var result;
function IsPrimeNumber(number) {
this.number = number;
this.result = true;
}
function forkSubTasks() {
var CHUNK_SIZE = 1000;
for (var i=0; i < number; i += CHUNK_SIZE) {
fork IsDividableBySomethingInRange(number, Math.max(i, 2), Math.min(i + CHUNK_SIZE, number-1)));
}
}
function joinSubTask(subtask) {
if (subtask.result == true) {
this.result = false;
}
}
}

This is a grid-enabled fork/join version of our IsPrimeNumber task. When we execute this task with a very large number on a grid of nodes, the task will be forked into sub-tasks and each of those sub-tasks can be executed on any of the nodes. When a sub-task is completed its result can be joined back into the IsPrimeNumber, which will assure that the combined result is correct.

IsPrimeNumber(2137)
forkSubTasks
|
|--------->IsDividableBySomethingInRange(2137, 2, 1000)
| |
|------------------+-------------->IsDividableBySomethingInRange(2137, 1000, 2000)
| | |
|------------------+---------------------+---------------->IsDividableBySomethingInRange(2137, 2000, 2136)
| | |
joinSubTask | | |
| | | |
|<-----------------+ | |
| | |
|<---------------------------------------+ |
| |
|<---------------------------------------------------------------+
|
true<--------+

If you know a bit about Google's Map/Reduce algorithm (or the open-source Hadoop implementation of it) you will probably see the similarities between join and reduce.

There is still many things that we can improve here (not swamping the grid with all sub-tasks at once, keep track of the number of missing sub-tasks, etc). But essentially we now have a grid-enabled way of determining whether any given number is a prime.

No comments: