Thursday, December 25, 2008

Which keyboard is best on mobile phones?

It seems that mobile phones are the main place where UI experiments are still being done. One of the types of UI experiment that recently drew my attention was in keyboard design.

Palm Treo


I just got a Palm Treo from work. That Treo has a built in QWERTY keyboard with horribly small keys. The keyboard is always available, together with the screen. And although that sounds convenient, it does mean that both the screen and the keyboard are small and the device is clunky.

GPhone


Compare this to the design of the first Google phone: the T-Mobile G1. When using this device as a phone the keyboard is stowed away under the screen. When you want to access one of its many PDA functions, you fold away the screen and the mini QWERTY keys become available. So while in phone mode the device is a lot smaller, but in PDA mode it is bigger.

iPhone


Lastly there's the iPhone solution: a virtual on-screen keyboard. There on-screen keyboards were pioneered by Windows PDAs about a decade ago, but leave it to Apple to breathe stylish new life into an existing concept. The screen on your iPhone is big and can be used for normal operation. Until you need a keyboard, then the screen is split into two: the top section remains a normal screen and the bottom section becomes a virtual keyboard. The virtual keys are as miniaturized as their physical counterpart. But they do miss the tactile feedback you get from the other devices. So while it looks endlessly better, inputting text on the iPhone is probably slower than on the others.

So what do you think? Which is the better way of inputting text on your mobile device? The Treo approach of an "always on" keyboard? The way the Google Phone hides the keyboard during normal operation? Or do you prefer the iPhone's virtual keyboard?

Saturday, October 25, 2008

Optimistic locking vs. pessimistic locking

If you have an education in computer science you typically learned about locking in databases. I was taught that the only safe way to update a database in a multi-user environment is to use a transaction:


  1. acquire a lock on the record

  2. read the current value from the record

  3. calculate the new value

  4. write the new value to the record

  5. release the lock


Simple. Reliable. Safe.

And horribly slow...

Imagine thousands of users concurrently hammering your database. Most of these users will be working on different records. So for most of them, the lock is completely useless. But you're still paying the price for acquiring this lock on every database update. That is why this type of locking is called a pessimistic lock. You assume that multiple people will want to update that same record and prevent them from doing so.

I thought this was the only way to do locking, until I encountered something completely different about 10 years ago. At my company at the time we used an internally built software application to track stock levels of our products. And that application used the pessimistic lock pattern described above. The application worked fine for some time, but as it got more users it started to have more and more performance problems. I was brought in as part of a team to solve those problems. I immediately went to work on some of the heaviest operations and was able to bring startup performance back to acceptable levels in a few days. But the updates to the database were still going as slow as ever.

That's when the technical lead of my team came with an interesting suggestion. "Why don't we remove the database locks?" Of course the entire team was in shock. You couldn't remove the locks from the updates. That meant that the data could become corrupt. That is what we told the tech lead: you need locks to keep your data from becoming corrupt. "So why don't we ensure that the data won't become corrupt when we update it?"

Nobody understood what he meant. Keep in mind that all of us were brought up with the concept of pessimistic locking. So the lead asked us to show him the queries that were being fired against the database:

  1. SELECT ID, StockCount FROM tblProducts WHERE ID=?

  2. application code calculates the new stock for the product

  3. UPDATE tblProducts SET StockCount=? WHERE ID=?


He thought about it for a while and then said "How about this?":

  1. SELECT ID, StockCount FROM tblProducts WHERE ID=?

  2. application code calculates the new stock for the product

  3. UPDATE tblProducts SET StockCount=? WHERE ID=? AND StockCount=?


"So we pass in both the new value for StockCount and the value that we expect to be there. That way the update will only be performed when the StockCount has not been modified by anyone else."

I can still remember the shock going through my mind. Does this mean that I don't need to lock the record before reading it? What will we do when the StockCount did actually change between our SELECT and our UPDATE?

All these questions were answered in the next few hours. If some other user had performed a conflicting UPDATE while our code was busy figuring out the new stock vale, we'd have to reperform the entire sequence above. So the process turned more into a loop:

  • while the stock value has not been updated:


    1. SELECT ID, StockCount FROM tblProducts WHERE ID=?

    2. application code calculates the new stock for the product

    3. UPDATE tblProducts SET StockCount=? WHERE ID=? AND StockCount=?



And sure the code got a bit more complex. And definitely in the worst case, this code will send a lot more queries to the database than the one with pessimistic locking. After all, it will fire multiple SELECT and UPDATE statements. Potentially an unlimited number of them, because of the loop.

But that is exactly why people call this the optimistic locking pattern. As I said earlier, most concurrent updates to a database are to different records. So most updates don't have a conflict. But with pessimistic locking all updates pay the price of acquiring a lock. So all updates pay get a penalty for a small fraction of them that do require the locking.

Optimistic locking is built on the assumption that most updates are without conflicts. When there is no conflict, there is no penalty. And only when there has been a conflicting update, do we go into the loop of retrying and pay the penalty. The penalty will be much higher than with pessimistic locking, but it will only be paid for the cases where it is needed.

So there you have it, the difference between optimistic and pessimistic locking and how I learned about them. Pick whichever one you think will work best for your project. Both of them are perfectly suitable for keeping your data safe. It all depends on the ratio of "updates with conflicts" vs. "updates without conflicts" for your application.

Sunday, July 13, 2008

Using PageRank to determine the most powerful people on LinkedIn

Last week I was reading the original paper by Larry Page and Sergey Brin on the PageRank algorithm. PageRank is an algorithm to determine the relative authority of pages that they came up with while at Stanford University. And it is of course also the algorithm they then just started using for a search engine at Stanford - a search engine they named Google.

If you just skip the math for a moment, the PageRank algorithm is surprisingly simple. The basic rules in determining the rank of a page are:
1) the links to a page determine its rank
2) a page spreads its rank over the links it contains

So if you have a page with three links on it, that page will contribute one third of its rank to each of those links. A page with a rank of 100 that has two links on it, will give 50 points to each of its targets.

A simplified PageRank calculation, showing that each page re-distributes its own PageRank over the pages it links to

One thing you may already have noticed is that the definition of the PageRank can't be determined in one go: you need to know the value of all incoming links to determine the rank of a page, which you need to know to determine the value of all outgoing links. According to the paper you can simply calculate all PageRanks in an iterative fashion until it converges after a quite limited number of steps. I'll gladly take their word for it, since that puts the algorithm back into the hands of us programmers. We can iterate over large collections of data like no other.

A converged PageRank calculation, showing that repeated refinement will result in a stable network of PageRanks

The PageRank algorithm is so simple that it not only works, it has turned the search engine world (and the internet in general) upside down. I'm quite sure that today's PageRank is determined in a much more complex fashion, but let's stick to this older and more simple model for now.

What else could we use PageRank for?

What would happen if you let PageRank loose on a completely different type of data? How about instead of taking all the hyperlinked documents on the web, we take a social network? So take the network of people at a site like LinkedIn, Facebook, Plaxo or Orkut and apply the same two basic rules:
1) a person's rank is determined by the incoming links
2) a person's rank is evenly divided over their outgoing links

So the more people link to you, the higher your rank. And if more people link to the people that link to you, your rank will get even higher.



If we let PageRank loose on a social network like this, what will happen? Who will get the highest PageRank? And what does the rank mean in such a scenario? Is your PageRank an indicator of the power of your social network? Will my boss indeed have a higher rank than me, as you would expect? And will his boss have an even higher rank?

There are of course differences between links on a social network site and the web in general. For example: in a social network like this the links between people are non-directed, when you are connected to someone that person is also connected to you. This will likely affect the results, but I am too bad at maths to foresee how.

Basically I have no idea what the outcome of such an experiment will be. It just seems to me that it would be incredibly interesting to see what the numbers are and then determine what they mean. Much like Page and Brin did with their original PageRank: first let the algorithm loose on the data, then see what you can do with the results. If the algorithm feels intuitively good, the results are bound to be interesting.

Does you feel the same? Can this be an interesting experiment? Or is this complete nonsense and is PageRank not applicable to this scenario? And do the social network sites have APIs that will allow you to do this? Or is your data access limited to the people in your own network (as they always promise in their privacy statement) and can only someone working for Facebook/LinkedIn/Plaxo create such an application?

Saturday, March 22, 2008

Three ingredients to a better bug report

At work I've recently been going through lots of bug reports. It is part of my job to determine the priority of each defect that gets reported and whether we should still fix it in the upcoming release. With multiple testers trying to break the product in all possible ways and a deadline approaching rapidly, analyzing the newfound defects seems to take more and more time.


What did you see?


People entering defects can actually do a lot to reduce the time it takes to analyze their entries. You'd be amazed at the number of defects that say something along the lines:

  • "when I do A, B happens"

There are of course cases when this is all there is to say about a defect. For example:

  • "when I click the Ok button, the browser shows an internal apache error"

Granted, it would be more useful if the report said a bit more about the error message. But it is at least clear that an internal error message should not be shown to the user.

What did you expect to see?


Unfortunately things are not always so clear:

  • "when try to I delete the last item from the list, I get a message saying at least one item is required"

When I get an error report like this, I'm not sure what to do with it. Most likely there is an internal rule in the program that this list may never be empty. And the program seems to enforce this rule by giving a message when you try to delete the last remaining item. So there is a "business rule" in the program and the developer wrote code to enforce that rule. Where is the defect?

In cases like these I ask the person who entered the defect why they think this behavior is wrong. Typically I get an answer like:

  • "if I can't delete the selected item, the delete button should be disabled"

This added information makes things a lot clearer. So the tester didn't disagree with the fact that there should always be at least one item in the list, they just didn't agree with the way it was handled.

Why do you think your expectation is better than the current behavior?


But the above leaves me with a problem. There is a clear rule in the program that the list must never be empty. The programmer implemented this one way, someone else thought it should have been implemented another way.

In cases like these I ask the tester (or whoever reported the defect) to explain why they think their expectation is better than the current behavior. In the example we've used so far, the reason could be something like:

  • "clicking the button when there is only one item in the list will always show an error message - the delete action will never be performed. Buttons that don't lead to an action being executed should be disabled."

This is a clear - albeit somewhat abstract - description of the reason why the person expected the behavior to be different.

Prior art



In this example I doubt whether anyone will disagree with the reasoning of the tester. But there are many cases where someone will disagree. Especially the developer that implemented the functionality will tend to defend the way it works.

That's why I normally prefer the defect to point to other places where similar functionality is available in the way the tester prefers it. So in the example defect:

  • "in screens B and C we have a similar list and there the delete button is disabled if there is only one item remaining in the list"

This type of argument works especially well when the functionality in screens B and C has already been in a released version of the product. The users of the product have experienced the functionality and they will expect the new screen to behave in the same way.

If no similar functionality is available in the application, I often look for other programs that have similar functionality. On Windows the notepad application is one of my favorite examples. Everybody has it and the functionality as not substantially changed for at least a decade. Of course the functionality your program has might not be in notepad. In those cases I often refer to programs like Microsoft Office, Outlook, Firefox or the Google home-page. Not because I think these are perfect programs, but because they're so ubiquitous that most users accept them as a reference point for the behavior they expose.

Summary


So a bug report should at least contain the following ingredients:

  1. What did you see?

  2. What did you expect to see?

  3. Why do you think that 2 is better than 1?


Now if everyone starts filing their bug reports like that, I will have to spend a lot less time on analyzing them and can get back to fixing those defects sooner. Who knows... maybe we'll make that deadline after all.

Sunday, March 2, 2008

Finding primes using a grid of browsers

It's been a few months since I talked about my idea of combining the power of a lot of browsers into a so-called computing grid. Since writing that post I've been working on such a browser based grid. Progress has been slow, but steady. And I'm learning a lot in the process, so I'm not complaining at all.

My current prototype grid software has a server-side and a client-side part. Tasks and results get distributed over clients as needed. It's actually pretty cool to see one node start a task and another node finish it. But one thing that my grid lacks is a data storage service. Until I add such a service (in a scalable way) I am stuck with a grid that can only handle tasks like finding prime numbers. But even from such a classic task, there is still a lot to be learned.

For example one of the problems we'll need to solve for a browser based grid is the fact that the nodes in the grid are not dedicated to being part of the grid. Whereas people that download SETI@home at least consciously decide to contribute their computer cycles to the project, for a browser based grid this decision is less explicit. Of course this is part of the beauty of a using the web for this grid: it becomes very easy for people to join the grid. But it also means that we can't do certain things. Like eat up all available CPU cycles...

So how do we find prime numbers without eating up too much CPU time? As you probably recall from school: a prime number is a number that can only be divided by 1 and by itself. So the simplest possible algorithm to determine if a given number is a prime is something like:


function IsPrimeNumber(number) {
for (var i=2; i < number-1; i++) {
if (number % i == 0) {
return false;
}
}
return true;
}

As you can see this is quite a literal translation of the definition of primes we all learned. And although there are plenty of optimizations you can do to this algorithm, we'll stick to to this very simple version for now.

What happens if we execute this function on a node? It'll work fine for small numbers. But once the number goes beyond a certain threshold, the loop will start eating up quite some time. And since executing JavaScript in the browser is unrestricted but single-threaded, this means the script will eat up all CPU cycles for a noticeable amount of time. As I said before, that is something that I simply don't want to happen on this grid.

So we need to modify our algorithm to not eat up all CPU time. We could optimize it by for example skipping all even numbers above 2. But that would only double the maximum number we can check with this function before the user starts to notice. What we need is something that will make this function work without noticeable effect on the CPU for any number.

My approach to this has been to split the task into many sub-tasks. So instead of having one loop like in the version above, we split it into multiple loops that each run over a smaller range. Let's say we have the following helper function:

function IsDividableBySomethingInRange(number, start, end);

This function will return true if number can be divided by one of the numbers in the range [start, end>. With this helper function, we can rewrite our original IsPrimeNumber function to use ranges.

function IsPrimeNumber(number) {
var CHUNK_SIZE = 1000;
for (var i=0; i < number; i += CHUNK_SIZE) {
if (IsDividableBySomethingInRange(number, Math.max(i, 2), Math.min(i + CHUNK_SIZE, number-1)));
return false;
}
}
return true;
}

Now if we leave the algorithm like this, we are still executing the work in one go. To actually remove that problem we need to convert the IsDividableBySomethingInRange calls into sub-tasks, which can be executed on the grid separately. The IsPrimeNumber function/task then just has to wait until all its sub-tasks have completed.

The splitting of a task into sub-tasks that can be executed separately and independent of each other is a typical fork operation that I learned back in my college days. We would fork off some work to separate sub-threads, so it could be completed in parallel instead of executing each one in turn. Waiting for the sub-tasks is called a join operation. By creating a fork/join algorithm we're not just making task execution more deterministic, we're also improving parallelism by allowing the sub-tasks to run independently from each other.

So what we really wanted to create all along is a fork/join version of our IsPrimeNumber function/task.

Let's say that we have a fork operation that we can call to somehow fork off a function into a separate sub-task. And let's say that whenever such a sub-task is completed, it will call back into the original task with the result:

class IsPrimeNumber {
var number;
var result;
function IsPrimeNumber(number) {
this.number = number;
this.result = true;
}
function forkSubTasks() {
var CHUNK_SIZE = 1000;
for (var i=0; i < number; i += CHUNK_SIZE) {
fork IsDividableBySomethingInRange(number, Math.max(i, 2), Math.min(i + CHUNK_SIZE, number-1)));
}
}
function joinSubTask(subtask) {
if (subtask.result == true) {
this.result = false;
}
}
}

This is a grid-enabled fork/join version of our IsPrimeNumber task. When we execute this task with a very large number on a grid of nodes, the task will be forked into sub-tasks and each of those sub-tasks can be executed on any of the nodes. When a sub-task is completed its result can be joined back into the IsPrimeNumber, which will assure that the combined result is correct.

IsPrimeNumber(2137)
forkSubTasks
|
|--------->IsDividableBySomethingInRange(2137, 2, 1000)
| |
|------------------+-------------->IsDividableBySomethingInRange(2137, 1000, 2000)
| | |
|------------------+---------------------+---------------->IsDividableBySomethingInRange(2137, 2000, 2136)
| | |
joinSubTask | | |
| | | |
|<-----------------+ | |
| | |
|<---------------------------------------+ |
| |
|<---------------------------------------------------------------+
|
true<--------+

If you know a bit about Google's Map/Reduce algorithm (or the open-source Hadoop implementation of it) you will probably see the similarities between join and reduce.

There is still many things that we can improve here (not swamping the grid with all sub-tasks at once, keep track of the number of missing sub-tasks, etc). But essentially we now have a grid-enabled way of determining whether any given number is a prime.

Saturday, February 2, 2008

Scrum: story points, ideal man days, real man weeks

My team completed its seventh sprint of a project. Once again all stories were accepted by the product owner.

While one of the team member was giving the sprint demo, I started looking in more detail at some of the numbers. Because with seven sprints behind us, we've gathered quite some data on the progress from sprint to sprint. That's the velocity, for XP practitioners.

Looking at the data



So far we've had 131 story points of functionality accepted by the product owner, so that's an average of 18-19 per sprint. The distribution has not really been all that stable though. Here's a chart showing the number of accepted story points per sprint:


Although it is a bit difficult to see the trend in sprint 1 to 5, we seemed to be going slightly upward. This is in line with what you'd expect in any project: as the team gets more used to the project and to each other, performance increases a bit.

The jump from sprint 5 to sprint 6 however is very clearly visible. This jump should come as no surprise when I tell you that our team was expanded from 3 developers to 5 developers in sprint 6 and 7. And as you can clearly see, those additional developers were contributing to the team velocity right from the start.

But how much does each developer contribute? To see that we divide the number of accepted story points per sprint by the number of developers in that sprint:

Apparently we've been pretty consistently been implementing 5 story points per developer per sprint. There was a slight drop in sprint 6, which is also fairly typical when you add more developers to a project. But overall you can say that our velocity per developer has been pretty stable.

Given this stability it suddenly becomes a simple (but still interesting) exercise to try and project when the project will be completed. All you need in addition to the data from the previous sprints, is an indication of the total estimate of all stories on the product backlog. We've been keeping track of that number too, so plotting both the work completed vs. the total scope gives the following chart:

So it looks like we indeed will be finished with the project after one more sprint. That is of course, if the product owner doesn't all of a sudden change the scope. Or we find out that our initial estimates for the remaining stories were way off. After all: it's an agile project, so anything can happen.

Story points vs. ideal man days vs. real man weeks



Whenever I talk about this "number of story points per developer per sprint" to people on other projects, they inevitably ask the same question. What is a story point? The correct Scrum answer would be that it doesn't matter what unit it is. It's a story point and we do about five story points per developer per sprint.

But of course there is a different unit behind the story points. When our team estimates its stories, we ask ourselves the question: if I were locked into a room with no phone or other disturbances and a perfect development setup, after how many days would I have this story completed? So a story point is a so-called "ideal man day".

From the results so far we can see that apparently this is a pretty stable way to estimate the work required. And stability is most important, way more important than for example absolute correctness.

A classic project manager might take the estimate of the team (in ideal man days) and divide that by 5 to get to the ideal man weeks. Then divide by the number of people in the team to get to the number of weeks it should take the team to complete the work. And of course they'll add some time to the plan for "overhead", being the benevolent leaders that they are. This will give them a "realistic" deadline for the project. A deadline that somehow is never made, much to the surprise and outrage of the classic project manager.

I'm just a Scrum master on the project. So I don't set deadlines. And I don't get to be outraged when we don't make the deadline. All I can do is study the numbers and see what it tells me. And what it tells me for the current project is that the number are pretty stable. And that's the way I like it.

But there is a bit more you can do with the numbers. If you know that the developers in the team estimate in "ideal man days", you can also determine how many ideal man days fit into a real week. For that you need to know the length of a sprint.

Our team has settled on a sprint length of four weeks. That's the end-to-end time between the sprints. So four weeks after the end of sprint 3, we are at the end of sprint 4. In those four weeks, we have two "slack days". One of those is for the acceptance test and demo. The other is for the retro and planning of the next sprint.

So there is two days of overhead per sprint. But there is a lot more overhead during the sprint, so in calculations that span multiple sprints I tend to think of those two days as part of the sprint.

So a sprint is simply four weeks. And in a sprint a developer on average completes 5 story points, which is just another way of saying 5 ideal man days. So in a real week there is 1.25 ideal man days!

I just hope that our managers don't read this post. Because their initial reaction will be: "What? What are you doing the rest of the time? Is there any way we can improve this number? Can't you people just work harder?"

Like I said before: I don't believe in that logic. It's classic utilization-focused project management. It suggests that you should try to have perfect estimates and account for all variables so that you can come to a guaranteed delivery date. The problem with that is that it doesn't work! If there's anything that decades of software engineering management should have taught us, is that there are too many unknown factors to get any kind of certainty on the deadline. So until we get more control of those variables, I'd much rather have a stable velocity than a high utilization.

Sunday, January 20, 2008

A WinSCP replacement for the Mac

About half a year ago I wrote a piece on the inability to use my iMac as a web development machine. The reason was very simple: the lack of a utility on OSX with a feature set similar to WinSCP on Windows. It doesn't need to have all features of WinSCP, but at least being able to browse a remote SSH/SCP file system as if it is local and being able to edit remote files without being bothered too much by the latency and remote-ness are a must-have for me.

Apparently I'm not the only one looking for something to replace good old WinSCP on OSX, as this post is one of the most viewed and most commented on this site. In the comments visitors suggested many replacements: Cyberduck, Fugu, Smultron, Filezilla and Transmit. I tried all of them.

But none of these tools seem to do what I want, as -until last week- I still found myself at my XP laptop whenever I wanted to work on a site. Until last week? Yes, because about a week ago my trial license of Transmit ran out. And although it's no WinSCP replacement, Transmit is a decent SCP program. So I went to the vendor website to see what it would cost. On the website I ran into their "One-window web development" program, called Coda. After all the tries one more couldn't hurt, so I decided to give it a go.



After installing Coda you have to get used to the fact that it is not a WinSCP clone. So it doesn't have a dual pane Norton Commander like interface, but instead just has the list of remote files on the left hand side. And when you open one of those files -instead of getting the WinSCP editor popup- Coda opens the files in a large panel on the right hand side. Open a second file and it adds a tab on that panel, where WinSCP would open a second popup window.



Looking at it like that, it seems like Coda tries to mimic an IDE more than anything else. But then an IDE for remote web site development, which is exactly what I use it for. I'm now about half-way through my trial period for Coda and I must say I love it so far.

The file transfers are incredibly fast, way faster than with WinSCP it seems. They're also handled more gracefully. With WinSCP file transfers are handled in the main window, which is typically covered by my editing popup. So I have to alt-tab to the main window to see whether the transfer is complete. With Coda the file transfer is indicated by a throbber animation on the file listing on the left. When the transfer is complete, a brief notification animation is shown in the top right.

The editor is decent enough for my needs. It highlights most of the languages you typically encounter during web development and it follows most standard editing conventions. But it could do with some more keyboard shortcuts for faster editing and especially navigation. It also would be nice if it could integrate with external or online help files for supported languages. There's tons of great documentation out there for Java, JavaScript, HTML, CSS, etc. Unfortunately Coda limits its built in help system to the help that comes with the installation. So for anything beyond that, I still need to tab to my browser and search for the help there. This seriously breaks their "one-window web development" mantra and should be addressed if they want to stick to that claim.

That said: you might notice that the editor has lots of features that the internal WinSCP editor doesn't have, so I shouldn't be complaining. Syntax highlighting is great of course, as are other features like the overview of functions in the current file. The fact that I'm asking for features that you'd normally find in real IDE's and not in WinSCP are actually a big compliment to the Panic people: Coda really feels like a web development IDE!

The biggest downside of Coda I've found so far is the price tag. Like pretty much all software on the Mac it's commercial. Panic decided to set the price tag to $79. Which definitely doesn't make it an impulse buy for me. So I'll evaluate it a bit longer and see what happens when my trial runs out. I'll probably go back to my XP laptop and WinSCP. But if I then find myself missing Coda features, I'll give in and buy it. After all... that would mean it's better suited to my WinSCP. Who would have ever thought that would be possible?