Sunday, December 30, 2007

Grid computing - using web browsers

Grid computing is one of those areas that seems to have a magic appeal to software developers. There is something very attractive about taking some relatively simple computers and wielding their combined power to perform seemingly infinitely large computing tasks within reasonable times.


I've also always been attracted to grids. But as many developers, I too thought this type of power was not within reach for me. Only since Google started documenting the "cloud of commodity PCs" that power their vast computing power, does it suddenly seem quite feasible for even just "large" companies to have their own computing cloud.


But my problem remains the same. I don't work for Google, Yahoo or IBM and I'm not a large company myself. So I don't have access to a set of commodity PCs that I can combine into a grid. So for years I decided that I'd never get a chance to work on a grid, unless I'd start working for one of those big boys.

Recently I've been thinking about an alternate setup for a grid computer, more along the lines of the SETI@Home project and all its successors. Those programs all allow home PCs of users all over the world to take part in a giant computer network - a global grid in essence. So the people creating these programs get a lot of computing power, yet they don't have to manage the hardware. A powerful setup.

But such setups already exist. And they have one downside that keeps them from even more mainstream adoption: they require the user to install software to put their computer into the grid. And although the threshold isn't very high, it's still too high for many people. So a lot of potential computing power is not used, because the barrier of installing software is too high.

Now that got me thinking: is there an existing platform on modern PCs that we can just embed our own grid applications in? Preferably a platform that's been around for a few years, so all its quirks are known. And it would be nice if the platform comes with built-in internet connectivity.

Here's the idea that popped into my head: web browsers! They used to be nothing more than HTML viewers, but those days are long gone. Nowadays our browsers are hosting more and more complete applications, like GMail, PopFly and Yahoo Pipes. These applications prove that there is a lot of computing power in the web browser. Is it possible to use the web browsers that people have open on their PCs all the time and turn those into nodes in the grid?

It is a very simple concept: every browser that has a certain URL open is a node in the grid. For a computer to join the grid, they just surf to the URL. To leave the grid again, they navigate away from the URL. It doesn't get much easier than that, right? No software to install, just a page you have to visit. Put it in your favorites in the office, open it every morning when you boot your PC and that's one more node in the grid. From even my own limited reach, I know of at least 5 machines that I could "grid enable" in this way. Those are all PCs and Macs that are on for a large part of the day, just waiting for me or my family to use them. Or that's what they used to be... now I can't stop thinking about them as being nodes in my "web based grid".

If you're a software developer reading this, than your mind probably started wandering while reading the last few paragraphs. Is this possible? How would the nodes get their tasks? How would they report their results back? How would you manage the nodes in the grid? Where do you keep the data that is needed for/generated by the nodes? How do you handle XSS issues? Wouldn't the nodes quickly overload the server that manages them? The list of challenges is seemingly endless and definitely too much for me to deal with in one go.

All I know is that ever since this idea popped into my head, I can't stop thinking about it. And for every problem, I can see at least a few potential solutions. I have no idea whether they'll work or which one is best, but the only way to figure that out is to actually start building the platform.

Oh man... I really need to make this my 20% project. Or more likely... I really need a lot of people to make this their 20% project. Help?

Saturday, December 22, 2007

The origin of the name Apache web server

I read a lot. Not much literature and novels as unfortunately those seem to suffer under my more professional reading habits. I read lots of technical articles, white papers, blog posts and specifications. It's part of what I do to keep up to date with the things happening in the CS field. But in part I also read all kinds of stuff to gain a broader understanding of our profession.

Some of the longer things I read this year include "PPK on JavaScript", "The no asshole rule", but also the venerable "Art and science of Smalltalk". And some colleagues even caught me reading an OS9 AppleScript manual dated somewhere around 1999. They're still making fun of their discovery almost every day, but I don't mind... having read that manual has given me a better understanding of how the now much heralded Apple engineers thought about making an end-user programming language almost a decade ago.

Recently I read the bulk of Roy Thomas Fielding's thesis Architectural Styles and the Design of Network-based Software Architectures in which he introduces the principles of REST. As with any thesis it is a bit too abstract for my taste, but it did introduce me somewhat better to the background and theory behind REST.

Aside from that, I made one stunning discover when I read about Fielding's involvement in the creation of the Apache HTTP server:

  • At the time, the most popular HTTP server (httpd) was the public domain software developed by Rob McCool at the National Center for Supercomputing Applications, University of Illinois, Urbana-Champaign (NCSA). However, development had stalled after Rob left NCSA in mid-1994, and many webmasters had developed their own extensions and bug fixes that were in need of a common distribution. A group of us created a mailing list for the purpose of coordinating our changes as "patches" to the original source. In the process, we created the Apache HTTP Server Project
Please read that last part again, and again... and again. Until it hits you where it finally hit me. What hit me? Well... I finally understood that the name of the Apache web server might (originally) have had nothing to do with the Apache tribe. The server was created by taking an existing code base and then applying all sort of patches. So in a sense it was a patchy web server. A patchy... Apache...!

Brilliant! In all my years of knowing the Apache web server and the brand that was created around the Apache name, I never realized where it came from.

The Apache website itself has this to say about it:
  • The name 'Apache' was chosen from respect for the Native American Indian tribe of Apache, well-known for their superior skills in warfare strategy and their inexhaustible endurance. It also makes a cute pun on "a patchy web server" -- a server made from a series of patches -- but this was not its origin.
For the moment I'll take their word for it and accept that the name sounding like "a patchy web server" is pure coincidence. I bet it's also more convenient for them in selling the Apache brand: "we named our web server after its inexhaustible endurance" sounds a lot better than "we named our web server after the fact that it was created from a bunch of unrelated patches".

Saturday, November 17, 2007

Why devx needs a better print function

I like to read technology articles during my daily commute. And since the train is too crowded for a laptop and I don't have an ebook reader (yet), I still print articles that seem interesting to read during the train ride.

A lot of web sites still have a Print button. What happens when you click that button differs from site to site, but it roughly falls into these categories:

  • Show all pages at once
    Many sites break articles into multiple pages. The print version of the article puts all of these pages together again, to allow them to be printed in one go.
  • Re-layout the site to print better
    Tables seem to be notoriously difficult to print. That's why many sites revert to a table-less layout in their print version
  • Remove navigation elements
    Global and local navigation elements are pretty useless on paper. So they're removed from the print layout.
  • Images - click to see full size version
    Some graphics-intensive sites show images of reduced size in their normal view, showing the full version in a popup when you click some link. Since you can't click a link in the Print version, the full size images should always be shown there.
These are some things that I wish more site would do:
  • Replace animated adds by text adds
    I don't mind showing adds next to good content. I do mind the ignorance of including animated adds in a print layout. I'm pretty sure no printer will deal with these in a useful way.
  • Use images that are more appropriate for B&W
    Most people still use B&W printers. So it would be nice if sites allowed the option of replacing their colored images with version that are more suited to printing on a B&W printer.
    A common example of this are mostly-black screenshots like from command prompts/shell windows. When printed these really eat through a toner at high speed. It would be nice if a site would allow me to replace those images with ones that are mostly white, making my toner last longer.
That's a pretty long list. And most of these things can actually be accomplished on a website without needing a special print version of the articles. Hiding navigation elements, showing non-animated adds and other layout tricks on a print version can easily be accomplished using CSS media types. And why do most sites still use tables for their layouts? Just remove those tables and you have one less difference between the screen and the print version. And I also think it would make sense to show all content on a single page.

So that actually leaves just one reason for having a Print button: showing full sized images inline. And that finally brings us to the title of the article: the print function of DevX.

DevX is a nice development site that sometimes has very interesting content. And one of the reasons their content is good is that they usually include quite a lot of screenshots and diagrams. This just makes their articles so much easier to follow. On screen the articles show the images at a reduced size. Which makes sense, because the images are often full screen screenshots which would otherwise leave hardly any room for text.

But if you've ever printed an article from www.devx.com you've probably noticed their print versions still only show the images with a reduced size. They're not replaced by the full-resolution version. They're not printed in a larger box. They're not even added at the end of the article, like appendices. The images in the print version are exactly the same as in the screen version: reduced to sometimes a tenth of the the original size.

So whenever I find an article in DevX that I want to read on the train, I start up Word and open the print version in there. Then I remove all tables, because they also don't print very well from Word. Then I go back to the browser and open each image, copy it to the clipboard, paste it in Word and then remove the useless downsized version.

And although I normally like the high volume of screenshots that DevX uses in their articles, this is actually a reason why I'd like them to use less screenshots and more text. Because this conversion to Word is not just a lot of mindless work; I sometimes forget to do it and print a DevX article as is. And by the time I realize what I've done, I'm already on the train. So I do my best and squint my eyes trying to read the text in there.

So there you have it: please DevX fix your @$@%&^# Print function.

Saturday, October 6, 2007

Viewing and editing Scrum project management data with Google Mashup Editor

Welcome to my first post on the Google Mashup Editor. In this article we'll create a tool for entering and storing data using Google's new mashup editor tool. Depending on available time, the evolution of Google Mashup Editor and the availability of alternative tools, I might improve on the basic data management application in later articles.

Scrum project management

The application we'll be creating is a Scrum project management tool. If you don't know Scrum yet, it's an agile project management framework. Please do yourself a huge favor and read this 90 page story about Scrum (pdf). It's a good and fun read and has already won over many organizations to at least give Scrum a try.

My reasons for wanting to create this type of application are many. One of them is that there seems to be no tool that satisfies my needs with the right price tag. XPlanner is good, but very basic. Mingle looks nice, but is too expensive and a real resource hog. ExtremePlanner also looks nice, but again: it seems a bit expensive for my taste. But one other reason is probably more important than the price issue: building this data model seems do-able and gives me a chance to get to know Google Mashup Editor a bit more.

Google Mashup Editor

Mashup tools seem to be a dime a dozen these days. These tools try to take programming to the masses, allowing everyone to create complex web applications based on existing data or logic.

Yahoo was the first big player in this field, with their Yahoo Pipes. They're aiming for a visual programming environment where the user manipulates blocks rather than writing code. Microsoft followed suit with Popfly, an even richer mashup creation environment combined with what seems to be the next generation of their MSN Spaces platform.

Google was the last entrant into this field (if I recall correctly) and the first glances at their entry into the field left me rather disappointed. No drag-and-drop programming, no cool default widgets, just a pretty basic text editor and some basic tags.

But if you look below the surface you can see that Google Mashup Editor (GME) is actually quite different from the other two. Where Yahoo and Microsoft just seem to focus on allowing you to read and combine data from various sources, Google also allows you to create new applications from scratch. In that respect GME is more of an application creation (and hosting) platform than a mashup editor.

Much of these additional possibilities seem so originate from Google's adoption of the Atom Publishing Protocol, exposed through the Google Data (GData) APIs. This API is what makes GME not only a mashup editor, but also a valid tool for creating completely standalone applications. These applications are then hosted on Google's servers, using Google's servers for data storage, using the GME to create and update the applications. Some people might not like to put so much in the hands of Google. But it will certainly lower the bar for creating scalable web 2.0 applications.

That's enough of the background talk. Let's get to work on the application.

Initial data model

We'll start by defining the basic entities and relations in our application. We'll probably expand on these later, but we can get pretty far with just the following.

A project is something on which a team works in sprints to create a product or a release of a product. This is all intentionally very vague, as our application doesn't need to know the details of the projects it manages.

A project has a product owner and a scrum master. Aside from that there are other team members, but we'll leave them out of the equation for now.

A sprint is a time period during which the team implements certain stories. A sprint has a start date and end date and a description of the general goal of the sprint.

A story is a piece of functionality that the team creates. It has a name, a description of how to demonstrate it and an estimate of the effort it will take to create the functionality. Stories can be either user-focused or technical in nature.

All stories combined are called the product backlog. Stories from the product backlog are planned into sprints. So each project has one product backlog and some of the stories in this product backlog are planned into each sprint.

This all translates into the following very simple data model:
Let's see how we can translate this data model into GME.

Creating the project list in GME

The first step is to create a new project in GME. This will show you a nice empty application with just a pair of gm:page tags.

<gm:page title="Scrum Project Manager" authenticate="true">

</gm:page>

Everything for our application will be inside the gm:page tags. If you want your application to have multiple pages, just add some more files to it. But for this application a single page will do.

Getting data into GME consists of two steps: defining the data itself and defining the GUI for it. The data itself takes the form of a gm:list tag:

<gm:list id="Projects" data="${app}/Projects" template="projectList" />

The gm:list tag defines a list of data that is used in the application. In many applications the data will be pulled from an external -RSS or Atom- feed. But we want to store the data inside the application, right in Google's mashup servers.

The data of our project list is stored under the ${app}. This is a location (a "feed" in GME terms) where the data of all users of the application is stored. If we don't want to share the data between users, we can store it under ${user}, which is data that is kept per user. Currently there is no way to have data shared between some users (but not all users of the application), although this feature will probably be added in the future.

To display the data in the list, the page needs a template. A template determines what fields to display and how to display them. It's easiest to use an HTML table, so we'll do that for now.

<gm:template id="projectList">
<table class="gm-table">
<thead><tr>
<td width="200">Name</td>
<td width="100">Product owner</td>
<td width="100">Scrum master</td>
<td width="45"> </td>
</tr></thead>
<tr repeat="true">
<td><gm:text ref="atom:title" hint="Project name"/></td>
<td><gm:text ref="gmd:productOwner"/></td>
<td><gm:text ref="gmd:scrumMaster"/></td>
<td><gm:editButtons/></td>
</tr>
<tfoot><tr>
<td colspan="4" align="right"><gm:create label="New project"/></td>
</tr></tfoot>
</table>
</gm:template>

As you can see we're mixing standard HTML tags, with GME specific tags like gm:text, gm:editButtons and gm:create. Also notice the non-HTML repeat attribute on the second tr (a normal HTML table row). This tells GME to repeat that tr for every item in the ${app}/Projects feed.

If we now compile and test this application, we get an empty table with a "New project" button. Pressing the button adds an empty row to the table, with fields to fill in the values for a product.
Note that editing and creation functionality are for free with GME. Although they're not very flexible, they allow you to quickly get started.

Creating the list of stories in GME

Next is a list of stories for a project. Since stories are always part of a project, we store the data under the feed of a project.

<h2>Stories for selected project</h2>
<gm:list id="Stories" data="${Projects}/Stories" template="storyList" />

This is where GME really adds a lot of logic automatically. The location refers to a child of the ${Projects}/StoriesProjects list we defined earlier. Each project in the Projects list will have its own list of Stories.

This list also needs a template to display it, which is really similar to the one for the projects.

<gm:template id="storyList">
<table class="gm-table">`
<thead><tr>
<td width="200">Title</td>
<td width="75">Type</td>
<td width="25">Estimate</td>
<td width="100">How to demo</td>
<td width="45"></td>
</tr></thead>
<tr repeat="true">
<td><gm:text ref="atom:title" hint="Story title"/></td>
<td>
<gm:select ref="gmd:storyType">
<gm:option value="user" selected="true">User</gm:option>
<gm:option value="tech">Tech</gm:option>
</gm:select>
</td>
<td><gm:number ref="gmd:estimate"/></td>
<td><gm:text ref="gmd:howToDemo"/></td>
<td><gm:editButtons/></td>
</tr>
<tfoot><tr>
<td colspan="5" align="right"><gm:create label="New story"></td>
</tr></tfoot>
</table>
</gm:template>

Now the only tricky bit we still need to do for the list of stories, is that it needs to be refreshed when the user selects a different project. This is quite easy, by setting an event handler.

<h2>Unplanned stories for selected project</h2>
<gm:list id="ProjectStories" data="${Projects}/Stories" template="storyList">
<gm:handleEvent src="Projects"/>
</gm:list>

This tells the story list to refresh itself when an event happens in the Projects list we defined before. So select a different project will display the stories for that project.

So after adding the story list and adding some projects and stories, our application looks like this:
We can easily do the same for the list of sprints for the project. Since this is really similar to the list of stories, I won't show the code here. If you want to have a look at the code, look at the finished project on http://scrummer.googlemashups.com.

Last is the list of stories for the selected sprint. Note that stories can either be part of the project or part of the sprint. So for now we'll call the first type "unplanned stories". Later we'll want to share the stories between the project and the sprints.

Since the list of stories is -again- really similar to the list of unplanned stories, we won't show the code here. But when we now run our mashup it looks like this:
At the bottom you can see that I am entering a story. This is almost a usable application, at least for entering and browsing the data. To make it something you'd really want your entire team to use for your daily managing of Scrum projects, it would require more work.

That's it for now. If you want to have a look at the finished code or play with the application, go to http://scrummer.googlemashups.com.

Saturday, August 25, 2007

Online burndown chart generator

One of the aspects of Scrum is its focus on transparency - getting all information out in the open. And one of the areas that enables the transparency is the burndown chart. It's a public posting of the progress of the team throughout its current sprint.

On the horizontal axis you see the days of this sprint. The vertical axis describes the amount of work. At the top of the vertical axis is the number of "ideal man hours" we committed to for this sprint. The straight diagonal line is the "ideal burndown" that we're aiming for. The slightly less straight line is our actual burndown. As you can see this chart is from somewhere during the third week of our four-week sprint and we're slightly above target. But things don't look as desperate as a few days before, thanks to some colleagues getting back from Holidays (which it says in the small scribling that you probably can't read).

As a Scrum master I like to post this information as publicly as I can. So just having it on the wall of our team room isn't good enough, since there are many people that don't visit our team room. Ideally I'd like to have the burndown chart projected on a wall in the central hallway of our office, so everyone can see it first thing they come in in the morning. But as a nice step along the way to this, I chose to publish the chart (and the rest of our product backlog) on our project wiki.

In the first sprints I did this by taking a photograph of the burndown chart every morning, right after updating it. I'd then upload the photo to our wiki. The only problem is... uploading them every day turned out to be too much of a hassle. So the wiki actually only got updated once a week. And that's not good for transparency of course.

So this time around we went searching for a simple tool that would lower the threshold of updating the burndown chart on our wiki. We searched for an extension to MediaWiki that allows you to create a chart by just entering the numbers in your wiki text. That turned out to be quite a challenge. There are many charting and drawing extensions for MediaWiki, but they either didn't do what I wanted or we couldn't get them to work on our wiki.

In the end I just gave up and wrote a simple web page that -when fed with the right parameters- will return a PNG image of the burndown chart. You call the page like this:

  • burndown.jsp?days=1,2,3,6,7,8,9,10,13,14&work=200,170,165,150,125,95
And the page will return the following image:
So the days parameter indicates the day numbers shown on the bottom. I entered all of them for the entire sprint right away. The work parameter is the work remaining. I just entered the values that I know, which is why the green line stops halfway through.

The generated chart is really simple and not very pretty. But it is very easy to keep up to date and that's what counts most. I just add the remaining hours at the end of the URL every morning... and that's it.

Although I consider this generator a stop gap solution until I find something better, I imagine it might also be useful to other budding Scrum masters. For that reason I've put the page online for public use at http://apps.vanpuffelen.net/charts/burndown.jsp. Just click the link and you'll get some usage examples.

Let me know if this generator is useful to you in the comments section. Also let me know if there's something wrong with it and I'll do my best to fix it.

Update (January 1st, 2010): in my company we've created a custom version of this same tool and used that in many projects over the last few years. This public burndown generator has drawn over 60.000 charts in 2009 alone, so apparently we're not the only ones who use burndown charts. That's why I've now updated the tool with the best features that we've added over time at my company. Check the latest version on http://apps.vanpuffelen.net/charts/burndown.jsp for all the features and let me know what you think of them.

Sunday, August 19, 2007

Scrum: utilization vs. velocity

At work we're recently started using Scrum for running some projects. As expected we need to slowly learn the lessons. One of the things we're been having a lot of discussion on recently is the meaning of the focus factor. Let me begin by explaining what a focus factor is, at least in my company.

To determine how much work you can do in a sprint, you need to estimate the top stories. We estimate these stories in "ideal man days" using planning poker. This means that each developer answers the question: if we lock you into a room each day without any distractions, after how many days would you have this story finished?

After these estimates we determine people's availability for the project. After all, they might also be assigned to other projects, if only for part of their time. Even people that have no other projects, tend to have other activities. Like answering questions from customer support or consultants, department meetings, company wide meetings, job interviews with candidates or just playing a game of fusball, table tennis or bowling on the Wii. So basically nobody is available to a project 100% of the time. At most it's 80% - 90% and on overage it seems to be about 60% - 70%.

So the first stab at determining how much work someone can complete is:

  • available hours = contract hours * availability
But when you're working on the project, you're not going to always be contributing towards the goals that you've picked up. Within Scrum there is the daily Scrum meeting. It lasts no more than 15 minutes, but those are minutes that nobody in the team is working towards the goal. And after the meeting a few team members always stick around to discuss some problem further. Such time is very well spent, but it probably wasn't included in the original estimate. So it doesn't bring the "remaining hours" down very much. I see all this meeting, discussion, coaching and tutoring as necessary work. But work that doesn't bring the team much closer to the goal of the sprint. I used to call this overhead, but that sounded like we were generating waste. So in lieu of the agile world I switched to using the term focus factor. So now we have:
  • velocity = contract hours * availability * focus factor
So the speed at which we get things done (velocity) is the time we're working minus the time we loose to non-project work minus the time we loose on work that doesn't immediately get us closer to the goal. In the past I probably would have included a few more factors in there, but in an agile world this is already accurate enough to get a decent indication of how long it will take us to get something done.

If there's one thing I've learned from the agile movement and Scrum it's to focus on "when will it be done" instead of "how much time will it take". So to focus on velocity instead of utilization.

Utilization is the territory of classic project management. It's trying to make sure that every hour of every employee is fully accounted for. So if they're programming, they should have a time-writing slot for programming; if they're meeting, there's a slot for meeting; if they're reviewing designs, there's a slot for that and if they're drinking coffee or playing the Wii... you get the picture. Of course there's no project manager that wants all that level of detail. But in general they are focused on what you're spending your time on.

Agile thinkers see this really differently. They say: it doesn't really matter how much time you spend, what matters is when it is done. This sounds contradictory so let's see if a small example can make it clearer what I'm trying to say.

If I tell my boss that some feature he wants will be done at the end of next week, he is interested in only one thing: that it is done next week. If we get it done on time, he doesn't care whether I spent two hours per day on it or whether it was twelve hours per day. I care about it of course, because I don't want to work late every night. And there's also a limit to the amount of gaming I like to do during a day, so two hours per day will leave me bored quickly. But to my boss, all that matters is when I deliver, not how much effort it took.

This is why the focus for Scrum projects is on velocity and not on utilization. So in Scrum you want to know how many hours you still need to spend on a job, not how many you've already spent on it. A classic project manager might be really proud that you worked late all week and clocked in 50+ hours. An agile project manager will note that you reduced the "hours remaining" by 10 hours and nothing more. If you're looking for compliments on all your hard work, then Scrum might not be for you.

Learn more:

Saturday, August 11, 2007

Will wireless work?

Friends often call me a geek. They mean no offense, so I try to take none. As Chris Pirillo once put it: "geek used to be a four letter word, now it's a six figure one". Well... that last part isn't exactly true for me, but that's probably only so because I get paid in euros instead of dollars.

Part of what makes me a geek is the fact that I tend to be an early adopter of new technologies. I got my first always-on internet connection in 1996 paying somewhere around 40 euros for a speed that never seemed to top 1.5 kbps. Yes, that's kpbs for kilobits per second - so about 40 times slower than an analog modem. But it was always on... so I was one of those people that knew they had email a few seconds after the other party had sent it.


I bought an XDA in 2001, which was the first touch-screen phone/pda with an internet connection. It was what you'd call an iPhone these days, although it had to do with a lot less marketing. So for me: six years have brought us better marketing and a multi-touch screen. Still... I'l probably buy an iPhone when they're actually available here. Not because I need it. Just because I'm an early adopter.

As an early adopter you of course run the risk of buying things that will never catch on. Or becoming the involuntary beta tester of a device, which means the technology is not yet ready for prime time. One area where the latter happened to me is with wireless networking.
I bought my first wireless access point and card somewhere in 2001. It wasn't completely new back then, but it hadn't been adopted by the masses yet. So usability and interoperability left something to be desired. USB wasn't as ubiquitous as it is now, so for a desktop PC I had to use a PCI to PCMCIA (now called PC-card and almost extinct) adapter. But hey... it worked... at times. But about half of the time it didn't work and I had to roll out a UTP cable again. I've tried to get a reliable wireless network over the years, but the devices either didn't work reliably or just broke down within a few months of service.

Somewhere in 2004 I just gave up on it and restored the cables to their full and permanent glory. So when whole tribes, states and even countries started using wireless networking, my house is completely wired. It's been like that for years now; first with really long UTP cables running down the hallways. They might not be pretty, but at least they work most of the time.




Two years ago I started using ethernet-over-powerline adapters, which use he powerline network in my house serves as a network. These adapters turned out to be as reliable as using direct UTP cables. So in my study I connect the ADSL modem/router to the powerline through one of these adapters. And in other rooms I connect computers to the powerline through another adapter. And it just works.

The adapters are ridiculously expensive and not very rugged, but they do give me the true plug-and-play experience that I never got with wifi. And even though having blue adapters in many wall sockets is not very pretty, it's a lot better than all those colorful UTP cables lining the floor. So -although expensive- I was pretty happy with it. Most people may prefer wifi, I've been sticking with ethernet-over-powerline.


A few weeks ago I saw a new device from Devolo: a wireless extender. So you plug this adapter into a socket and not only can you plug in an UTP cable, but it also provides wireless networking. So once again I couldn't resist and ordered one. If the whole world is using wireless without problems, I can't stay behind - can I?

Friday, August 3, 2007

Both, either or any

I recently was adding functionality to a part of our product that was full of lines like this:

  • lBool_vert_in = this.IsBoathBitsSet(iArrAllow[key], iArrDeny[key],
  • top.mTDSDefines_CheckInAction);
At first I was just annoyed by the obvious typo in there. I'm sure a boath is something (maybe a very expensive boat) but it has no place in our code. Making a typing error is not strange, it's actually quite normal. But not correcting it when you're copy/pasting it at least a few dozen times is a kind of laziness that I don't like very much.

Later I also needed to know what the function does. At first I thought the name was actually pretty self-describing. But I couldn't explain the behavior I was seeing in a part of the code that invoked this function.
  • this.IsBoathBitsSet = function(iBit1Value, iBit2Value, iDefineType)
  • {
  • return (
  • (
  • (iBit1Value & iDefineType)||
  • (iBit2Value & iDefineType)
  • ) == iDefineType )
  • }
Now that strikes me as odd... the function name suggests that both bits need to be set. This would normally translate into an && and not into an || like in this code. So instead of just a simple typo, this function name is actually plain wrong. And it also immediately explained the behavior I was getting.

I was thinking of renaming the function. But what shuld I rename it to? At first I was thinking IsEitherBitSet, since it returns true if either bit it set. But that is also not entirely correct. Since it also returns true if both bits are set.

So maybe it should be called IsAnyBitSet. What do you think? How do you translate boolean operators into English?

Sunday, July 29, 2007

Why is there no WinSCP for the Mac?

Ever since Mac OSX came out, I understood that the Mac is the dream machine for software developers. It's a UNIX like system with a great GUI built on top of it. But how come then that even after I've owned an iMac for a few months, I find myself hardly doing any development work on it?

I try to do my personal development work on the iMac. And recently I even found it quite nice for developing a Java application that's been in my head for a few years now. But most of the development work I do at home is not Java work, it's web development work. And for that I still find myself behind my trusty old Windows XP laptop.

The reason for that is that I do most work through the remote editing feature of WinSCP. For those of you that don't know that program: it provides a Norton Commander style (dual pane) interface, with one pane being the local system and the other a remote system. Like a remote web server. And with a single key press you can open any remote file in a very simple local text editor, make a few changes and then save the file back. Their embedded editor might not be the most feature rich IDE, but it just works, it's there when I need it and apparently it's good enough for me to get the job done with.

But then why doesn't a WinSCP exists for the Mac? I know it's called WinSCP, so it's for Windows. But the Mac must have something similar, right? Well, similar yes... On the Mac there is Fugu, which at first sight seems similar. But it's completely not the same for my case.

Fugu doesn't come with a built-in editor and to me that makes a lot of a difference. They chose to integrate with existing editors instead. Which wouldn't be a problem, if they'd actually integrate with the standard OSX editor: TextEdit. But they don't. Instead they offer a whole list of more or lesser known editors, from BBEdit via TextMate to VI and emacs.

Unfortunately I don't have about 75% of the editors Fugu does integrate with. TextMate sounds great, but I haven't bought it yet as I'll need to invest some time to learn to appreciate it. And for the editors that I do have and that Fugu supports, their integration doesn't work. Now if it doesn't work for those editors, do I want to risk installing yet another editor on my system to find out if the integration works there? Well apparently I don't.

While typing this I already noticed that there are of course more options than just Fugu. There's Cyberduck, apparently Krusader also is nice and Disk Order sounds perfect if it would support SCP/SFTP. So I have some options ahead of me. But until I invest the time and find something that works at least as well as WinSCP, my development work on the Mac is limited to Java applications.

--- January 21, 2008 - Frank van Puffelen ---

I might have finally found my WinSCP replacement. Read more about it here.

Tuesday, July 24, 2007

Don't mistake the means for the goal


Last weekend my wife was balancing her checkbook. Not that we have a problem making ends meet, but after an embarrassing situation at an ATM she really wanted to figure out why these things sometimes happen to her. I agreed to help her, as long as she would do the actual work herself and draw her own conclusions.

As it quickly turned out, she had no idea where her salary is going. So I suggested she'd categorize her spendings based on what she could find in the tele-banking application. She went to work on it. I knew that something was going wrong when I entered her room three hours later and found her still busy categorizing.

It turned out that she had found this online checkbook application, which could show all kinds of charts based on your spendings. All you had to do was upload a dump of the tele-banking information and categorize it using their "super friendly" web interface. Three hours later and she still wasn't done categorizing just a few months worth of spendings.

I asked my wife whether this was really worth the time. After all, categorizing spendings was not the goal. It was just supposed to be a means to quickly figure out where her money was going. She assured me that it was worth it; she was almost done and then she would know.

An hour and a half later she was indeed done and proudly called me in. "See. I spent this much on our Holidays. And that much on gas." Very interesting information I'm sure, but not what I was interested in. "I'm pretty sure I also paid part of those Holidays", I said. "Does this mean that I didn't pay my fair share?" She started clicking on the charts frantically. "It must be in there?" She couldn't find it and had to go back to the tele-banking application to look it up through a quick search.

Forty-five minutes later we had pretty much figured out where her money is going. And although the online checkbook gave some nice charts, we frequently had to go back to the "source" (the tele-banking application) for additional details. The online checkbook seemed like a nice, quick way of doing the categorizing. But when it turned out it wasn't very fast, she should have stopped entering data into it. Categorizing the spendings was just a means to figure out what type of things her money was being spent on, it wasn't the goal.

It turned out that I indeed didn't pay my fair share of the Holidays and that she was paying for some insurances that really should come from our shared account. With that and a solid resolution to determine where all the cash withdrawals are going she's pretty sure that those embarrassing ATM incidents should not occur anymore. Or at least not too frequently...

Tuesday, July 17, 2007

Using the iMac for music playback

Now that we've had an iMac in our living room for a few months, we've started using it for many things. One of the most obvious ones is that -after spending almost a month ripping our CD collection- we now use it to play music from. Sure we already had a mediacenter in the living room. But by using the mac, we only have to switch on one device instead of two.

But something that really annoys me is all these small sounds that are added to the music. Just last night we we're listening to the new Crowded House album and right at the end of the song Silent Trees I heard a woosh sound that was not supposed to be there. And at the start of the next song, there was a tring-like sounds that I did not recall hearing before.

So here is one simple request to my family, friends and co-workers: can you please stop signing in and out of iChat while the music is playing?

Thursday, July 12, 2007

My first game of planning poker

Yesterday I took part in the first sprint planning meeting of my life. We have started up a new development project and we decided to use Scrum as the process. The project is actually quite small, so we have just two developers (myself included) and a product owner for it.

The product owner had prepared nicely and had a quite extensive product backlog. He had even filled in a "how to demo" field for a lot of the stories, which I'm not sure he's supposed to do before the sprint planning. At least it wasn't very handy to have the "how to demo" in place, as it makes it harder to discuss alternative solutions for the same functionality.

After the product owner had explained each story, we were to come up with an estimate of how much work (in ideal man days/story points) it would be to implement the story. I have done many of these estimation sessions before, but this time we decided to play a game of planning poker. Being the good scrum master that I am, I had brought two packs of (rather improvised) planning poker cards.

The other developer and I talked through the story, determining what it would take. We were basically already breaking the story down in tasks, which was a nice head start for the actual breaking down we planned to do later. After agreeing on the tasks, we would go into our poker deck and select the card matching our estimate. When we both had selected a card, we'd pull it out of the deck at the same time - revealing our estimate.

Now I must admit that I wasn't too impressed with the transparency that this estimating method brought. I guess -just as with real poker- you shouldn't play with just two players. There was actually only one story where we seemed to have a big difference in estimate: 8 vs 13 points. But as it turns out, our decks just didn't have any numbers in between 8 and 13. We had both wanted to select a 10, but since that wasn't there we just had to pick something slightly higher or lower. Being the planning pessimist that I am, I of course picked the 13. :-)

So there you have it: I played the game of planning poker. It wasn't anything special or extremely different from the ways I've done estimations before. But I guess that contrary to popular belief, being extremely different is not what Scrum is about. What is it about then, you ask? I'll let you know when I find out. Because if I answered that question now, I'd just be repeating the Scrum/Schwaber mantra.

Wednesday, July 11, 2007

a List vs. IList

One of the advantages of being a programming language polyglot is that you get to see the difference of how people work in all those languages. Of course it might take a while before you're comfortable enough with a language to be able to appreciate it, yet step back far enough to see the patterns of people (including yourself) using that language. But once you do, it takes your understanding of programming languages to a whole new level.

That's all very nice and abstract, but I recently encountered a very practical example of the difference between the C# collection classes and the Java collection classes. While reviewing some of the code of a new prodct, I noticed that a lot of methods in the API were accepting and returning List objects.

  • List filterInstructions(List instructions);

Now without reading further tell me, which language is this: C# or Java. Don't worry, I'll wait while you think about it...





Ok, that wasn't very nice of me. You can't really tell which language it is, since the syntax is valid in both.

But there is a huge difference in what it actually means in both languages. In C# List is a concrete implementation of the IList interface. In Java List is an interface, which is implemented by classes like LinkedList and ArrayList. Do you notice the subtle difference? In C# List is a concrete implementation, in Java it is an interface. So the above code sample in C# would accept only instances of the List class (or a subclass), while the Java implementation would accept any implementation of its List interface.

Now I don't want to start a debate on whether classes should accept or return interfaces or concrete classes. There are plenty of good discussions out there. What I want to do is show what might cause the difference in what I often see in Java code and what I see in C# code.

The example above was from a C# class in one of our products. So the developer chose to expose a concrete List implementation, instead of the IList interface. I noticed this and -coming from a Java background- wondered whether our Java developers would normally expose e.g. an ArrayList in Java. They wouldn't. They would expose the List interface instead. So I asked the C# developer why he chose to expose the concrete class, rather than the interface. He didn't have a specific reason, he just did what was easiest: expose the List class.

Keep in mind that a good developer is a lazy developer. So actually this developer was taking the right approach (be lazy) yet he got a different result than what I'm used to seeing in Java code. But then why wouldn't a Java developer expose his ArrayList, but instead choose to expose the List interface?

Well... I know one possible reason. Both of them are exposing the same thing: a list. And actually they are both saying exactly that in their contract: we're accepting and returning a list. It's just that when you literally say that in C# ("I am exposing a List") it translates into a concrete class, while if you say the same thing in Java ("I am exposing a List") it translates into an interface.

It's really all a matter of understanding how a developer thinks and how your choices in class library design influence that thinking. In this case it is very simple: Microsoft seems to put the List class first in your mind, while Sun/Netscape puts the List interface first. A small detail to many, but I found it an interesting difference between the platforms. What a difference an I makes. :-)

Saturday, July 7, 2007

Background compilation vs background unit test runner

A few months ago I posted about how to get background compilation in Visual Studio by using Resharper. It's really remarkable how background tools like these can improve your productivity. Sure, you'll have to learn to ignore the red wiggly lines until you're done typing. In that respect it is no different than the spell checking in Word or Firefox. But once you know when to check the results and when to ignore them, background compilation with "in code" feedback is a real time saver. I probably would have worn out my ctrl-shift-B combo long ago without it.


This week I noticed that even though compilation is automatic, you still have to start unit tests manually. And while they're running, you have to wait for the unit tests to complete before you can continue working on your code. And if you have a substantial set of tests, like the good test-driven developer, running them may take a few minutes. So running the tests is quite disruptive to the development process.

And as you probably know, if something breaks the flow of your development process you probably won't do it as often as might be useful. Why are continuous build systems so good? Because they work automatically when you check something into your version control system. Why is background compilation so good? Because it works automatically as you're making changes to the code.

So why can't we then have the unit tests running in the background? Sure, it'll eat up some resources. But it's not like my machine needs all its megahertz's to keep up with my typing anyway. And I can only imagine how great it would be to automatically see a purple line appear under some code I just changed that broke one of the unit tests.

Sounds like cool stuff to me. Does anyone know if this already exists?

Friday, July 6, 2007

Darth Vader's sword

I just wondered about something yesterday:

Jedi knights use light sabers since they are on the good side
But shouldn't darth vader then be using a dark saber?

I know, it has nothing to do with technology - except for being extremely geeky. But sometimes these questions just pop into your mind.

Saturday, June 30, 2007

Pinball machines

Yesterday evening I went with my brother to pick up a pinball machine he had bought. He already has one in his home: a worn down 1975 Bally Kick Off machine.

It's still fun to play at times, but it could use some restoration. So... my brother decided to buy a second machine. After some searching on the second hand sales sites he finally bought a 1990 Williams Funhouse.
It's a pretty well known machine, since it's very colorful and has a moving head in it.

While talking to the seller I once again realized that pinball is really dying. Twenty years ago we'd have at least four or five arcades nearby where we lived. And each arcade would have at least somewhere between 10 and 15 pinball machines.
The biggest arcade had at least 50 of them. That was so much fun. More fun than all the other video games they had, even though those drew the bigger crowds.

But with video games moving into the living room, the arcades were not making enough money anymore. So these days most of them have closed down. And the ones that are still there, have switched to only having slot machines.

Which isn't exactly the same thing as a pinball. Especially not when you try to nudge them a bit, which is really frowned upon by the arcade owners.

So with no arcades to host the machines, vendors shut down one after the other. Today there is only one big vendor left: Stern. They're a relatively new company, so not a survivor from the old days. But still, Stern makes pretty well designed pinball machines based around things like movies. They lack some of the character and originality of the older machines, but they're real pinballs and as long as Stern is the only company still making pinball machines I won't be complaining too much.

But there are also a lot of pinball aficionados our there, keeping the hobby alive. They buy old machines, restore them with a lot of skill, time and dedication and then send or sell them off to someone else - hoping they too will pick up the love for pinball. In this case the receiver was my brother, who doesn't need to pick up the love. He likes pinballs almost as much as I do. But he does have to little kids (ages 5 and 7) that still need to learn to appreciate it.

One thing I didn't really realize until yesterday was how heavy pinball machines are. They actually need to be quite heavy to withstand all the kicking, shoving and nudging that used to go on at arcades. But nudging a 140 kilo machine isn't exactly the same as trying to lift it into a car. My arms are still hurting....

Tuesday, June 26, 2007

It's all about the data, not about the code

As I've probably told before, I administer (and contribute to) a few photo blogs. Since I wrote the software for the blogs a few years ago, actually most of the time I spend on it is small tweaks to the code. Adding a feature here and there. Nothing big, but enough to keep the editors happy and myself busy.

But recently I noticed that I've been putting of adding one of the requested features. And I wondered why. The feature in itself isn't very spectacular. The blogs work with a scheduled system, where all editors can see everyone else's posts. Since this is the holiday season, editors are sometimes planning their posts weeks ahead of time. And this can get confusing to the editors because it sometimes isn't very clear anymore where we are today. And something that also slightly bothered me is that some of our editors like to read posts from the administrative interface before they appear on the site. We even get answers to some of our "guess what" photo's before they're available on the public site.

It is time to do something about this. Like I said, it is pretty simple: don't allow the editors to see each others future posts in the administrative interface. So they should still see each others "old" posts, which is a great way to quickly find something you want to link to. But into the future, they should only be able to see their own posts.

The heart of the administrative interface is a table with a row for each post. It's very basic and the code is like:

    for each post
write post to grid
Any half-decent programmer will know how to add the feature:
    for each post
if (postdate <= now || currentUser == author)
write post to grid
But this is where it becomes problematic. The editors can sometimes also post on behalf of other (guest) authors. And of course, they should be able to see the posts that they created on behalf of guest authors. So the condition should be more like:
    for each post
if (postdate <= now || currentUser == editor)
write post to grid
In here we added the concept of an editor: the person who created the post in the system. And this is not necessarily the same as the author: the person who created the content of the post.

This would indeed very easily implement the feature. One additional if statement and I'm done.

There is only one problem with it: I don't keep track of the editor of the post!

The information was never needed, so it has never been recorded. That means that I have about a year and a half of posts for which I don't know who the editor is. Which means that I have to figure out a way to either manually gather that data now (something I don't look forward to), add an exception for the cases where the data isn't available (resulting in uglier code) or somehow programmatically extract the information from the data that we already have. That last option sounds like the least manual work (both now and in later maintenance). But it means that I'll have to write code that touches all 500+ posts we have in the system.

When I realized this, I finally knew why I'd been putting off adding this feature. I don't have any problem modifying the code, even when I hardly use version control and backups of it. Why? Well, simply because I know that if I break it I can just as easily fix it again. That's the benefit of a one man software project. I wrote the code, so I know how everything works. But I am reluctant to touch the data. Why? Well I didn't write most of the data, so I have no idea how to restore it if I "break" something in that area.

This realization reminds me of something I always say to fellow developers: "it's all about the data, not about the code". The code can be re-written without any problem. All it takes is time and a developer. It doesn't even need to be the original developer, because a good set of data allows for lots of reverse engineering. But the data is often gathered from many sources. And if it's lost, it's lost. It will take much more work to find all those sources of data again, if it's possible at all.

Now that I've talked about it this much, I'll probably bite the bullet and add the feature anyway. I'll even do it the right way: by modifying/augmenting the existing data to include information about the editor in addition to the author field we already have. But I'll be sure to first do this in a development environment. And even when I know it works, I'll make some extra backups before applying it to the live environment. It might be a lot of work that is most likely not needed. But I'd rather do the extra work than run the risk of corrupting some of our precious data.

Saturday, June 23, 2007

The Mac startup sound

I switched on my Mac this morning. While it booted I walked to the kitchen to grab some coffee and I heard myself singing:

  • It's been seven hours and fifteen days
In case you don't recognize it, that is the opening line of "Nothing compares to you" by Sinead O'Connor.

Now it's not unusual for me to sing small fragments of songs. But I always wonder what triggers a certain song. Especially one that is as old as this one.

It didn't take long to figure it out in this case. The song was triggered by the sound that the Mac makes when you switch it on. It sounds remarkably similar to the start of the song.
Has anyone else noticed this? All I could find in a quick scan, was a comment in this post.

And is it a coincidence? Or is this part of the tribute that the Mac engineers pay to their creation? It sounds like the thing any Mac-head would say to his/her machine: "nothing compares to you".

Flashback to regex'es

Sorry for the lack of updates in the past week. Normally I'd blame work, but in this case it wasn't unusually busy there. Actually my lack of updates has been caused by the amount of feedback I've gotten on one of my earlier posts.

The post is about optimizing regular expressions, one of the last black arts of programming. And this black art is mostly lost on me. Luckily one of our readers seems very fluent with regex'es and has been extremely helpful. Unfortunately I haven't yet gotten all the kinks out yet while integrating it into the project, but his suggestions seem to improve performance immensely.

If you want to know more, follow the link above and see how it plays out. If you don't care about regular expressions; I am busy writing a next regular post. It should be up later today or early tomorrow.

Sunday, June 17, 2007

Help! My mediacenter is dying!

Yesterday I told you about the positive impact our mediacenter has had on the way we watch TV. Unfortunately it hasn't all been positive.

From day one our mediacenter has been having hardware problems.

  • Motherboard replacement.
    Within two months of buying the mediacenter, its motherboard had to be replaced. And this is an A brand, mind you. Apparently that doesn't say anything about quality anymore. The service center was pretty close by, though.
  • Single tuner or twin tuner.
    We went for the mediacenter with two tuners in it. But unfortunately about half of the time, only one of the tuners seems to work. It's annoying, but apparently not annoying enough to send the machine back to the repair shop.
  • Hibernate or always awake.
    One the things that is really important is that your mediacenter should "go to sleep" when it's not doing something and then "wake up" when it needs to record a show. And at times that actually worked reliable for us.
    But most of the time, our mediacenter is much like a 15 year old kid: it wakes up when it feels like, stays awake as long as it likes and refuses to go to sleep when you tell it to.
  • What's that noise?
    We went for the more expensive VCR form factor mediacenter, because it was supposedly less noisy. And I guess a standard desktop PC would indeed make more noise that our mediacenter. But I still find the amount of noise it makes too much. It could really do with some less power-hungry parts, so at least some of the fans can be removed.
  • Wireless keyboard range
    The range that I can use the wireless keyboard at is ridiculous. I basically have to sit on the floor right in front of the mediacenter to get it to work. And even then it is intermittent. I would much rather have a reliable wired keyboard, but apparently am to lazy to get one that looks nice enough in my living room.
Ok, so much for the hardware problems. There also plenty of software problems. Our mediacenter came with Windows XP Mediacenter Edition. That is a whole Windows version dedicated to supporting mediacenter functionality. So it can't be bad, can it? Wel....
  • Installation assumes you have a monitor attached
    I bought a mediacenter to hook it up to my TV. Then why does the installation assume that I'm on a high-resolution monitor? Entering that 20 digit license code really is no fun when you can't read half of what's on the screen.
  • Windows Mediacenter is an application, not a special version of Windows
    When I saw Window Mediacenter Edition, I expected the whole Windows OS to center around the mediacenter functionality. Well, I was wrong. Mediacenter is nothing more than an application on top of regular Windows. And unfortunately that's how it feels. When I start (or wake up) my mediacenter, I firsts see the PCs POST screen, then I see Windows booting and finally I see the Mediacenter application starting. Imagine your TIVO doing something like that.
  • PDC? What's that?
    In Europe we have something called program delivery control, which sends some extra signals with the program when they start and end. The result is that you never miss the end of a program that is running late. Admittedly this is available for a limited number of channels, but for those that have it this is a great feature. Unfortunately Windows Mediacenter doesn't support this feature whatsoever.
  • Teletext support is laughable
    Again in Europe many TV channels carry teletext pages with useful information. Windows Mediacenter can display these fine, but why -on a PC with plenty of memory- do I still have to wait for it to cycle through the pages? Have the developers never heard of caching the pages? More and more TVs do this, but Windows Mediacenter... doesn't.
  • Why isn't there a web interface to the EPG?
    Programming recordings through the EPG on your TV screen is a real improvement to old-style, time-based VCR programming. But this is one are where a higher resolution screen helps a lot. So why isn't there a standard web-interface to the EPG data on my mediacenter. There's a great third-party program to do this, but why doesn't this work out of the box? Microsoft's own (add-on) solution is worse then the third party one and doesn't even work outside of the US.
  • Why isn't the EPG data up to date?
    The quality of the data in the EPG used to be really bad. For some channels we used to miss about 50% of the programs, because the data wasn't in the EPG. This has improved considerably since last year, but it still happens sometimes.
  • Failing recordings
    The mediacenter either doesn't start recording a show that it is scheduled to record. Or it records a few minutes, then stops, then starts again, then stops... you get the picture. The result is shows that I really can't watch.
    So I lied yesterday: I have missed quite some episodes of CSI lately. This is the biggest issue for us at the moment. All the other software issues are annoyances, but this one is making me think of unplugging my mediacenter. Or at least of re-installing it completely. I'd happily sacrifice the 40+ hours of recordings that are still on the hard disk to get it to function properly again.
I guess a lot of our problems have to do with the fact that we are early adopters. Mediacenter adoption has even now not taken off in full, so we should expect to find some "children's diseases" (that's a Dutch expression) in there. But on the other hand... if I buy something that looks like a VCR, I expect it to work somewhat like a VCR. I wasn't buying a beta version of some new application, at least that's not what it looks like. If something looks like a consumer electronics device, I expect it to work like one.

So that's the current story. As you can see it is really two-sided. On the one hand, having a mediacenter has really changed the way I watch TV. On the other hand our mediacenter has been a pretty constant source of frustration too. Let's just hope Microsoft cleans up its act in a next version of the Mediacenter software (the Vista version only seems to be somewhat better). Or let's hope that Apple adds a receiver to its Apple TV or the Mac Mini. I really hope for the latter, because a few more Macs would add some more technological styling to my living room. :-)