The Puf Principle: April 2007

Monday, April 30, 2007

Second impressions of Joost

I've been playing with Joost for a few weeks now and are slowly starting to see it for what it really is: TV played through the internet. Over the weeks I've found a show I liked on the Alliance Atlantis Sci-Fi channel and watched a few episodes of it. The show is not especially brilliant of anything. But apparently good enough to watch three episodes so far, so it can't be that bad.

It's actually very much like watching TV. You flick through some channels until you find something that seems interesting. After that you sit back and enjoy the show. Or you just watch it until it annoys you and then switch channels or switch Joost of entirely.

There are even commercials injected into the shows. And the transition from show to commercial is not necessarily smoothly like on real TV networks. No, on Joost a commercial can start anywhere in the show. Even in the middle of a scene or shot. Technologically that makes sense. But as a viewer it takes some getting used to.

Luckily they are only showing one commercial per block. And of course you can't fast-forward them. As a mediacenter owner that means that I actually watched some commercials for the first time in over a year. :-)

So far I've found one channels that seems to match my interest. I'll continue my search to see whether there's more. But I'm still not sure this is the way I'll be watching TV in the future.

Saturday, April 28, 2007

Eleven days... feels like a month

I would like to start this post with an apology. I might not reach a huge audience through these pages, but the audience I do get hasn't heard anything from me for 11 days. In the blogosphere that is almost like an eternity, especially since I didn't announce it up front.

What happened is that for once work caught up with me. Within those 11 days my company shipped a new product, got acquired and held a conference where I hosted a session. I can't believe it is less that two weeks ago that I said "Go" to the release of our new product to customers.

But I really can't believe that at our first conference we get more than 375 attendees and about a 120 of them came to see my technical workshop. This was the first time that I presented a technical topic to more than 50 people, so as you can imagine my adrenaline level was quite high. But still I wasn't stressed or uncomfortably nervous: I was excited. It is great fun to tell people about something you've been working on for months. Especially if you've got a good story to tell about it.

That brings me to the sound-byte for this time. I found it in a dictionary while looking for something else a few weeks ago, and it is so true: proper preparation prevents poor performance.

Tuesday, April 17, 2007

Showing just the right download instructions

When I chose to download Joost on my Mac, I got to a page that showed me the instructions for downloading and installing Joost on a Mac using Firefox. Every screen in the instructions, from the Downloads window of Firefox to the installer in Mac OS X, matched exactly what I saw on my screen.

When I later installed Joost on my Windows PC, I got a similar page with instructions for installing Joost on a Windows PC using Firefox. This time there was a slight mismatch in some of their screen shots, but overall it was pretty accurate.

And I guess if I had used Safari on the Mac or IE on Windows, I would also have gotten a page showing the correct instructions for that combination.

It looks like a pretty small thing for the comfort of the end user, but I imagine a lot of work must have gone into this. Imagine all the platform combinations you have to cover and the number of screen shots you'll have to take.

And all that effort will not be appreciated by the average user. They'll just see the instructions and at best say "hey, those are easy to follow". And that's a Good Thing (tm).

Normally if as a developer we've spent a lot of time and effort to create something we'd like the users to notice that it took time. But why is that? Is that because it's good for the user to notice this new feature? Or do we want them to notice, because we want them to see how much work we've put into it? And why should they care?

I once wrote a post for my company blog about this topic, titled "doing the right thing":

Did you ever proudly show off your work to someone only to have them respond with, "Yeah... how else would you do it?"

Well I have. And the first dozen times it happened, it annoyed the hell out of me. After spending hours tweaking the functionality so it worked exactly as expected, how could they not see the beauty of it??? How could they not appreciate the effort it took???

Luckily wisdom comes with age and in recent years I've started to realize that this reaction is actually the best compliment you can get. You've solved the problem in a way that seems completely natural to someone who's never seen the solution before. As [any usability expert] will probably gladly confirm, this means your solution should pass most usability tests with ease.

And to me this is exactly what the Joost developers/web masters have accomplished: a solution that works exactly as most users expect it. Of course they shouldn't show a generic instruction page with sections for all operating systems and browsers mingled together. Of course what they did is the only sensible approach. And of course they don't show a message on how good their solution is. Of course this is how we'd all do it, if assigned a similar task and enough time. But I definitely appreciate the fact that they solved it this way and would like to publicly applaud them for it: great work people!

Saturday, April 14, 2007

First impressions of Joost

Earlier this week I finally got to try out Joost, which -according to their own web site- is "a new way of watching TV on the internet". I've installed their software on both the Mac and a PC and played with it for about an hour. So far I found it interesting and well executed, but nothing revolutionary.

Ever since winamp introduced their integrated TV channel, it has been clear that it is technically possible to stream video to users on demand. And given that broadband adoption has skyrocketed in recent years, combining this streaming technology with a peer-to-peer network should not be rocket science. I'm not saying this must have been easy, but there is nothing very new about the technology Joost uses as far as I can see.

The end result could be very compelling: given enough providers of good video, it will be an on-demand like TV experience. The problem has always been with finding enough content providers. Looking at winamp today I still get the same types of channels I got years ago: some cartoons, some home made stuff and some cheap porn. The technology is there, but apparently the content providers aren't.

Joost seems very well implemented and it is definitely nicely executed. The interface looks slick, there is lots of content to choose from and the quality is... well... acceptable. But the content they show is content that I could have already gotten elsewhere on the web. Sure, they've brought it together in a single interface, saving me a lot of searching, but still... it's the same old content.

And they don't seem to include content from any of the big players. On one hand it's nice to be able to watch sketches from Boom Chicago in Amsterdam, but what if I want to watch an episode of Friends or Seinfeld? And it's also great to see they've got a channel showing clips from 5th gear, but what i I'd rather see some footage of the big car show we had locally? Even with the amount of content Joost seems to have indexed, it still doesn't seem to match my needs. And isn't that what the internet is about? Matching our needs.

In short: I'm not sure about Joost. Maybe the brilliance of it just hasn't hit me yet. Or maybe I'm just not their target audience. If either of those changes, I'll let you know.

Tuesday, April 10, 2007

How to optimize a regular expression

Regular expressions are in the toolbox of most developers. They provide an easy way to match patterns that are just a bit too complex for IndexOf. A friend of mine once said: "regular expressions must have been invented by the devil. They first lure you in with their beauty and terseness, allowing you to do complex things with a seemingly simple code. But later, when you encounter problems with your expressions, you find yourself in debugging hell."

And after having fought regular expressions last week, I tend to agree with him.

In a project I'm involved with we have to rewrite the URLs in HTML pages. The process is pretty simple, especially in pseudo code:

for each URL in the HTML
   determine the new destination
   replace the URL with the new destination

The most tricky part is of course in finding the URLs in the HTML in the first place. Parsing HTML is not exactly exact science, something the browser vendors at least seem to agree upon. But for this project, we couldn't afford writing our own HTML parsing logic and apparently there aren't many good HTML parsers available for .NET. So one of our developers decided to use regular expressions to find the URLs.

And as my friend had already predicted, it started out very well. Recognizing href and src attributes in HTML is not that difficult:

(?<=\ssrc|\shref)\s*=\s*([^>\s]*)([^>]*)(>|/>)+

It of course becomes a bit more tricky when you also need to handle quotes around the URLs:

(?<=\ssrc|\shref)\s*=\s*([""'])([^\1]*?)\1

I was very surprised that I couldn't find a single construct that matches both quoted and unquoted values, but apparently it really can't be done. So we decided to use separate expressions for them, doubling the required processing time but at least allowing us to rewrite all URLs.

Things got even more tricky once you we found out that we needed to treat base tags different from other tags and want to exclude them from the result:

(?!<\s*base [^>]*)(?<=\ssrc|\shref)\s*=\s*([""'])([^\1]*?)\1

This last one uses a construct called "zero-width negative lookbehind assertion", which I had never heard of before. But it seemed to do exactly what we needed: only match when the URL is not inside a base tag.

Observing readers might have noticed the fact that this expression matches hrefs anywhere, so also in things like HTML comments. We decided to not call that a bug, but label them as unusual HTML constructs. Anything to keep the project going forward. :-)

Until of course... we ran into performance problems. As it turns out the last regular expression takes over 600ms to find 20 matches in a 21Kb HTML document. And if you run a sequence of similarly complex regular expressions to handle different constructs against that HTML document, the time to rewrite the HTML quickly becomes unacceptable (over 3 seconds in our case). And since the original developer who wrote these regular expressions had since left the project, I was tasked with searching for a way to optimize them. Oh lucky me!

If you don't know regular expressions very well, the process of optimizing them can turn into a very mind numbing activity. Remove a character here or there, re-run, check whether the results are the same as before and whether it is faster. By now I know this process pretty well, because I tried it for a day and a half. And the results have not been too promising: about 25% improvement of the performance. The trick was to replace the "Zero-width positive lookahead assertion" with a "zero-width positive lookahead assertion". To you and me that means removing the < (less then) after the first question mark:

(?!<\s*base [^>]*)(?<=\ssrc|\shref)\s*=\s*([""'])([^\1]*?)\1

A 25% performance improvement might sound good, but the response times are still unacceptable and I have the feeling that there is quite some room for more improvement in there.

I recall having to optimize a pretty complex XSLT a few years ago. It quickly turned into a similar exercise of "change and test", without any idea of the likeliness of success. That is... until I discovered the option to compile the XSLT into code (Java code in this case). The idea behind allowing you to do that is that Java code would execute faster and would take less memory while executed. But an additional benefit I found was... the fact that Java code can be profiled. I still recall the excitement of the first few runs of my XSLT in a profiler. The analyzing of the call stacks, the finding out that named templates become methods named temlate[name] and match templates are translated into methods named template[index]. It was great and it allowed me to tune the XSLT within days.

Now if only something similar exists for regular expressions. I know .NET has the ability to compile them into an assembly. But I'm using the pro version of Visual Studio, so I don't have access to the profiler. And I'm not aware of any good, free or cheap profiler for .NET.

Meanwhile, does anyone have an idea of how to further optimize this regular expression? If not, I'll probably see whether I can reduce the number of expressions we run by merging some of them together. It will probably hurt readability even further, but at this stage I'll sacrifice readability to get closer to shipping. God, I know I will hate myself for that in a few months time.

Monday, April 2, 2007

The Mac application menu is at the top of the screen

I noticed that on the Mac when I don't instantly understand how to do something, it often takes me a lot of time to figure out how to do it at all. I think this might be because I don't see the menu bar as part of the application. You see, on a Mac the menu bar of the current application is at the top of the screen. As a long time Windows user, I'm used to it being at the the top of the application window. And apparently it takes me some time to make the transition.

On the one hand the inability to immediately find the application menu is annoying. Why can't those Mac designers just stick to long established standards? On the other hand it is just something 'll have to get used to. But on yet another hand (feeling like a three-handed monster here) it forces the application designers to make sure most common actions are available from the actual client area of the application. And that's a good thing (tm).

iTunes, Garage Band, even a simple application like Photo Booth have the most often used operations somewhere in the main application window. Whether it is the "Import CD" button in iTunes (yup, I'm still busy ripping my CD collection) or something as simple as the small "x" buttons to remove photos from Photo Booth, they are all really close to where the functional "meat" of the application is. And I can already feel that once I get used to finding the right action close by, I might actually hate having to reach out to the application menu so often on other applications.

The Puf Principle

Monday, April 30, 2007

Second impressions of Joost

Saturday, April 28, 2007

Eleven days... feels like a month

Tuesday, April 17, 2007

Showing just the right download instructions

Saturday, April 14, 2007

First impressions of Joost

Tuesday, April 10, 2007

How to optimize a regular expression

Monday, April 2, 2007

The Mac application menu is at the top of the screen

About

Most read posts

Blog Archive

The Puf Principle

Monday, April 30, 2007

Second impressions of Joost

Saturday, April 28, 2007

Eleven days... feels like a month

Tuesday, April 17, 2007

Showing just the right download instructions

Saturday, April 14, 2007

First impressions of Joost

Tuesday, April 10, 2007

How to optimize a regular expression

Monday, April 2, 2007

The Mac application menu is at the top of the screen

About

Most read posts

Blog Archive

Subscribe To