HTML5 vs XHTML2 vs DoNothing

Simon Willison points to David “liorean” Andersson’s article on HTML5 vs XHTML2. This debate about the evolution of HTML has gotten confusing. In a nutshell, the W3C wanted to fix HTML by making it proper grown-up XML, hence XHTML which was meant to succede HTML 4.0. Unfortunately XHTML never really caught on. One of its inherent problems is nicely put by Andersson:

Among the reasons for this is the draconian error handling of XML. XML parsing will stop at the first error in the document, and that means that any errors will render a page totally unreachable. A document with an XML well formedness error will only display details of the error, but no content. On pages where some of the content is out of the control of XML tools with well-designed handling of different character encodings—where users may comment or post, or where content may come from the outside in the form of trackbacks, ad services, or widgets, for example—there’s always a risk of a well-formedness error. Tag-soup parsing browsers will do their best to display a page, in spite of any errors, but when XML parsing any error, no matter how small, may render your page completely useless.

So nobody took much notice of XHTML; the W3C’s influence declined; and a rival anything-but-Microsoft group called WHATWG commenced work on its own evolution of HTML which it called HTML 5.

In the meantime the W3C eventually realised that XHTML was never going to catch on and announced that it would revive work on HTML. Actually it is still working on XHTML2 in parallel. I suppose the idea, to the extent it has been thought through, is that XHTML will be the correct format for the well-formed Web, and HTML for the ill-formed or tag-soup Web. The new W3C group has its charter here. In contrast to WHATWG, this group includes Microsoft; in fact, Chris Wilson from the IE team is co-chair with Dan Connolly. However, convergence with WHATWG is part of the charter:

The HTML Working Group will actively pursue convergence with WHATWG, encouraging open participation within the bounds of the W3C patent policy and available resources.

In theory then, WHATWG HTML 5 and W3C HTML 5 will be the same thing. Don’t hold your breath though, since according to the FAQ:

When will HTML 5 be finished? Around 15 years or more to reach a W3C recommendation (include estimated schedule).

I suppose the thing will move along and we will see bits of HTML 5 being implemented by the main browsers. But will it make much difference? Although HTML is a broken specification, it is proving sufficient to support AJAX and to host other interesting stuff like Flash/Apollo, WPF and WPF/E, and so on. Do we need HTML 5? It remains an open question. Maybe the existence of a working group where all the browser vendors are talking is reward in itself: it may help to fix the most pressing real-world problem, which is browser inconsistency.

 

Technorati tags: , , ,

Microsoft will move your server to the cloud

The excellent Mary Jo Foley has a key quote from Microsoft’s Steve Berkowitz, VP of online services, speaking at the Search Engine Strategies conference in New York yesterday:

Basically, we’re moving the server from your office to cloud,” Berkowitz said

This is the right strategy; but I have not heard it before from Microsoft. At one briefing a year or so ago I asked how Microsoft was positioning its Live products versus its Small Business Server (SBS) offerings, and got no kind of answer worth reporting. The problem is that those SBS customers are exactly the ones who will be moving first to cloud-based services, yet they also form an important and highly successful market for old-style Windows servers. Microsoft cannot create a new market without cannibalising its old one. Another factor is that when a business adopts SBS, they are hooked into Microsoft Office as well; SBS includes Sharepoint and Exchange, both of which link directly to Office applications on the clients. Disrupting this cosy cash-cow is dangerous; yet it is being disrupted anyway, by the likes of Google and Saleforce.com, so in reality Microsoft has no choice.

The opportunity for Microsoft is offer its lan-based customers a smooth transition to on-demand services, maintaining features that work best with Microsoft Office without losing the benefits of zero maintenance and anywhere access to data.

Has it got the vision and courage to pursue such as strategy? Is its Live technology even up to the job? Or will it continue to focus on servers for your LAN and watch its business slowly but surely erode?

 

Orange is undecided about Flash on mobile devices

I spoke to Steve Glagow at Orange, Director of Orange Partner Programme, in advance of the Orange Partner Camp at Cape Canaveral next week. I asked him about what trends he is seeing in development for mobile devices. He was guarded, saying that Orange is seeing growth in all three of the core platforms it supports: Symbian Series 60, Microsoft Windows Mobile, and Linux. He says that “Linux is dramatically increasing”, but of course it it is doing so from a small base in this context; Symbian is the largest platform for Orange in absolute terms, and Java the most prominent language. Palm’s adoption of Windows Mobile has given Microsoft a boost, especially in the US. What about Flash, which is less widely deployed on mobile devices than it is on the desktop. Will Orange be pre-installing the Flash runtime? “The reason I won’t answer that is that we’ve been looking at Flash for some time now, and we’ve not made a formal decision,” he told me.

It’s an intriguing answer. Many us think that Flash/Flex/Apollo (all of which use the Flash runtime) is set to grow substantially as a rich client platform, supported by XML web services or Flex Data Services on the server. Extending this to mobile devices makes sense, but only if the runtime is deployed. Adobe needs to break into this Java-dominated space. The Apple iPhone could also be an influence here: as far as I’m aware, it is not initially going to include either runtime, but I have the impression that Steve Jobs is warmer towards Flash than towards Java, which he called “this big heavyweight ball and chain.”

My prediction: Flash will get out there eventually. As fast data connections become more common, the Flash runtime will be increasingly desirable.

 

Making search better: smarter algorithms, or richer metadata?

Ephraim Schwartz’s article on search fatigue starts with a poke at Microsoft (I did the same a couple of months ago), but goes on to look at the more interesting question of how search results can be improved. Schwartz quotes a librarian called Jeffrey Beall who gives a typical librarian’s answer:

The root cause of search fatigue is a lack of rich metadata and a system that can exploit the metadata.

It’s true up to a point, but I’ll back algorithms over metadata any day. A problem with metadata is that it is never complete and never up-to-date. Another problem is that it has a subjective element: someone somewhere (perhaps the author, perhaps someone else) decided what metadata to apply to a particular piece of content. In consequence, if you rely on the metadata you end up missing important results.

In the early days of the internet, web directories were more important than they are today. Yahoo started out as a directory: sites were listed hierarchically and you drilled down to find what you wanted. Yahoo still has a directory; so does Google; another notable example is dmoz. Directories apply metadata to the web; in fact, they are metadata (data about data).

I used to use directories, until I discovered AltaVista, which as wikipedia says was “the first searchable, full-text database of a large part of the World Wide Web.” AltaVista gave me many more results; many of them were irrelevant, but I could narrow the search by adding or excluding words. I found it quicker and more useful than trawling through directories. I would rather make my own decisions about what is relevant.

The world agreed with me, though it was Google and not AltaVista which reaped the reward. Google searches everything, more or less, but ranks the results using algorithms based on who knows what – incoming links, the past search habits of the user, and a zillion other factors. This has changed the world.

Even so, we can’t shake off the idea that better metadata could further improve search, and therefore improve our whole web experience. Wouldn’t it be nice if we could distinguish synonymns like pipe (plumbing), pipe (smoking) and pipe (programming)? What about microformats, which identify rich data types like contact details? What about tagging – even this post is tagged? Or all the semantic web stuff which has suddenly excited Robert Scoble:

Basically Web pages will no longer be just pages, or posts. They’ll all be split up into little objects, stored in a database (a massive, scalable one at that) and then your words can be displayed in different ways. Imagine a really awesome search engine that could bring back much much more granular stuff than Google can today.

Maybe, but I’m a sceptic. I don’t believe we can ever be sufficiently organized, as a global community, to follow the rules that would make it work. Sure, there is and will be partial success. Metadata has its place, it will always be there. But in the end I don’t think the clock will turn back; I think plain old full-text search combined with smart ranking algorithms will always be more important, to the frustration of librarians everywhere.

 

Infinitely scalable web services

Amazon’s Jeff Barr links to several posts about buiding scalable web services on S3 (web storage) and EC2 (on-demand server instances).

I have not had time to look into the detail of these new initiatives, but the concept is compelling. This is where Amazon’s programmatic approach pays off in a big way. Let me summarise:

1. You have some web application or service. Anything you like. Football results; online store; share dealing; news service; video streaming; you name it.

2. Demand of course fluctuates. When your server gets busy, the application automatically fires up new server instances and performance does not suffer. When demand tails off, the application automatically shuts down server instances, saving you money and making those resources available to other EC2 users.

3. Storage is not an issue; S3 has unlimited expandibility.

This approach makes huge sense. Smart programming replaces brute force hardware investment. I like it a lot.

 

Technorati tags: , ,

120 days with Vista

Is there any more to say about Vista? Probably not; yet after reading 30 days with Vista I can’t resist a few comments.

The author, Brian Boyko, says:

On two separate computers I had major stability problems which resulted in loss of data. This is an unforgivable sin …. Additionally, Vista claims backwards compatibility, but I’ve had major and minor problems alike with many of my games, more than a few third-party applications, my peripherals, and, in short, I encountered problems that actively prevented me from getting my work done. Based on my personal experiences with Vista over a 30 day period, I found it to be a dangerously unstable operating system, which has caused me to lose data.

As for me, I installed Vista RTM on four computers shortly after it was released to manufacturing in November last year. Two plain desktops, one media center, one laptop. Just for the record, my experience is dull by comparison with Boyko’s. No lost data; all my important apps run fine; I am not plagued by UAC prompts; the OS is stable.

Have there been hassles? Yes. Tortoise SVN crashes Explorer from time to time; a perfectly good Umax scanner has no driver; Vista on the laptop had severe resume problems which only recently seem to have been fixed by a BIOS update. And Creative’s X-Fi drivers for Vista are terrible. There are also annoyances, like Vista’s habit of thinking your documents are music.

At the same time, I’ve seen nothing to change my opinion that the majority of Vista’s problems are driver-related. Overall I like it better than XP; it doesn’t get in the way of my work and I would hate to go back.

When I do use XP, some of the things I miss are the search box in the Start menu (the Vista Start menu is miles better in other ways as well); the thumbnail previews in the task bar and in alt-tab switching; and copy and paste which doesn’t give up at the first hurdle. I also miss Vista’s more Unix-like Home directories, sensibly organized under Users rather than buried in Documents and Settings.

Security-wise, I consider both User Account Control and IE’s protected mode to be important improvements.

Forget the “Wow”. This is just the latest version of Windows; and it’s not as good as it should be, five years on from XP.

Nevertheless, it is a real improvement, and I’ve been happy with it over the last four months.

 

Technorati tags: ,

MP3 device runs .NET – but in Mono guise

I’ve long been interested in Mono, the open-source implementation of Microsoft .NET. It seems to be maturing; the latest sign is the appearance of an MP3 player using Linux and Mono. Engadget has an extensive review. Miguel de Icaza says on his blog:

The Sansa Connect is running Linux as its operating system, and the whole application stack is built on Mono, running on an ARM processor.

I had not previously considered Mono for embedded systems; yet here it is, and why not?

The device is interesting too. As Engadget says:

… you can get literally any music in Yahoo’s catalog whenever you have a data connection handy

This has to be the future of portable music. It’s nonsense loading up a device with thousands of songs when you can have near-instant access to whatever you like. That said, wi-fi hotspots are not yet sufficiently widespread or cheap for this to work for me; but this model is the one that makes sense, long-term.

I wonder if iPhone/iTunes will end up doing something like this?

Technorati tags: , , ,

Why the change of CEO at CodeGear?

CodeGear has a new CEO. But why? There’s the usual bland stuff in the press release:

Today we made a change to the leadership team at CodeGear.  Jim Douglas is joining as CEO of CodeGear.  Jim will be responsible for driving CodeGear to the next level, building on the solid foundation and momentum achieved by the CodeGear team under Ben Smith’s leadership.

Departing CEO Ben Smith has a blog entry that is no more revealing.

Judging by comments on the Borland newgroups, developers are fearing the worst. The problem: a change of CEO is a sign of instability, when CodeGear customers need reassurance that their preferred tools are in good hands. I didn’t see any previous suggestion that Smith’s appointment was intended to be short-term.

To make matters worse, there are signs that both Delphi for PHP (see here) and Delphi 2007 (see here) were released too quickly – especially Delphi for PHP. Strategically unwise.

There’s still nothing to touch Delphi for native Windows (if you don’t need 64-bit). And tackling PHP tools is a great idea. But in a difficult market the company cannot afford many slip-ups.

 

The search for the new client runtime

Some interesting posts recently about the connected client wars:

Ray Ozzie interview from Knowledge@Wharton.

Commentary from Ryan Stewart – subscribe to his blog if this stuff interests you, and it should.

Commentary from David Berlind

Why a new client runtime? It’s because of certain desirables:

  1. Designer freedom – think multimedia, effects, custom controls.
  2. Zero deployment – It Just Works, not ardous setup routines with weird error messages.
  3. Web storage – most data belongs in the cloud, it’s safer there.
  4. Local storage – for offline use and performance.
  5. Cross-platform – for all sorts of reasons: Apple resurgence, Linux desktop improving, inherent client agnosticism of the Web. Windows-only doesn’t cut it.

I’d add, and this is a techie point, an XML UI. XML makes huge sense for defining a user interface. Think of the history here: in the beginning we had text (DOS etc). Then we got pixels (Windows API), supplemented by arcane ideas like dialog units to make it vaguely scaleable. Then we got layout managers – Java’s AWT and Swing. Fundamentally right but awkward to code. Now we combine XML and layout managers – easier to code, better for visual designers. The best yet.

I dont care as much about the language. Java, C#, JavaScript (ECMAScript 4.0, ActionScript 3.0) are all workable. Just-in-time compilation is important; but all of these have that.

Of course the new client runtime is an old client runtime. Flash, transmuted with Flex and Apollow. Microsoft .NET, transmuted with WPF and given some belated cross-platform appeal with WPF/E. And not forgetting Mozilla XUL, which ticks most of the boxes but lacks the marketing effort and tools that are making waves for Adobe and Microsoft.

In some ways this looks like a battle that is Adobe’s to lose. It has designer hearts and minds, runtime deployment, cross-platform all sewn up. That said, I really like WPF; it has been mostly lost in the Vista fog but will emerge; maybe Mix07 will help (now sold out, apparently). Good WPF apps are amazing; and Microsoft has armies of .NET developers out there, and a great tool in Visual Studio – but stumbles on (5) above.

Watch this space.

 

Technorati tags: , , , , ,

Delphi for PHP first impressions

I tried out Delphi for PHP for the first time this weekend.

Install on Vista was smooth. The setup installs its own copy of Apache 2 and PHP 5. A few minutes later and I was up and running.

The IDE is Delphi-like. Here is a scrunched-up image to give you a flavour:

 

I have a standard application I build when trying out a new development tool. It is a to-do list with a listbox, a textbox, and buttons to add and remove items from the list. I started well, and soon had the controls placed, though they are tricky to line-up nicely. I resorted to setting the Left property as the snap-to-grid did not work for me.

Then I double-clicked the Add button. As expected, I was greeted with an empty Click handler. What to type? After a little experimentation I came up with this:

$this->lstItems->AddItem($this->ebItem->Text,null,null);

When you type ->, the editor pops up autocomplete choices. Nice. I clicked the run button and the application opened in my web browser. I set a breakpoint on the line; that worked nicely, especially after I displayed the Locals window so I could see the value of variables.

The next step is to implement removing an item. This is fractionally more challenging (I realise this is little more than Hello World), since I need to retrieve the index of the selected item and then work out how to remove it.

I am embarrassed to admit that it took me some time. Yes, I tried the documentation, but it is terrible. Unbelievably bad. Someone ran a thing called Doc-O-Matic over the code. Here’s the entire description of the ListBox control:

A class to encapsulate a listbox control 

There’s also a reference which lists methods, again with a one-line description if you are lucky. Here’s the one for ListBox.getItems:

This is getItems, a member of class ListBox.

I gave up on the docs. I had figured out AddItem; I had discovered that the itemindex property has the index of the selected item; but there is no RemoveItem or DeleteItem. I went back to basics. The ListBox has an _items member field which is an array. In PHP you remove an item from an array with unset. I resorted to editing the VCL for PHP by adding a RemoveAt method to CustomListBox:

function RemoveAt($index)
{
unset($this->_items[$index]);
}

Note that I am not proposing you do the same. There must be a better way to do this. I just couldn’t work it out quickly from the docs; and I was determined to get this up and running.

Here’s my code for removing an item:

$selindex = $this->lstItems->itemindex;

if ( $selindex > -1)
{
$this->lstItems->RemoveAt($selindex);
}

Now my app worked fine. What about deployment? I used the deployment wizard, which essentially copies a bunch of files into a directory, ready for upload. There are a lot. 44 files to be precise, mostly of course the VCL for PHP. Still, it was painless, and you can configure a web server to share these files between different applications.

All I needed to test it was a web server running PHP 5.x (it will not work with PHP 4). Fortunately I had one available, so I uploaded my first Delphi for PHP application. It looked good, but although it worked on my local machine, the deployed app throws an error when you click a button:

Application raised an exception class Exception with message ‘The Input Filter PHP extension is not setup on this PHP installation, so the contents returned by Input is *not* filtered’

I note that this user has the same problem. My hunch is that Delphi for PHP requires PHP 5.2 – I only have 5.1 at the moment.*

In addition, I don’t like the way the default deployment handles errors, by publishing my callstack to the world, complete with the location of the files on my web server.

How secure are all these VCL for PHP files anyway? What assurance do I have about this? Will they be patched promptly if security issues are discovered?

Important questions.

There will be plenty more to say about Delphi for PHP. For the moment I’m reserving judgment. I will say that the release looks rushed, which is a shame.

Update: I’ve now seen a fix posted to the Borland newsgroups for the input filter exception, showing how to remove the code which raises it. However I suggest you do not apply this fix, for security reasons, unless you are deploying on a trusted intranet. It is vital to sanitize PHP input on the internet.

*PHP 5.2 is not the answer. It could even be a problem. Delphi for PHP ships with PHP 5.1. There is an input filter extension which you can add for PHP 5.x; see http://pecl.php.net/package/filter. However these are built into PHP 5.2; but the version used by VCL for PHP is old and seems to be incompatible. What a mess.

Technorati tags: , , ,