Knol questions

The internet is buzzing about Knol. Google no longer wishes merely to index the web’s content. Google wishes to host the web’s content. Why? Ad revenue. Once you click away from Google, you might see ads for which Google is not the agent. Perish the thought. Keep web users on Google; keep more ad revenue.

Snag is, there is obvious conflict of interest. Actually, there is already conflict of interest on Google. I don’t know how many web pages out there host Adsense content (mine do), but it is a lot. When someone clicks an Adsense ad, revenue is split between Google and the site owner. Therefore, it would pay Google to rank Adsense sites above non-Adsense sites in its search. Would it do such a thing? Noooo, surely not. How can we know? We can’t. Google won’t publish its search algorithms, for obvious reasons. You have to take it on trust.

That question, can we trust Google, is one that will be asked again and again.

Knol increases the conflict of interest. Google says:

Our job in Search Quality will be to rank the knols appropriately when they appear in Google search results. We are quite experienced with ranking web pages, and we feel confident that we will be up to the challenge.

Will Google rank Knol pages higher than equally good content on, say, Wikipedia? Noooo. How will we know? We won’t. We have to take it on trust.

On balance therefore I don’t much like Knol. It is better to separate search from content provision. But Google is already a content provider (YouTube is another example) so this is not really groundbreaking.

I also have some questions about Knol. The example article (about insomnia) fascinates me. It has a named author, and Google’s Udi Manber highlights the importance of this:

We believe that knowing who wrote what will significantly help users make better use of web content.

However, it also has edit buttons, like a wiki. If it is a wiki, it is not clear how the reader will distinguish between what the named author wrote, and what has been edited. In the history tab presumably; but how many readers will look at that? Or will the author get the right to approve edits? When an article has been edited so thoroughly that only a small percentage is original, does the author’s name remain?

Personally, I would not be willing to have my name against an article that could be freely edited by others. It is too risky.

Second, there is ambiguity in Manber’s remark about content ownership:

Google will not ask for any exclusivity on any of this content and will make that content available to any other search engine.

Hang on. When I say, “non-exclusive”, I don’t mean giving other search engines the right to index it. I mean putting it on other sites, with other ads, that are nothing to do with Google. A slip of the keyboard, or does Google’s “non-exclusive” mean something different from what the rest of us mean?

Finally, I suggest we should not be hasty in writing off Wikipedia. First mover has a big advantage. Has Barnes and Noble caught up with Amazon? Did Yahoo Auctions best eBay? Has Microsoft’s MSN Video unseated YouTube? Wikipedia is flawed; but Knol will be equally flawed; at least Wikipedia tries to avoid this kind of thing:

For many topics, there will likely be competing knols on the same subject. Competition of ideas is a good thing.

Then again, Wikipedia knows what it is trying to do. Knol is not yet baked. We’ll see.

Update

Danny Sullivan, who has been briefed by Google, has some answers. Partial answers, anyway. Here’s one:

Google Knol is designed to allow anyone to create a page on any topic, which others can comment on, rate, and contribute to if the primary author allows

The highlighting is mine. Interesting. I wonder what the dynamics would/will be. Will editable pages float to the top?

Second:

The content will be owned by the authors, who can reprint it as they like

You can guess my next question. If as the primary author I have enabled editing, do any contributions become mine? What if I want to include the article in a printed book? The GNU Free Documentation License used by Wikipedia seems a simpler solution.

Fun: Wikipedia already has an article on knol.

Amazon SimpleDB: a database server for the internet

Amazon has announced SimpleDB, the latest addition to what is becoming an extensive suite of web services aimed at developers. It is now in beta.

Why bother with SimpleDB, when seemingly every web server on the planet already has access to a free instance of MySQL? Perhaps the main reason is scalability. If demand spikes, Amazon handles the load. Second, SimpleDB is universally accessible, whereas your MySQL may well be configured for local access on the web server only. If you want an online database to use from a desktop application, this could be suitable. It should work well with Adobe AIR once someone figures out an ActionScript library. That said, MySQL and the like work fine for most web applications, this blog being one example. SimpleDB meets different needs.

This is utility computing, and prices look relatively modest to me, though you pay for three separate things:

Machine Utilization – $0.14 per Amazon SimpleDB Machine Hour consumed.

Data Transfer – $0.10 per GB – all data transfer in. From $0.18 per GB – data transfer out.

Structured Data Storage – $1.50 per GB-month.

In other words, a processing time fee, a data transfer fee, and a data storage fee. That’s reasonable, since each of these incurs a cost. The great thing about Amazon’s services is that there are no minimum costs or standing fees. I get billed pennies for my own usage of Amazon S3, which is for online backup.

There are both REST and SOAP APIs and there are example libraries for Java, Perl, PHP, C#, VB.NET (what, no Javascript or Python?).

Not relational

Unlike MySQL, Oracle, DB2 or SQL Server, SimpleDB is not a relational database server. It is based on the concept of items and attributes. Two things distinguish it from most relational database managers:

1. Attributes can have more than one value.

2. Each item can have different attributes.

While this may sound disorganized, it actually maps well to the real world. One of the use cases Amazon seems to have in mind is stock for an online store. Maybe every item has a price and a quantity. Garments have a Size attribute, but CDs do not. The Category attribute could have multiple values, for example Clothing and Gifts.

You can do such things relationally, but it requires multiple tables. Some relational database managers do support multiple values for a field (FileMaker for example), but it is not SQL-friendly.

This kind of semi-structured database is user-friendly for developers. You don’t have to plan a schema in advance. Just start adding items.

A disadvantage is that it is inherently undisciplined. There is nothing to stop you having an attribute called Color, another called Hue, and another called Shade, but it will probably complicate your queries later if you do.

All SimpleDB attribute values are strings. That highlights another disadvantage of SimpleDB – no server-side validation. If a glitch in your system gives an item a Price of “Red”, SimpleDB will happily store the value.

Not transactional or consistent

SimpleDB has a feature called “Eventual Consistency”. It is described thus:

Amazon SimpleDB keeps multiple copies of each domain. When data is written or updated (using PutAttributes, DeleteAttributes, CreateDomain or DeleteDomain) and Success is returned, all copies of the data updated. However, it takes time for the update to propogate to all storage locations. The data will eventually be consistent, but an immediate read might not show the change.

Right, so if you have one item in stock you might sell it twice to two different customers (though the docs say consistency is usually achieved in seconds). There is also no concept of transactions as far as I can see. This is where you want a sequence of actions to succeed or fail as a block. Well, it is called SimpleDB.

This doesn’t make SimpleDB useless. It does limit the number of applications for which it is suitable. In most web applications, read operations are more common than write operations. SimpleDB is fine for reading. Just don’t expect your online bank to be adopting SimpleDB any time soon.

Another pro musician gives up on Vista audio

I occasionally highlight interesting comments to this blog, because they are less visible than new posts. This one for example:

After months of struggling with Vista, I have now completely removed it from my quad-core, purpose-built audio recording PC. With all the same hardware, XP 64 bit edition is working as I had hoped Vista 64 would. The machine now records and plays back flawlessly.

The question I originally posed was whether Vista audio problems are primarily to do with poor drivers, or indicate more fundamental problems. Initially I was inclined to blame the drivers, especially as Microsoft put a lot of effort into improving Vista’s audio. However, read this post by Larry Osterman. He mentions three problems with audio in XP, and says:

Back in 2002, we decided to make a big bet on Audio for Vista and we committed to fixing all three of the problems listed above.

However, only one of his three problems is unequivocally about improving audio. The first is actually about Windows reliability:

The amount of code that runs in the kernel (coupled with buggy device drivers) causes the audio stack to be one of the leading causes of Windows reliability problems.

Therefore, Microsoft moved the audio stack:

The first (and biggest) change we made was to move the entire audio stack out of the kernel and into user mode.

though he adds that

In Vista and beyond, the only kernel mode drivers for audio are the actual audio drivers (and portcls.sys, the high level audio port driver).

So, not quite the entire audio stack. Some pro musicians reckon the removal of the audio stack from the kernel is the reason for Vista’s audio problems.

However you look at it, it is to my mind a depressing failure that a year after Vista’s release you can find pro musicians giving up, and even vendors (who have an interest in Vista working properly) making comments like this one from Cakewalk’s Noel Borthwick:

Vista X64 (and X86 to some extent as well) is known to have inherent problems with low latency audio. We have been in touch with Microsoft about this and other problems for over a year now so its not that Cakewalk hasn’t done their bit. There are open case numbers with MS for this issue as well.

It does look as if, for all the talk of “a big bet on audio in Vista”, Microsoft does not care that much about this aspect of the operating system.

Anyone tried audio in Vista SP1 yet?

Technorati tags: , ,

450 fixes in Office 2007 service pack 1

Microsoft has released Office 2007 service pack 1. But what does it fix? If you go to this page you can download a spreadsheet which lists around 450 fixes. It is a little misleading, since many of the fixes reference pre-existing knowledgebase articles, which I reckon means you may already have the fix. SP1 is still worth it (presuming it works OK) – there are plenty of other issues mentioned.

Of course I went straight to the Outlook 2007 section, as this is the app I have real problems with. This one will be interesting to some readers of this blog:

  • POP3 sync is sometimes slow.  An issue that contributed to this issue was fixed in SP1.

I believe I have noticed this one too:

  • A large number of items may fail to be indexed.

As to whether Outlook 2007 will perform noticeably better after SP1, I am sceptical but will let you know.

As it happens, the top four search keywords for visitors to this blog who come via search engines, for this month, are as follows:

  1. 2007
  2. outlook
  3. vista
  4. slow

It is similar most months. Hmmm, seems there may be a pattern there.

Wired votes for Zune over iPod

Wired Magazine, home of Cult of Mac, has declared the Zune 2 a better buy than the iPod Classic.

This may prove any number of things. One possibility is that Microsoft has a winner. After all, it the company’s modus operandi. Windows 1.0, rubbish. Windows 3.0, word-beating.

Then again, perhaps articles with unexpected conclusions just get more links. Like this one.

Not that I care – there is no Zune for the UK.

Technorati tags: , , ,

Live Workspace: can someone explain the offline story?

I showed the Asus Eee PC to a friend the other day. She liked it, but won’t be buying. Why? It doesn’t run Microsoft Office (yet – an official Windows version is planned).

It reminded me how important Office is to Microsoft. No wonder it is fighting so hard in the ODF vs OOXML standards war.

Therefore, if anything can boost Microsoft’s Web 2.0 credentials (and market share), it has to be Office. I’ve not yet been able to try out Office Live Workspace, but it strikes me that Microsoft is doing at least some the right things. As I understand it, you get seamless integration between Office and web storage, plus some extras like document sharing and real-time collaboration.

I still have a question though, which inevitably is not answered in the FAQ. What’s the offline story? In fact, what happens when you are working on a document at the airport, your wi-fi pass expires, and you hit Save? Maybe a beta tester can answer this. Does Word or Excel prompt for a local copy instead? And if you save such a copy, how do you sync up the changes later?

If there’s a good answer, then this is the kind of thing I might use myself. If there is no good answer, I’ll stick with Subversion. Personally I want both the convenience of online storage and the comfort of local copies, with no-fuss synch between the two.

That said, I may be the only one concerned about this. When I Googled for Live Workspace Offline, the top hit was my own earlier post on the subject.

Microsoft Volta: magic, but is it useful magic?

Microsoft has released an experimental preview of Volta, a new development product with some unusual characteristics:

1. You write your application for a single machine, then split it into multiple tiers with a few declarations:

Volta automatically creates communication, serialization, and remoting code. Developers simply write custom attributes on classes or methods to tell Volta the tier on which to run them.

2. Volta seamlessly translates .NET byte code (MSIL) to Javascript, on an as-needed basis, to achieve cross-platform capability:

When no version of the CLR is available on the client, Volta may translate MSIL into semantically-equivalent Javascript code that can be executed in the browser. In effect, Volta offers a best-effort experience in the browser without any changes to the application.

The reasoning behind this is that single-machine applications are easier to write. Therefore, if the compiler can handle the tough job of distributing an application over multiple tiers, it makes the developer’s job easier. Further, if you can move processing between tiers with just a few declarations, then you can easily explore different scenarios.

Since the input to Volta is MSIL, you can work in Visual Studio using any .NET language.

Visionary breakthrough, or madness? Personally I’m sceptical, though I have had a head start, since this sounds very like what I discussed with Eric Meijer earlier this year, when it was called LINQ 2.0:

Meijer’s idea is programmers should be able to code for the easiest case, which is an application running directly on the client, and be able to transmute it into a cross-platform, multi-tier application with very little change to the code.

What are my reservations? It seems hit-and-miss, not knowing whether your app will be executed by the CLR or as Javascript; while leaving it to a compiler to decide how to make an application multi-tier, bearing in mind issues like state management and optimising data flow, sounds like a recipe for inefficiency and strange bugs.

It seems Microsoft is not sure about it either:

Volta is an experiment that enables Microsoft to explore new ways of developing distributed applications and to continue to innovate in this new generation of software+services. It is not currently a goal to fit Volta into a larger product roadmap. Instead, we want feedback from the community of partners and customers to influence other Live Labs technologies and concepts.

Mono and C# on an Asus Eee Pc

I am having a lot of fun with the Asus Eee PC. In its way, it is a game changer. I wondered if it would run Mono applications, enabling support for the open source version of Microsoft .NET. The news is partially good:

mono_ee

Unfortunately, I’ve not been able to do much more than that so far. I tried compiling a basic forms application, but got a pkg-config error. This may be because of a kernel module called binfmt, which let you register interpreters for different binary formats. This is normally present in Linux, but seems to be omitted from the Eee kernel. If I am right, then fixing this means figuring out how to recompile the kernel on the Eee. You can still execute Mono applications by running mono as in the screenshot, but the compiler seems to expect binfmt to work. I am sure someone will figure this out.

Update –  getting better – we have GUI:

mono_ee2

Still can’t use -pkg though.

Update

The problem with -pkg is easy to fix. Just install pkg-config 🙂

I’m not clear yet whether the absence of binfmt could cause other problems.

Further update

Everything is working. I can compile and run the Hello World examples here. Note that the Gtk example there does not quit properly, so I suggest you use this modified version.

To get this working, I did as follows:

1. Added a xandros repository to /etc/apt/sources.list:

deb http://xnv4.xandros.com/xs2.0/upkg-srv2 etch main contrib non-free

2. Installed mono-gmcs (.NET 2.0 compiler). (I think that is the minimum but I’m not 100% sure)

3. Installed pkg-config

4. Installed gtk-sharp2

I’ve also installed JEdit for editing. Not in the repository, so I installed using the jar installer on the Jedit site.

df shows 30% used, not too bad.

Technorati tags: , ,

CodeRage sessions available for download

You can now download the content from last week’s CodeRage, the virtual developer conference laid on by CodeGear. The downloads use Camtasia and Flash and work well.

A few that I recommend are Ravi Kumar’s session on JBuilder Application Factories from Day 5, and Joe McGlynn on 3rd Rail, an IDE for Ruby on Rails, from Day 3. For Delphi futures (64-bit, generics, concurrent programming, hints about cross-compilation to other operating systems) check out Nick Hodges’ session on Day 1. I’ve not viewed everything, so there are no doubt other excellent sessions.

Nevertheless, I have mixed feelings about this CodeRage. The keynotes were weak, with too much high level waffle about how CodeGear is committed to developers etc etc. The conferencing software was no more than adequate, did not work properly for me on Vista, and did not support Mac or Linux. That may explain why attendee numbers in some sessions were embarrassingly small.

I am struggling to make sense of this. CodeGear claims to have 7.5 million registered users; yet only 2100 registered to attend the free CodeRage, and some of those no doubt never turned up. If that is representative of the level of interest in new CodeGear products, as opposed to legacy users, then that is a worrying sign.

Eee PC vs Origami UMPC: D’Oh

I loved this comment to Kevin Tofel’s post on the Asus Eee PC:

You kinda get the feeling that the Origami team is saying “D’OH” right about now.

So it should. Origami (officially known as Ultra Mobile PC) is an attempt to re-define the ultra-portable market. It uses a touch screen, no keyboard. It is typically more expensive than a traditional laptop. It usually comes with just basic Windows software.

The Asus Eee PC is a bunch of mass-market components thrown together. It is the classic clamshell design. It comes with a bundle of free software that encompasses a large percentage of what people actually do on their portable computers: word processing, spreadsheet, presentation, email, web, music playback. It throws in a webcam for good measure. It is cheaper than almost any laptop on the market.

Result: UMPC has pretty much flopped, while Eee PC is a runaway bestseller and nobody can get enough stock.

This isn’t primarily a Windows versus Linux thing. In fact, the Eee PC is set up to be Windows-friendly, and Open Office is set to save in Microsoft Office formats by default. Further, the Eee PC can run Windows XP; and most of its applications are also available for Windows. An Asus Windows XP Eee PC is planned.

To my mind, the success of Eee PC proves that the Origami team got at least one thing right. There is a market for full-function ultra-portables.

What the Origami team got wrong is first, that ultra-portables that cost more than laptops are never going to be a mass-market proposition. Second, users like keyboards and would rather have a cheaper device than a touch screen. Third, a bundle of software that does everything you want is a great advantage.

As it is, Eee PC is bringing desktop Linux to the mass market. Interesting times.

Technorati tags: , , , ,