Tim Anderson’s ITWriting

Tech writing blog

August 7th, 2008

SQL Server 2008 is done

Microsoft has announced that SQL Server 2008 is released to manufacturing – ie. the bits are done, even if you can’t buy it yet. MSDN subscribers can download it now.

This is the product that was “launched” back in February; it’s been a long delay but I get the impression that the SQL team likes to wait until its release really is ready.

SQL Server 2008 is more like a suite of products than a single product now. It has a large range of editions from Compact to Enterprise, and product areas like Analysis Services and Reporting Services are distinct from the core engine.

The pieces that interest me most are the spatial data types, sparse columns, FILESTREAM data type, and the various object-relational layers including LINQ, Entity Framework, ADO.NET Data Services, and the ongoing work with SQL Server Data Services (which is far from done yet).

DBAs will likely have a very different view of what is important, as will Business Intelligence specialists.

SQL Server has prospered by being cheaper than than the likes of Oracle and DB2, and by integrating smoothly with Windows and Active Directory. I wonder if it will feel pressure from even more cost-effective open source offerings like MySQL, as they become more Enterprise-ready?

June 20th, 2008

Where is your SQL Server CE Database?

Maybe not where you think. Now, I admit I am three years late with this bug strange feature of Visual Studio but it wasted some of my time today so it is still worth reporting.

I’ve been writing about creating database applications in Visual Studio. Specifically, I was looking at what happens if you download Visual Basic Express and take the quickest, easiest route to knocking together a database application.

The default local database engine these days is SQL Server Compact 3.5:

When you create applications, the preferred local database is SQL Server Compact 3.5.

says MSDN.

OK, so you add a new database to your project and accept various defaults. The wizard then asks you whether you would like to “copy the file to your project and modify the connection”?

Sounds reasonable, if you can figure out what it means. Default is Yes, so OK do it.

Mistake. Don’t do that. Not, at least, without reading and understanding this document. But I digress. Next up, you are asked another question:

Storing connection strings in your application configuration file eases maintenance and deployment … do you want to save the connection string to the application configuration file?

It’s another option that sounds good. OK, do it.

Now you set up a little table or two, add some data-bound controls, and off you go. Run the app, enter some data, save it. Run the app again … and all your data has disappeared. Why?

Well, it has to be either that the updates are silently failing; or that the database file is getting overwritten. It’s the latter. It turns out that VB is treating your database like any other resource, and copying to bin/debug when you run the app. This is the copy you are connecting to, you update it, but next time you build and run it gets overwritten.

None of this is obvious, because when you look at the connection string in the application settings (which VB hides by default, sigh), it shows the database file in the root of your project folder. Click Test Connection, all is fine. The only warning sign is that the connection string looks like this:

Data Source=|DataDirectory|\test.sdf

So where is |DataDirectory| set? That’s not obvious either. Read here for the answer. It’s an application property that is not visible anywhere, that gets set to different values depending on how you deploy the app. I can see why someone thought this was a smart idea; but the implementation is horrible. It gives you the illusion of having one database file, when in fact you have multiple copies (source, debug, release etc) overwriting one another, and during testing you are never editing the correct one.

Once you have worked this out you can fix it, of course. But here’s another problem. You are the single user of a database. You insert a record and save it, using all the generated data-bound stuff that Visual Studio provides. Works fine. Then you edit the record you just inserted, and save again. Boom. Concurrency exception. Why?

It is all do to with a limitation of SQL Server Compact 3.5. It can’t handle multiple SQL statements. This means that a feature of the ADO.NET TableAdapter, called Refresh the Data Table in the configuration wizard, is not available. This option kicks in when you have an identity column that auto-increments, which is by the easiest way to create a primary key. In this scenario, the actual value of the identity column is not known until after you make the insert, because it is generated by the database engine. Normally, the TableAdapter would retrieve it with a Select statement immediately after the Insert statement. However, with SQL Server Compact 3.5 that does not work.

The result is that saving a record works fine, but next time around the row has an incorrect primary key in the DataSet. No wonder you get a concurrency exception.

You can work around this in code, of course. But what surprises me is just how hard Microsoft has made all this for the kind of newbie programmer who might pick up VB Express. In fact, easy database programming in VB has marched backwards since Visual Basic 3 back in 1993.

By the way, I also dislike the way VB adds so much database gunk to your main form, again by default. What if you add another form to your app? What if you want to delete the first form? It all gets messy fast.

Look at Ruby on Rails. It has simple database handling that works. OK, you are going to have to modify that code eventually; and I accept that database apps have an inherent complexity that no amount of wizards, O/R layers or even "Convention over Configuration” can remove. I still think that simple, single table, single user apps should be, well, simple. Not in VB, unfortunately.

June 10th, 2008

Bet on Entity Framework, not LINQ to SQL

So says Roger Jennings in his post Is the ADO.NET team abandoning LINQ to SQL? His main points in favour of ADO.NET Data Services (formerly Astoria) Entity Framework:

  • It is the focus of more energetic development
  • It already has richer features
  • It supports multiple database engines, not just SQL Server

As Andres Aguiar, software architect at Infragistics, notes in a comment, this has a lot to do with internal politics at Microsoft:

The Data Programmability Team never owned LinQ to SQL, it was owned by the C# team. That’s why we have two O/R mappers, both teams wanted to ship theirs. The C# team looks to be thinking about functional programming now. The Data Programmability will always be thinking about data. That’s why the EF [Entity Framework] is the safe choice.

Although LINQ to SQL is now (apparently) owned by the SQL Server team, it still doesn’t seem plausible or sensible that both will get equal attention. We also learn from Matt Warren that LINQ to SQL was deliberately tied to SQL Server only:

LINQ to SQL was actually designed to be host to more types of back-ends than just SQL server. It had a provider model targeted for RTM, but was disabled before the release. Don’t ask me why. Be satisfied to know that is was not a technical reason.

Note that this wasn’t necessarily a plot in favour of SQL Server world dominance; keeping the entire stack as a Microsoft stack no doubt makes support easier. That said, to me this is the big weakness of LINQ to SQL.

I was impressed by Astoria when I first saw it at the European Tech Ed in 2007. I am not surprised it is gaining ground.

May 16th, 2008

WinFS reborn: SQL Server as a file system

Fascinating interview with Quentin Clark, who led the cancelled WinFS project at Microsoft. Jon Udell is the interviewer.

Clark talks about how technology from WinFS is now emerging as the Entity Framework in ADO.NET (part of .NET 3.5 SP1) and the FileStream column type in SQL Server 2008 - a connection I’d already made at the Barcelona TechEd last year. He also mentions the new HierarchyID column type that enables fast querying of paths, the concept of rows which contain other rows. He adds that a future version of SQL Server will support the Win32 API so that it can support a file system:

In the next release we anticipate putting those two things together, the filesystem piece and the hierarchical ID piece, into a supported namespace. So you’ll be able to type //machinename/sharename, up pops an Explorer window, drag and drop a file into it, go back to the database, type SELECT *, and suddenly a record appears.

Put that together with the work Microsoft is doing on synchronization, and you get offline capability too - something more robust than offline files in Vista. Clark says SharePoint will also benefit from SQL Server’s file system features.

Note that Live Mesh does some of this too. I guess SQL Server is there in the Live Mesh back end, but it strikes me Microsoft is at risk of developing too many ways to do the same thing.

The piece of WinFS that shows no sign of returning is the shared data platform, which was meant to enable applications to share data:

… all that stuff is gone. The schemas, and a layer that we internally referred to as base, which was about the enforcement of the schemas, all that stuff we’ve put on the shelf. Because we didn’t need it.

|