Category Archives: sql

SQL Server 2014 is done: Hekaton, Azure integration

Microsoft has released SQL Server 2014 to manufacturing (an odd phrase in these diskless days) but which signifies that it is code complete for the initial release. General availability is April 1st.

What do you do if hardware trends enable you to stuff vast amounts of RAM into your server, along with many CPU cores? The answer is that you optimize applications to work mostly in RAM, with disk important as a persistence layer. This contrasts to the approach when you have large amounts of disk space and little RAM, when you focus on loading only as much data into memory as you absolutely need.

The implications for a database server are profound. Instead of a logic that goes something like “read from disk, do something, write to disk” you can address the data directly; it is just a memory pointer.

Now combine that with stored procedures compiled to native code. Performance leaps up, and by much more than you get simply by caching data in RAM, or using fast SSD storage, but still using the old disk-based approach in the database engine.

This is the reasoning behind “Hekaton”, properly known as In-Memory OLTP (online transaction processing), which is a new in-memory database engine that comes with SQL Server 2014.

It is fully integrated. You just have to add a filegroup to a a SQL Server database with the keyword CONTAINS MEMORY_OPTIMIZED_DATA and then create a table with the keyword WITH (MEMORY_OPTIMIZED=ON). And for the stored procedures, use WITH NATIVE_COMPILATION.

The speed-up is as great as you would expect. I have seen demonstrations of 30x or more performance increases, like this one in a demo based on one from the SQL Pass conference, but which I did for myself in one of Microsoft’s “Hands On Labs”:

image

In another demo, on an Azure VM, I got a speed up of 7x. Only seven times faster! Still, hard to complain about those sorts of numbers.

Unfortunately, in-memory OLTP is spoilt by some rather severe limitations in this release. The first problem is that a combination of the need to support native compilation of stored procedures, and other limitations, means that only a subset of T-SQL (the query and management language of SQL Server) is supported. You can see the list of what is not supported here; and it is depressing reading, with lots of keywords that you likely do use at the moment; even IDENTITY is on the list of what does not work.

Another issue is that the ability of In-Memory OLTP to take advantage of hardware is not as extensive as you might hope. Lead program manager Kevin Liu told me at a recent press workshop that the team recommends restricting total data size to 256GB, and that the recommended number of CPU sockets is two. You can get servers today with much more memory and more sockets. It gets complicated though: in a multi-socket server memory has processor affinity and there is a thing called NUMA (Non-Uniform Nemory Access) that describes the way memory is shared between processors.

According to Liu, Microsoft expects to lift these limitations in future releases, as well as improving T-SQL support, but things like this remind you that it is a version one release.

What else is in SQL Server 2014? There is some neat Azure integration, including a managed backup tool that is almost one click to have your data backed up to Azure storage; a brilliant facility for small businesses. You can also use Azure for high availability, creating always-on replicas in Azure VMs.

Data warehouse users will like the new clustered columnstore indexes, which allow you do use a column-oriented table structure for much faster processing of typical report and analysis queries. Columnstore indexes first appeared in SQL Server 2012 but were not updateable. Now they are.

SQL Server is well liked, licensing hassles aside; and even on licensing, Microsoft can always point at Oracle and claim, rightly, to be cheaper and less complex. It has earned a reputation for solid performance. SQL Server 2014 looks as good as ever, even if the management tools now look rather dated – the shell for SQL Server Management Studio uses an old version of Visual Studio, which is one of the reasons. I also suspect the SQL Server team lacks a dialog designer, but doubt that the average database admin cares one jot.

That said, it is difficult to describe this as a must-have upgrade, unless you can make good use of “Hekaton” in-memory OLTP. The porting effort will be worth it presuming you can get it to work. One of the good fits for the technology is managing web app session data, or, as in the example above, rapid processing to display recommendations or customisations on a web site.

I can imaging though that many users will look at Hekaton and decide that it is too much work or too immature for immediate use. What is left for them, apart from some nice Azure integration?

Not a huge amount, it seems to me, making this to my mind a transitional release.

Are you planning to upgrade? I would be interested to know your reasons why or why not.

Microsoft SQL Azure versus SQL Server on Amazon AWS

Amazon RDS for Microsoft SQL Server offers cloud instances of SQL Server. Amazon’s offering even supports “License Mobility”, Microsoft jargon that lets volume licensing customers use an existing SQL Server license for an Amazon’s instance. But how does Amazon’s cloud SQL Server compare with Microsoft’s own offering, SQL Database running on Azure?

Peter Marriott has posted on the subject here (registration required). The key point: despite the obvious similarity (both are SQL Server), these two offerings are radically different. Amazon’s RDS SQL is more IaaS (infrastructure as a service) than PaaS (platform as a service). You choose an edition of SQL Server and rent one or more instances. The advantage is that you get full SQL Server, just like the on-premise editions but hosted by Amazon.

Microsoft’s Azure-hosted SQL on the other hand is more abstracted. You do not rent a SQL Server instance; you rent a database. Under the covers Microsoft provides multiple redundant copies of the data, and if traffic increases, it should scale automatically, though the database size is limited to 150GB. The downside is that not all features of SQL Server are available, as I discovered when migrating data.

Marriott adds that SQL Azure supports encrypted connections and has a more usable administration interface.

A further twist: you can also install SQL Server on an Azure Virtual Machine, which would get you something more like the Amazon approach though I suspect the cost will work out higher.

Moving a database from on-premise SQL Server to SQL Azure: some hassle

I am impressed with the new Windows Azure platform, but when I moved a simple app from my local machine to Azure I had some hassle copying the SQL Server database.

The good news is that you can connect to SQL Azure using SQL Server Management studio. You need to do two things. First, check the server location and username. You should already know the password which you set when the database was created. You can get this information by going to the Azure portal, selecting the database, and clicking Show connection strings on the dashboard.

Second, open the SQL firewall for the IP number of your client. There is a link for this in the same connection string dialog.

Now you can connect in SQL Server Management Studio. However, you have limited access compared to what you get as an admin on your local SQL Server.

Here is the Tasks menu for an on-premise SQL Server database:

image

and here it is for a SQL Server Azure database:

image

Still, you can start Export Data or Copy Database from your on-premise connection and enter the Azure connection as the target. However, you should create the destination table first, since the Export Data wizard will not recreate indexes. In fact, SQL Azure will reject data imported into a table without at least one clustered index.

I tried to script a table definition and then run it against the SQL Azure database. You can generate the script from the Script Table as menu.

image

However even the simplest table will fail. I got:

Msg 40514, Level 16, State 1, Line 2
‘Filegroup reference and partitioning scheme’ is not supported in this version of SQL Server.

when attempting to run the script on SQL Azure.

The explanation is here.

Windows Azure SQL Database supports a subset of the Transact-SQL language. You must modify the generated script to only include supported Transact-SQL statements before you deploy the database to SQL Database.

Fortunately there is an easier way. Right-click the table and choose Generate Scripts. In the wizard, click the Advanced button for Set Scripting Options.

image

Find Script for the database engine type, and choose SQL Azure:

image

You may want to change some of the other options too. This generates a SQL script that works with SQL Azure. Then I was able to use the Export Data wizard using the new table as the target.  If you use Identity columns, don’t forget Enable identity insert in Edit Mappings.

image

What’s new in SQL Server 2012?

Microsoft’s SQL Server 2012 is released next month and available to download now (I am not sure what the distinction is). I have a high regard for Microsoft’s database server; it seems to me that the team mostly gets it right. The product has become somewhat diffuse though, especially as the Business Intelligence aspect has grown, and this may account for what to me is a rather unfocused launch for SQL Server 2012, even though its name suggests that it is the most significant release since SQL Server 2008.

The following slide summarises the new features, presumably with the type size suggesting the importance of each one.

image

But is the ODBC Driver for Linux really more important than the SQL Server Data Tools, for example? Not in my view; but that reflects how SQL Server represents different things to different people.

So what are the key new feature? Here’s my quick take.

Always On

A new feature called Availability Groups that is an improved version of database mirroring

Improved failover clustering which supports multi-site clustering across subnets – above to failover across datacentres.

ColumnStore Index

A new type of index for data warehouses. This is actually pretty simple: the name says it all. Here is Microsoft’s illustration:

image

and explanation:

A columnstore index stores each column in a separate set of disk pages, rather than storing multiple rows per page as data traditionally has been stored.

Why do this? Because it is more efficient when the query only requests a a few columns from the table. Microsoft claims performance improvements from 6X to 100X in cases where the the data can be cached in RAM, and thousand-fold improvements where the working set does not fit in RAM.

SQL Server Data Tools

This is my favourite feature, probably because it is developer-focused. These are the tools that were code-named “Juneau” and which install into Visual Studio 2010. There are some visual tools, but this is essentially a code-centric approach to database design, where you design your database with all its tables, queries, triggers, stored procedures and so on. You can then build it and test it against a private “localdb” instance of SQL Server. What I like is that the database project includes the entire design of your database in a form that can be checked into source control and compared against other schema versions. Here is the Add New Item dialog for a database project:

image

Data Quality Services

Data Quality Services (DQS) lets you check your data against a Data Quality Knowledge Base (DQKB), the contents of which are specific to the type of data in the database and may be created and maintained by your business or obtained from a third-party. If your data includes addresses, for example, the DQKB might have all valid city names to prevent errors. Features of DQS include data cleansing, de-duplication through data matching, profiling a database for quality, and monitoring data quality.

image

Illustration and more details are here.

Updated SQL Server Management Studio

SQL Server Management Studio now runs in the Visual Studio 2010 shell.

LocalDB

LocalDB is a local instance of SQL Server aimed at developers and for use as an embedded database in single-user applications. It is a variant of SQL Server Express, but different in that it does not run as a service. Rather, the LocalDB process is started on demand by the SQL native client and closed down when there are no more connections. You can attach database files at runtime by using AttachDBFileName in the connection string. LocalDB is intended to replace user instances which are now deprecated.

FileTables

This is the most intriguing feature in SQL Server 2012. It is described here:

The FileTable feature brings support for the Windows file namespace and compatibility with Windows applications to the file data stored in SQL Server … In other words, you can store files and documents in special tables in SQL Server called FileTables, but access them from Windows applications as if they were stored in the file system, without making any changes to your client applications.

and the purpose:

Enterprises can move this data from file servers into FileTables to take advantage of integrated administration and services provided by SQL Server. At the same time, they can maintain Windows application compatibility for their existing Windows applications that see this data as files in the file system.

Integration of the file system and the database is not a new idea, and Microsoft has tried variants before, such as the “M” drive that was once part of Exchange, the aborted WinFS feature planned for Windows Longhorn (Vista), and SharePoint, which can store documents in SQL Server while presenting them as Windows file shares through WebDAV.

That said, FileTables in SQL Server 2012 are not an attempt to reinvent the file system, but presented more as a way of supporting legacy applications while managing data in SQL Server. It is an interesting feature though, and it would not surprise me if users find some unexpected ways to exploit it.

Power View

Codenamed “Project Crescent”, this is a web-based reporting client for businesses that have embraced Microsoft’s platform, because it has several key dependencies:

  • SharePoint Server Enterprise Edition
  • SQL Server Reporting Services
  • Silverlight on the client

In fact, Power View is described as:

a feature of SQL Server 2012 Reporting Services Add-in for Microsoft SharePoint Server 2010 Enterprise Edition

Power View reports that I have seen do look good, and have an Office ribbon style designer for designing customising the report. That said, I would guess that Microsoft now wishes it had used HTML 5 rather than Silverlight for this – there are those Apple iPad and Windows 8 Metro users to think of, after all.

Microsoft emphasises that Power View is not a replacement for Report Designer or Report Builder, but an ad-hoc reporting tool.

Closing thoughts

There is more in SQL Server 2012, as a glance back at the initial slide will tell you, but the above is a starting point if you are wondering what it is all about. It is also worth noting that Microsoft still gives away SQL Server Express which supports up to 10GB per database and includes many of same features as the paid-for versions; it is the same product at heart.

Someone who finds that SQL Server Express actually meets all their needs asked me why Microsoft gives it away. My guess is that this is a consequence of all the other free database engines available such as MySQL, PostgreSQL, interesting  newer NoSQL options like mongoDB, and of course equivalent free versions of Oracle and IBM DB2. A proportion of customers who start with SQL Server Express will grow into the paid-for editions.

This does make SQL Server Express an excellent choice for smaller scale applications and small businesses, particularly since it integrates smoothly into Microsoft’s developer stack. Having said which, I am becoming something of an Entity Framework sceptic, but that is a story for another day – and fortunately you do not have to use EF if you do not want to.

Microsoft LocalDB: another option for local databases

Microsoft is launching SQL Server 2012 on  March 7th 2012. In Microsoft’s world “launches” do not always coincide with the availability of release code, which may come before or after, but they are usually not far apart.

The big news in SQL Server 2012 is in new BI (Business Intelligence) features and the ability to import and export from the open source Hadoop framework. Microsoft is also supporting Hadoop on Windows Server and Windows Azure. Robert Sheldon has an excellent article on TechTarget which describes the Hadoop integration.

At the other end of the scale though there is a new approach to local databases, which interests me as this is the kind of thing an application developer might use for local applications. SQL Express LocalDB uses the full SQL Server Express engine but does not require a SQL Server service to be running or even installed. In summary:

  • The LocalDB binaries can be installed with a separate installer or as part of the SQL Server Express.
  • LocalDB instances are isolated to the user.
  • The LocalDB system databases are buried deep in AppData in the user profile. The default location for user databases is the root of the user profile.
  • The old SQL Server User Instances are now deprecated

A driver for LocalDB has to know how to fire up the SQL Server binaries if they are not running, which means that old drivers will not work. Microsoft has patched System.Data.SqlClient in .NET 4 to work with LocalDB.

LocalDB Pros and cons

The advantage of LocalDB over the cut-down Compact Edition is that you get full access to SQL Server features including transactions, stored procedures, geographical data types and so on. It is meant to improve on the old user instances by simplifying matters for the user: no need to run a service, and management of SQL Server completely hidden.

The disadvantage is that your app still has the overhead of SQL Server running in a separate process. A SQL Server LocalDB install also takes around 140MB, which bumps up the download size if your app is distributed on the web.

If you need a local database, it seems to me that Microsoft still has nothing that quite matches SQLite, which runs in-process, is lightning fast, and which does not require any hidden system databases.

On the other hand, it might make sense to use SQL Server if you want to integrate with a server database, or if you are familiar with coding for SQL Server.

I would like to see Windows ship with a local database engine documented as something developers can rely on being there, as with Core Data on the Mac. It would also help if the SQL Server team got together with the Office team and worked out how to get Access and SQL Server Express to use the same database engine – yes, I know Access can use SQL Server data, but it still defaults to its own .ACCDB format and JET database engine.

SQL Server 2011 Denali publishes tables as Windows network folders

I’ve been testing the new Community Tech Preview of SQL Server 2011, codenamed “Denali”.

Here is an intriguing feature. You can now create a new kind of table called a FileTable. A FileTable is mapped to a folder on the filesystem, though you are not meant to access it directly once it is managed by SQL Server. However, you can access the folder in Windows Explorer, or over the network, as a network share. When you do this, a SQL Server component intercepts the Windows API calls and updates the FileTable. FileTables build on the existing FILESTREAM feature in SQL Server 2008, and the documents in the folder are stored as FILESTREAM data.

The illustration shows a folder in Windows Explorer that is also a SQL Server FileTable.

image

Is this the return of WinFS, the fabled relational file system which was originally planned for Windows Longhorn, but abandoned? Not really. According to the docs:

FileTables remove a significant barrier to the use of SQL Server for the storage and management of unstructured data that is currently residing as files on file servers. Enterprises can move this data from file servers into FileTables to take advantage of integrated administration and services provided by SQL Server. At the same time, they can maintain Windows application compatibility for their existing Windows applications that see this data as files in the file system.

Guardian.co.uk enthuses about MongoDB, plans to ditch Oracle

The Guardian’s Mat Wall has spoken here at Qcon London about why it is migrating its web site away from Oracle and towards MongoDB.

He also said there are moves towards cloud hosting, I think on Amazon’s hosted infrastructure, and that its own data centre can be used as a backup in case of cloud failure – an idea which makes some sense to me.

So what’s wrong with Oracle? The problem is the tight relationship between updates to the code that runs the site, and the Oracle database schema. Significant code updates tend to require schema updates too, which means pausing content updates while this takes place. Journalists on a major news site hate these pauses.

MongoDB by contrast is not a relational database. Rather, it stores documents in JSON (JavaScript Object Notation) format. This means that documents with new attributes can be added to the database at runtime.

Although this was the main motivation for change, the Guardian discovered other benefits. Developer productivity is significantly better with MongoDB and they are enjoying its API.

Currently both MongoDB and Oracle are in use. The Guardian has written its own API layer to wrap database access and handle the complexity of having two radically different data stores.

I enjoyed this talk, partly thanks to Wall’s clear presentation, and partly because I was glad to hear solid pragmatic reasons for moving to a NoSQL data store.

IE9 in Windows Phone will be good for cross-platform JavaScript and HTML5 apps

Microsoft CEO Steve Ballmer, accompanied by Nokia’s Stephen Elop, showed coming updates for Windows Phone 7 at a Mobile World Congress keynote last night.

A minor update due in early March will add copy and paste, and CDMA support is also coming in the first half of 2011.

The more interesting update is planned for the second half of 2011 – I’m guessing late this year – and will have multi-tasking for 3rd party apps, as well as a mobile version of Internet Explorer 9. We were told that this will feature the same HTML 5 standards support and hardware acceleration as in the desktop version.

Windows Phone VP Joe Belfiore showed the fish demo running on Windows Phone with IE9 alongside Safari on the iPhone. The iPhone fish jerk slowly across the screen.

image 

Note that Apple will likely have a new iPhone out before mobile IE9 is ready, which may well equal or exceed IE9’s graphics performance.

Nevertheless, this is interesting for developers since it means that the fast “Chakra” JavaScript runtime will be available on the device. HTML and JavaScript is one route to cross-platform mobile applications.

Silverlight on Windows Phone includes a WebBrowser control which has access to isolated storage. This means you could write most of your app in cross-platform JavaScript and HTML, but wrap it in Silverlight for access to native phone features.

It is a shame though that Microsoft does not include the Sqlite local database engine found in WebKit-based mobile browsers. Sqlite is in the public domain so this may be an example of the “not invented here” syndrome. Microsoft does not even have SQL Compact Edition in Windows Phone 7, though it would not surprise me if this also appears in the autumn update. Full details are being held back until the Mix conference in April.

Although it has not been stated, it would make sense for this update to be used in the first Windows Phones from Nokia. On Sunday evening, Nokia stated its desire to deliver a Windows Phone device before the end of the year.

Functional programming, NOSQL themes at QCon London

One reason I enjoy the QCon London software development conference is that it reflects programming trends. Organiser Floyd Marinescu described it as by practitioners for practitioners. In previous years I’ve seen themes like disillusionment with enterprise Java, the rise of Agile methodologies, the trend towards dynamic languages, and the benefits of REST.

So what’s hot this year? A couple of trends are striking. One is functional programming. Don Syme, Principal Researcher at Microsoft Research and co-inventor of F#, gave a lively presentation on functional approaches to parallelism and concurrency. He shows screen after screen of equivalent F# and C# code, illustrating how F# is more concise and expressive, as well as being better suited to concurrent development.

F# is one of the languages included by default in Visual Studio 2010, which should be released shortly.

I asked Syme what sort of problems are not well suited to F#. In his reply he described the state of play in Visual Studio 2010, where you can easily create F# libraries but there is no designer support for user interface code, such as Windows Forms or Windows Presentation Foundation. That is merely a tooling issue though.

Syme’s point is that functional programming, and F# in particular, is ideal for today’s programming challenges, including concurrency and asynchronous code.

If nothing else, he convinced me that every .NET programmer should at least be looking at F# and learning what it can do.

The functional programming track at QCon is not just about F#, of course, though in some ways it seems to be the functional language of the moment.

The other theme that has made a big impression is NoSQL, or what the QCon track calls “Non-relational database managers and web-oriented data”.Geir Magnusson from Gilt Groupe talked about the challenge of running a web site which has extreme peaks in traffic, and where every user needs dynamic data and transaction support so simple caching does not work. They were unable to get their relational database store to scale to handle thousands of transactions a second. They solved the problem with an in-memory non-relational database.

In another talk, the BBC discussed their use of CouchDB for highly scalable web sites.

PDC day one: Windows in the cloud

Today was cloud day at PDC. Microsoft announced that Windows Azure will become a production platform on January 1st, with billing starting from February 1st. It also announced the beta of Windows Server AppFabric role, for on-premise apps that can either stay on-premise or be deployed to Azure later; and some new developments like the Windows Server Virtual Machine role on Azure, a pre-configured Windows Server VM into which you will be able to deploy an application.

Azure was first announced at the 2008 PDC, and had a stuttering start, with a CTP (Community Tech Preview) that was difficult to use, major changes to SQL Server Data Services – a simplified cloud database that was scrapped and replaced with full SQL Server – and generally poor marketing from Microsoft. I was not sure whether the company was serious about Azure, or merely trying to tick the cloud box.

I do now think it is serious, and delivering some interesting technology for easily scalable cloud-hosted applications. Microsoft does not sees its cloud services as replacing your in-house servers (no surprise there), but more as a way of deploying certain kinds of web applications. A great feature is that thanks to Active Directory Federation Services in combination with the new .NET library called Windows Identity Foundation you can relatively easily have use your Azure applications authenticated against your internal Active Directory.

The surprise of the day was when Matt Mullenweg of WordPress fame turned up to demo WordPress running on Azure, which now supports PHP and MySQL as well as Java applications. Another unexpected guest was Loic Le Meur of Seesmic, who introduced Seesmic for Windows and also talked about a coming Silverlight version.

That said, the keynote did not exactly crackle with excitement. Microsoft seemed almost to downplay what is now possible with Azure, perhaps sensing that it could be disruptive to its own business model. A telling moment came during a press briefing when Doug Hauger, Azure General Manager, denied that Windows or Office were in any sort of decline. Despite his position he seems to be under the illusion that we will happily continue with our fragile on-premise, single platform, micro-managed IT systems.

I enjoyed the day though. The beauty of PDC is that Microsoft rolls out its best speakers; it was great to hear Mark Russinovich explain the kernel changes in Windows 7 and Server 2008 R2 – same kernel of course – and I will be writing more about the session shortly.

I’m expecting more focus on Office, Silverlight and Visual Studio tomorrow, when Steven Sinofsky, Scott Guthrie and Kurt DelBene will be giving the keynote, and hoping for some compelling announcements.