Tag Archives: unicode

Using SQLite 3 with Unicode in Delphi

I’ve updated my SQLite3 Delphi wrapper for Unicode in Delphi 2009 and higher. Previous versions of the wrapper ducked the issue by using Ansi strings throughout.

image

I actually used Embarcadero Delphi XE for the development, but I would expect it to work in Delphi 2009 and higher, since it was in that version that Delphi first properly supported Unicode. Converting older Delphi projects is meant to be seamless, except in cases where you are using pointers or doing interop with native DLLs; of course this wrapper does both.

The SQLite 3 API expects either UTF8 or Unicode strings. To be more precise, some functions have Unicode versions indicated with a “16” suffix, and some do not, in which case they expect UTF8 if they accept string values. Although UTF8 strings support Unicode characters, most characters generally occupy a single byte just as in Ansi strings, so one of the things I discovered was that I could not simply rely on PChar, Delphi’s null-terminated string type, which from Delphi 2009 is a Unicode type with double-byte characters. Instead, for cases where the SQLite 3 API expects a UTF8 string, I have used PAnsiChar as before.

It is all somewhat confusing, and there are a few cases where Delphi does not do quite what you would expect. I recommend Marco Cantu’s paper Delphi and Unicode [pdf], one of the best resources I found. This article by Nick Hodges on Unicodifying your code is handy too.

Finally, in the example I keep an object in memory and it is easy to end up with code paths that do not free it. I love this feature of Delphi (since at least 2007) which informs you of your mistake when the application closes:

image

I have uploaded the code and you can find it linked here.

Update: My assumption that Delphi 2010 would work the same way as Delphi XE was incorrect. I have some code that reads a blob field containing a UTF8 string and returns it as a string. I read the stream into a byte array and then cast it to a UTF8String:

str := UTF8String(bytes);

where str is a UTF8string variable. This worked in Delphi XE, but in Delphi 2010 I got garbage. I have modified the code to use a TStringStream and now it works in both.