Category Archives: c++

Embarcadero All-access: a better way to deploy developer tools?

I have a call lined up with Embarcadero today, and wanted to catch up with their latest tools. It reminded me of something I’d intended to post about for some time, the Embarcadero All-Access system which allows no-touch install of many of its tools. Here is how it works. First, you run the All-Access client:

image

I’m not showing all the available tools here: I count 17 currently. You’ll notice many of them are marked InstantOn. Let’s say I want to take a look at DBArtisan. I click the link and get a dialog:

image

This invites me to start a download. Click Yes and I get a download thermometer:

image

Once downloaded, I have to pass a license screen and enter a serial number. Presuming you have a current subscription, you can get a serial number by logging on to you Embarcadero account and requesting it there, where it is supplied instantly. This part of the process is similar to that used by Microsoft for MSDN subscriptions. It is a shame it is not built into the All Access desktop client, but a minor inconvenience.

Then the application runs.

image

No further setup, no install options, or any of the other complications that often accompany installing developer tools.

To be fair, I can think of other development tools that are pretty much download and run. Eclipse is usually good in this respect, at least until you try to get updates. Further, even with All Access there can be additional steps. Instant-on 3rd Rail, for example, does not install a Ruby runtime, so it is not really click and run: the Eclipse-based IDE runs, but you cannot start a project without getting a Ruby interpreter from somewhere.

Nevertheless, this is the closest I’ve seen to on-demand developer tools, short of the interesting browser-hosted tools that are emerging. Embarcadero now also calls it the ToolCloud. It is not just an easy install; this is application virtualisation:

Aimed at simplifying deployment, enabling side-by-side versioning of products, and breaking down the barriers to use, InstantOn is also great in locked-down desktop environments, since the product does not affect any system files or system registry settings.

says the faq.

Alongside the technical aspects, All-Access simplifies license management for a development team. You can install the server piece on your own network for full control.

This comes at a price of course. There are four subscription levels, from Bronze to Platinum, though even the Bronze gives you use of a wide range of tools including Delphi, C++ Builder, JBuilder, Rapid SQL, some parts of ER/Studio, 3rdRail and Delphi for PHP. Example price from Grey Matter in the UK starts at £3188.57 for a 1-year Bronze concurrent license.

The interesting question: when can this be made into a generic tool that developers can use for deploying their own applications?

How Intel’s compiler underperforms on other CPUs: artificial impairment versus failure to optimise

Last month the US Federal Trade Commission sued Intel for anti-competitive practices; and in my post on the subject I tried to make sense of part of the FTC’s complaint, that:

Intel secretly redesigned key software, known as a compiler, in a way that deliberately stunted the performance of competitors’ CPU chips.

A few days ago Agner Fog wrote an article that sheds some light on the subject:

The reason is that the compiler or library can make multiple versions of a piece of code, each optimized for a certain processor and instruction set, for example SSE2, SSE3, etc. The system includes a function that detects which type of CPU it is running on and chooses the optimal code path for that CPU. This is called a CPU dispatcher. However, the Intel CPU dispatcher does not only check which instruction set is supported by the CPU, it also checks the vendor ID string. If the vendor string says “GenuineIntel” then it uses the optimal code path. If the CPU is not from Intel then, in most cases, it will run the slowest possible version of the code, even if the CPU is fully compatible with a better version.

Fog also notes a clause in Intel’s November 2009 settlement with AMD [PDF] in which the company undertakes not to:

include any Artificial Performance Impairment in any Intel product, or require any Third Party to include an Artificial Performance Impairment … “Artificial Performance Impairment” means an affirmative engineering or design action by Intel (but not a failure to act) that (i) degrades the performance or operation of a Specified AMD product (ii) is not a consequence of an Intel Product Benefit and (iii) is made intentionally to degrade the performance or operation of a Specified AMD product.

It is a fine distinction. At what point does failure to optimise constitute “affirmative engineering”? What riles developers is that even when an non-Intel CPU reports support for an optimisation such as SSE, the Intel-compiled code will not use it unless it is an Intel CPU. You could argue that this is an inaction (failure to optimise) rather than an action (deliberately running slower); but the end result is the same, worse performance on non-Intel processors.

The obvious practical solution is to use other compilers, but for certain types of work Intel’s compiler is considered the best.

Has Intel now agreed to do a better job? You tell me; I don’t think the clause quoted above tells us one way or another. I do think it is legitimate for the government to press Intel at least to take advantage of obvious optimisations on third-party processors, since this benefits everyone. Even so, Intel will always optimise best for its own CPUs and that is to be expected.

Performance tests are another issue. It is deceptive to produce test results showing performance differences without also revealing that in one case the code is optimised, and in another it is not. That said, if Intel has a smart optimisation that is specific to its own CPUs, there is no reason why it should not trumpet the fact. This is a matter of disclosure.

Finally, developers take note: if you are compiling for a general market that might or might not be using Intel CPUs, maybe the Intel compiler is not the best choice.

Technorati Tags: ,,

Parallel Programming: five reasons for caution. Reflections from Intel’s Parallel Studio briefing.

I’m just back from an Intel software conference in Salzburg where the main topic was Parallel Studio, a new suite which adds Intel’s C/C++ compiler, debugging and profiling tools into Visual Studio. To some extent these are updates to existing tools like Thread Checker and VTune, though there are new features such as memory checking in Parallel Inspector (the equivalent to Thread Checker) and a new user interface for Parallel Amplifier (the equivalent to VTune). The third tool in the suite, Parallel Composer, is comprised of the compiler and libraries including Threading Building Blocks and Intel Integrated Performance Primitives.

It is a little confusing. Mostly Parallel Studio replaces the earlier products for Windows developers using Visual Studio; though we were told that there are some advanced features in products like VTune that meant you might want to stick with them, or use both.

Intel’s fundamental point is that there is no point in having multi-core PCs if the applications we run are unable to take advantage of them. Put another way, you can get remarkable performance gains by converting appropriate routines to use multiple threads, ideally as many threads as there are cores.

James Reinders, Intel’s Chief Evangelist for software products, introduced the products and explained their rationale. He is always worth listening to, and did a good job of summarising the free lunch is over argument, and explaining Intel’s solution.

That said, there are a few caveats. Here are five reasons why adding parallelism to your code might not be a good idea:

1. Is it a problem worth solving? Users only care about performance improvements that they notice. If you have a financial analysis application that takes a while to number-crunch its data, then going parallel is a big win. If your application is a classic database forms client, it is probably a waste of time from a performance perspective. You care much more about how well your database server is exploiting multiple threads on the server, because that is likely to be the bottleneck.

There is a another reason to do background processing, and that is in order to keep the user interface responsive. This matters a lot to users. Intel said little about this aspect; Reinders told me it is categorised as convenience parallelism. Nevertheless, it is something you probably should be doing, but requires a different approach than parallelising for performance.

2. Will it actually speed up your app? There is an overhead in multi-threading, as you now have to manage the threads as well as performing your calculations. The worst case, according to Reinders, is a dual-core machine, where you have all the overhead but only one additional core. If the day comes when we routinely have, say, 64 cores on our desktop or laptop, then the benefit becomes overwhelming.

3. Is it actually desirable on a multi-tasking operating system? Consider this: an ideally parallelised application, from a performance perspective, is one that uses 100% CPU across all cores until it completes its task. That’s great if it is the only application you are running, but what if you started four of these guys (same or different applications) simultaneously on a quad-core system? Now each application is contending with others, there’s no longer a performance benefit, and most likely the whole system is going to slow down. There is no perfect solution here: sometimes you want an application to go all-out and grab whatever CPU it needs to get the job done as quickly as possible, while sometimes you would prefer it to run with lower priority because there are other things you care about more, such as a responsive operating system, other applications you want to use, or energy efficiency.

This is where something like Microsoft’s concurrency runtime (which Intel will support) could provide a solution. We want concurrent applications to talk to the operating system and to one another, to optimize overall use of resources. This is more promising than simply maxing out on concurrency in every individual application.

4. Will your code still run correctly? Edward Lee argues in a well-known paper, The Problem with Threads, that multi-threading is too dangerous for widespread use:

Many technologists are pushing for increased use of multithreading in software in order to take advantage of the predicted increases in parallelism in computer architectures. In this paper, I argue that this is not a good idea. Although threads seem to be a small step from sequential computation, in fact, they represent a huge step. They discard the most essential and appealing properties of sequential computation: understandability, predictability, and determinism. Threads, as a model of computation, are wildly nondeterministic, and the job of the programmer becomes one of pruning that nondeterminism. Although many research techniques improve the model by offering more effective pruning, I argue that this is approaching the problem backwards. Rather than pruning nondeterminism, we should build from essentially deterministic, composable components. Nondeterminism should be explicitly and judiciously introduced where needed, rather than removed where not needed.

I put this point to Reinders at the conference. He gave me a rather long answer, saying that it is partly a matter of using the right libraries and tools (Parallel Studio, naturally), and partly a matter of waiting for something better:

Law articulates the dangers of threading. Did we magically fix it or do we really know what we’re doing in inflicting this on the masses? It really come down to determinism. If programmers make their program non-deterministic, getting out of that mess is something most programmers can’t do, and if they can it’s horrendously expensive.

He’s right, if we stayed with Windows threads and Pthreads and programming at that level, we’re headed for disaster. What you need to see is tools and programming templates that avoid that. The evil thing is what we call shared mutable state. When you have things happening in parallel, the safest thing you can do is that they’re totally independent. This is one of the reasons that parallelism on servers works so well, in that you do lots and lots of transactions and they don’t bump into each other, or they only interface through the database.

Once we start opening up shared mutable state, encouraging threading, we set ourselves up for disaster. Parallel Inspector can help you figure out what disasters you create and get rid of them, but ultimately the answer is that you need to encourage people to use programming like OpenMP or Threading Building Blocks. Those generally guide you away from those mistakes. You can still make them.

One of the open questions is can you come up with programming techniques that completely avoid the problem? We do have one that that we’ve just started talking about called Ct … but I think we’re at the point now where OpenMP and Threading Building Blocks have proven that you can write code with that and get good results.

Reinders went on to distinguish between three types of concurrent programming, referring to some diagrams by Microsoft’s David Callaghan. The first is explicit, unsafe parallelism, where the developer has to do it right. The second is explicit, safe parallelism. The best approach according to Reinders would be to use functional languages, but he thinks it unlikely that they will catch on in the mainstream. The third type is implicit parallelism that’s safe, where the developer does not even have to think about it. An example is the math kernel library in IPP (Intel Integrated Performance Primitives) where you just call an API that returns the right answers, and happens to use concurrency for its work.

Intel also has a project called Ct (C/C++ for Throughput) which is a dynamic runtime for data parallelism, which Reinders considers also falls into the implicit parallelism category.

It was a carefully nuanced answer, but proceed with caution.

5. Will your application need a complete rewrite? This is a big maybe. Intel’s claim is that many applications can be updated for parallelism with substantial benefits. A guy from Nero did a presentation though, and said that an attempt to parallelise one of their applications, a media transcoder, had failed because the architecture was not right, and it had to be completely redone. So I guess it depends.

This brings to mind another thing which everyone agrees is a hard challenge: how to design an application for effective parallelism. Intel has a tool in preparation called Parallel Advisor, to be part of Parallel Studio at a future date, which is meant to identify candidates for parallelism, but that will not be a complete answer.

Go parallel, or not?

None of the above refutes Intel’s essential point: that effective concurrent programming is essential to the future of computing. This is an evolutionary process though, and at this point there is every reason to be cautious rather than madly parallelising every piece of code you touch.

Additional Links

Microsoft has a handy Parallel Computing home page.

David Callaghan: Design considerations for Parallel Programming

Delphi and C++ Builder 2009 are available to order

I’m choosing my words carefully, because although the CodeGear/Embarcadero site is now showing Delphi 2009 as the current version, if you click through to the order page it only offers a pre-order. Still, it must be done or thereabouts. US prices are as follows:

Delphi 2009: Pro $874.00;  Enterprise $1974.00; Architect $3474.00

C++ Builder 2009 costs the same; or you can get a bundle with both for a relatively small extra cost, eg. $1074.00 for Delphi and C++Builder Professional.

Curiously, an upgrade to Delphi 2009 Pro only costs $374.00 (57% discount), but an upgrade to Enterprise is $1274.00 (35% discount). I can’t make sense of this except on the basis that any product labelled “Enterprise” is presumed not price sensitive.

So what’s not in the Pro version? The Enterprise edition adds drivers for server databases in the dbExpress database framework, the DataSnap multi-tier application framework, and a full range of modeling diagrams. Architect bundles ER/Studio Developer Edition, Embarcadero’s database modeling tool, with support for a wide range of database servers.

In other words, the majority of Delphi’s features are in the Pro edition, which is really much the best value, though if you need DataSnap or client-server dbExpress then I guess you have no choice.

The big features here strike me as Unicode in the Visual Component Library; and the new language features, generics and anonymous methods. I’ve not yet looked at the product though, so watch this space.

Visual C++ Feature Pack includes major MFC update

The Visual C++ 2008 Feature Pack is now available for download. It includes an update to MFC (Microsoft Foundation Classes), with classes for an Office Ribbon style interface (if you can live with the legal aspect) and other new GUI controls. There is also an implementation of  the TR1 (Technical Report 1) standard (PDF), an important update to the ISO 2003 C++ standard library.