How Hyper-V can seem to lose your data

I’m sure it can really lose your data as well, but in this case “seem” is the appropriate word. I’ve been messing around with Hyper-V and one of my test machines is a SharePoint server. I started this up and found I could not access it over the network. On further investigation, it turned out to be a broken trust relationship with the Domain Controller. In other words, on attempting to log on with domain credentials I got the message:

The trust relationship between this workstation and the primary domain failed

The official advice when confronted with this problem is to remove and re-join it to the domain, creating a new computer account. I did so. Logged on, and was disappointed to discover that SharePoint was now empty. Worse still, even checking out the SQL Server databases did not uncover them. All my documents had vanished.

It turned out that I had done the wrong thing. What had really happened is that Hyper-V had been saving my changes on that virtual hard drive to a “differencing disk”, a file with an .avhd extension. This is part of the Hyper-V snapshot system. Somehow, Hyper-V had forgotten the differencing disk, and started up my SharePoint VM using the last fully merged copy of the drive, which was over a month old. My drive had gone back in time, so the data had gone.

The solution was to restore the old parent .vhd from backup, and then manually merge it with the differencing file. Step by step instructions are here. Since I had deleted the original computer account, I then had to remove and rejoin the machine to the domain a second time. All was well and my data reappeared.

The bug here is how Hyper-V managed to start with an old version of the virtual hard drive in the first place. I can imagine this causing panic if it occurs in production – and once you start writing new, important data to the old version you are really in trouble. I was lucky that the discrepancy was severe enough that Active Directory complained.

Virtualization may be wonderful; but it also introduces new problems of its own.

The other lesson is that those .vhd files in C:\Users\Public\Public Documents\Hyper-V\Virtual Hard Disks do not necessarily contain your latest data. You also need to consider the .avhd files stored handily at C:\Program Data\Microsoft\Windows\Hyper-V\Snapshots.

Technorati tags: , , ,

6 thoughts on “How Hyper-V can seem to lose your data”

  1. Tim,

    I’m having a discussion with a colleague and we are wondering…

    Are you saying that “in doing the wrong thing” that you should have first merged the differencing files and therefore you would not have needed to rejoin the domain and also SharePoint and SQL would’ve been intact and that this is normal?

    Or, are you saying “what had really happened” is that Hyper V truly forgot about the differencing files upon the initial boot which produced the “The trust relationship between this workstation and the primary domain failed” and the problems with SQL and SharePoint?

  2. Hi Greg

    I’m saying that I needn’t have deleted the computer account. If I’d realised what had happened, I could just have merged the differencing files and it would have been fine.

    This is not normal though. So your second statement is true as well.

    There wasn’t really a problem with SQL and SharePoint – the VM had just reverted to its state at an earlier date, without being asked.

    Tim

  3. I had a standalone SQL/MOSS VM running Server 2008 (lets call it the Test1 VM) that had snapshots I would roll back to occasionally (used for testing). After a month or so, the rollback was too old for my DC and the trust relationship broke. I resolved it by going to the DC (Windows Server 2008-based) and under AD Users and Computers saying “Disable Account” for the Test1 machine on the domain. With the Test1 machine still on the network (and logged into it as local admin) I went to join it to the same domain it was already joined to (when challenged – I provided my domain admin creds). This “rejoined” the VM to the domain and flipped the “disabled” bit for the machine account to “enabled”. I did NOT unjoin the machine or place it in workgroup. I believe this approach saved me from pulling down new SIDs and royally screwing up SharePoint. With the trust re-established I took a new baseline VM snapshot which should be good for another few months…

  4. Thanks for the tip; I wonder if this generally works when the trust relationship breaks.

    Tim

  5. Applied a 3 month old VM snapshot today, resulting in broken domain trust. Disabling the machine account in ADS & rejoining the VM worked as advertised.

    Amateurs such as myself may find that coercing the OK button of the VM’s Computer Name dialog into an enabled state when “joining the same domain” will seem impossible until one remembers the FQN & NETBIOS domain names are different strings.

  6. For the broken trust issue, you don’t need to drop it from the domain and re-join it, you can simply run this command at the command prompt of the now untrusted workstation: netdom resetpwd /server:YourDcName /userd:YourDomainNameadministrator /passwordd:*

Comments are closed.