Category Archives: Notes from the field

Notes from the field: Windows Time Service interrupts email delivery

A business with Exchange Server noticed that email was not flowing. The internet connection was fine, all the servers were up and running including Exchange 2016. Email has been fine just a few hours earlier. What was wrong?

The answer, or the beginning of the answer, was in the Event Viewer on the Exchange Server. Event ID 1035, only a warning:

Inbound authentication failed with error UnexpectedExchangeAuthBlobCheckForClockSkew for Receive connector Default Mailbox Delivery

Hmm. A clock problem, right? It turned out that the PDC for the domain was five minutes fast. This is enough to trigger Kerberos authentication failures. Result: no email. We fixed the time, restarted Exchange, and everything worked.

Why was the PDC running fast? The PDC was configured to get time from an external source, apparently, and all other servers to get their time from the PDC. Foolproof?

Not so. If you typed:

w32tm /query /status

at a command prompt on the PDC (not the Exchange Server, note), it reported:

Source: Free-running System Clock

Oops. Despite efforts to do the right thing in the registry, setting the Type key in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Parameters to NTP and entering a suitable list of time servers in the NtpServer key, it was actually getting its time from the server clock. This being a Hyper-V VM, that meant the clock on the host server, which – no surprise – was five minutes fast.

You can check for this error by typing:

w32tm /resync

at the command prompt. If it says:

The computer did not resync because no time data was available.

then something is wrong with the configuration. If it succeeds, check the status as above and verify that it is querying an internet time server. If it is not querying a time server, run a command like this:

w32tm /config /update /manualpeerlist:”0.pool.ntp.org,0x8 1.pool.ntp.org,0x8 2.pool.ntp.org,0x8 3.pool.ntp.org,0x8″ /syncfromflags:MANUAL

until you have it right.

Note this is ONLY for the server with the PDC Emulator FSMO role. Other servers should be configured to get time from the PDC.

Time server problems seem to be common on Windows networks, despite the existence of lots of documentation. There are also various opinions on the best way to configure Hyper-V, which has its own time synchronization service. There is a piece by Eric Siron here on the subject, and I reckon his approach is a safe one (Hyper-V Synchronization Service OFF for the PDC Emulator, ON for every other VM).

I love his closing remarks:

The Windows Time service has a track record of occasionally displaying erratic behavior. It is possible that some of my findings are not entirely accurate. It is also possible that my findings are 100% accurate but that not everyone will be able to duplicate them with 100% precision. If working with any time sensitive servers or applications, always take the time to verify that everything is working as expected.

Kaspersky encrypted connection scanning breaks ADFS login, internet-facing Dynamics CRM

I was asked to look at a case where a user could not log in to Dynamics CRM. This is an internet-facing deployment which uses ADFS (Active Directory Federation Services). The user put in valid credentials but received a 401 – unauthorized: Access is denied due to invalid credentials.

The odd thing from the user’s perspective is that everything worked fine on other PCs; but switching web browsers did not fix it.

I noticed that Kaspersky anti-virus was installed.

image

Pausing Kaspersky made no difference to the error. However I came back to this after eliminating some other possible problems. I noticed that if you looked at the certificate on the ADFS site it was not from the site itself, but a Kaspersky certificate.

image

The reason for this is that Kaspersky wants to inspect encrypted traffic for malware.

I understand the rationale but I dislike this behaviour. Your security software should not hide the SSL certificate of the web site you are visiting. Of course it is particularly dislikeable if it breaks stuff, as in this case. I found the setting in Kaspersky and disabled both this feature, and another which injects script into web traffic (though this proved not to be the culprit here), for the sake of Kaspersky’s “URL Advisor”).

Personally I feel that encrypted traffic should only be decrypted in the recipient application. Kaspersky’s feature is an SSL Man-in-the-Middle attack and to my mind reduces rather than increases the security of the PC. However you made the decision to trust your anti-virus vendor when you installed the software.

There are other anti-virus solutions that also do this so Kaspersky is not alone. As to why it breaks ADFS I am not sure, but regard this as a good thing since the user’s SSL connection is compromised.

image

As it turns out, it isn’t essential to disable the feature entirely. You can simply set an exclusion for the ADFS site by clicking Manage exclusions.

Posted here in case others hit this issue.

Unhealthy Identity synchronization Notification: a trivial solution (and Microsoft’s useless troubleshooter)

If you use Microsoft’s AD Connect, also known as DirSync, you may have received an email like this:

image

It’s bad news: your Active Directory is not syncing with Office 365. “Azure Active Directory did not register a synchronization attempt from the Identity synchronization tool in the last 24 hours.”

I got this after upgrading AD Connect to the latest version, currently 1.1.553.

The email recommends you run a troubleshooting tool on the AD Connect server. I did that. Nothing wrong. I rebooted, it synced once, then I got another warning.

This is only a test system but I still wanted to find out what was wrong. I tweaked the sync configuration, again without fixing the issue.

Finally I found this post. Somehow, AD Connect had configured itself not to sync. You can get the current setting in PowerShell, using get-adsyncscheduler:

image

As you can see, SyncCycleEnabled is set to false. The fix is trivial, just type:

set-adsyncscheduler –SyncCycleEnabled $true

Well, I am glad to fix it, but should not Microsoft’s troubleshooting tool find this simple configuration problem?

How to remove the WINS server feature from Windows Server

The WINS service is not needed in most Windows networks but may be running either for legacy reasons, or because someone enabled it in the hope that it might fix a network issue.

It is now apparently a security risk. See here and Reg article here.

Apparently Microsoft says “won’t fix” despite the service still being shipped in Server 2016, the latest version:

In December 2016, FortiGuard Labs discovered and reported a WINS Server remote memory corruption vulnerability in Microsoft Windows Server. In June of 2017, Microsoft replied to FortiGuard Labs, saying, "a fix would require a complete overhaul of the code to be considered comprehensive. The functionality provided by WINS was replaced by DNS and Microsoft has advised customers to migrate away from it." That is, Microsoft will not be patching this vulnerability due to the amount of work that would be required. Instead, Microsoft is recommending that users replace WINS with DNS.

It should be removed then. I noticed it was running on a server in my network, running Server 2012 R2, and that although it was listed as a feature in Server Manager, the option to remove it was greyed out.

I removed it as follows:

1. Stop the WINS service and set it to manual or disabled.

2. Remove the WINS option in DHCP Scope Options if it is present.

3. Run PowerShell as an administrator and execute the following command:

uninstall-windowsfeature wins

This worked first time, though a restart is required.

Incidentally, if Microsoft ships a feature in a Server release, I think it should be kept patched. No doubt the company will change its mind if it proves to be an issue.

Note: you can also use remove-windowsfeature which is an alias for uninstall-windowsfeature. You do need Windows Server 2008 R2 or higher for this to work.

Fixing “couldn’t parse private ssl key” in Dovecot

I run Debian Linux including a mail server, and part of the system is Dovecot, an open source IMAP and POP3 server which has always worked well for me.

Unfortunately it stopped working after an upgrade. With Linux I am in the habit of doing:

apt-get update

apt-get upgrade

to keep the system patched, and normally everything works fine. Occasionally it does not, and then I need to dig in and work out what is wrong and how to fix it. The upgrade to Apache 2.4, for example, was somewhat painful because of changed configuration directives.

This time it was Dovecot that broke. I use Thunderbird to pick up POP3 mail, and nothing was flowing. Eventually I found the problem logged in syslog:

Fatal: Couldn’t parse private ssl_key: error:0906D06C:PEM routines:PEM_read_bio:no start line: Expecting: ANY PRIVATE KEY

I puzzled over this for some time. The path to the private key was correct in dovecot.conf. The permissions were OK. I regenerated the certificate (it’s self-signed) but still the same.

Eventually I found the solution here. The path to the SSL certs used to look like this:

ssl_cert = /etc/ssl/certs/dovecot.pem
ssl_key = /etc/ssl/private/dovecot.pem

Now it must look like this:

ssl_cert = </etc/ssl/certs/dovecot.pem
ssl_key = </etc/ssl/private/dovecot.pem

Yes, you need that angle bracket, otherwise you get the error.

It used to work, so at some point the Dovecot coders took out the compatibility code that allowed the old-style directive.

Mentioned here in case it helps someone find the solution.

Disabling automatic update restarts in Windows Server 2016

Windows Server 2016 is in effect the Windows 10 version of the server OS. If you look in Settings it seems to have the same attitude to updates; in other words, you get them automatically whether you like it or not. Currently my server is even offering me Windows 10 Creators Update:

image

However, I prefer to have servers just download updates and let me decide when to install them. There can be good reasons for this. For example, I run Exchange Server on a machine that is not really up to spec, and the Exchange services have to be manually started every time it reboots. Well, there are ways round this, but it makes the point.

It turns out that you can after all set Windows Server 2016 to download-only. Just run sconfig from the command line and choose option 5:

image

The sconfig menu will be familiar if you have worked with Server Core or other variants of Windows Server without a GUI.

Incidentally, I tried to install Exchange 2016 on Server 2016 without a GUI but it appears not to be supported. A shame.

Returning to the subject of updates, Brendan Power at Microsoft popped up on Reddit to say that this is a bug in in the settings:

The "Available updates will be downloaded…" text in the UI is a bug that doesn’t represent the actual automatic update settings.

To verify the actual server settings, you can open the command prompt and run sconfig.cmd; in the menu, you should see option 5 set to Manual.

A bug? I am not sure. If so, it seems an odd and obvious one. I think Microsoft is keen to have us update automatically. That said, Windows Server 2016 is meant to follow the Long Term Servicing Branch (LTSB) model rather than the “Windows as a service” approach in Windows 10, unless you run Nano Server, according to this post. So compulsory update to retain a supported configuration does not apply here.

Of course you should patch your Windows Server installations in a timely manner, however you choose to do it.

How to get a Bitlocker recovery key on Android

Scenario: you are out and about with your laptop and phone. Laptop protected with Bitlocker encryption. You start up your laptop and it decides for no obvious reason to demand your Bitlocker key before booting up.

If your laptop is domain-joined, this key is normally stored in Active Directory. If is not domain-joined, it is normally stored in the OneDrive linked with your Microsoft account. These are options when the Bitlocker encryption was applied.

This laptop is not domain-joined and the key is in OneDrive. So I pick up my phone and find the link, which is here.

Android helpfully opened this link in the OneDrive app, but not to the location of the keys, just the home page. No idea how to find the recovery keys in there.

The solution I found was to copy the link and paste it directly into the browser. You might need to get a code sent to your phone if you have 2-factor authentication. I found I needed to check the box for “I log on frequently with this device” before it worked.

Then I could see the key in the Android web browser. Phew.

Publishing Exchange with pfSense

pfSense is a FreeBSD-based firewall which you can find here.

I wanted to publish Exchange through pfSense. I installed the Squid plugin which includes specific reverse proxy support for Exchange.

If you search for help with publishing Exchange on pfSense you will find this document by Mohammed Hamada.

Unfortunately the steps given seem to be incorrect in some places, certainly for my version which is 2.3.2.

Here’s what I had to do to get it working:

1. Simple one not mentioned in his steps, you have to enable the Squid Proxy Server otherwise Squid will not run

2. Hamada sets a NAT rule to forward HTTPS traffic to his Exchange server:

image

If you do this, it will bypass your reverse proxy. What you should do instead is to create a Firewall rule to accept HTTPS:

image

You should also verify that the pfSense web GUI is not using the same port (443), in System/Advanced/Admin Access. If it is set to HTTP rather than HTTPS that is OK too. Normally access to the web GUI from the WAN is blocked. One other thing: in order to use port 443 in Squid Reverse Proxy General Settings, I set net.inet.ip.portrange.reservedhigh to 0 in System/Advanced/System Tunables

3. I did this, as well as setting up Exchange in Squid Reverse Proxy General Settings, whereupon OWA worked but remote Outlook and mobile clients did not, or at least not reliably. The main problem was this setting in Squid Reverse Proxy / General:

image

This must be set to Intermediate rather than Modern (the default).

Now it works – though if pfSense experts out there have better ways to achieve the above I would be interested.

Update: one other thing to check, make sure that your pfSense box can resolve the internal hostname of your Exchange server. By default it may use external DNS servers even if you put internal DNS servers in General Setup. This is because of the setting Allow DNS server list to be overridden by DHCP/PPP on WAN.

What to do when Outlook is stuck on “processing”

I have seen this a couple of times recently, both cases where Outlook 2016 is installed. You start Outlook, it loads plug-ins, then presents a dialog that says “Processing”.

image

It does this for a long time. What is is processing? Who knows. Will it complete in its own good time? Not sure, but for sure it takes longer than you want to wait in order to get your email.

Here is the fix that worked for me. Close Outlook by clicking the X at top right. If that doesn’t work, you can use Task Manager to end the Outlook process.

Now hold down Ctrl and click the Outlook shortcut on the taskbar, presuming it is pinned. This dialog appears:

image

Click Yes. If you get further dialogs such as First things First, click Accept:

image

In both cases I have seen, Outlook now opens immediately, though in safe mode which means no plug-ins are loaded.

Close Outlook and restart it. Again it opens quickly, this time complete with plug-ins.

What is going on here? Not sure, but it may be related to automatic updates for those of us with the Pro Plus version of Office installed via Office 365 or other entitlement.

Observation: this is poor from Microsoft. One of the issues is that showing a generic busy dialog with no indication of what the software is actually doing makes poor UI. Users are more accepting of a long process if they can see evidence of it, even if the technical details of what is displayed make no sense. Maybe something like “Verifying nodes nnn of nnn” with the number incrementing.

This would also help if in fact the software is stuck in a loop, since the user can see that nothing is really happening.

Another issue of course is that this looks like a bug. Most users will end up calling support, despite the trivial fix above.

There may be other reasons for this problem which require different fixes. If that is the case with you, apologies!

The case of the disappearing Azure AD application registration

Some time ago I wrote a simple web application which runs on Microsoft Azure and uses Azure Active Directory for authentication. The application is used constantly and has proved reliable; however yesterday it stopped working. A quick debug session showed that the problem was an Azure AD permissions error.

In order to use Azure AD, applications have to be registered in the Azure management portal. I use the old portal for this; I am not sure that the functionality exists in the new portal yet. There is a nice how-to here.

image

One of the elements in the registration is a key which has a maximum lifetime of 2 years:

image

My application was deployed about two years ago so I went to the portal to see if it had expired.

What I found surprised me. The application was not listed at all. It had disappeared.

Instead of simply obtaining a new key and updating my application config, I had to create a new application registration and update several keys in the config, which was an annoyance.

There is a wider point here, in the whole category of dealing with “things that expire”. Some time ago, Microsoft suffered an extended Azure outage because of an expired certificate. It is a shame that Microsoft insists on a maximum 2 year lifetime for this key but does not provide a check box for “alert me when this key is about to expire”, how difficult would that be?

Problems like this also mean that things which “just work” may not continue to do so. Of course a well organised enterprise setup can deal with this type of problem, but imagine, for example, the case of a small business with an application running on Azure where the developers have gone out of business, perhaps, or are no longer available. In fact the only code I needed to change was in web.config, but I can imagine it could take some time to figure out what to do and what to change.