How Microsoft is like an abusive boyfriend

Disclaimer: this is my opinion only, and is worth every penny you paid me for it. 😉

This weekend, the organization I work for is moving 1/2 of us off of NCP (Novell Core Protocol) file servers, to Microsoft CIFS (Common Internet File System) file servers.  I’m in the testing group. Windows Explorer, showing me file shares, is noticeably faster.

My boyfriend has decided to stop beating me.

I no longer have that other boyfriend, Novell, so everything is nice now, right?  My Microsoft boyfriend is happy now, he no longer needs to beat sense into me every night, for keeping up that relationship with that loser, Novell.

I’ve been doing computers for a long time; what follows are five times Microsoft shipped code to put the hurt on me and my users, for having the gall to also be Novell customers.

  • IFSHLP.386 and the anti-virus demo
  • Windows 3.10 –> Windows for Workgroups 3.11 setup.exe
  • Windows 95 Get Nearest Server
  • Outlook access via MAPI of a GroupWise mailbox
  • Windows 2000 network multiplexor

This phenomena is not new, in fact, someone came up with a clever phrase: “DOS isn’t done until Lotus won’t run”.

This refers to an incident where Microsoft shipped a new version of DOS that implemented LIM EMS (Lotus, Intel, Microsoft Expanded Memory Specification) differently.  Lotus had 1-2-3, the first “killer app” – the spreadsheet – and Microsoft wanted all that money.  They were just shipping Windows, and because it was graphical, it was slower.  In speed tests, Lotus 1-2-3 calculated faster than Microsoft Excel inside Windows.  So Microsoft shipped a new version of DOS, with LIM EMS that implemented memory access on word boundaries, instead of byte boundaries (or something like that).  The upshot is that before, you could install DOS, install Windows, install Lotus 1-2-3, and everything ran fine.  After, with the new DOS, when you went to launch Lotus 1-2-3, Windows immediately erred out with a big ugly GPF (General Protection Fault) due to illegal memory access, and a dialog box that told you you should contact your vendor (Lotus) for a version of the program that doesn’t crash Windows.

At the same time, Microsoft was running advertising in all the trade publications, with a picture of a jet fighter pilot and his crash helmet.  The subtext of the ads were “You should get your spreadsheet for Windows from the vendor who wrote Windows.  Less crashes that way” (something to that effect).

This was essentially the first big evidence that Microsoft was the jealous boyfriend who would beat his girlfriend (you) who also dated someone else (Lotus, in this case).

IFSHLP.386 and the anti-virus demo

Installable File System (IFS) Helper, for the Intel 386 architecture, background here:  This was an idea that programmers ought to get to the file system via a standardized operating system call.  Prior to IFSHLP.386, software that needed disk access hooked into Interrupt 21 – and several pieces of software would need to hook on any one PC.

As this idea grew, Novell asked Microsoft if they should get on board.  Of course, they said.  Hooking into the file system to provide storage over the network was what NCP did – it’s reason for existence.  You installed the Novell client software stack, you logged in, and now your PC has a drive E: where it didn’t before, and that drive E: was now on the other side of the network cable.  Novell was the perfect candidate to be an installable file system.

As this idea grew, anti-virus software vendors asked Microsoft if they should get on board.  And Microsoft told them: No, Interrupt 21 will always work, you can count on it, and there is no reason to make your life more complicated by mucking around with installable- this, and dynamic- that.  Just hook into Interrupt 21 like you always have, and things will be fine.

After a while, the IFS idea grew, and shipped, and was promoted as a generally good thing.

And then, Microsoft held a press conference.

They put a virus on a PC, and ran various partner’s anti-virus on the PC.  They all found and cleaned the virus.  They put the virus on a Windows NT file server.  As soon as the the PC accessed the file (going through Interrupt 21), the anti-virus software triggered and protected the PC from the virus on the network file server.  And then they put the virus on a Novell NetWare file server, and accessed the file.  The virus was not detected, because the Novell software stack on the PC had been configured to use IFSHLP.386 to get to the files on the network!  Microsoft made a huge deal of the fact that if you used Novell NetWare for your file server, you were putting your company at risk.  What a bunch of terrible programmers Novell were; they hid viruses from your anti-virus programs.

Thankfully, the computer press was aware of Microsoft’s abusive boyfriend behavior, and instead of flooding all channels of communication with “OMG! Novell! What a bunch of losers!”, they took a weekend off, and asked Novell to explain themselves.  The computer press confirmed with the anti-virus vendors the story about IFS versus Interrupt 21.  Novell agreed to to re-add the Interrupt 21 support they had dropped in the presence of IFS.  So then the computer press either didn’t run the story, or, ran the story of the anti-virus demo with the caveat that Novell promised to support the older access method in the future.

Windows 3.10 –> Windows for Workgroups 3.11 setup.exe

Back in the day, we used to move PCs around a lot.  Also, there was a saying “The 3 R’s of troubleshooting: Reboot, Reinstall, Reformat”.  First, reboot.  Is the problem solved? If no, reinstall whatever software package was troubling the user.  Is the problem solved? If no, reformat the hard drive and install everything from fresh again.  It was brutal, but effective.  Our methods developed around this practice, and it became somewhat easy.

Step 1: boot the machine from a floppy disk drive.  Issue the Format command in DOS to wipe the C: drive.

Step 2: install DOS from the floppy (all of DOS fit on one floppy).

Step 3: Reboot from the C: drive, and use the Novell stack from the floppy to log in to the the network.

Step 4: Switch to the F: drive, and change directory to the subdirectory (folder) where Windows 3.10 was.

Step 5: Launch setup.exe, with the command line switch to tell it to install Windows to the C: drive.

A short while late, the machine had a brand new, fresh install of Windows on it; “R3” of The 3 R’s of Troubleshooting was complete.  (We always told our users to store their documents on the H: drive, on the Novell network, so no-one ever lost any files this way).

It is worth noting that all the floppies from Windows 3.10 had been copied to the one network location, so we didn’t need to carry six additional floppies with us.  It made a lot of sense to put them on the network, log in to the network, and run the installers off the network.  It was much faster than floppies, too.  We didn’t worry about licenses, because we never bought a PC that didn’t come with Windows on it.

And then, Microsoft shipped an update to Windows; Windows for Workgroups (WfW), version 3.11.  This was “the first version of Windows, built with the network in mind”.

One of my co-workers copied all six floppies to the network, and told us of the new location.  We were going to have to keep track of which machines had Windows 3.10 versus WfW 3.11 for license reasons, but otherwise we had no qualms – until it came time to 3R a client’s computer.

Steps 1 – 4 were identical.  Step 5, however, had a landmine built into it.

Microsoft shipped WfW 3.11 setup.exe with code that reached into the install media and did something nasty.  If the install was from physical media (a hard drive or floppies) the physical drive ignored it or otherwise didn’t care.  But if setup.exe was running from a virtual (Novell network mapped) drive, it reached into the drive and crashed the entire server.

It. crashed. the. entire. server.

The server suffered an “ABEND” (abnormal end), and broadcast to everyone on the network that it had crashed, your files are lost, the end times have arrived, too bad, so sad, log back in after the file server comes back up, hope you had backups…

I crashed a server, with 80 users on it.  My co-worker crashed a server with 50 users on it.  We learned, the hard way, not to run setup.exe from the F: drive.

I, being the clever guy I am, came up with a work-around.  I renamed setup.exe to setup.not, and made a DOS batch file, setup.bat, that took setup.exe’s place.  It did five things:

  1. Copied the F: drive WfW folder contents to the C: drive
  2. Changed to the C: drive
  3. Renamed setup.not to setup.exe
  4. Deleted setup.bat
  5. Launched setup.exe

Setup.exe still did whatever nasty thing Microsoft programmed it to do; but because the install media was a physical hard drive (C: drive), the nastiness had no effect.

Back to my point of Microsoft being the abusive boyfriend, the analogy for this instance is the jealous boyfriend seeing you borrow another friends truck to help the jealous boyfriend move in; but because the truck belonged to some other guy, your boyfriend deliberately crashes the other guy’s truck into a brick wall.

Later, I was talking with a Novell product manager, and his comment was that they were thankful to Microsoft for this sabotage.   Before, they were happy that things just worked at all.  After, they learned they needed to practice defensive programming, as if a malicious actor was trying to crash their server.

Windows 95 Get Nearest Server

Early Novell NetWare servers had an easy (if simplistic) way of helping to set stuff up: broadcasts.  The type of packet was called a SAP packet, for Service Advertising Protocol.  Note that SAP was specific to Novell IPX/SPX networking, and had nothing to do with TCP/IP (the dominate protocol today).  IPX/SPX and SAP were a Novell thing.  When a server started up, it broadcast a SAP packet: “I’m a server, if you need me, here’s my address.”  Some things, like Hewlett-Packard JetDirect print servers would send a SAP packet every 30 seconds: “I’m a printer, if you need me, here’s my address.”

A client PC booting up on the network would broadcast a SAP Get Nearest Server packet.  This was the other side of the coin: “I’m a client, and I need to know what servers are available.  I’m asking which servers are nearest me.”  And all the NetWare boxes on the network would send a packet back “I’m a server, here’s my address.”  The network connection would be established and communication would flow.

An interesting assumption that Novell programmers made, was that the file server was the fastest machine on the network.  If you have 200 PCs on a network, serving all of them isn’t going be the job of some underpowered hand-me-down piece of junk that no-one wants any more.  It’s going to be the biggest, fastest, best box money can buy.  And “fastest” meant “answers quickest”.

The “nearest server” was the box that was a server, and replied the quickest.  If there were several servers on the network, all would reply; but the client would choose the first one with an answer as the “nearest”.

So how could Microsoft abuse the network?  By adding SAP Get Nearest Server replies to Windows 95.  Mind you, Windows 95 didn’t provide Novell NetWare network support; they just answered the Get Nearest Server broadcasts with “I’m a server, here’s my address.”

It was all lies, of course.  The client PC would then attempt to establish the network connection, but the Windows 95 box would sit there silently, non-responsive.  I imagine some programmer in Redmond, Washington, watching  the network failure play out, was grinning like the proverbial Cheshire Cat.

So of course, one of our power users had just gotten a really fast PC – faster than his file server, in fact.  Microsoft had the big shindig, announcing Windows 95.  Our power user stood in line to buy that retail copy of Windows 95 on opening day, came back to work, and upgraded his PC.

Which promptly began answering Get Nearest Server broadcasts with “I’m a server, here’s my address (whispering only to itself ‘but you can go stuff yourself’).”  All the printers on the network vanished.  All the file servers to log in to vanished.  All the servers in Remote Console vanished (how us administrators administered servers).

Kind of like the abusive boyfriend standing on your doorstep, and to everyone who shows up at your door, getting confrontational with them, to make sure they can’t reach you, nor can you reach them.  All that poor mail carrier wanted to do was deliver a freaking letter.

Outlook access via MAPI of a GroupWise mailbox

One wonderful resource we had in the GroupWise environment was a mailing list named NGW List.  Anyone could subscribe, and if you administered Novell GroupWise (NGW) you ought to have.  If you had a problem, we could tell you how we solved it, or what work-arounds there were.  At it’s height, it had more than 200 messages per day.  We sometimes discussed feature requests.  But a problem was that someone at Microsoft had the job of lurking on the list, looking for ideas.

So, a new version of GroupWise dropped, and Novell told us how great it was; that it was now fully MAPI compliant.

MAPI was a Microsoft standard: “Messaging Application Programming Interface”.  Developers wanted to be able to send email from their programs, and Windows might have MS Mail installed on it, or MS Outlook installed on it.  How to send email, if different mail providers are available?  The answer was MAPI, which Microsoft made, and provided as a standard.

On the list, we asked “because the new GroupWise was fully MAPI compliant, does that mean one can install Microsoft Outlook on the PC, and Outlook would access the mailbox on the GroupWise server?”  Yes, the answer was: just fine.  Outlook never knew it wasn’t talking to MS Mail or MS Exchange.  It just worked.  Life was good.

Until that Microsoft lurker reported upstairs that people were successfully using Outlook as a client to GroupWise servers.

Magically, one day not very long later, a Windows Update appeared, and things were patched.  Outlook was patched.  Outlook accessed the GroupWise mailbox via MAPI, and seeing it was GroupWise, created a new, duplicate, folder at the root level named “mailbox”.  And if you only ever used Outlook, life was grand.  Every time you logged in, Outlook would look at the mailbox, see two system folders at the root level named “mailbox”, pick the right one (the one with mail in it), and continue merrily along it’s way.

The GroupWise client, on the other hand, was completely unprepared for a duplicate system folder at the root level.  Like, GPF unprepared.  Like, wow that was ugly unprepared.

So although the GroupWise server was 100% MAPI compliant, GroupWise had additional features that MAPI didn’t support.  Specifically, To Do lists.  MAPI was an email protocol, and To Do items aren’t in any email spec, so MAPI doesn’t support them.  Doesn’t really matter, if all you are trying to do is email, and all Outlook was needing to do was email.

But if you did want to also do To Do lists, you needed to crank up the GroupWise client.  After the Windows Update which patched Outlook (and Outlook fouled over your GroupWise mailbox),  your GroupWise mailbox was completely unusable by the GroupWise client.

Novell had to scramble, and soon found the problem.  They had to issue an emergency patch to the database maintenance routine “gwcheck”.  It now (still to this day) includes a fix “deldupfolders”.  Run a gwcheck with the deldupfolders option, and your mailbox becomes un-f*cked.  Don’t run Outlook against it again, until Novell can issue a fix to the GroupWise client that does not crash when duplicate system folders magically appear in the folder structure.

It’s like your boyfriend works at Nestle, you invite him over for dessert, and when he finds you bought Hershey chocolate syrup for the ice cream, he takes an axe to your dinner table.  Weird thing was: why did he bring an axe with him, to dessert in the first place?

Windows 2000 network multiplexor

You try to access a file on a file server.  You are doing this from Windows.  Windows handles the call to the network, and gets to do with it what it may, as it takes care of your request.  If the file you want to get is on an NCP server, your experience may be sub-optimal.

Where we experienced this the worst, was again after one of those magical Windows Updates that got applied to a ton of machines on a Patch Tuesday.  All of our users on Windows 2000, after the patch, were taking three minutes to log in.

The day before, it took only a few seconds.  What happened?

Microsoft, in typical jealous boyfriend fashion, decided to play passive-aggressive in serving up network resources.  “Where’s the server? Hang on while we find out.”

After the patch, when a PC asked for an NCP server, the user’s PC (through the Windows 2000 network multiplexor), threw the name resolution request out to all CIFS servers first, and then waited 30 seconds for any of them to respond.  After 30 seconds, since none of the CIFS servers answered the call for the NCP service, it threw the name resolution request at the NCP servers (which of course, responded instantly).  Then it went on to resolve the next server name request.

Which took another 30 seconds.

To. the. same. server.

We typically mapped E:, F:, G:, H:, I:, and Z: all to the same server (but to different subdirectories on disk).  Six drive letters, at 30 seconds of timeout each, turned every login from a few seconds before the patch to three minutes after the patch.  Our users were howling at how painful this was.  All we could tell them was “don’t reboot, or you will have to log in (and wait) again”.

Novell again had to scramble for a fix, which was to roll out a new version of the Novell Client.  It would avoid asking the Windows 2000 network multiplexor to resolve NCP server names, by keeping a cache.  Find it once, with the 30 second penalty, but you’ll never have to find it again, ever again.

Their long term solution, by the way, was to add CIFS support to the NCP file servers.  Sure, Microsoft could (and from history, would) foul over anything NCP related, but if the server became a CIFS server; well, Microsoft couldn’t sabotage that without sabotaging their own.

Which brings me to today

Today, my PC will no longer ever again make an NCP call to a NCP file server.  All evidence points to it being speedy.  Very speedy.

My abusive boyfriend, Microsoft, who I now vow to be with forever and ever, is showing me his magnanimous generosity by not beating me up today.

And you know damn well that I had better be grateful.  Damned grateful.