Disclaimer: This is my opinion only and is worth every penny you paid me for it. 😉
This weekend, the organization I work for is moving 1/2 of us from NCP (Novell Core Protocol) file servers to Microsoft CIFS (Common Internet File System) file servers. I’m in the testing group. Windows Explorer, showing me file shares, is noticeably faster.
My boyfriend has decided to stop beating me.1
I no longer have that other boyfriend, Novell, so everything is nice now, right? My Microsoft boyfriend is happy now; he no longer needs to beat sense into me every night for keeping up that relationship with that loser, Novell.
I’ve been doing computers for a long time; what follows are five times Microsoft shipped code to put the hurt on me and my users for having the gall to also be Novell customers.
- IFSHLP.386 and the anti-virus demo
- Windows 3.10 –> Windows for Workgroups 3.11 setup.exe
- Windows 95 Get Nearest Server
- Outlook access via MAPI of a GroupWise mailbox
- Windows 2000 network multiplexor
This phenomenon is not new; in fact, someone came up with a clever phrase: “DOS isn’t done until Lotus won’t run”.
DOS isn’t done until Lotus 1-2-3 won’t run
This refers to an incident where Microsoft shipped a new version of DOS that implemented LIM EMS (Lotus, Intel, Microsoft Expanded Memory Specification) differently. Lotus had 1-2-3, the first “killer app” – the spreadsheet – and Microsoft wanted all that money. They were just shipping Windows, and because it was graphical, it was slower. In speed tests, Lotus 1-2-3 calculated faster than Microsoft Excel inside Windows. So Microsoft shipped a new version of DOS with LIM EMS that implemented memory access on word boundaries instead of byte boundaries (or something like that). The upshot is that before, you could install DOS, install Windows, install Lotus 1-2-3, and everything ran fine. After, with the new DOS, when you went to launch Lotus 1-2-3, Windows immediately erred out with a big ugly GPF (General Protection Fault) due to illegal memory access, and a dialog box that told you you should contact your vendor (Lotus) for a version of the program that doesn’t crash Windows.
At the same time, Microsoft was running advertising in all the trade publications, with a picture of a jet fighter pilot and his crash helmet. The subtext of the ads was, “You should get your spreadsheet for Windows from the vendor who wrote Windows. Less crashes that way” (something to that effect).
This was essentially the first big evidence that Microsoft was the jealous boyfriend who would beat his girlfriend (you), who also dated someone else (Lotus, in this case).
IFSHLP.386 and the anti-virus demo
Installable File System (IFS) Helper, for the Intel 386 architecture, background here: https://en.wikipedia.org/wiki/Installable_File_System. This was an idea that programmers ought to get to the file system via a standardized operating system call. Before IFSHLP.386, software that needed disk access hooked into Interrupt 21.
As this idea grew, Novell asked Microsoft if they should get on board. “Yes, of course”, Microsoft said. Hooking into the file system to provide storage over the network was what NCP did – it is reason for its existence. You installed the Novell client software stack, you logged in, and now your PC has a drive E: where it didn’t before, and that drive E: is now on the other side of the network cable. Novell was the perfect candidate to be an installable file system.
As this idea grew, antivirus software vendors asked Microsoft if they should get on board. And Microsoft told them, “No, Interrupt 21 will always work.” The antivirus vendors can count on it, and there is no reason to make their lives more complicated by mucking around with installable this and dynamic that. Just hook into Interrupt 21 like they always have, and things will be fine.
After a while, the IFS idea grew, and shipped, and was promoted as a generally good thing.
And then, Microsoft held a press conference.
They put a virus on a PC, and ran various partners’ antivirus on the PC. They all found and cleaned the virus. They put the virus on a Windows NT file server. As soon as the PC accessed the file (going through Interrupt 21), the antivirus software triggered and protected the PC from the virus on the network file server.
Then they (Microsoft) put the virus on a Novell NetWare file server, and accessed the file. The virus was not detected because the Novell software stack on the PC had been configured (per Microsoft’s advice) to use IFSHLP.386 to access the files on the network!
Microsoft made a huge deal of the fact that if you used Novell NetWare for your file server, you were putting your company at risk. What a bunch of terrible programmers Novell were; they hid viruses from your antivirus programs.
Thankfully, the computer press was aware of Microsoft’s abusive boyfriend behavior. Instead of flooding all channels of communication with anti-Novell propaganda, they took a weekend off, and asked Novell to explain themselves. The computer press confirmed with the antivirus vendors the story about IFS versus Interrupt 21. Novell agreed to re-add the Interrupt 21 support they had dropped in the presence of IFS.
The computer press followed up by choosing to either not run the story; or, ran the story of the antivirus demo with the caveat that Novell promised to support the older access method in the future.
Windows 3.10 –> Windows for Workgroups 3.11 setup.exe
Back in the day, we used to move PCs around a lot. Also, there was a saying “The 3 R’s of troubleshooting: Reboot, Reinstall, Reformat”. First, reboot. Is the problem solved? If no, reinstall whatever software package was troubling the user. Is the problem solved? If no, reformat the hard drive and install everything from fresh again. It was brutal, but effective. Our methods developed around this practice, and it became moderately easy.
Step 1: Boot the machine from a floppy disk drive. Issue the Format command in DOS to wipe the C: drive.
Step 2: Install DOS from the floppy.
Step 3: Install the Novell stack onto the C: drive.
Step 4: Reboot from the C: drive and log in to the network.
Step 5: Switch to the F: drive, and change directory to the subdirectory (folder) where Windows 3.10 was.
Step 6: Launch setup.exe with the command line switch to tell it to install Windows to the C: drive.
A short while later, the machine had a brand new, fresh install of Windows on it. “R3” of “The 3 R’s of Troubleshooting” was complete.
Because we always told our users to store their documents on the H: drive (the Novell network), no one ever lost any files this way.
It is worth noting that all the floppies from Windows 3.10 had been copied to the one network location, so we didn’t need to carry six additional floppies with us. It made a lot of sense to put them on the network, log in to the network, and run the installers off the network. It was much faster than floppies, too. We didn’t worry about licenses because we never bought a PC that didn’t come with Windows on it.
And then, Microsoft shipped an update to Windows: Windows for Workgroups (WfW), version 3.11. This was “the first version of Windows, built with the network in mind”.
One of my co-workers copied all six floppies to the network and told us of the new location. We were going to have to keep track of which machines had Windows 3.10 versus WfW 3.11 for license reasons, but otherwise we had no qualms – until it came time to R3 a client’s computer.
Steps 1-5 were identical. Step 6, however, had a landmine built into it.
Microsoft shipped WfW 3.11 setup.exe with code that checked where it was running from. If the install was from physical media (a hard drive or floppies), setup.exe ran fine. But if setup.exe was running from a virtual drive (Novell network mapped drive), it reached into the drive and made a call that crashed the entire server.
It. crashed. the. entire. server.
The server suffered an “ABEND” (abnormal end), and broadcast to everyone on the network that it had crashed, your files are lost, the end times have arrived, too bad, so sad, log back in after the file server comes back up, hope you had backups…
I crashed a server with 80 users on it. My co-worker crashed a server with 50 users on it. We learned, the hard way, not to run setup.exe from the F: drive.
I, being the clever guy I am, came up with a work-around. I renamed setup.exe to setup.not and made a DOS batch file, setup.bat, that took setup.exe’s place. It did five things:
- Copied the F: drive WfW folder contents to the C: drive
- Changed to the C: drive
- Renamed setup.not to setup.exe
- Deleted setup.bat
- Launched setup.exe
Setup.exe still did whatever nasty thing Microsoft programmed it to do, but because the install media was a physical hard drive (C: drive), the nastiness had no effect.
Back to my point of Microsoft being the abusive boyfriend, the analogy for this instance is the jealous boyfriend seeing you borrow another friend’s truck to help the jealous boyfriend move in, but because the truck belonged to some other guy, your boyfriend deliberately crashes the other guy’s truck into a brick wall.
Later, I was speaking with a Novell product manager, and his comment was that they were thankful to Microsoft for this sabotage. Before, they were happy that things just worked at all. After, they learned they needed to practice defensive programming. A malicious actor was trying to crash their server. Microsoft might have been the first, but they wouldn’t be the last.
Windows 95 Get Nearest Server
Early Novell NetWare servers had an easy (if simplistic) way of helping to set stuff up: broadcasts. The type of packet was called a SAP packet, for Service Advertising Protocol. Note that SAP was specific to Novell IPX/SPX networking, and had nothing to do with TCP/IP (the dominant protocol today). IPX/SPX and SAP were a Novell thing. When a server started up, it broadcast a SAP packet: “I’m a server; if you need me, here’s my address.” Some things, like Hewlett-Packard JetDirect print servers, would send a SAP packet every 30 seconds: “I’m a printer; if you need me, here’s my address.”
A client PC booting up on the network would broadcast a SAP Get Nearest Server packet. This was the other side of the coin: “I’m a client, and I need to know what servers are available.” All the NetWare boxes on the network would send a packet back: “I’m a server, here’s my address.” The network connection would be established, and communication would flow.
An interesting assumption that Novell programmers made was that the file server was the fastest machine on the network. If you have 200 PCs on a network, serving all of them will not be the job of some underpowered hand-me-down piece of junk that no one wants anymore. It’s going to be the biggest, fastest, and best box money can buy. And “fastest” meant “answers quickest”.
The “nearest server” was the box that was a server and replied the quickest. If there were several servers on the network, all would reply, but the client would choose the first one with an answer as the “nearest”.
So how could Microsoft abuse the network? By adding SAP Get Nearest Server replies to Windows 95. Mind you, Windows 95 didn’t provide Novell NetWare network support; they just answered the Get Nearest Server broadcasts with “I’m a server, here’s my address.”
It was all lies, of course. The client PC would then attempt to establish the network connection, but the Windows 95 box would sit there silently, non-responsive. I imagine some programmer in Redmond, Washington, watching the network failure play out, was grinning like the proverbial Cheshire Cat.
So of course, one of our power users had just gotten a really fast PC – faster than his file server, in fact. Microsoft had the big shindig, announcing Windows 95. Our power user stood in line to buy that retail copy of Windows 95 on opening day, came back to work, and upgraded his PC.
Which promptly began answering Get Nearest Server broadcasts with “I’m a server, here’s my address” (whispering only to itself, “But you can go stuff yourself’).” All the printers on the network vanished. All the file servers to log in to vanished. All the servers in Remote Console vanished (how we administrators administered servers).
Kind of like the abusive boyfriend standing on your doorstep and, to everyone who shows up at your door, getting confrontational with them to make sure they can’t reach you, nor can you reach them. All that poor mail carrier wanted to do was deliver a freaking letter. All your mom wanted to do was to drop by and visit her daughter. Microsoft wrote, tested, and delivered software to deliberately fuck over anyone on a NetWare network by answering Psych! to all machines looking for a server.
Ultimately, Novell started moving to SLP (Service Location Protocol) as invented by Eugene Guttman at Sun Microsystems but made popular by Apple.
Outlook access via MAPI of a GroupWise mailbox
One wonderful resource we had in the Novell GroupWise environment was a mailing list named NGW List. Anyone could subscribe, and if you administered GroupWise, you ought to have. If you had a problem, we could tell you how we solved it or what workarounds there were. At its height, it had more than 200 messages per day. We sometimes discussed feature requests. But a problem was that someone at Microsoft had the job of lurking on the list, looking for ideas.
So, a new version of GroupWise dropped, and Novell told us how great it was; it was now fully MAPI compliant.
MAPI was a Microsoft standard: “Messaging Application Programming Interface”. Developers wanted to be able to send email from their programs, and Windows might have MS Mail or MS Outlook installed on it. How to send email if different mail providers are available? The answer was MAPI, which Microsoft made and provided as a standard.
On the list, we asked, Since the new GroupWise supports MAPI, can you install Microsoft Outlook on a PC and use it to access a GroupWise mailbox?
The answer was yes, just fine. We tried it, and it just worked. Life was good.
Until that Microsoft lurker reported upstairs that people were successfully using Outlook as a client to GroupWise servers.
Magically, one day not very long later, a Windows Update appeared, and things were patched. Outlook was patched. Outlook accessed the GroupWise mailbox via MAPI, and seeing it was GroupWise, created a new, duplicate folder at the root level named “mailbox”.
If you only ever used Outlook, life was grand. Every time you logged in, Outlook would look at the mailbox, see two system folders at the root level named “mailbox”, pick the right one (the one with mail in it), and continue merrily along its way.
The GroupWise client, on the other hand, was completely unprepared for a duplicate system folder at the root level. Like, GPF unprepared. Like, wow, that was ugly unprepared. The GroupWise programmers had not anticipated that a malevolent client would purposefully create a duplicate system folder named “mailbox”.
GroupWise had additional features that MAPI didn’t support. Specifically, to-do lists. MAPI was an email protocol, and to-do items aren’t in any email spec, so MAPI doesn’t support them. That doesn’t really matter if all you are trying to do is email, and all Outlook was needing to do was email.
But if you did want to also do to-do lists, you needed to crank up the GroupWise client. After the Windows Update patched Outlook (and it fouled over your mailbox), your mailbox was completely unusable by the GroupWise client.
Novell had to scramble and soon found the problem. They had to issue an emergency patch to the database maintenance routine “gwcheck”. It now (still to this day) includes a fix “deldupfolders”. Run a gwcheck with the deldupfolders option, and your mailbox becomes un-f*cked.
Don’t run Outlook against it again until Novell can issue a fix to the GroupWise client. They did later ship a client that does not crash when duplicate system folders magically appear in the folder structure.
It’s like your boyfriend works at Nestle, you invite him over for dessert, and when he finds you bought Hershey chocolate syrup for the ice cream instead of Nesquik, he takes an axe to your dinner table. The weird thing was, why did he bring an axe with him for dessert in the first place?
Windows 2000 network multiplexor
You try to access a file on a file server. You are doing this from Windows, although you have a NetWare client on your PC. Windows handles the call to the network and gets to do with it what it may, as it takes care of your request.
Again, after one of those magical Windows Updates that got applied to a ton of machines on a Patch Tuesday, all of our users on Windows 2000 were taking three minutes to log in.
The day before, it took only a few seconds. What happened?
Microsoft, in typical jealous boyfriend fashion, decided to play passive-aggressive in serving up network resources. “Where’s the server? Hang on while we find out.”
During login, a PC would ask for a server name. It threw that request at the Windows 2000 network multiplexor.
Before the patch, the Windows 2000 network multiplexor used SAP or SLP to get an NCP server name and then passed that server to the client for further investigation.
After the patch, when the user’s PC asked for a server name, the Windows 2000 network multiplexor threw the name resolution request out to all CIFS servers first and then waited 30 seconds for any of them to respond. After 30 seconds, since none of the CIFS servers answered the call for the NCP service, it threw the name resolution request at the NCP servers (which, of course, responded instantly). Then it went on to resolve the next server name request.
Which took another 30 seconds.
To. the. same. server.
It’s like your boyfriend moved himself into your home and installed locks on the bathroom, bedrooms, basement, and even closets. Every morning, he wakes up and locks all the doors. Every time you try to go through a door, he blocks you and tells you that if only you moved into his apartment with him, the doors would remain unlocked.
We typically mapped E:, F:, G:, H:, I:, and Z: all to the same server (but to different subdirectories on disk). Six drive letters, at 30 seconds of timeout each, turned every login from a few seconds before the patch to three minutes after the patch. The login script screen would finally map the first drive and then sit, waiting. Then the next drive letter was mapped, followed by more waiting.
Our users were howling at how painful this was. All we could tell them was, “Don’t reboot, or you will have to log in (and wait) again”.
Novell again had to scramble for a fix, which was to roll out a new version of the Novell Client. It would avoid asking the Windows 2000 network multiplexor to resolve NCP server names by keeping a cache. Find it once, with the 30-second penalty, and you’ll never have to find it again.
Their long-term solution, by the way, was to add CIFS support to the NCP file servers. Sure, Microsoft could foul over anything NCP-related, but if the server became a CIFS server, well, Microsoft couldn’t sabotage that without sabotaging their own.
Which brings me to today
Today, my PC will no longer ever again make an NCP call to a NCP file server. All evidence points to it being speedy. Very speedy.
My abusive boyfriend, Microsoft, who I now must avow to be with forever and ever, is showing me his magnanimous generosity by not beating me up today.
And you know damn well that I had better be God damned grateful.
- This is an incorrect gender reference: I’m male, but “my girlfriend stopped beating me” doesn’t deliver the idea with the same punch ↩︎