Once you’ve been bamboozled ….

Once you’ve been bamboozled, it is almost impossible to become un-bamboozled.

This was from an AskReddit question about “What was the best quote or life changing saying or most profound advice people had heard?” Something like that; but my search did not find the exact entry to cite. One of the answers was this one. It’s great. I mentioned this to a friend of mine; he thought Carl Sagan had said it. Well, essentially yes, but not exactly. Carl Sagan’s quote goes like this:

“One of the saddest lessons of history is this: If we’ve been bamboozled long enough, we tend to reject any evidence of the bamboozle. We’re no longer interested in finding out the truth. The bamboozle has captured us. It’s simply too painful to acknowledge, even to ourselves, that we’ve been taken. Once you give a charlatan power over you, you almost never get it back.”

https://www.goodreads.com/quotes/85171-one-of-the-saddest-lessons-of-history-is-this-if

This is well said, but also really wordy, plus throws in a ten dollar word: charlatan. I like the short and sweet version.

It seems to me that the almost the entire USA has been bamboozled about politics.

The Left has been bamboozled that Donald Trump Is A Bad Man.

The Right has been bamboozled that Donald Trump Is A Good Man.

I remember seeing a cartoon not that long ago (within a year or two) that had a King on a balcony with an advisor, overlooking an angry mob. Here it is (I linked to the original source, so you can get to that web page – credit where credit is due):

The advisor was saying to the King: “Oh, You don’t need to fight them – you just need to convince the pitchfork people that the torch people want to take away their pitchforks.”

When I went looking, Google search failed to find this cartoon. I mentioned it to a friend, and he saw it on Facebook. I asked him to forward it to me. From there, I was able upload it to Google Image Search, and then finally find the original publisher. The conspiracy theorist spoiler alerter in me thinks the search engines of the day have de-ranked or removed this image in search results because it spoils the narrative.

The idea here is an intersection between the two old sayings A house divided against itself cannot stand and The People restrain themselves and anxiously hope for just two things: bread and circuses.

The Left is thoroughly convinced that The Right has been bamboozled. The Right is thoroughly convinced that The Left has been bamboozled.

I am convinced both have been bamboozled by the deep state and it’s unholy alliance with mass media. When I say mass media, I’m also looking at you: Facebook and Google and Twitter.

Here’s the thing about Donald Trump: he was never supposed to be President.

The deep state mass media planned to get Hillary Clinton. They thought they earned Hillary Clinton. By knocking out every good opposing candidate, there was no way that Hillary could lose. There was no way that Hillary Clinton could lose against Donald Fucking Trump. Knock out every other candidate, and the election was a done deal.

This was perfect for the deep state, because Bill and Hillary Clinton were already players. They’d played ball before, and were happy to play again. As insiders, their keepers had leverage on them, and as players, they knew their keepers would be comforted with them as lackeys. It was a win-win situation.

But (“oh by the way”) Hillary Clinton was the worst possible candidate for President.

Which is proven out, because she lost to Donald Fucking Trump, dontcha know. Fair and square, she was simply that BAD of a candidate. And to be fair, Donald was actually a very good campaigner, and a master of Twitter trolling. His campaign speeches were super entertaining. The deep state completely underestimated how well Donald would perform.

Donald was an outsider. This was a disaster for the deep state.

Chuck Schumer delivering the deep state warning to Donald Trump to play ball or else (after election but before inauguration).

  • Donald bristled at being told to take his role of lackey. Now the deep state is on his shit list.
  • If Donald made it to a second term, there was no remaining leverage to keep him from ravaging the deep state.

Does it appear to you that Donald rolled over and became a lackey?

The only choice the deep state had was to backstab the sitting President every chance they could get.

Wow did they ever.

The Commander In Chief: that is who the deep state is supposed to obey. Instead, they did everything they could to subvert CIC/POTUS. They became traitors to the rule of law.

And you, dear reader, got taken in by the charlatans that Donald Trump Is A <‽> Man.

rsync is wonderful, but ….

rsync /datastore/61/E4 /newserver/61/E4

is wrong and will mess you up!

Imagine if you will, that you have a whole bunch of data stored on an old server, and you need to copy it to a new server. The rsync utility would be an obvious way to go. There are things about the job and rsync that you might want to tweak, though, and that’s where things get ugly. Part of this is bash’ fault.

Imagine if you will, that your data store is 120 million small files (emails) stored in 256**3 directories. 256 cubed is 16,777,216 sub directories.

The programmer that created the data store to hold all these files needed subdirectories to put the files in. Linux doesn’t really like 20,000+ files in one directory. It would be better to have more subdirectories, with less files per subdirectory. So the programmer started with a loop:

for 00 .. FF mkdir

Then the programmer did a change directory into each of those directories he just made, and did the exact same thing.

cd 00;for 00 .. FF mkdir;cd ..
cd 01;for 00 .. FF mkdir;cd ..
cd 02;for 00 .. FF mkdir;cd ..
...
cd FF;for 00 .. FF mkdir;cd ..

That gets you to 256 squared, which is 65,536

And then the programmer did a change directory into each of those directories he just made, and did the exact same thing. All 65,536 second level subdirectories got a third level of another 256 subdirectories. That gets you to 16,777,216 which is 256 cubed.

So your file server directory structure might contain this:

/datastore/61/E4/7D

Inside good old 61/E4/7D there might be twenty to thirty files, each one holding the content of an email, or a metadata file about the email. The programmer was pretty good about filling all of the datastore subdirectories to nineteen files each, then twenty files each, then twenty one files each. No Linux system is going to have a problem with twenty one files in a subdirectory.

The only real problem here is if you need to traverse everything in /datastore – this takes forever

Back to the problem of copying everything from /datastore to /newserver. Let’s assume that /newserver in on a different machine, and we are using remote file system mount command to make the remote machine appear to be a local disk (mount point).

You might think the rsync command ought to look like this:

rsync --archive /datastore /newserver

There are two things that make this sub-optimal. First, it is single-threaded. Second, there is no progress feedback.

The single threaded part isn’t so bad; it just means that we are losing speed due to rsync overhead. The server has twelve cores, the network is 10 Gbps Fibre Channel, the /datastore disk has multiple spindles, but rsync was designed for slow networks way back when in the bad old days.

At this point, you might ask “why not do a straight cp -r” (copy command, recursive)? It’s not a terrible idea; but, what if there were a network glitch? The entire cp -r would have to be started over, and every bit already copied would be copied again. This is where rsync shines: if the files in the destination are the same as the source, the copy is skipped. cp -r also suffers from the same lack of progress feedback.

Did I mention that the 120 million files are also 9.3 terabytes of files? I really don’t want to get to 98% done and then have a network glitch cause me to copy another 9.3 TB over, which would be the case with cp -r

The tests I’ve done indicate that four rsync commands, running simultaneously, copied the most data in the shortest period of time in my environment*. More than four rsync commands at once, and I started to saturate the disk channel. Less than four rsync commands, and something is waiting around, twiddling it’s thumbs, waiting for rsync to get busy with the copying again, which it will do, as soon as it finishes up with the overhead it’s working on.

The other problem is a lack of progress feedback. The copy is going to take multiple days. It would be nice to know if we are at 8% complete or 41% complete or 93% complete. It would be nice to be able to compute what the percentage complete is.

Well, how about 64K rsync commands, each with a print statement of the directory it is processing? And if we could run four of them in parallel, we could get the multiple jobs speedup too.

You might think the rsync commands ought to look like this:

rsync --archive /datastore/00/00 /newserver/00/00
rsync --archive /datastore/00/01 /newserver/00/01
rsync --archive /datastore/00/02 /newserver/00/02
rsync --archive /datastore/00/03 /newserver/00/03
rsync --archive /datastore/00/04 /newserver/00/04
...
rsync --archive /datastore/FF/FF /newserver/FF/FF

but WOW would you ever be wrong!

Remember old /datastore/61/E4/7D up there? This format for rsync would put E4 in the source under E4 in the destination! In other words, although the source looks like this: /datastore/61/E4/7D the destination would look like this: /newserver/61/E4/E4/7D

To be done right, the command needs to look like this:

rsync --archive /datastore/00/00/* /newserver/00/00/
rsync --archive /datastore/00/01/* /newserver/00/01/
rsync --archive /datastore/00/02/* /newserver/00/02/
rsync --archive /datastore/00/03/* /newserver/00/03/
rsync --archive /datastore/00/04/* /newserver/00/04/
...
rsync --archive /datastore/FF/FF/* /newserver/FF/FF/

The source needs a trailing slash and asterisk to tell rsync to copy the stuff underneath the source (not the source itself) to the destination (which is finished with a slash).

Enter the problem where bash is a pain in the ass.

Well, before I go there, let me mention that it wasn’t too bad to write a Perl script to write this bash script, and do three things per source and destination pair:

echo "rsync --archive /datastore/00/00/* /newserver/00/00/"
rsync --archive /datastore/00/00/* /newserver/00/00/
echo "/newserver/00/00/" > /tmp/tracking_report_file

The first line prints the current status to the screen. The second line launches the rsync. The third line overwrites a file, tracking_report_file, with the last rsync finished.

So, crank up screen first, launch the bash script, and some number of days from now, the copying will be done.

That /tmp/tracking_report_file gives me a pair of hexadecimal pairs, which I can then use to compute percentage complete. For example, when /newserver/7F/FF updates to /newserver/80/00, then we are going to be just over 50% done.

Heck, I can detach from screen, and I don’t even have to watch the rsyncs happen. I mean that I do need to, but I don’t have to. Better yet, I can take the same routine that converts the pair of hexadecimal pairs into percentage complete and wrap that inside a cron job that sends an email. Progress status tracking accomplished!

But this does not solve the single-threaded rsync problem.

And ultimately, I could not get it done.

What looked to be an okay solution was using the find command, to feed into xargs which could do shell stuff in parallel. I even got as far as getting bash shell variables to create the rsync --archive /datastore/00/00/00 /newserver/00/00/00 part.

Okay, that would be 16 million smaller rsyncs instead of 64 thousand larger ones, but I might even be able to bump up the parallelism to six or eight or nine.

But the serious problem the rsync –archive /datastore/00/00/00 /newserver/00/00/00 command has, is the naive problem: the missing trailing slash and asterisk are going to put the source underneath a destination. I need to put the trailing slash and asterisk on there.

And bash says “that’s a nope”

Trailing slashes and asterisks are automatically culled from output, because (reasons).

Oh well. The find command also spits out the directories it finds in rather random order. My bash script with sequential rsyncs by sorted order means that the last one complete really is some-percentage-of-the-total done. But if find chooses to spit out /datastore/b3/8e/76 instead of /datastore/00/00/00 then my status tracking doesn’t actually work. I would be forced to traverse all of /newserver/ and count which of the 17 million are complete; which would take freaking forever.

Yes, I said 17 million. Did you notice that the programmer that created subdirectories did some of them in lowercase hexadecimal? That happened when we brought in another email system (Exchange). Lovely.

*the last time I did this migration, although it was on a four core box, then.