www::mechanize | David's Blog

If the file name it is dumping does not end in:

.html

the mech-dump will spit out an error that the file content is not text/html it is text/plain. And of course, it immediately quits without doing anything helpful.

And then you go and look inside the file, and this is right at the top:

content="text/html"

You ask yourself What the Hell?

It’s a terrible error message; that’s the hell of it. The error message should say “Input file name does not terminate with the string .html”

I use Linux a lot, and in Linux, files do not have to have file extensions in their names. Over in the Windows side, it is expected that a file name has an extension. Windows uses that file name extension to figure out which program should be associated to the file type. But in Linux, the program association data is written inside the file itself.

This has two effects. First, files in Linux don’t need file extensions in their name. Second, you can name a file in Linux to not have the file extension, and the file works anyway.

So, if I’m writing a Perl script on Linux and I want to dump out something I’ve just pulled down from a web server, using WWW::Mechanize, I might be inclined to name the file where I’m dumping this web form to www_mechanize_web_form_dump

And this would be a mistake, because when I later run

mech-dump www_mechanize_web_form_dump

I’m going to get spit at with the message that the file does not contain HTML, it contains only plain text.

It would have saved me a bunch of time, if the error message would have been “mech-dump does not interpret files with names that do not end in .html”

That might seem kind of a silly input file name constraint, but at least the error message wouldn’t be misleading.

Tag: www::mechanize

mech-dump (part of Perl WWW::Mechanize) is incredibly stupid about it's input file name