Saturday, May 16, 2009

Clones Ate My Files: A Sci-Fi Geek Thriller.

OK, so I'm banging out code for the betterment of my corporate overlords. It's a conceptually simple program: You give it a command and a list of servers and it runs the command against each server.

Not impressed? Well Monkey Boy doesn't get the corporate shekels without doing major mojo. This program runs in parallel. It juggles 64 instances of the command at a time. It can ping all the severs on a netmare in less than a minute, keep track of the results and print them out in the order that they were input. I get wet just thinking about it!

All praise Monkey Boy right? We'll I did run into a problem. A sneaky problem that involved failed clones, suicidal files and a forgotten inheritance. It's good stuff. The problem is, if you don't program in Perl, you're not going to give a crap. Oh well, I've never let my complete lack of an audience slow me down before. Why start now?

Like I said before, the beast is written in Perl. Perl's not an elegant language, but if you need to leap out of the bushes, rape and strange a problem and get on with your life, then Perl's your language of choice.

The basic layout is: Start a sub-process for the first 64 severs and have each one write to it's own temp file. When one sub-process finishes, read it's results from the temp file and let the temp file disappear. Then add a new sub-process for the next server. Once all the servers are done, print out the results in order and accept smoochies from hot code groupies/naked underwear models. Simple.

The Boy of Monkey knew that he'd need lots of temporary files to hold command results. He'd also like the files to go away by themselves when he's done with them. Perl lept to his aid with File::Temp. File::Temp is kind of the anonymous underage prostitute of programming. When you say "Gimmie" it gives you access to the goodies and provides an assumed name. When you're done using it, it disappears into the aether. It all works great, until someone, or something, starts killing the temps prematurely. Then you get a mystery to solve. Foreshadowing!

I got the code working and was getting ready to document (Yes, Monkey Boy is a pro, not documenting makes you a douche bag), when I though, what happens if the sever command can't run? If the user misspells "ping" as "pong" will they get a reasonable message?

Does
Couldn't open file '/tmp/multi.1d834.pid': No such file or directory
strike you as reasonable? Me neither. Nuts!

The actual error message made sense to Monkey Boy. He spawned the beast. '/tmp/multi.1d834.pid' is one of the randomly generated temp file names. When a sub-process finished, the program asked the temp handle for it's file name. It tried to opened the file, but for some reason the file was gone! Somehow the files were being killed before Monkus Boyus could get to them. No one should know about these files. They have specially constructed names, known only to the monkey... or the monkey's clone.

The way you run a sub-process in Linux is using the fork() command. You're running along happy as a clam, as if a mucus coated bivalve is your apotheoses of happiness, and then you hit fork(). At that point your program is cloned. You have 2 running copies of the code. The only difference is that fork() will tell the parent the ID of it's child. The child is handed the ID of 0, which tells it that it's the clone.

This is kind of Star Trekie at this point ain't it? We've got parents making 64 clones (top that Octomom!) and we've got children that are one bit away from being perfect copies of their parent. It gets better. The clone's next job is to call exec() which completely obliterates it and replaces it with another program. It's this second program, the sever command, which does the real work I want done.

A clone has one job. It's job is to die and be forgotten. Programming ain't for wussies!

When all goes well, the program runs like a well oiled roach motel. The clones check in, but never checkout. They disappear on the spot and are never heard from again.

That's when all goes well. What happens when the exec() command fails?

It's simple really, the clone lives on! It also keeps it's copy of the temp files, which it believes it owns. When it dies, it takes the temp files with it. Clones can be selfish little pricks.

Eventually the grieving parent checks on the child. It notices that it's died and then tries to check the temp file for the reason. The temp file is gone baby gone, it died at the hands of junior. You can only die once in temp file land.

As for the reason the exec() failed? It was written into the temp file. You know, the temp file that's in temp file heaven? Hmm. What to do? What to do?

Suddenly Monkey Boy (you remember Monkey Boy, he's the hero of this epic) has an insight. When you delete a file it doesn't really disappear until the last program that has a hold of it lets go. It's removed from the directory, so it can't be seen, but it's still out there, in limbo, awaiting for the sweet kiss of digital death.

Who else is holding on to the file? The parent of course! The question was, could Monkey Boy get to the parent to cough up the handle and could it be used for reading?

Detective Monkey Boy began investigating. He checked the usual suspects. "perdoc File::Temp" didn't provide much. It was higher level than a kite. The Internet tubes were blocked by flame wars and almost naked pictures of some platinum blond from California. No go. Detective Monkey Boy knew what he had to do. He had to go (non-prequil) Jedi. "Use the Source Luke!" is the rallying call of the Open Source movement. But Monkey Boy's name ain't Luke.

Into the source goes the hero. Past lines of documentation. Past obscure code references. Further he goes, until he finds, what he knows in is heard must be, "use IO::Handle". Rocken!

For those that don't know, deep in the belly of the beast, a file comes down to little more than a number. When you open up a file, voodoo happens, and an entry is put in a table called the File Descriptor Table. What you deal with, either directly or through some Perl interface, is an entry in this table. If you can figure out the index number to this table, you can find your file. File::Temp was a cold fish, but what about it's ancestors? File::Temp inherits from IO::Handle. To get to the real power, you got to seduce grandma.

"Hey there Granny!"

Once I go my hands on Granny's nodes (yech!) she gave up IO::Handle. IO::Handle has the fileno() function. You got the number, you get the data.

After that it was just a hop, skip and a file dupe to get to the data so ingloriously killed off by the wayward clone. It takes more than the death of a temp file to stop a motivated Monkey Boy!

All praise Monkey Boy.

2 comments:

chorny said...

"Perl's not an elegant language" - you should try things like Perl::Critic, perltidy, Moose, MooseX::Declare, signatures.pm.

Dale Wiles said...

I'm not saying that you can't trowel a big pile of elegance on top of Perl. With enough work I could come up with object oriented COBOL, but it would pervert the paradigm.

At it's enlarged heart, Perl is a "get it done, then count the bodies" type of language. That's why it's so useful.

Of course there are right ways and wrong ways to chop up a baby.