February 17, 2019

How I Caught the Spam and What I Did With it When I Caught it - page 3

An Unpleasant Surprise

  • October 14, 1999
  • By Mark-Jason Dominus
Our filter program will be run by the mail transport agent (MTA), which is the program that is responsible for receiving mail from the network and for delivering it to the right place. With sendmail, for example, you can put a line like this one into your .forward file:

��"| /home/mjd/bin/mailfilter"

When mail arrives, sendmail will run the mailfilter program and hand it the mail message on its standard input. mailfilter can then decide whether to deliver the message by writing it to the mailbox, or whether to throw it away, or whether to do something else. Most MTAs have an option to deliver mail to a program in this way. My system was using the superb qmail MTA, so I would put the line

��| /home/mjd/bin/mailfilter

into my ~/.qmail file. (Actually my filter program is named deliver.aol.q2. Please don't ask why, because I don't remember.)

Reading the Message

What does this mailfilter program need to do? Obviously, the first thing it must do is read the mail message in from the standard input. Code to read in an email message is very simple in Perl:

1 { local $/ = "";
2 $header = ;
3 undef $/;
4 $body = ;
5 }

This reads the header of the message into $header and the body into $body. What's going on here? The Perl <...> operator reads a line of input from some filehandle. But what's a line? Normally, it's any sequence of characters that's terminated by a newline character. Why a newline? Because that's the default setting of the Perl $/ special variable. If you change $/, that changes Perl's idea of what a line looks like. If you changed it to contain a period, then Perl would think that a `line' was any sequence of characters that ended with a period.

There are two special settings for $/, however. If you set $/ to the empty string, as I did on line 1, the <...> operator reads by paragraphs instead of lines; consecutive paragraphs are separated by a blank line. Each call to <...> reads in one complete paragraph. Since the header of a mail message is a paragraph, separated from the body by a blank line, line 2 reads the entire header into $header.

The local on line 1 confines the changes to $/ to the block, so that when control reaches line 5, the original value of $/ is automatically restored. We might be doing file I/O in other parts of our program, and if we didn't put $/ back to normal we'd get weird results when the <...> operator didn't behave the way we expected. Using local ensures that we won't forget to put it back the way it was.

On line 3 we see the other special setting of $/. If $/ is undefined, then there is no line termination sequence, and Perl's <...> operator reads the entire rest of the input all at once. This is sometimes called `slurping' the input. Line 4 reads the entire message body into the variable $body.

Most Popular LinuxPlanet Stories