Rule-based filtering

Content filtering techniques relied on the specification of lists of words or regular expressions disallowed in mail messages. Thus, if a site receives spam advertising "herbal Viagra", the administrator might place these words in the filter configuration. The mail server would thence reject any message containing the phrase.

Header filtering is the means of inspecting the header of the email, the part of the message that contains information about the message. Spammers will often spoof fields in the header in order to hide their identities, or to try to make the email look more legitimate than it is; many of these spoofing methods can be detected. Also, headers that violate the RFC 2822 standard on how the email header is to be formed are frequently rejected.

Disadvantages of filtering are threefold: First, it can be time-consuming to maintain. Second, it is prone to false positives. Third, these false positives are not equally distributed: since content filtering is prone to reject legitimate messages on topics related to products frequently advertised in spam. A system administrator who attempts to reject spam messages which advertise mortgage refinancing, credit or debt may inadvertently block legitimate e-mail on the same subject.

Spammers frequently change the phrases and spellings they use. This can mean more work for the administrator. However, it also has some advantages for the spam fighter. If the spammer starts spelling "Viagra" as "V1agra" (see leet) or "Via_gra", it makes it harder for the spammer's intended audience to read their messages. If they try to trip up the phrase detector, by, for example, inserting an invisible-to-the-user HTML comment in the middle of a word ("Viagra"), this sleight of hand is itself easily detectable, and is a good indication that the message is spam. And if they send spam that consists entirely of images, so that anti-spam software can't analyze the words and phrases in the message, the fact that there is no readable text in the body can be detected, making that message a higher risk of being spam.

Content filtering can also be implemented to examine the URLs present (i.e. spamvertising) in an email message. This form of content filtering is much harder to disguise as the URLs must resolve to a valid domain name. Extracting a list of such links and comparing them to published sources of spamvertised domains is a simple and reliable way to eliminate a large percentage of spam via content analysis.