ANTI SPAM: July 2007

Anti-spam techniques.

Spam Harassment Reduction

(SHRED) is a recent proposal for a sender-at-risk protocol that aims to avoid some of the defects of earlier reputation and postage-based systems. There are a number of proposals for sideband protocols that will assist SMTP operation. The Anti-Spam Research Group (ASRG) of the Internet Research Task Force (IRTF) is working on a number of E-mail authentication and other proposals for providing simple source authentication that is flexible, lightweight, and scalable. Recent Internet Engineering Task Force (IETF) activities include MARID (2004) leading to two approved IETF experiments in 2005, and DomainKeys Identified Mail in 2006.

Bonds or Sender-at-risk

As a refinement to stamp systems was the idea of requiring that the micropayment only be retained if the recipient considered the email to be abusive. This addressed the principal objection to stamp systems: popular free legitimate mailing list hosts would be unable to continue to provide their services if they had to pay postage for every message they sent out.

Bill Gates announced that Microsoft is working on a solution requiring so-called “unknown senders”, i.e. senders not on the Accepted List of the recipient to post “the electronic equivalent of a” stamp whose value would be lost to the sender only if the recipient disapproves of the email. Gates said that Microsoft favors other solutions in the short-term, but would rely on the contingent payment solution to solve the spam problem over the longer run. Microsoft, AOL as well as Yahoo! have recently introduced systems that allow commercial senders to avoid filters if they obtain a certificate or certification, which is lost to the sender if recipients complain.

The intent of such "sender-at-risk" solutions, which impose a significant cost to the sender only if the recipient rejects the message subsequent to receiving the email, is to deter spam by making it economically prohibitive to send unwanted email messages, while allowing legitimate emailers to send messages at little or no expense.

Proof-of-work systems

such as hashcash require that a sender pay a computational cost by performing a calculation that the receiver can later verify. Verification must be much faster than performing the calculation, so that the computation slows down a sender but does not significantly impact a receiver. The point is to slow down machines that send most of spam—often millions and millions of them. While every user that wants to send email to a moderate number of recipients suffers just a few seconds' delay, sending millions of emails would take an unaffordable amount of time. This approach suffers when sender maintains a computation farm of their own or from zombies.

Techniques for researchers & law enforcement

Increasingly, anti-spam efforts have required co-ordination between law enforcement, researchers, major consumer financial service companies and Internet service providers who need e-mail spam, identity theft and phishing evidence to track and monitor the risks and activities.

Honeypots

Another approach is simply an imitation MTA which gives the appearance of being an open mail relay, or an imitation TCP/IP proxy server which gives the appearance of being an open proxy. Spammers who probe systems for open relays/proxies will find such a host and attempt to send mail through it, wasting their time and potentially revealing information about themselves and the source of spam to the unexpected alert entity (in comparison to the anticipated careless or unskilled operator typically in charge of open relay MTA systems) that operates the honeypot. Such a system may simply discard the spam attempts, submit them to DNSBLs, or store them for analysis.

Techniques for researchers & law enforcement

Spam report feedback loops

By monitoring spam reports from places such as spamcop, AOL's feedback loop, and Network Abuse Clearinghouse, the domains abuse@ mailbox, etc., ISPs can often learn of problems before they seriously damage the ISP's reputation and have their mail servers blacklisted.

Strong AUP and TOS agreements

Most ISPs and web e-mail providers have either an Acceptable use policy (AUP) or a Terms of Service (TOS) agreement that discourages spammers from using their system and allows the spammer to be terminated quickly.

Port 25 interception

Network address translation can be used intercept all port 25 (SMTP) traffic and direct it to a mail server that enforces rate limiting and egress spam filtering. This is commonly done in hotels, but it can cause e-mail privacy problems, as well making it impossible to use STARTTLS and SMTP-AUTH if the port 587 submission port isn't used.

Rate limiting

Machines that suddenly start sending lots of e-mail may well have become zombie computers. By limiting the rate that e-mail can be sent around what is typical for the computer in question, legitimate e-mail can still be sent, but large spam runs can be slowed down until manual investigation can be done.

Port 25 blocking

Firewalls and routers can be programmed to not allow SMTP traffic (TCP port 25) from machines that are not supposed to run Mail Transfer Agents or send e-mail.[3] This practice is somewhat controversial when ISPs block home users, especially if the ISPs do not allow the blocking to be turned off upon request. E-mail can still be sent from these computers to designated smart hosts via port 25 and to other smart hosts via the e-mail submission port 587.

Limit e-mail backscatter&Egress spam filtering

Egress spam filtering

E-mail senders can do the same type of anti-spam checks e-mail coming from their users and customers as can be done for e-mail coming from the rest of the Internet.

Limit e-mail backscatter

If any sort of bounce message or anti-virus warning gets sent to a forged email address, the result will be e-mail backscatter.

Problems with sending challenges to forged e-mail addresses can be greatly reduced by not creating a new message that contains the challenge. Instead, the challenge can be placed in the Bounce message when the receiving mail system gives a rejection-code during the SMTP session. When the receiving mail system rejects an e-mail this way, it is the sending system that actually creates the bounce message. As a result, the bounce message will almost always be sent to the real sender, and it will be in a format and language that the sender will usually recognize.

Confirmed opt-in for mailing lists

enting opt-in mailing lists: many means of gathering user e-mail addresses remain susceptible to forgery. For instance, if a company puts up a Web form to allow users to subscribe to a mailing list about its products, a malicious person can enter other people's e-mail addresses — to harass them, or to make the company appear to be spamming. (To most anti-spammers, if the company sends e-mail to these forgery victims, it is spamming, albeit inadvertently.)

To prevent this abuse, MAPS and other anti-spam organizations encourage that all mailing lists use confirmed opt-in (also known as verified opt-in and (by spammers themselves) as double opt-in). That is, whenever an address is presented for subscription to the list, the list so

ftware should send a confirmation message to that address. The confirmation message contains no advertising content, so it is not construed to be spam itself — and the address is not added to the list unless the recipient responds to the confirmation message. See also the Spamhaus Mailing Lists vs. Spam Lists page. All modern mailing list management programs (such as GNU Mailman, Majordomo, and qmail's ezmlm) support confirmed opt-in by default.

Background checks

Since spammers are frequently kicked off the network, they are constantly trying to create new accounts. Many spammers are able to make even a few hours profitable for them and can cause many days of damage to reputation of the services they abused. As a result, many ISPs and web -email providers use CAPTCHAs on new accounts and try to verify the credit cards are not stolen before accepting new customers, check the Spamhaus Project ROKSO list and do other background checks.

X-ASVP&Automated techniques

X-ASVP eXtensible Anti-spam Verification Protocol

X-ASVP is a method for enabling peer to peer scalable authentication between holders of an Internet e-mail address, with redundancy and reliability provided by secondary (tld) and tertiary (global) providers.

Automated techniques for e-mail senders

There are a variety of techniques that e-mail senders use to try to make sure that they do not send spam. Failure to control the amount of spam sent, as judge by e-mail receivers, can often cause even legitimate email to be blocked and for the sender to be put on DNSBLs.

Tarpits&Transparent SMTP proxy

Tarpits

A tarpit is any server software which intentionally responds pathologically slowly to client commands. By running a tarpit which treats acceptable mail normally and known spam slowly or which appears to be an open mail relay, a site can slow down the rate at which spammers can inject messages into the mail facility. Many systems will simply disconnect if the server doesn't respond quickly, which will eliminate the spam. However, a few legitimate e-mail systems will also not deal correctly with these delays.

Transparent SMTP proxy

Transparent SMTP proxies allow combating spam in real time, combining sender's behavior controls, providing legitimate users immediate feedback, eliminating a need for quarantine.

Statistical content filtering

Statistical filtering was first proposed in 1998 by Mehran Sahami et al., at the AAAI-98 Workshop on Learning for Text Categorization. A statistical filter is a kind of document classification system, and a number of machine learning researchers have turned their attention to the problem. Statistical filtering was popularized by Paul Graham's influential 2002 article A Plan for Spam, which proposed the use of naive Bayes classifiers to predict whether messages are spam or not – based on collections of spam and nonspam ("ham") email submitted by users.

Statistical filtering, once set up, requires no maintenance per se: instead, users mark messages as spam or nonspam and the filtering software learns from these judgements. Thus, a statistical filter does not reflect the software author's or administrator's biases as to content, but it does reflect the user's biases as to content; a biochemist who is researching Viagra won't have messages containing the word "Viagra" flagged as spam, because "Viagra" will show up often in his or her legitimate messages. Spam emails containing the word "Viagra", however, do get

filtered because of their unique content compared to legitimate messages. A statistical filter can also respond quickly to changes in spam content, without administrative intervention. Statistical filters should also look at message headers thereby considering not just the content but also peculiarities of the transport mechanism of the email. Spammers have attempted to fight statistical filtering by inserting many random but valid "noise" words or sentences into their messages while attempting to hide them from view, making it more likely that the filter will classify the message as neutral. (See Word salad (computer science).) Attempts to hide the noise words include setting them in tiny font or the same colour as the background. However, these noise countermeasures seem to have been largely ineffective. Software programs that implement statistical filtering include Bogofilter, DSPAM, SpamBayes the e-mail programs Mozilla and Mozilla Thunderbird, Mailwasher, and later revisions of SpamAssassin. Another interesting project is CRM114 which hashes phrases and does bayesian classification on the phrases. There is also the free mail filter POPFile which sorts mail in as many categories as you want (family, friends, co-worker, spam, whatever) with bayesian filtering.

MTP callback verification

since a large percentage of spam has forged and invalid sender ("from") addresses, some spam can be detected by checking that this "from" address is valid. A mail server can try to verify the sender address by making an SMTP connection back to the mail exchanger for the address, as if it was creating a bounce, but stopping just before any e-mail is sent.

Callback verification is compliant with SMTP RFCs, but it has various drawbacks, mainly that it relies on external mail servers whose responses are not reliable, as well as the fact that it's using other people's resources to combat spam. At the same time, there will be numerous false negatives (spammers abusing real addresses) and some false positives (legitimate e-mail with undeliverable sender address).

Sender-supported whitelists and tags

There are a small number of organizations which offer IP whitelisting and/or licensed tags that can be placed in email (for a fee) to assure recipients' systems that the messages thus tagged are not spam. This system relies on legal enforcement of the tag. The intent is for email administrators to whitelist messages bearing the licensed tag.

A potential difficulty with such systems is that the licensing organization makes its money by licensing more senders to use the tag—not by strictly enforcing the rules upon licensees. A concern exists that senders whose messages are more likely to be considered spam who would accrue a greater benefit by using such a tag. The concern is that these factors form a perverse incentive for licensing organizations to be lenient with licensees who have offended. However, the value of a license would drop if it was not strictly enforced, and financial gains due to enforcement of a license itself can provide an additional incentive for strict enforcement. The Habeas mail classing system attempts to further address this issue by classing email according to origin, purpose, and permission. The purpose is to describe why the email is not likely spam, but permission based email.

Rule-based filtering

Content filtering techniques relied on the specification of lists of words or regular expressions disallowed in mail messages. Thus, if a site receives spam advertising "herbal Viagra", the administrator might place these words in the filter configuration. The mail server would thence reject any message containing the phrase.

Header filtering is the means of inspecting the header of the email, the part of the message that contains information about the message. Spammers will often spoof fields in the header in order to hide their identities, or to try to make the email look more legitimate than it is; many of these spoofing methods can be detected. Also, headers that violate the RFC 2822 standard on how the email header is to be formed are frequently rejected.

Disadvantages of filtering are threefold: First, it can be time-consuming to maintain. Second, it is prone to false positives. Third, these false positives are not equally distributed: since content filtering is prone to reject legitimate messages on topics related to products frequently advertised in spam. A system administrator who attempts to reject spam messages which advertise mortgage refinancing, credit or debt may inadvertently block legitimate e-mail on the same subject.

Spammers frequently change the phrases and spellings they use. This can mean more work for the administrator. However, it also has some advantages for the spam fighter. If the spammer starts spelling "Viagra" as "V1agra" (see leet) or "Via_gra", it makes it harder for the spammer's intended audience to read their messages. If they try to trip up the phrase detector, by, for example, inserting an invisible-to-the-user HTML comment in the middle of a word ("Viagra"), this sleight of hand is itself easily detectable, and is a good indication that the message is spam. And if they send spam that consists entirely of images, so that anti-spam software can't analyze the words and phrases in the message, the fact that there is no readable text in the body can be detected, making that message a higher risk of being spam.

Content filtering can also be implemented to examine the URLs present (i.e. spamvertising) in an email message. This form of content filtering is much harder to disguise as the URLs must resolve to a valid domain name. Extracting a list of such links and comparing them to published sources of spamvertised domains is a simple and reliable way to eliminate a large percentage of spam via content analysis.

PTR/Reverse DNS checks

The PTR DNS records in the reverse DNS can be used for a number of things, including:

Most e-mail Mail Transfer Agents (server software) use a FCrDNS verification and if there is a valid domain name, put it into the "Received:" trace header field.

Some e-mail Mail Transfer Agents will perform FCrDNS verification on the domain name given on the SMTP HELO and EHLO commands. This can violate RFC 2821 and so e-mail is usually not rejected by default.

To check the domain names in the rDNS to see if they are likely from dial-up users, dynamically assigned addresses, or home-based broadband customers. Since the vast majority, but by no means all, of e-mail that originates from these computers is spam, many mail servers also refuse e-mail with missing or "generic" rDNS names.

A Forward Confirmed reverse DNS (FCrDNS) verification can create a weak form of authentication that there is a valid relationship between the owner of a domain name and the owner of the network that has been given an IP address. While weak, this authentication is strong enough that it can be used for whitelisting purposes because spammers and phishers cannot usually bypass this verification when they use zombie computers to forge the domains.

Greeting delay & Hybrid filtering

A greeting delay is a deliberate pause introduced by an SMTP server before it sends the SMTP greeting banner to the client. The client is supposed to wait until it has received this banner before it sends any data to the server. (per RFC2821 3.1). Many spam-sending applications do not wait to receive this banner, and instead start sending data once the TCP connection is complete. The server can detect this, and drop the connection.

There are some legitimate sites that play "fast and loose" with the SMTP specifications, and may be caught by this mechanism. It also has a tendency to interact badly with sites that perform Callback Verification, as common callback verification systems have timeouts that are much shorter than those mandated by RFC2821 4.5.3.2.

Hybrid filtering, such as is implemented in the open source programs SpamAssassin and Policyd-weight, uses some or all of the various tests for spam, and assigns a numerical score to each test. Each message is scanned for these patterns, and the applicable scores tallied up. If the total is above a fixed value, the message is rejected or flagged as spam. By ensuring that no single spam test by itself can flag a message as spam, the false positive rate can be greatly reduced.

Fake MX Records

Virus infected spam bots ignore requirements that the email start at the lowest numbered MX record and move up the list if a failure occurs. They try the highest numbered MX records first thinking that the backup servers have less spam filtering than the low numbered MX servers. Spam bots usually do not retry on failure but move on to the next email address in their list. Thus adding fake high numbered MX records is an effective way to reduce incoming spam.

One can also reduce spam but having a fake lowest MX record as well. This causes real email to have to retry as well but it only adds a second to the delivery time. Some people report as much as 90% reduction in spam bot spam using this method.

mx1.example.com - 10 - dead IP
mx2.example.com - 20 - real server
mx3.example.com - 30 - dead IP

Greylisting

The SMTP allows for temporary rejection of incoming messages. Greylisting is the technique to temporarily reject messages from unknown sender mail servers. A temporary rejection is designated with a 4xx error code that is recognized by all normal MTAs, which then proceed to retry delivery later.

Greylisting is based on the premise that spammers and spambots will not re-try their messages. Instead, they will move on to the next message and next address. This is effective since a re-try attempt means the message and state of the process must be stored inherently increasing the cost incurred by the spammer, but a standard component of any legitimate sender's server.

HELO/EHLO checking

For example, spam can be greatly reduced by a number of simple checks confirming compliance with standard addressing and MTA operation

In many situations, simply requiring a valid FQDN in the SMTP EHLO statement is enough to block 25% of incoming spam.

Refusing connections from hosts that begin transmission prior to presentation of the receiving host's HELO banner
Refusing connections from hosts that give an invalid HELO - for example, a HELO that is not an FQDN or is an IP address not surrounded by square brackets

Invalid HELO localhost
Invalid HELO 127.0.0.1
Valid HELO domain.tld
Valid HELO [127.0.0.1]

Refusing connections from hosts that give an obviously fraudulent HELO - for example, issuing a HELO using the FQDN or an IP address that doesn't match the IP address of the connecting host

Fraudulent HELO friend
Fraudulent HELO -232975332

Refusing to accept email claiming to be from a hosted domain when the sending host has not authenticated
Refusing to accept email whose HELO/EHLO argument does not resolve in DNS. Unfortunately, some email system administrators ignore section 3.6 of RFC2821 and administer the MTA to use a nonresolvable argument to the HELO/EHLO command.

All of the examples above are fairly simple checks, all conform to existing standards and RFCs, and all are missing from most commercial MTA implementations available today.

Enforcing RFC standards

Enforcing technical requirements of the Simple Mail Transfer Protocol (SMTP) can be used to block mail coming from systems that do not comply with the RFC standards. A lot of spammers use poorly written software or are unable to comply with the standards because they do not have legitimate control of the computer sending spam (zombie computer). By setting restrictions on the MTA a mail administrator can reduce spam significantly.

DNS-based Blackhole

DNS-based Blackhole Lists, or DNSBLs, are used for heuristic filtering and blocking. A site publishes lists (typically of IP addresses) via the DNS, in such a way that mail servers can easily be set to reject mail from those sources. There are literally scores of DNSBLs, each of which reflects different policies: some list sites known to emit spam; others list open mail relays or proxies; others list ISPs known to support spam. Other DNS-based anti-spam systems list known good ("white") or bad ("black") IPs domains or URLs, including RHSBLs and URIBLs. For history, details, and examples of DNSBLs, see DNSBL.

Country-based filtering

Some e-mail servers expect to never communicate with particular countries from which they receive a great deal of spam. Therefore, they use country-based filtering - a technique that blocks e-mail from certain other countries such as India. This technique is based on country of origin determined by the sender's IP address rather than any trait of the sender.

Checksum-based filtering

Checksum-based filter exploits the fact that the messages are sent in bulk, that is that they will be identical with small variations. Checksum-based filters strip out everything that might vary between messages, reduce what remains to a checksum, and look that checksum up in a database which collects the checksums of messages that email recipients consider to be spam (some people have a button on their email client which they can click to nominate a message as being spam); if the checksum is in the database, the message is likely to be spam.

The advantage of this type of filtering is that it lets ordinary users help identify spam, and not just administrators, thus vastly increasing the pool of spam fighters. The disadvantage is that spammers can insert unique invisible gibberish—known as hashbusters—into the middle of each of their messages, thus making each message unique and having a different checksum. This leads to an arms race between the developers of the checksum software and the developers of the spam-generating software.

Challenge/response systems

Another method which may be used by internet service providers, by specialized services or enterprises to combat spam is to require unknown senders to pass various tests before their messages are delivered. These strategies are termed challenge/response systems or C/R. Some view their use as being as bad as spam since they place the burden of spam fighting on legitimate email senders.

Automated techniques for e-mail administrators

There are a number of appliances, services and software systems that e-mail administrators can use to reduce the load of spam on their systems and mailboxes. Some of these depend upon rejecting email from Internet sites known or likely to send spam. Others rely on automatically analyzing the content of email messages and weeding out those which resemble spam. These two approaches are sometimes termedblocking and filtering.

There is an increasing trend of integration of anti-spam techniques into MTAs whereby the mail systems themselves also perform various measures that are generally referred to as filtering, ultimately resulting in spams being rejected before delivery (or blocked).

Many filtering systems take advantage of machine learning techniques, which improve their accuracy over manual methods. However, some people find filtering intrusive to privacy, and many e-mail administrators prefer blocking to deny access to their systems from sites tolerant of spammers.

Authentication and Reputation (A&R)

A number of systems have been proposed to allow acceptance of email from servers which have authenticated in some fashion as senders of only legitimate email. Many of these systems use the DNS, as do DNSBLs; but rather than being used to list nonconformant sites, the DNS is used to list sites authorized to send email, and (sometimes) to determine the reputation of those sites. Other methods of identifying ham and spam are still used. The A&R allows much ham to be more reliably identified, which allows spam detectors to be made more sensitive without causing more false positive results. The increased sensitivity allows more spam to be identified as such. Also, A&R methods tend to be less resource-intensive than other filtering methods, which can be skipped for messages identified by A&R as ham.

Reporting spam

Some have argued that the most effective way to put an end to spam is by contacting the service providers that are responsible for bringing the content to your desktop. Of those, the registrar is often the most effective administrative body to contact. This is because often the spams are sent from hacked machines, whose administrators may be slow or unable to respond to the problem. Registrars on the other hand, as ICANN-accredited administrative organizations, are obliged to uphold certain rules and regulations, and have the resources necessary for dealing with abuse complaints.

Tracking down a spammer's ISP and reporting the offense can lead to the spammer's service being terminated. Unfortunately, it can be difficult to track down the spammer—and while there are some online tools to assist, they are not always accurate. Occasionally, spammers employ their own netblocks. In this case, the abuse contact for the netblock can be the spammer itself and can confirm your address. Examples of these online tools are SpamCop and Network Abuse Clearinghouse. They provide automated or semi-automated means to report spam to ISPs. Some spam-fighters regard them as inaccurate compared to what an expert in the email system can do; however, most email users are not experts.

A useful free tool that may be used in the reporting of spam is also available (Complainterator). The Complainterator will send an automatically-generated complaint to the registrar of the spamming domain and the registrar of its name servers.

Historically, reporting spam in this way has not seriously abated spam, since the spammers simply move their operation to another url, ISP or network of IP addresses.

Consumers may also forward "unwanted or deceptive spam" to an email address (spam@uce.gov) maintained by the FTC. The database collected is used to prosecute perpetrators of scam or deceptive advertising.

Responding to spam

Some advocate responding aggressively to spam—in other words, "spamming the spammer". The basic idea is to make spamming less attractive to the spammer, by increasing the spammer's overhead. There are several ways to reach a spammer, but besides the caveats above, it may lead to retaliations by the spammer.

1. Replying directly to the spammer's email address. Just clicking "reply" will not work in the vast majority of cases, since most of the sender addresses are forged or made up. In some cases, however, spammers do provide valid addresses, as in the case of Nigerian scams

2. Targeting the computers used to send out spam. In 2005, IBM announced a service to bounce spam directly to the computers that send out spam. Because the IP addresses are identified in the headers of every message, it would be possible to target those computers directly, sidestepping the problem of forged email addresses. In most cases, however, those computers do not belong to the real spammer, but to unsuspecting users with unsecured or outdated systems, hijacked through malware and controlled at distance by the spammer.

3. Leaving messages on the spamvertised site. Spammers selling their wares need a tangible point of contact so that customers can reach them. Sometimes it is a telephone number, but most often is a web site containing web forms through which customers can fill out orders or inquiries, or even "unsubscribe" requests. Since positive response to spam is probably much less than 1/10,000, if just a tiny percentage of users visit spam sites just to leave negative messages, the negative messages could easily outnumber positive ones, incurring costs for spammers to sort them out, not mentioning the cost in bandwidth. This was the approach used by the now-defunct Blue Security's Blue Frog service: it looked for web forms on those sites and filled one form on behalf of the subscriber for each spam received, thus overwhelming spammer sites (Blue Frog service had up to a half million users). Ironically, it was Blue Frog's effectiveness that brought it down, by attracting a massive, retaliatory denial-of-service attack from a spammer who refused to fold. After the demise of Blue Frog, other services have been trying to replicate its ideas, while avoiding its shortcomings.

Disposable e-mail addresses

Many email users sometimes need to give an address to a site without complete assurance that the site will not send out spam. One way to mitigate the risk is to provide a disposable email address—a temporary address which forwards email to a real account, which the user can disable or abandon. A number of services provide disposable address forwarding. Addresses can be manually disabled, can expire after a given time interval, or can expire after a certain number of messages have been forwarded

Disable HTML in e-mail

Many modern mail programs incorporate Web browser functionality, such as the display of HTML, URLs, and images. This can easily expose the user to offensive images in spam. In addition, spam written in HTML can contain web bugs which allows spammers to see that the e-mail address is valid and that the message has not been caught in spam filters. JavaScript programs can be used to direct the user's Web browser to an advertised page, or to make the spam message difficult to close or delete. Spam messages have contained attacks upon security vulnerabilities in the HTML renderer, using these holes to install spyware. (Some computer viruses are borne by the same mechanisms.)

Mail clients which do not automatically download and display HTML, images or attachments, have fewer risks, as do clients have been configured to not display these by default.

Contact Forms

Contact forms allow users to send email by filling out forms in a web browser. The web server takes the form data, forwarding it to an email address. The user never sees the email address. Contact forms have the drawback that they require a website that supports server side scripts. They are also inconvenient to the message sender as they are not able to use their preferred e-mail client. Finally if the software used to run the contact forms is badly designed they can become spam tools in their own right. Additionally many spammers have taken to using contact forms to send spam to the intended recipient

Avoid responding to spam

Spammers often regard responses to their messages—even responses like "Don't spam me"—as confirmation that an email address is valid. Likewise, many spam messages contain Web links or addresses which the user is directed to follow to be removed from the spammer's mailing list. In several cases, spam-fighters have tested these links, confirming they do not lead to the recipient address's removal—if anything, they lead to more spam.

It must be noted that sender addresses are often forged in spam messages, so that responding to spam may result in failed deliveries or may reach innocent e-mail users whose addresses have been abused. In many countries providing a false identity in that way is a criminal offense. Criminal spammers sometimes send their messages from purposely compromised computers in order to hide their real identity. Benign spammers reveal their identity, allowing recipients to respond.

In Usenet, it is widely considered even more important to avoid responding to spam. Many ISPs have software that seek and destroy duplicate messages. Someone may see a spam and respond to it before it is cancelled by their server, which can have the effect of reposting the spam for them; since it is not a duplicate, the reposted copy will last longer.

Address munging

Posting anonymously, or with a fake name and address, is one way to avoid "address harvesting," but users should ensure that the fake address is not valid. Users who want to receive legitimate email regarding their posts or Web sites can alter their addresses so humans can figure out but spammers cannot. For instance, joe@example.net might post as joeNOS@PAM.example.net.invalid, or display his email address as an image instead of text. Address munging, however, can cause legitimate replies to be lost. And if it's not the user's valid address, it has to be truly invalid, otherwise someone or some server will still get the spam for it. See http://www.2kevin.net/munging.html

End-user-techniques

There are a number of techniques that individuals can use to restrict the availability of their e-mail addresses, reducing or preventing their attractiveness to spam.

Anti-spam techniques.

Toprevent e-mail spam, both end users and administrators of e-mail systems use various anti-spam techniques. Some of these techniques have been embedded in products, services and software to ease the burden on users and administrators. No one technique is a complete solution to the spam problem, and each has trade-offs between incorrectly rejecting legitimate e-mail vs. not rejecting all spam, and the associated costs in time and effort.

Anti-spam techniques can be broken into four broad categories: those that require actions by individuals, those that can be automated by the email administrator, those that can be automated by e-mail senders and those employed by researchers and law enforcement officials.

Anti-spam techniques.

To prevent e-mail spam, both end users and administrators of e-mail systems use various anti-spam techniques. Some of these techniques have been embedded in products, services and software to ease the burden on users and administrators. No one technique is a complete solution to the spam problem, and each has trade-offs between incorrectly rejecting legitimate e-mail vs. not rejecting all spam, and the associated costs in time and effort.

Anti-spam techniques can be broken into four broad categories: those that require actions by individuals, those that can be automated by the email administrator, those that can be automated by e-mail senders and those employed by researchers and law enforcement officials.

End-user techniques

There are a number of techniques that individuals can use to restrict the availability of their e-mail addresses, reducing or preventing their attractiveness to spam.

Address munging

Posting anonymously, or with a fake name and address, is one way to avoid "address harvesting," but users should ensure that the fake address is not valid. Users who want to receive legitimate email regarding their posts or Web sites can alter their addresses so humans can figure out but spammers cannot. For instance, joe@example.net might post as joeNOS@PAM.example.net.invalid, or display his email address as an image instead of text. Address munging, however, can cause legitimate replies to be lost. And if it's not the user's valid address, it has to be truly invalid, otherwise someone or some server will still get the spam for it. See http://www.2kevin.net/munging.html

Avoid responding to spam

Contact Forms

Disable HTML in e-mail

Mail clients which do not automatically download and display HTML, images or attachments, have fewer risks, as do clients have been configured to not display these by default.

Disposable e-mail addresses

Reporting spam

Registrars on the other hand, as ICANN-accredited administrative organizations, are obliged to uphold certain rules and regulations, and have the resources necessary for dealing with abuse complaints.

Tracking down a spammer's ISP and reporting the offense can lead to the spammer's service being terminated. Unfortunately, it can be difficult to track down the spammer—and while there are some online tools to assist, they are not always accurate. Occasionally, spammers employ their own netblocks. In this case, the abuse contact for the netblock can be the spammer itself and can confirm your addres

s. Examples of these online tools are SpamCop and Network Abuse Clearinghouse. They provide automated or semi-automated means to report spam to ISPs. Some spam-fighters regard them as inaccurate compared to what an expert in the email system can do; however, most email users are not experts.

Historically, reporting spam in this way has not seriously abated spam, since the spammers simply move their operation to another url, ISP or network of IP addresses.

Responding to spam

2. Targeting the computers used to send out spam. In 2005, IBM announced a service to bounce spam directly to the computers that send out spam. [3] Because the IP addresses are identified in the headers of every message, it would be possible to target those computers directly, sidestepping the problem of forged email addresses. In most cases, however, those computers do not belong to the real spammer, but to unsuspecting users with unsecured or outdated systems, hijacked through malware and controlled at distance by the spammer.

[edit] Automated techniques for e-mail administrators

[edit] Authentication and Reputation (A&R)

Further information: E-mail authentication, DomainKeys, and SPF

[edit] Challenge/response systems

Main article: Challenge-response spam filtering

Another method which may be used by internet service providers, by specialized services or enterprises to combat spam is to require unknown senders to pass various tests before their messages are delivered. These strategies are termed challenge/response systems or C/R. Some view their use as being as bad as spam since they place the burden of spam fighting on legitimate email senders.

[edit] Checksum-based filtering

Checksum-based filter exploits the fact that the messages are sent in bulk, that is that they will be identical with small variations. Checksum-based filters strip out everything that might vary between messages, reduce what remains to a checksum, and look that checksum up in a database which collects the checksums of messages that email recipients consider to be spam (some people have a button on their email client which they can click to nominate a message as being spam); if the checksum is in the database, the message is likely to be spam.

The advantage of this type of filtering is that it lets ordinary users help identify spam, and not just administrators, thus vastly increasing the pool of spam fighters. The disadvantage is that spammers can insert unique invisible gibberish—known as hashbusters—into the middle of each of their messages, thus making each message unique and having a different checksum. This leads to an arms race between the developers of the checksum software and the developers of the spam-generating software.

Checksum based filtering methods include:

[edit] Country-based filtering

[edit] DNSBLs

Main article: DNSBL

[edit] Enforcing RFC standards

Further information: SMTP RFC standards

[edit] HELO/EHLO checking

For example, spam can be greatly reduced by a number of simple checks confirming compliance with standard addressing and MTA operation.

In many situations, simply requiring a valid FQDN in the SMTP EHLO statement is enough to block 25% of incoming spam.

Refusing connections from hosts that begin transmission prior to presentation of the receiving host's HELO banner
Refusing connections from hosts that give an invalid HELO - for example, a HELO that is not an FQDN or is an IP address not surrounded by square brackets

Invalid HELO localhost
Invalid HELO 127.0.0.1
Valid HELO domain.tld
Valid HELO [127.0.0.1]

Refusing connections from hosts that give an obviously fraudulent HELO - for example, issuing a HELO using the FQDN or an IP address that doesn't match the IP address of the connecting host

Fraudulent HELO friend
Fraudulent HELO -232975332

Refusing to accept email claiming to be from a hosted domain when the sending host has not authenticated
Refusing to accept email whose HELO/EHLO argument does not resolve in DNS. Unfortunately, some email system administrators ignore section 3.6 of RFC2821 and administer the MTA to use a nonresolvable argument to the HELO/EHLO command.

All of the examples above are fairly simple checks, all conform to existing standards and RFCs, and all are missing from most commercial MTA implementations available today.

[edit] Greylisting

Main article: Greylisting

[edit] Fake MX Records

mx1.example.com - 10 - dead IP
mx2.example.com - 20 - real server
mx3.example.com - 30 - dead IP

[edit] Greeting delay

Hybrid filtering

Hybrid filtering, such as is implemented in the open source programs SpamAssassin and Policyd-weight, uses some or all of the various tests for spam, and assigns a numerical score to each test. Each message is scanned for these patterns, and the applicable scores tallied up. If the total is above a fixed value, the message is rejected or flagged as spam. By ensuring that no single spam test by itself can flag a message as spam, the false positive rate can be greatly reduced.

PTR/Reverse DNS checks

The PTR DNS records in the reverse DNS can be used for a number of things, including:

Most e-mail Mail Transfer Agents (server software) use a FCrDNS verification and if there is a valid domain name, put it into the "Received:" trace header field.

Some e-mail Mail Transfer Agents will perform FCrDNS verification on the domain name given on the SMTP HELO and EHLO commands. This can violate RFC 2821 and so e-mail is usually not rejected by default.

To check the domain names in the rDNS to see if they are likely from dial-up users, dynamically assigned addresses, or home-based broadband customers. Since the vast majority, but by no means all, of e-mail that originates from these computers is spam, many mail servers also refuse e-mail with missing or "generic" rDNS names.

A Forward Confirmed reverse DNS (FCrDNS) verification can create a weak form of authentication that there is a valid relationship between the owner of a domain name and the owner of the network that has been given an IP address. While weak, this authentication is strong enough that it can be used for whitelisting purposes because spammers and phishers cannot usually bypass this verification when they use zombie computers to forge the domains.

Rule-based filtering

Sender-supported whitelists and tags

SMTP callback verification

Statistical content filtering

Tarpits

Transparent SMTP proxy

Transparent SMTP proxies allow combating spam in real time, combining sender's behavior controls, providing legitimate users immediate feedback, eliminating a need for quarantine.

X-ASVP eXtensible Anti-spam Verification Protocol

Automated techniques for e-mail senders

Background checks on new users and customers

Confirmed opt-in for mailing lists

Main article: Opt in e-mail One difficulty occurs in implem

Egress spam filtering

E-mail senders can do the same type of anti-spam checks e-mail coming from their users and customers as can be done for e-mail coming from the rest of the Internet.

Limit e-mail backscatter

If any sort of bounce message or anti-virus warning gets sent to a forged email address, the result will be e-mail backscatter.

Port 25 blocking

Port 25 interception

Network address translation can be used intercept all port 25 (SMTP) traffic and direct it to a mail server that enforces rate limiting and egress spam filtering. This is commonly done in hotels[4], but it can cause e-mail privacy problems, as well making it impossible to use STARTTLS and SMTP-AUTH if the port 587 submission port isn't used.

Rate limiting

Spam report feedback loops

Strong AUP and TOS agreements

Techniques for researchers & law enforcement

Honeypots

Another approach is simply an imitation MTA which gives the appearance of being an open mail relay, or an imitation TCP/IP proxy server which gives the appearance of being an open proxy. Spammers who probe systems for open relays/proxies will find such a host and attempt to send mail through it, wasting their time and potentially revealing information about themselves and the source of spam to the unexpected alert entity (in comparison to the anticipated careless or unskilled operator typically in charge of open relay MTA systems) that operates the honeypot. Such a system may simply discard the spam attempts, submit them to DNSBLs, or

store them for analysis.

Ongoing research

Several approaches have been proposed to improve the e-mail system.

Ham passwords

Another approach for countering spam is to use a "ham password". Systems that use ham passwords ask unrecognised senders to include in their email a password that demonstrates that the email message is a "ham" (not spam) message. Typically the email address and ham password would be described on a web page, and the ham password would be included in the "subject" line of an email address. Ham passwords are often combined with filtering systems, to counter the risk that a filtering system will accidentally identify a ham message as a spam message.

The "plus addressing" technique appends a password to the "username" part of the email address.

Cost-based systems

Since spam occurs primarily because it is so cheap to send, a proposed set of solutions require that senders pay some cost in order to send spam, making it prohibitively expensive for spammers.

Stamps

Certified e-mail Some gatekeeper would sell electronic stamps and keep the proceeds. Or a micropayment, such as electronic money would be paid by the sender to the recipient or their ISP, or some other gatekeeper.

Proof-of-work systems

Proof-of-work systems such as hashcash require that a sender pay a computational cost by performing a calculation that the receiver can later verify. Verification must be much faster than performing the calculation, so that the computation slows down a sender but does not significantly impact a receiver. The point is to slow down machines that send most of spam—often millions and millions of them. While every user that wants to send email to a moderate number of recipients suffers just a few seconds' delay, sending millions of emails would take an unaffordable amount of time. This approach suffers when sender maintains a computation farm of their own or from zombies.

[edit] Bonds or Sender-at-risk

As a refinement to stamp systems was the idea of requiring that the micropayment only be retained if the recipient considered the email to be abusive. This addressed the principal objection to stamp systems: popular free legitimate mailing list hosts would be unable to continue to provide their services if they had to pay postage for every message they sent out.

Bill Gates announced that Microsoft is working on a solution requiring so-called “unknown senders”, i.e. senders not on the Accepted List of the recipient to post “the electronic equivalent of a” stamp whose value would be lost to the sender only if the recipient disapproves of the email ^[7]. Gates said that Microsoft favors other solutions in the short-term, but would rely on the contingent payment solution to solve the spam problem over the longer run. Microsoft, AOL as well as Yahoo! have recently introduced systems that allow commercial senders to avoid filters if they obtain a certificate or certification, which is lost to the sender if recipients complain.

SHRED

Spam Harassment Reduction via Economic Disincentives

Strong AUP and TOS agreements

Limit e-mail backscatter

If any sort of bounce message or anti-virus warning gets sent to a forged email address, the result will be e-mail backscatter.

X-ASVP eXtensible Anti-spam Verification Protocol

Tarpits

Transparent SMTP proxy

Avoid responding to spam

Contact Forms

Disable HTML in e-mail

Disposable e-mail addresses

[edit] Automated techniques for e-mail administrators

[edit] Authentication and Reputation (A&R)

[edit] Challenge/response systems

[edit] Checksum-based filtering

[edit] Country-based filtering

[edit] DNSBLs

[edit] Enforcing RFC standards

[edit] HELO/EHLO checking

[edit] Greylisting

[edit] Fake MX Records

[edit] Greeting delay

Hybrid filtering

PTR/Reverse DNS checks

Rule-based filtering

Sender-supported whitelists and tags

SMTP callback verification

Statistical content filtering

Tarpits

Transparent SMTP proxy

X-ASVP eXtensible Anti-spam Verification Protocol

Confirmed opt-in for mailing lists

Egress spam filtering

Limit e-mail backscatter

Port 25 blocking

Port 25 interception

Spam report feedback loops

Strong AUP and TOS agreements

Honeypots

Ongoing research

Ham passwords

Cost-based systems

Stamps

Proof-of-work systems

[edit] Bonds or Sender-at-risk

SHRED

Anti Spam

Free Anti / Filler Spam