Close (X)

Blog

Email Marketing, Business & Monkeys

Most Common Spam Filter Triggers

February 4th, 2009 | by Ben

We’re working on an experiment in the MailChimp Lab to help us automatically detect when someone’s about to send something too spammy from MailChimp (no, this is not what the supercomputer is for). We’re using Cloudmark, Barracuda, and Spam Assassin (and possibly Postini in the near future). We picked those, because they’re the most commonly used—and vexing—spam filters.

We’re not planning to expose any secret formulas, or help customers “get around spam filters.” It’s more of a behind-the-scenes, “big brother” tool to help us catch exceptionally bad campaigns before they get sent. That’s the idea, at least, and we’re not sure when this’ll go live.

For now, we’re doing research. We’re currently scanning a few hundred thousand campaigns sent through MailChimp over the years, to see how many “false positives” we might trigger.

In the process, we’re uncovering a lot of innocent mistakes made by senders, plus a few surprises.

We’ve written about How Spam Filters Work in the past. Basically, spam filters look for certain “spammy criteria” in your messages. Each criteria gets a different score. Your message’s total score determines whether or not you’re blocked.

For example, putting the word “viagra” in your subject line is dangerous, for obvious reasons.

There are other, not-so-obvious criteria used by spam filters too. Like poorly coded HTML (spammers are notoriously bad coders). Or my personal favorite, using Microsoft Front Page. Ha. Also, simply using the word “Oprah” will get you a few points (for the record, the spam filters probably have nothing against Oprah—methinks her name is just used a lot by spammers).

If this is new and fascinating to you, I encourage you to read How Spam Filters Work.

Anyway, we’re looking at the most common triggers that MailChimp customers have been setting off.

Some of them are pretty surprising.

Top 10 Most Common Spam Filter Triggers

By far, the most common reason MailChimp customers have been flagged by spam filters is “too many images, not enough text.” This is a very common mistake (see: Stupid HTML Email Design Mistakes), and I’ve blogged about this in the past. Over and over. (See: How Your Email Designs Can Get You Blacklisted, and this and this).

Anyway, here’s the top 10 list of spam filter criteria that MailChimp users are most guilty of. I’ve included the corresponding number of detected matches (keep in mind the system is not done scanning—it might take another week to finish):

  1. BODY: HTML has a low ratio of text to image area    (1,217 matches)
  2. BODY: Message only has text/html MIME parts    (971)
  3. BODY: HTML has a low ratio of text to image area    (729)
  4. BODY: HTML and text parts are different    (625)
  5. Subject is all capitals    (324)
  6. BODY: HTML and text parts are different    (279)
  7. BODY: HTML: images with 2400-2800 bytes of words    (211)
  8. BODY: HTML: images with 2000-2400 bytes of words    (194)
  9. BODY: HTML: images with 1200-1600 bytes of words    (178)
  10. BODY: HTML: images with 1600-2000 bytes of words    (178)

Number 5 is just idiotic. TYPING IN ALL CAPS = SCREAMING AND IS RUDE. Don’t type in all caps in your emails, please. Who does that?

Number 2 means somebody was lazy, and only included the HTML or the plain-text version of their emails, instead of both. I think that’s what it means. Spam filter rules can be cryptic sometimes (intentionally, perhaps).

But the rest of the detections on that list basically mean that the senders sent way, way too many images, and not enough readable text. Spam filters can’t read images. Spammers know that, so they often send spam that’s nothing but a big, ginormous image. And spam filters know that, so they in turn block email that they can’t read.

The battle between spam filters and spammers is brutal and never ending, and sometimes legit marketers get caught in the crossfire. Understand how both sides work, and do your best to cope.

But don’t try too hard to appease the spam filters. They don’t like that either (looks needy).

Not-So-Common Spam Filter Triggers

During our user research, we found some surprising spam filter triggers. Here are some examples:

  • The phrase, “extra inches” will get you a score of 3.1 by spam assassin. The phrase sounds like it came from some kind of “appendage enhancement” pharma-spam, right? Turns out it popped up 4 times in MailChimp, from relaxation & beauty spas. As in, “if your new years resolution is to shed some extra inches off your waistline, come in and…
  • Dear FNAME, = “not very dear at all!” Do you merge the recipient’s FNAME into your messages? If so, don’t use the d-word. Turns out “Dear” will get you 2.7 spam points. That’s about halfway to getting your email blocked. Use something else, like “Howdy.” At MailChimp, we use “dear” in just about all our demo videos and tutorials, because it’s the easiest way to explain mail merge tags. When we say, “Dear *|FNAME|*,” people just get it. We might stop using this example. I’ve written about how  salutations can waste valuable space anyway.
  • “Stop Further Distribution” – In your footer, when you give people that unsubscribe link, don’t try to be all official and corporate sounding. The phrase, “stop further distribution” will get you 3.1 spammy points. By the way—”distribution?” Nobody says that.
  • “You registered with a partner” – If the body of your email contains that phrase, chances are very good that your email list is not permission-based. This actually sets off a few red flags in MailChimp’s list setup process,  (we get alerted when people enter that into their permission reminder), and I was pleasantly surprised to see that spam filters look for it too.

As you can see, your emails can get flagged as spam, even if you’re not a spammer. Your email delivery can suffer, even from an innocent mistake. If enough innocent mistakes happen, MailChimp’s overall deliverability can suffer. So we’re working on preventing that. Hopefully, you won’t be hearing from us soon.

Spread the monkey love:
  • TwitThis
  • Digg
  • Facebook
  • del.icio.us
  • Reddit
  • StumbleUpon
  • description
  • Google
  • LinkedIn
  • Ma.gnolia
  • MisterWong
  • Netvouz
  • NewsVine
  • Slashdot
  • Technorati
  • YahooMyWeb
  • BlinkList
  • Design Float
  • Mixx
  • Pownce
  • Propeller
  • Webnews.de

35 Comments

    • Ad Hustler says:

      This post is beyond awesome. I am always wondering what to do and what not to do so I will follow some of these tips.

    • Slaton says:

      Very helpful tips. I had no idea that using the “Dear” salutation will result in spam pts. Also, interesting point about Thunderbird that I’ve recently observed is that if you use full URLs in the text of an email they better match the underlining URL! If not, it’s tagged as potential spam.

    • Tim says:

      Number 2 is not generally that you just sent an all html email (although it potentially could set off alarms). There’s an email type called multipart alternative. This is from back when there were all sorts of different email readers that could and couldn’t read different sorts of email. You put your email message in different formats so that all the mail readers can read it. So you’d have a section where your message is just in plain text, then a section in HTML, etc. The main “other” format was text/enhanced which was a sort of HTML-lite that was supported by Eudora.

      But the spec is open. You can put in all sorts of things if you wanted, word document, PDF, whatever. Most email clients are going to ignore it if you do and just use the HTML. The kicker is that you’re always supposed to put in plain text since that’s the most basic. Anyone, even if their email program was from the dark ages can read an email that has a plain text message in it.

      Spammers knowing that most email clients like to show HTML would not put in this plain text section (’cause it’s a lot of work to type your ad in again, right?) and so it was a reliable way to figure out who was legitimate. It still is. I have no clue why spammers haven’t completely caught on to this. There are ways to easily convert your html automatically to plain/text (which I won’t go into here, but was in another mailchimp blog post I believe). The fact it’s still #2 means it’s obviously still a good way to detect spammers. And Outlook or MailChimp isn’t going to be stupid enough not to send it.

    • Ben says:

      Thanks, Tim! That explains it.

    • Ben says:

      @Slaton – Yep, that security feature in Thunderbird scared me a while back too:

      http://www.mailchimp.com/blog/new-security-feature-in-thunderbird-triggered-by-click-tracking/

    • Tanya Stesen says:

      Great post! We had no idea about the “Dear” greeting being spammy. Will have to come up with a better salutation.

    • Joel Davies says:

      Ben, nice post. But I’m afraid exorting email designers to adhere to best practices is doomed. Doooomed I tell you.

      I just last night had a client who forwarded me an email newsletter from Borders and asked, why can’t you do nice looking emails like this for me?

      ( I’ve posted a jpeg of this ad to the web so you can see it in all its glory:)
      http://zgraphicsdev.com/zgr_samples/email/borders_testcase.html

      Okay, so I’m going to try to make an effort to educate the client as to why the Borders Shortlist is an email abomination.

      You know what I’m anticipating she’ll say? “If a big retailer like Borders does it, why shouldn’t I?”

      How do you answer that?

      This is addressed to anyone who might be reading this comment. How do you keep your client from going to some other provider who says, “no problem, I can make it look just like that Borders newsletter!”?

      • Ben says:

        @Joel – A lot of the very, very large retailers are paying for certification (lookup Goodmail and ReturnPath’s SenderScore Certified) to help them get past image blockers and certain spam filters (of course, they still have to follow permission best practices). Here’s a really old article we wrote on Email Certification (i think most of it is still valid). So that can partially explain why the big senders can do what they do.

        In terms of convincing clients to follow best practices, it’s not easy. Hard numbers tend to help (temporarily). A/B testing two designs might help some. You can also use our ecommerce360 plugin to track ROI on your different campaigns. If you can show their ROI dipped because they didn’t listen to you, it might help.

        You can fight pictures with pictures. Run an inbox inspection and show them what their image-heavy campaign looks like when images are turned off.

        The inbox inspector will also tell you if their campaign will get blocked by major spam filters (before you send).

        FYI, I have that exact same email campaign from Borders printed and hanging on my wall. It’s 4 sheets long, and it keeps peeling off the wall and falling on my floor (it’s so heavy). It’s a complete p.i.t.a. for me, too. FWIW, we’re working on similar templates to this one (and many others) to help you do this for your client.

    • Henrik says:

      Great informative post!
      I am looking forward to seeing more reports from the ongoing scan of sent campaigns. The “distribution” trigger was news to me, I´ll tell you.
      Thanks!

    • Kevin says:

      As far as the word “partner” triggering the Spam points in some circumstances…

      Is the syntax important? I work with distributors, and I often (read: always) mention in our Permission Reminder that they are a “valuable” partner.

      Think that’s a bad idea?

    • Douglas Karr says:

      Outstanding advice! Thanks for sharing this. I wrote a while back that people really don’t understand what SPAM is compared to what ISP’s think SPAM is… this confirms it even further!

    • Samara says:

      Too funny – I found this article via an email sent from MailChimp that ended up in my Mac Mail spam box. ;)

    • Landing Page says:

      This is an excellent and very useful post. Just one comment is there any reason why the Most Common SPAM Filter Triggers Nbr 1 and Nbr 3 are the same? Is that because it was reference more than twice for the specified numbers by the various SPAM filtering services/software or is it to make a point that this is a fairly serious easy to avert trigger, hence it is in the Nbr 1 position.

      Quick note: is there any indication what a solid ratio might be. Often times it is very hard to convince clients to go with less graphic heavy email and focus on Relevant Copy with Strong Supporting Images.

    • Banana says:

      Just really love you guys… what fantastic and well researched information. Thank you bunches.

    • beast from the depths says:

      Another superb post – thanks folks. I’m with Landing Page – would love to know the answer to both issues.

      [The use of caps in "Relevant Copy with Strong Supporting Images", implies that LP has the same sort of clients as me.]

    • Rick says:

      Also, if you use “friend” in place of a missing first name as in “Dear Friend,” you will get dinged 2 to 3 SA points. Thank Nigerian spammers for that one.

    • HostPipe Web Design says:

      We’ve helped a few of our clients with email marketing campaigns (we always recommend MailChimp). I would have had no idea that using ‘Dear’ would be recognised as being spammy. I wonder what other phrases and words might pop up in ordinary looking marketing emails and that add points in the likes of SpamAssassin and CloudMark (both of which we use). I guess if that information was readily available then it would defeat the object!

      A really useful article, thank you.

    • Anna says:

      Great post- reminds me of early blog posts a few years back on email marketing. I learned some stuff too- the Oprah bit and “Dear Anna” !

      Anna

      • Ben says:

        Thanks, Anna! It indeed felt like a blast from the past writing that article. Next, I’ll write about “sniffers” that detect when to display plain-text emails, and why open tracking should be taken with a grain of salt. Seriously, I hoped it would be a slightly different spin since it was focused specifically on what our own customers were triggering. The scanning is still ongoing. “Image-heavy” is still the most common trigger, and spas and yoga studios, etc., are popping up a *lot* because of their references to losing weight.

    • Amanda says:

      As far as #5, using all caps, there is a competitor company of ours who uses subject lines with all CAPS, every single time. Every word. Plus lots of exclamation points at the end of the subject, AND their emails are all images. Yet they always reach my inbox. I’m still scratching my head over that.

    • biggie says:

      Very nice article. A lot of people (myself included) wonder about keywords that catch individual emails to clients in spam folders. Any sources for that info?

    • sara says:

      Great post! Was wondering about ALT tags on those large images everyone seems to love… Do the spam filters read the ALT tags and base their scores on those as well?

    • LB says:

      Thanks for the writing this! I’m new at this game so am trying to catch up. Q: Is there a way to tell if an email sent out to clients was caught in spam filter? Does it bounce back?

      • Ben says:

        There’s no easy way to tell if it was caught in a spam filter. Sometimes, they do send bouncebacks that tell you “blocked for content” or something like that.

    • Jen says:

      Hi,

      We have an internal controversy going on about whether or not a graphic in our outgoing Outlook email signature blocks is being seen as Spam to our recipients (and potentially blocking our messages from being delivered). Some people have had problems, others have not going into the same organizations.

      Any insight into this would be greatly appreciated!

      Thank you!

      • Ben says:

        I don’t think it should cause huge deliverability problems. They’re pretty common, and usually just tiny little attachments. Especially if we’re talking about one to one messages. If you’re trying to embed a logo in your signature *and* sent it en masse, you’re probably more likely to see issues. Personally, I think signature logos are annoying, because I’ll search for “that email with the important attachment from that one guy” and they *all* have attachments because of the logo.

Leave a Reply

* indicated required
http://www.mailchimp.com/nonrestrictiveocean.php