Close (X)

Blog

Email Marketing, Business & Monkeys

Want 700,000 HTML email templates?

March 17th, 2009 | by Ben

mechanical-turk-logoWhen we launch MailChimp v4.1 later this month, there will be more email template options to choose from. A lot more. I’m not really sure exactly how many templates there will be, because we’re still counting them.

Basically, we came up with over 700,000 HTML email template options, and we’re narrowing it down using Amazon’s Mechanical Turk.

Here’s how…

As you know, MailChimp has never offered “pre-built” HTML email templates.

Our philosophy has always been to give you nice, modular layouts, plus powerful design tools like our automagic email designer, header designer, and magic color picker to help you build your own beautiful template. Think “photoshop for email” instead of, “microsoft word stationery.”

But our customers have been asking us for more choices. They don’t necessarily want fully pre-built templates. They just want more choices that can help them get started, which they can then customize (instead of starting from scratch). Good point. But we didn’t want to just hire someone to sit here and “come up with a bunch of craptacular templates that can be re-purposed.” We needed something more scalable and automatic.

We have over 100 beautiful header graphics for different occasions (here’s an example for St.Patricks Day). We took each one of them and ran them through a color analyzer (which we developed in the MailChimp Lab) so that we could automatically generate color palettes for your template that compliment each header graphic.

It’s what I was talking about in this post: Color Experiments in the MailChimp Lab.

The initial results were pretty good:

download

By the way, this technology will also be live in MailChimp v4.1. Basically, whenever you upload your logo into a MailChimp template, we’ll analyze the colors and suggest a few color combinations for the rest of your template.

Anyway, for each header graphic, we found out we could generate roughly 400 color palette possibilities (or “themes”).

We narrowed that down with certain rules, like “Bright #FF0000 red should not be used for titles” and “default body text shouldn’t be bright blue,” and “don’t let the colors for backgrounds and fonts be within x% similarity, or there won’t be enough contrast.”

That narrowed things down to about 200 possible themes per header graphic.

So we went from roughly 700,000 options to around 25,960.

Then we started to apply them to actual email templates. And we were kinda shocked how good they looked, considering they were automatically generated. Here’s a batch:

themes-full1

Here’s another batch:

full-themes2

And here’s a batch that didn’t turn out so good (IMHO):

full-themes3bad

The results were better than we expected, but we still didn’t want to post them all and overwhelm our users with too many choices.

For the record, I actually did want to overwhelm our users with too many choices (just so that I could write a blog article with a ridiculous template count in my title).

Luckily, our product team is smarter and nicer than me, and wanted to narrow things down a little more intelligently so that we’d have a more usable interface. Party poopers.

They wanted some kind of human review. Sure, we learned a lot about programmatic design harmony in all our lab experiments, but color and design is so subjective.

But how could we possibly review so many email themes?

How to review 25,960 designs in 19 hours

So our engineers turned to Amazon’s Mechanical Turk, which is a “global, on-demand, 24×7 workforce.” It’s composed of “Turkers” who complete micro tasks really, really fast.

They’ve got an API that allowed us to basically post all our design combinations, and ask their users to judge them. Users get paid (think pennies, not dollars) for each time they complete a task.

A typical screen looks something like this:

amazon-turk-screen

Turkers would each scroll through 3 options, and tell us which they think is best. For each task they completed, we pay 2 cents. We can choose not to pay a Turker if we think he’s cheating or gaming the system, which apparently can be an issue.

Reviewing Turk Work

Because of the possibility of cheating, there’s actually a review process where we have to analyze all the results we got back from Turkers, and decide, “this guy gets 2 cents, this guy doesn’t, etc.”

The only issue we faced here was sifting through all the results. We ran just shy of 100,000 tasks through the system, so we got back a HUGE .csv file that showed us how much time each Turker spent on each task, and all kinds of other useful data. FYI, running this large job actually prompted a call from Amazon, where a very, very nice person kindly asked us to please talk to them before running something that big again. Heh. Now we know why. There’s so much data, we’d need a bunch of Turkers to review the Turkers!

Anyway, we’re not inclined to be “strict” and deny anyone a couple pennies for their work. Something about that feels dirty and mean.

So all we did was look for heuristics in the data to tell if someone is cheating. For example, if they review a dozen themes in just a few seconds, and they happened to vote for “Theme #1″ every single time, they’re probably cheating. Or, if we can discern a repeatable pattern, like voting for “1,2,3″ over and over again.

Otherwise, this is completely subjective work, so we’re not going to be harsh.

It’s just that sifting through this data to run these scripts kept crashing our engineer’s desktop computer. It would run for 15 minutes, then just crash. That’s when we remembered the good old Nvidia Tesla supercomputer that we bought a while back (we talked about it in this newsletter), precisely for crunching massive amounts of data (for a certain “Project Omnivore” which we may or may not deny the existence of).

The supercomputer ran it in under 1 minute. Ha.

The results

In total, we found 27 turkers who fit the profile of “possibly gaming the system” and we unfortunately had to reject their 8700 tasks. On the bright side, we approved the 85,000 tasks that the other 503 workers performed.

We’re going to run the remaining templates that were deemed “prettiest” through a “March Madness” style bracket, and we think we’ll end up with around 600 beautiful templates.

Give or take a few hundred bajillion.

Spread the monkey love:
  • TwitThis
  • Digg
  • Facebook
  • del.icio.us
  • Reddit
  • StumbleUpon
  • description
  • Google
  • LinkedIn
  • Ma.gnolia
  • MisterWong
  • Netvouz
  • NewsVine
  • Slashdot
  • Technorati
  • YahooMyWeb
  • BlinkList
  • Design Float
  • Mixx
  • Pownce
  • Propeller
  • Webnews.de

14 Comments

    • Christopher says:

      Genius!

      I’m surprised Mechanical Turk is still in beta after all these years though…

    • Rob Bell says:

      That’s some impressive surveying going on there, I’m looking forward to seeing the templates that make the final cut – they’ll have to be pretty special to get through this selection procedure!

    • Britany says:

      I think this is going to be amazing! I can’t wait!

    • Harry says:

      I see you got from 26,000 total to 600 total, but how many themes per header graphic did you get it down to?

      • Ben says:

        @Harry – I don’t think that we know yet. We’re still left with quite a lot of themes to sift through, and I think that the team wants to run them through a “March Madness” style bracket competition to pick a handful per header graphic. But I’m not really sure what the “target” number of options is (or if we have a target).

        We’ve learned a lot from this experiment. Namely, just because you think some people might not really be looking, and you reject their data, it doesn’t mean you should reject *paying* them. Because if you do, they get really, really, really angry (apparently, it hurts their eBay-style turker-reputation). Oops. We just assumed that if we reject their data, we also reject their payment.

        A handful of turkers really hate me over at the Turker Nation forum, and have been emailing me.

        Oh, another lesson – don’t sign up for Mechanical Turk with your personal name. Use the company name. Heh.

        We ended up just hitting the “approve all” button. No idea if it’ll be retroactive and pay the people we rejected, so we’ll see.

        All in all, we’re super happy. We feel like we got about 90% quality results from the experiment, and according to the person from Amazon that we spoke to, people usually get around 75 – 85% quality on their first go at this (a lot depends on the type of tasks you’re asking turkers to complete, too).

        We also learned about how to *word* your request. Since this was a subjective, “design” related task, we told people, “There is no right or wrong answer.” Some turkers took that as, “ok, I can just click anything i want, really really fast.”

        The crazy thing was that there’s a little field where turkers can enter comments while they vote. We got an *amazing* amount of great feedback from people about designing headers, why they didn’t like certain ones, and even tips on how to conduct better mechanical turk experiments.

        We plan to do some actual usability studies (on the MailChimp application) with mechanical turk, now that we know how it all works.

    • Nicole says:

      WOW… this is amazing stuff. I am so jealous!

    • Kali7 says:

      Ok, I have to say it…. but I am finding the template thing problematic. Much prefer to do the lot in dreamweaver (and yes, know the e-mail rules).
      I don’t want my stuff shoved into one of your templates, and I want to see ALL the source code.
      Can we please have a developer section where we don’t have things handed on a plate for us? Please?
      The automation thing keeps wanting to take control of most of the stuff I do for a university, to the point that we are contemplating building our own server to send e-mails.
      I love having someone else do the marketing reports, host images, and send stuff. But the rest, great for some, annoying as all hell for the others.

      • Ben says:

        Hi Kali7, there are lots of options in MC for advanced designers. You can totally bypass the templates and the WYSIWYG if you want. Check out the “advanced design tools” demo on this page: http://www.mailchimp.com/power_features/

      • Sam says:

        I don’t know that “developer” would be the appropriate word for a DreamWeaver user or an email template designer. =P

        MailChimp offers complete freedom over the overall design process, including plain-text view of the HTML and CSS. The automation is just an awesome, but optional bonus feature.

    • Benji says:

      On the opposite end of the spectrum from @Kali7, will you be making some of those filtered-out designs available for those who want to use them? I’m thinking for those who like a certain header, but don’t quite like the way the design came out, could they browse further through other designs with the same header? They may not want to tweak the colors themselves. Tweaking colors can make a monkey go bananas, you know. :P

    • jonathan says:

      I think that you could of computed it in the cloud really fast also. http://aws.amazon.com/ec2/

      • Ben says:

        Thanks, Jonathan. We used our supercomputer (nvidia tesla) for the first pass, then wanted human opinion (on a massive scale) for the 2nd pass. But yes, we are indeed using ec2 for some exciting stuff to come soon. Muahahaha.

Leave a Reply

* indicated required
http://www.mailchimp.com/nonrestrictiveocean.php