Split Test Your Emails: A or B?

Posted by Brad J. Ward | Posted in Analytics, Concepts, Email, Thoughts, Usability | Posted on 08-11-2008

12

At eduWeb I chimed in about A/B testing on emails during the Q&A Session of Kyle James’ presentation (that’s me around 27:40).  I just wanted to share an example of a recent test that I did.

The email was to announce that our online app was available.  I knew from last year’s send that the subject line was fairly effective (“Butler’s Online App Now Available!”), so I wanted to take a look at the content and see where I could actually push more students to click through and take action.

For the 2007 send, it went to 14,650 students, and the results are below.

For 2008, we have 17,566 students to email.  Rather than do it the same as last year, I first ran an A/B Test. I left the first email the same as last year, and for the 2nd one I used a button graphic to see if it would help clickthrough rates increase.

Test A: Same email as last year.

Test B: Added a visual clickthrough.

Results: Each test was sent to 3,500 random students on 8/7/08.  The winning test after 36 hours would then be sent to the remaining 10,566 students.

Test A: 3,282 successful. 339 opens (10.3%), 72 click throughs (21.2%) as of 8/11
Test B: 3,292 successful. 719 opens (21.8%), 273 click throughs (38.0%) as of 8/11
2008 Send: 9,454 successful. 1,378 opens (14.6%), 449 click throughs (32.6%) as of 8/11

And for comparison,
2007 (1): 14,650 successful. 5,137 opens (35%), 853 click throughs (16%) after 1 month
2007 (2): 9,513 successful. 1,232 opens (13%), 270 click throughs (22%) after 1 month

So in first 4 days, we have had 47% of the opens as last year (2,436), and 93% of the click throughs (794).  These numbers will continue to rise as the days and weeks go on.  Based on early #’s, I can say that Test B has been a success, 34.4% click through rate to date compared to 17.6% click through rate over the course of last year’s entire campaign. Those are results.

Even if we don’t do a 2nd send this year, I think we’ll get close to the amount of opens we had last year cumulatively.  I might do a 2nd send with a different title to see how that affects open rates, using the content from Test B to continue to push click throughs higher.

I started thinking about the button after reading Designing The Obvious on a flight last week. It was a really good book and made me think more about how I can incorporate more design-friendly aspects into emails.

I’d encourage you to consider an A/B test in the future and see how you can make the most of your email campaigns.  This isn’t a new technique, but it’s usually overlooked.

Comments posted (12)

So you didn’t change the subject line? Odd that the open rates would would be so significantly different with no subject line change…and since the random group had a high propensity to open, maybe they’d have a higher propensity to click? Still, great info though.

Also, the click rate you have in there is really a click-to-open rate. Click rates are usually a factor of the messages delivered, rather than the ones opened. Click-to-open rates compare the click rate against how many people actually opened the email. It’s probably a much more accurate stats to use in terms of effectiveness so I’m not knocking it….just wanted to note the difference.

Learn something new every day! I didn’t know that type of click had a specific name. The backend of my system does click divided by total sent, which as you said is not as accurate. I’d rather know what % click through after actually seeing the design! So I just divide clicks / opens in my own spreadsheet. Delivra is pretty lackluster when it comes to reporting. Any email past ~6 months disappears, so I have to keep all data on my own.

I thought it was strange that Group A was higher too, but %-wise they still out-performed and the % of clicks then translated closely to the overall send of 10K+, so I wasn’t sure what to make of it.

Thanks for your thoughts!

Coolness. We need to get more people thinking about A/B testing and other things. I’ll admit I’m bad about it too, because we just don’t have time. It’s free to do and can be very valuable for website testing also. You can get into multivariable testing, but if you have the time (meaning months and years to test) you might as well just keep it simple w/ A/B and keep tweaking the loser.

Very cool – thanks for posting the results! I’m a total novice in email marketing: so it’s OK to have a “!” in the subject line? I’ve always avoided !?* but, again, novice!

I don’t think it’s a problem. My emails never suffer as a result. I would avoid using ‘spammy’ words in conjunction with !, such as ending a subject with free!, scholarships!, money!, etc.

A lot might depend on who you’re sending through and if they are whitelisted with most email providers. Delivra seems to be pretty good at this.

I’m sure there is research for both sides, it just depends on which side you want to be on. A lot of times I think people think way too much when it comes to all of this.

“Delivra is pretty lackluster when it comes to reporting. Any email past ~6 months disappears, so I have to keep all data on my own.” Yikes!

Total click-throughs or unique recipient click-throughs? There is a difference. The distinction? Some recipients may have clicked on a link multiple times. (They do this – I see it in the metrics.) Some recipients may have clicked on every link in the message…

Also, which version did you send first? You may have tipped a spam filter at some point, thus causing some of the first and much/all of the second message to be filtered to the bulk folder. That obviously would skew the results.

In terms of last year vs. this year comparisons, was the message sent at about the same date as last year? Same day of the week? Were there more messages (print and/or e-mail) sent before this message this year than last year? These can have huge impacts. Multivariate testing extends beyond a singe e-mail send or Web page, of course.

Karlyn is correct about the stats being an open-to-click ratio, but they likely also reflect how the e-mail service provider (ESP) is recording opens.

Some e-mail tracking 101:
- Opens rely on display of 1×1 pixel tracking image
- Images in e-mail are often blocked by default (Yahoo, Gmail, Outlook, etc., etc.,)
- Many recipients that open a message are not recorded as opens since the tracking image does not display. (Conversely, preview panes might record an open even if the person merely clicked on the message to delete it.)

To combat this, some (but not all) ESPs count every message that has a click as an open even if the tracking image does not display. (The assumption: if the recipient clicked on a link, they must have opened the message.) If your ESP does this, it is little wonder why the version with more clicks also has more opens. The higher number of opens is, at least in part, because of the higher number of clicks. (Some spam filters follow links before the recipient even receives the message. When that is true, an open is recorded. E-mail metrics are a murky field.)

On the subject of image blocking:
- Picture the above message with all images blocked. Not so pretty, but thankfully the text is able to get the message through and the app link also exists apart from the button. (You may have seen worse results if the only app link were the button, since that button wouldn’t display by default for many recipients.)
- Alt text for images is extremely important. Given the prevalence of image blocking with e-mail clients, what you had for alt text for the apply button might have been as or more helpful than the button itself. It is a good idea to always use image alt text to help give some context to blocked images. BIGGER TIP #1. If you use a lead-off banner image as this message appears to do (assuming it doesn’t have the “if you prefer to view this message in a browser” spiel, wasting valuable preview pane space imho), the alt text for that image functions as the snippet that follows the subject line in the Gmail inbox, as well as some other e-mail clients. Further, if images are blocked but alt text is displayed, it functions as teaser text within the message. If you don’t use a lead off banner image, you can use a 1×1 “invisible” image as the lead off and give it alt text, though some argue this looks spammy to filters.

Other reasons why version B may have had more clicks:
- The stats may reflect total clicks and not unique recipient clicks. The second message contained two visible links, including the button. (You have the answer to this, obviously.)
- There were two places to click through to the app page, and that alone can increase the likelihood of clicks (even unique recipient clicks).
- The first link is buried within a sentence/paragraph BIGGER TIP #2. Don’t overlook how big this tip is, because it is huge. A link that stands on it’s own in an e-mail (i.e. it isn’t part of a sentence or paragraph) is more likely to be clicked. Your button stands on it’s own. You honestly may have seen similar results with version B if the split test had one link in the paragraph as it is in version A, and the second link as a standalone but not in button format.

(And by now the TargetX folks reading this are taking notes to make a marketing tip e-mail out of it…)

@Melissa, there is a lot potentially spammy about the message:
- Yes, the exclamation point in the subject line
- The high ratio of text within the images
- The high image-to-text ratio
- The high html-to-text ratio
- The use of the word free in the body of the e-mail, particularly since it is in bold. Free is a powerful word, so using it can be good. Not sure about the risk of bolding, though. You can verbally work around it, too, and still bold your point.
- Having a url in a message that doesn’t matched the hyperlinked url underneath it (tracking was on, so the url underneath it would have been a tracking url generated by the ESP, not http://go.butler.edu/apply). It is safer to make the link text, i.e. apply online for admission.
- Simply having open and click tracking turned on ups the spam score.

All of these things, even in combination, are not necessarily fatal in getting the message filtered as spam, depending on the sensitivity of the spam filter doing the filtering. More important is the reputation of the sender, in this case the particular sending IP of the ESP being used, not the Butler e-mail address the message appeared to come from. The ESPs sending infrastructure (sends per connection, connections per hour, etc.) can have a major impact, too, as well as other factors controlled by the ESP (use of authentication and a whole lot more).

That said, I think too many e-mail marketing pros downplay the impact content can still have on spam filtering nowadays. There was some credible research done in the last year to support this. Sorry I don’t have a reference – I have it filed away somewhere.

It is a good idea to include a seed list of Yahoo, Hotmail, etc., accounts and add them to every send. While you can’t always be assured a message didn’t get filtered as spam because it made it to the inbox of your particular seed account (Yahoo may have filtered the send further down the recipient list, for instance), you can at least see the red flag when it does hit the spam folder of your seed account.

Brad, I have some ideas improve the message in your example, but I think my employer prohibits moonlighting for a competitor ;) Plus I think my reply is longer than the original post lol

“Plus I think my reply is longer than the original post lol”

I think that’s usually the case ;) Thanks for chiming in.

Forget TargetX….*I’M* taking notes on this ;-)

Rob’s back! Welcome back from vacation buddy! It’s good to have your insight back in the conversation!

I don’t have anything that strong to contribute, but I’ve been enjoying vacation myself and haven’t gotten a links of the week out for a while and here’s one of those valuable posts that fits right into this conversation.

Top 14 power words for email, advertising and communications.

I don’t blog and rarely post comments – probably explains the length of those rare comments.

@Kyle, I checked out the “Top 14 power words” link. Definitely some powerful (perhaps overused, though still effective?) words – I felt like I was reading the Sunday sales fliers.

Some of the sentences used on the page are enough to make an English major cringe: “Your emails or advertising is missing to get answered ?” It appears to be a European company. Either English isn’t the native tongue of the person who wrote that content, or it is a case of SEO gone very wrong!

Hmm. All 14 words in one or two sentences… that could be entertaining.

[...] All of Your Other Priorities Living Up To Your Potential Is B.S. The Allure of Junk Mail From India Split Test Your Emails: A or B? The Ultimate Guide to Achieving Total Happiness Be Here Now Marketing: How United Airlines Can Take [...]

Write a comment