Patrick Sullivan
Most competitive players have “play-tested” at one time or another. Actually, most of them probably do it a ton. By 'playtest,' I mean, sit down with a bunch of decks, grind out a ton of games, figure out which cards and decks are the best given the expected metagame. I know in my time playing TCG’s competitively, I have spent a lot of hours figuring out the equivalent of “Should I play Dispel Magic or Disperse Magic in this specific metagame?”, time that perhaps would have been better spent reading, going on a hike, or in the company of loved ones. But that’s neither here nor there.
Since players “play-test” and R&D “play-tests”, there is an assumption that these are roughly the same thing. In fact, not all of my time is spent doing what most people call “play-testing”. I’ve got a lot of masters to serve, a finite amount of time, and too much testing of top tier constructed can be harmful anyway. I’ll get into all of that in a bit. The best starting point, I think, is to illuminate what “play-testing” means in R&D, then to talk about specifically what testing top-level play looks like, and what pitfalls to avoid.
To begin with, as I discussed in my previous blog, our sets are about things. Specific things. One of the first places to go, when play testing with a new file, is to throw all of “mechanic X” into a deck and see what happens. This is what happens at the early stages of a file, and it’s rarely about figuring out the raw power level of something. When you look at all of the cards at once, you get the best sense of what’s fun, what’s not fun, what’s too complicated, and whatever else you want to look at or analyze. A good example of this is Death Rattle in Icecrown.


By this point, Scourgewar and Wrath Gate were done deals, and obviously both of those sets contained a reasonably large amount of the mechanic. At one point in the file, there was a lot more Death Rattle than what ended up going out the door. What happened was, I decided to build a deck with basically as many Death Rattle cards and as many ways to trigger at instant speed (Donation and Into the Maw-style Quests, Alchemist’s Stone, etc.) and see how annoying it was for my opponent. The issue wasn’t about raw power, although the deck could do some powerful things. The bigger issue, to me, was, “How annoying is this? Is this too much to keep track of? Are there too many permutations to these plays?” And, as we found out, multiple Death Rattle triggers happening on my turn and yours is pretty annoying. So, not only did we ratchet back the amount of Death Rattle generally, we were very careful about cards that you would want to Death Rattle during your opponent’s turn. I’m pretty sure Ben Cichoski was about to kill me when I kept stopping him before his draw step on his turn because I was considering which discard outlet I wanted to use and which card I wanted to discard.
On top of that, the epics require a lot of attention as compared to other cards. Not all of them are going to be players in top level Constructed; people can only cram so many high-cost cards into their deck and our Epics are going to be biased towards big allies and such. Still, we have an obligation to make sure that our Epics are cool to at least one type of player, and ideally more than one. Plus, they have to pass the Drew Walker test (resident super-casual and lore master) that we’re doing the MMO justice. It’s important for them to “feel” Epic, which involves watching how they operate in play and how people react to playing them/having them played against them.

Big, splashy, thematic, and OMG LIGHTNING-- mission accomplished!
There’s some other stuff too—general set complexity, making sure the cards feel right and are working as intended, and so on. Still, our primary focus of our testing has to be within the context of the set that we’re working on and the primary themes, rather than testing arbitrary power cards and how they interact with other power cards from sets previous. Put another way:
-
While working on Worldbreaker, if you built a deck with a ton of Worgen to see how powerful King Genn Greymane is, or built a deck around nature damage abilities and allies with Earth and Moon, that would be a good use of time.
-
While working on Worldbreaker, if you built a stock Death Wish deck with Stance Mastery so your Jin’Rohk would hit for seven instead of six or whatever, that would be a bad use of time.
It is impossible for us to fully “balance” Constructed. We have our opinions about what’s good and what’s bad, and we make a good faith effort to ensure that no particular strategy is overwhelming. But, promising balance is impossible, and not a good use of time in striving to do so. For one thing, there are a couple of us working on this game, and thousands of you building decks. If it took a few of us a couple of weeks to figure out what all the “good” decks are, I imagine it would take the public about two seconds. Secondly, one of the biggest mistakes a developer can make is justifying mistakes by assuming people are going to figure out the solutions.
Let’s say there was a busted resource ramp deck when Worldbreaker hit. Let’s say we knew this, that the average deck had very little shot of beating it, but the ramp deck could never beat Close Quarters Combat beatdown, no matter how the ramp player modified their deck. A competitive player’s instinct might be to say, “Sounds fine. It’s hard for that deck to get overwhelming in the long term because a solution exists.” That is a huge fallacy. Consider:
- What if no one discovers the Close Quarters Combat deck you assume they are going to find?
- What if people really hate playing with or against Close Quarters Combat?
- What if the prevalence of Close Quarters Combat means everyone writes off every high-resource cost card you make for the next few years?
Any of the above outcomes represents a potential disaster, and even in the “optimistic” scenario it’s not like things are super awesome in your metagame. To avoid this, the best thing to do is to test all of your decks against a spectrum of strategies. If anything is striking you as obnoxious, or lock-outy, or too powerful unless people play with narrow or specific cards, you should nerf that card/deck even if you don’t think it’s going to be dominate in top level play. Of course, even if a deck is fun to play, if it’s batting an overwhelming percentage in testing, it should be taken down a peg or two just to maintain general balance. Decks are almost always fun in small doses and almost always annoying in large ones. Striking that balance is the tricky part, and that balance is best found by running everything against everything and seeing how you feel, rather than jamming hundreds of games with the same five decks, making sure nothing bats more than 55% against the field.
News & Articles
-
17 May

-
16 May

-
15 May

-
14 May

Blog
-
17 May

-
10 May

-
09 May

-
27 Apr




