The next step in our playtesting journey is deciding what you should measure. Measurement comes in many forms, and so today we’ll mostly be looking at what you’ll be measuring. We’ll look at how to collect that data next time.
Measuring during playtesting is pretty damned important. It’s one of the major differentiators between a good playtest and a bad one – you can set everything up perfectly, but if you aren’t collecting the right data, you’ll miss something important.
The data you need will (shock horror!) depend a lot on what your test purpose is. As a general rule, early stage, on-location testing will benefit more from qualitative data (partly because it’s harder to come to conclusions from small amounts of quantitative data with a small sample), while later stage, online testing will benefit more from quantitative data (because wading through that qualitative data will take much more time).
This isn’t to say that you should have only qualitative or only quantitative data at any point: there’s no point at which a ‘What was your least/favourite part’ question will be wasted, and no point at which monitoring how many deaths won’t be supremely useful. In the end, you’ll probably want a mix of each.
So, now for a long list of things you might want to measure, starting with the qualitative:
- What they found fun and boring
- What they found it difficult and easy
- What parts were frustrating
- Did the player learn everything they needed to?
- What didn’t the player understand?
- Did the difficulty curve increase too dramatically?
- Do hardcore gamers find it boring/too easy?
- Do novice gamers find it too difficult?
- Number of deaths (and where/what level they were in)
- A good way of determining challenge (alternatively, where they lose health can be useful)
- Use these to create heat maps in order to determine if there’s any winning strategies
- Great to find out winning strategies, or if your weapons are even useful…
- Another way of monitoring challenge
- This is really important for working out how long your game will actually take, deciding on timing etc. Also gives an idea of challenge, though other measures are probably a bit better.
- Valid for nonlinear games etc. You probably don’t want everyone taking the same branches.
- Most useful for online testing, as people are less likely to quit when you’re standing right there. Gives you an idea of whether a certain level is really frustrating/difficult
- Get players to rate things (in terms of difficulty, fun, understanding, etc.) for a quick and easy data grab, and a way of turning qualitative questions into more quantitative ones. A single rating is usually rubbish as everyone has a different scale (which is influenced by how you word the question): you want to get ratings per level to be able to compare them (how you compare these is something we’ll discuss later)
- Get players to rank levels/mechanics/game objects from easiest to hardest, least fun to most fun, understandable to confusing.
What you’ll notice from these two lists is that many of the qualitative data can be covered, at least partially, by the quantitative data. However, quantitative data is best at helping you to guess when someone is happy, sad, frustrated etc., while with qualitative data, you usually get a much more reliable read over how the player is feeling, and some better suggestions on how to actually fix it.
Again, I’ve probably left some of the data you might want to consider out – this is supposed to be a formative list to give you some ideas to start with rather than an exhaustive one. If there’s anything major missing, however, I’ll be sure to update the post.
Next time, we look at how you’re gonna collect this precious, precious data: starting with the quantitative.