When people talk about running playtests, the focus tends to go straight to logistics.
How many players do we need?
How quickly can we get results?
What platform should we use?
Those things matter, but they’re not what determines whether the research is actually useful.
What actually determines the quality of research is much simpler. It comes down to four things:
If any one of these is off, the value drops off very quickly.
Everything starts with the question.
This is where most research goes wrong.
You’ll often see questions like:
They sound reasonable, but they’re not particularly useful.
“Fun” isn’t a stable concept. It changes depending on the type of game and the intended experience. A horror game shouldn’t feel comfortable. A strategy game might feel demanding. A narrative game might aim for something more reflective.
If you don’t anchor the question, you can’t meaningfully interpret the answer.
A better way to think about it is in terms of intended experience.
These are grounded in what the game is trying to do, and they lead somewhere actionable.
In practice, good research questions are:
If the question isn’t clear, everything that follows becomes harder.
Once the objective is clear, the next step is the sample.
This isn’t just about demographics. It’s about whether the people you’re testing with make sense for the question you’re asking.
If you’re testing usability, you don’t necessarily need your exact target audience. You need players who can reasonably engage with the genre and help you identify where things break.
If you’re testing engagement or retention, that changes. Now you care much more about whether players resemble your intended audience. Their expectations and motivations start to matter a lot more.
You might also be looking at specific groups:
Each of these shifts the sample.
There’s also a scale element here.
For usability, smaller samples work because you’re identifying issues.
For experience and sentiment, you need larger samples because you’re dealing with variation between players.
The key point is that the sample should follow the question.
Once you know what you’re trying to understand and who you’re testing with, the next question is how you actually capture the data.
Different questions need different methods.
If you’re looking at usability, observation is the most valuable. Watching players interact with the game, especially with think-aloud, gives you direct visibility into where things break down.
If you’re looking at experience or sentiment, you need to combine that with surveys so you can understand how players feel across a larger sample.
If you’re trying to understand reasoning or motivation, interviews can add depth.
Where this often goes wrong is trying to answer everything with the same approach.
For example, trying to understand enjoyment with five players and no survey data. You’ll get something, but it’s going to be very unstable.
Or relying entirely on survey scores without understanding what actually happened during play.
In practice, you’re usually combining methods:
The method should match the question you’re asking.
The final piece is interpretation.
You can have the right question, the right players, and the right data, and still end up in the wrong place if this part isn’t handled properly.
There are a couple of common issues.
One is over-weighting individual players. Especially in smaller samples, a single strong opinion can skew how the results are interpreted.
Another is taking player feedback too literally.
Players are very good at describing their experience. They’re much less reliable when it comes to diagnosing the cause or suggesting solutions.
For example, a player might say something is too difficult, but the actual issue might be clarity, feedback, or onboarding rather than difficulty itself.
What you’re trying to do here is move from:
to:
This is where research connects back to design. You’re not just reporting issues, you’re helping the team understand them in context.
Good research isn’t about tools or scale. It’s about alignment between these four components.
When these line up, even small studies can be very effective.
When they don’t, you can run large, expensive playtests and still struggle to get something actionable out of them.
In practice, the difference between good and bad research is rarely effort. It’s whether these fundamentals have been thought through properly.
And once they are, everything else becomes much easier.