Playtesting Capsized

Last Saturday I ran a playtesting session for Capsized. When I approach this type of public testing I tend to do it in a very scientific manner. I have a script, I have a list of items that will be examined, and I have variables I'm going to measure. I also tend to use a very qualitative method. I feel that you can learn a great deal about a game with very few sample points, where the point of diminishing returns is quickly reached and additional players aren't going to tell you anything new. There are those who will argue that you can't say anything conclusive with only a few samples. However, when your asking questions like 'how' and 'why' you're looking for disagreement, rather than agreement, among players. You can look for agreement among players later, in the larger alpha/beta testing phases.

Let's look at an example, suppose a player becomes stuck at a specific point in a level. Let's also suppose that you've tested a total 10 people, the other 9 having no problems at all. It's easy to look at the results and conclude that 1/10 is fine, and that the vast majority seems to have no problem with that level. However, we don't know how or why the player became stuck, or if the problem is unique or systemic. Personally, these are the questions I strive to answer during, and after, I run a playtesting session.

When I run a playtesting session I break the process down into 3 sections.

  1. Play and observation
  2. Survey
  3. Interview/discussion

My sessions usually run about 2 hours. The first hour being play and observation, and the second hour for the survey and interview. Obviously this time changes based on how much you are trying to test.

I’ve found that the survey is best used as a method to measure the overall experience of the player. While the interview allows me to really explore the observations made during the play session. By combining the three sections I've found they provide a rich data set that can be used in the analysis and fixing of observed problems.

Here’s a real example from the session on Saturday; one player was observed having difficulty navigating through a level. The level in question required a significant amount of vertical traversal and it was clear that the player was becoming frustrated.

Me: "How are you feeling right now?"

Him: "A bit annoyed right now. I can't get... ahh, up there." - he gives up

Me: "Why were you trying to get up there?" - I noted that he was trying to reach a catch of items.

Him: "I need to get that jetpack fuel, so I can fly to the top again."

The problem wasn't that he couldn't reach the items (in fact he could), the problem was that he had missed the training in the first level that taught him to use his grappling hook to move around the levels, a conclusion that was drawn from the post session data analysis. The player was unaware of the cause of his problem, and if we had simply asked him – without observing the problem first hand – we might have dismissed it, or incorrectly assumed it was a problem in the current level.

I'm happy with the process I've been using. It seems to work very well within the limitations found in an Indie title.


New publication available

I've added my most recent publication for everyone to read. It's a short workshop paper that describes a project I've been working on. You can download the article here, or you can find the reference on the publications page.

The project is a cool little initiative describing the expanded use of data collected from video game reviews as a tool to inform future design decisions, and to help in the assessment of quality during the testing process. The project goes hand-in-hand with my previous published work, which described a heuristic evaluation technique that I call Critic-Proofing. If you're interested in novel evaluation techniques designed specifically for video games, then this is worth a read.

