Gearing up for paper-writing

The last couple days have been about prototyping user interface and rendering techniques for improving the accessibility of old video games. I wanted to see if the basics—drawing things in different colors, tweaking contrast and saturation, labeling affordances, et cetera—were achievable in reasonable time, and indeed they are. I want to get some screenshots together for a demo post tomorrow, but for today my goal was to decide if it made sense to outline a research paper, and I’m pretty sure it does. The challenge for an FDG paper will be showing that it’s a significant enough improvement or new application over e.g. Batu’s work, and I think I can get there by showing generality and the fact that some labels (and potentially more in the future) are automatically generated.

To guide the paper, I want to showcase a problem and my contributions. The problem is that many videogames are inaccessible to people with vision impairments; while new games can be made to support larger groups of players, this depends on the cooperation of the game developer. For other games, we want to find ways to increase their accessibility after the fact. For the large class of Nintendo NES games, I have developed a technological approach and tool for efficiently labeling their graphics while playing with key affordances, which are then used to modulate the display to make it more accessible to players with different visual needs. Some of these tags are inferred automatically from play while others are explicitly given by the user. As a side effect, this also produces a rich dataset of game images tagged with fine-grained affordance data, which will be useful for bootstrapping this work to unseen games or onto new, non-NES platforms in a machine learning regime. I have a natural tendency to want to make this bigger—add in motor assists and point-to-move, add in visual cues for sound effects, learn all tags automatically—but I will try and keep it short this time for a change.

I’ll need to double check that recent FDG reviewers have accepted such “tools papers”—they did publish Eric Kaltman’s and my GISST work a few years back, so I think it’s plausible—or whether I need to put more effort into showing that it indeed improves accessibility, to create a set of accessible games, to generate affordance tags automatically, etc.

Life Outside of Work

I didn’t get much work done since last time. Last week (and the week before) my family all fell like dominoes to sinus infections and colds. That meant other tasks were more important than research work: caring for family members, caring for myself, even playing Splatoon all took priority. When life isn’t normal, you can’t pretend like it is normal.

A little distance helped me in other ways, too. For example, I was able to complete some refactoring on the weekend that I had been meaning to do for a while, which substantially reduced the complexity of the interactive Mappy player and its debug outputs. I realized that I should be using 3D convolutions in the scroll detection application and I remembered that there’s a Rust Jupyter kernel (evcxr) that I could use to prototype and visualize my work there. I also got a chance to think about the accessibility work beyond manually tagging individual sprites and tiles with affordances.

Taken together, slowing down for a week meant that I could use my time more efficiently when I was well enough to work again. Today I have recommendation letters and search committee stuff to do, but I am confident that I can catch up on that and push the accessibility work to an Foundations of Digital Games submission within a few weeks.

Overview, 2022-09-27

Xinyi Therese Xu ’24 and I were talking for a while about adding accessibility features to retro games, especially aids for players with visual and motor disabilities. This picks up a thread of research that I first saw carried by Batu Aytemiz et al. in the paper “Your Buddy, the Grandmaster: Repurposing the Game-Playing AI Surplus for Inclusivity”. I got pretty into this idea around the time I applied for CAREER in the summer of 2021, but 2021-2022 was a really intense academic year and I wasn’t able to explore it much further. While more and more games made today are implemented with accessibility features, so many games of major cultural import will never be patched to include such features. So, how can we hack existing games to add them?

There are a few areas here I want to investigate:

  1. Visual supports, probably with some game-specific information—as in modern PC games, can we highlight game entities with positive or negative valence, de-emphasize and reduce contrast of unimportant details, augment text displays with text-to-speech support, or crop and zoom the screen where that would be helpful?
  2. Motor supports, again with game-specific information about character control. Can we add click-to-move features to Super Mario Bros., use slow-motion or other time travel features to make games accessible to people with differing reaction times, and so on?
  3. Can the information needed by the supports above—the valence of game entities, the control scheme and character physics, et cetera—be defined in a way which is convenient for non-specialist players, potentially by the same people who want to play with these supports?
  4. Can the information needed by the supports above be learned automatically to support arbitrary new games?

Back in 2021, I added some Python scripting support to Mappy. It exposes the sprites and tiles on screen (in world coordinates) and gives write access to the screen pixels, so it’s a good starting point for adding visual supports. I think this is a project which is both interesting and useful, and it’s something that feels close to being publishable, so I’m hoping to pursue it over the next couple of months.

I started today by re-reading Batu’s paper, and picking up an email thread with a researcher at UC Irvine. I haven’t published in accessibility before, and I feel like my work is more likely to land in the general area of Batu’s—a technological intervention with some examples motivated by existing work, which could be evaluated with a real population in the future.

This is one of those projects where I feel pretty comfortable with the basic idea (change the visual appearance of things based on their semantics), and it doesn’t involve coming up to speed on machine learning techniques or other technologies. So I think my first goal will be to figure out, for Super Mario specifically, how I can identify background tiles, foreground terrain tiles, enemies, and powerups; then, how I can best communicate e.g. the state of Mario (big, small, jumping, etc) and the salient aspects of the game state. Since I’m OK doing this manually, I’ll add a way to click on a sprite on screen to assign it some valence or affordances (maybe calling back to my work with Gerard Bentley ’19) and collect game data that way. Then, I’ll probably add some convenience functions for saturating/desaturating/tweaking contrast/etc of rectangular regions of the screen (maybe later supporting an alpha mask to keep the shapes crisp) and map those onto object valences. For the paper, I think it’s enough to just do a few levels of Super Mario Bros., the first dungeon of The Legend of Zelda, etc, and keep track of how long the tagging task takes.

The paragraphs above are essentially an outline of a research paper already: name an area, scope it out, relate it to prior work, make an intervention, and show its utility. I watched Simon Peyton Jones’s “How to Write a Great Research Paper” recently—as eager as I am to start hacking, I think I might actually try to write the paper first this time around.