How to Design a Competition

To understand what we want to design, we need some goals and direction, this will depend on what you want to do. Some typical goals for a competition are accessibility and fun, as well as testing a programmers skills.

I will define a few terms before proceeding:

Competition: The competition is a collection and interactions of competitors, the competition design, and the processes that allow competitors to compete (e.g. agent submission page, rankings page, visualizers etc.)

Competition design: This refers to the exact game a user's agent plays, whether it be against itself / the environment / or other users. This is usually fully described by a design spec documentation what one can do and not do in the game.

Agent: The code and files that comprise a user's strategy to be employed in a competition design against self or others (and then usually ranked on a leaderboard)

Assuming a generic competition aimed at a general audience and a wide range of abilities, the following are some general good design principles for a great competition:

Easy to Start
Rich in Strategy
Strong Signifiers
Excellent Feedback

The first 2 points are general concepts about competition design. The rest of the points are applying design principles, a lot of which I have learned from competing in past competitions and from reading "The Design of Everyday Things" by Don Norman (which I highly recommend reading!). I will abbreviate this book as DOET whenever I reference it.

Also it's really important to consider who your competition audience is. If everyone competing is a seasoned competitor and / or is highly skilled at these kind of competitions, the easy to start point won't be entirely applicable. Likewise, if everyone competing has little to no experience, you may want to avoid highly complex competition designs.

Easy to Start

Easy to start competition designs make it much more accessible and fun for competitors. This way most competitors won't be frustrated by setup or leave because it's hard to write a working AI bot. This is a problem often faced by more hardcore / difficult competitions like Battlecode. While Battlecode offers complex, fun, and strategy rich competitions, they often inadvertently deter newcomers. A good example of something that is easy to start is the Halite AI competition, where the rules are simple and the programming API is also simple. Moreover, the language agnostic system adopted by Halite helps make it easier for competitors to join and start as they do not need to use a different programming language from what they're comfortable with. (edit: they unfortunately no longer do this in current iterations of Halite).

But making something easy to start is also not easy as complexity can explode. Typically a good way to estimate how complex your competition is to look at your action space. The action space refers to all available actions a user's agent can take at any given point in time. Something like being able to only move in limited directions in a game like checkers is considered to have a minimal action space. Something like Starcraft would be considered to have a very large action space. By carefully controlling the size of the action space, you will be more able to control the complexity and keep the competition simple enough to start.

Rich in Strategy

While an easy to start competition helps cover and address issues of accessibility for the spectrum of competitors that are less experienced, it's also important to ensure the competition design isn't instantly solvable. This means simple games like tic-tac-toe or purely random chance games (picking out cards and the higher number wins) aren't rich. If a competition is too easy, it won't be interesting and most of the bots on the leaderboard will likely be the same and will have the same performance. Without room for growth in strategy, this will easily dull any competition.

But how do you make it rich in strategy? I highly recommend taking a look at some of Battlecode's past design specs. They are almost always extremely rich in strategy. There's a number of common dimensions through which one can use to build richness. Some examples include resource management, movement in a grid, and in general just variety in each dimension. However, usually, the best parts come from more innovative and creative ideas. Being creative is very difficult but I presume this comes with experience and from competing in other competitions.

What makes this unfortunately even more difficult is that it is hard to balance the richness with a low barrier of entry to start the competition. They more often than not collide with each other, meaning you will need to find unique ways to add minimal actions while adding a great number more possible strategies.

Strong Signifiers

A signifier communicates where actions / interactions should take place. In the context of an AI programming competition, some of these interactions range from using the agent programming API that enables users to control their agent to interacting with the platform / website that allows you to compete against others or self. Importantly, signifiers help bridge the gulf of execution representing the amount of effort required to figure out how to do some action in a competition

Some of the most common signifiers of these competitions are design specs, API documentation, and the overall website flow for submitting an agent and interpreting its performance.

Design specs signify exactly how a user's new strategy can interact in the competition and exactly what actions are available. A good design spec is probably one of the most important parts of a competition. Without one, most users would be left confused and asking many many questions, leading to users leaving the competition, the creations of FAQs, and many more problems and we thus fail to bridge the gulf of execution. (FAQs in my opinion aren't good. Long FAQs are just evidence of poor design). But what is a good design spec? A good design spec communicates quickly how the competition works and without ambiguity. Some of the many ways to improve communication and signifiers include visuals, bolded text, information-hierarchy etc. In my opinion, visuals in the context of AI programming competitions are probably the best signifier for users. AI programming competitions usually come with a visualizer to show users their agent in action and give them feedback on the performance of behavior of their agent. Connecting the visual behavior of the agent with the design specs visually will greatly improve the clarity.
API documentation signifiers to user's how exactly are they allowed and able to code an agent that works in the competition. Such documentation needs to be concise and clear. This also neatly ties in with keeping action space minimal since a minimal action space tends to imply less needs to be written in the API documentation. Without clarity, we enter the same problems faced by poor design specs which can be read about above. Documentation in software has a very long history of being difficult to write well, and is often enforced as a class requirement in computer science classes around the world. I personally recommend just getting used to how popular open source projects document their code and following those standards: see https://github.com/PharkMillups/beautiful-docs for a list of well documented software. Good documentation will easily help bridge the gulf of execution, allowing competitors to easily move from idea to code.
A good website / platform for interacting with the competition is crucial. One component of this is good signifiers communicating key aspects of your competition, from where to find the design specs, to how to submit an agent. If a user can't figure out how to enter a competition and submit something, they can 1. ask endless questions and be given the same answers, 2. use those answers and still fail, 3. just quit all together. History of other websites suggest that 1, 2, and 3 happen very often in a poorly designed website. There are many ways to approach this issue of signifying key components. For example, for bot submission there needs to be 1. clear indication of where to submit and 2. clear indiction of how to submit. Often times point 2 is tricky as perhaps a submission process might require a user to zip all their files, rename the folder a specific name, and then it, steps of which if not communicated well, will cause every user to certainly fail when trying to submit an agent. A better solution would be to reduce the number of steps required so less communication needs to be done (or if that is technologically not possible, explain those steps clearly)

For more information, read the first chapter in DOET about signifiers and more.

Excellent Feedback

Excellent feedback allows competitors to have a smooth competition experience and most importantly bridge the gulf of evaluation, which represent the effort required to interpret the state to figure out what happened as a result of some action.

Good feedback allows users to be able to continue competing as it helps improve the ability to go from observations to ideas to code. What comprises feedback in a typical AI programming competition? A few important sources of feedback are listed here and explained:

Usually an AI competition will have some way to rank submitted agents against each other (or self) and display this publicly to the users to let them know how their agent is performing. This is a "Leaderboard" which is a common operationalization of displaying relative skill and how one's changes impact their agent.
A visualizer for the competition. Like many RL environments, in general these AI programming competitions need some form of visual method to show users what their agent is doing, to debug it, and to help them improve upon it. This helps speed up development time and lower frustration (imagine coding an agent without knowing what it is actually doing) and keeps more users engaged.
An agent programming API is one of the first barriers between ideas and working code. Such an API is meant to allow the user to engage with the competition design, interact with the environment which may include their opponent, and perform certain actions to help them score higher or win. The API is a source of feedback for competitors because they have to read the documentation and then run the code and see the results. If the API is faulty, then results do not match expectations and we fail to bridge the gulf of evaluation and confuse users.

But how do we make these sources of feedback "good." There are some common design principles and some of them are listed below:

Having feedback. I put this as point 0 because this sounds like common sense, but in practice is not done often. For example, I've often found it difficult to determine which version of my agent I submitted to a competition was running. The only way I could check was either 1. downloading the file I uploaded some time ago, or 2. check the id hidden internally in the javascript code of a website. A lack of feedback will often leave users confused about what exactly happened which will lead to them asking lots of questions, wasting time redoing something that may have worked already, or not realizing some action did not succeed.
Timely feedback. Feedback that is not immediate / very slow will lead to a whole host of problems. One such problem caused by slow feedback is that it can frustrate users and cause them to give up. Another problem is that untimely feedback can lead to a waste of resources as users make an extra effort to find different ways to get timely feedback. Timely feedback is crucial for bridging the gulf of evaluation and is crucial in giving user's a smooth development flow.
Informative feedback. Poor feedback like messy charts, unlabeled numbers in a visualizer, or confusing error messages etc. do nothing to help the user. Poor feedback serves more as clutter and can lead to a constant loop of endless FAQs and questions from users that are not really necessary if the competition is designed well. It's always worth it to spend a bit of extra effort to make add appropriate labels, hide uninformative clutter feedback, and show only the relevant feedback.

Notes

While writing this, I found many many concepts from DOET applied to AI programming competitions to my surprise. While DOET uses many examples that are mostly far away from something like a AI programming competition, the concepts and ideas are widely applicable.

Also this document is a WIP. More to be added some time! But the content in here right now should be a decent guideline

Provide feedback

Saved searches