-
Notifications
You must be signed in to change notification settings - Fork 9
/
BGG-Analysis-Part-3.html
164 lines (148 loc) · 17.4 KB
/
BGG-Analysis-Part-3.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
<!DOCTYPE html>
<html lang="en">
<head>
<meta http-equiv="refresh" content="0; URL='https://dvatvani.com/blog/bgg-analysis-part-3'" />
<link href='//fonts.googleapis.com/css?family=Source+Sans+Pro:300,400,700,400italic' rel='stylesheet' type='text/css'>
<link rel="stylesheet" type="text/css" href="/theme/stylesheet/style.min.css">
<link rel="stylesheet" type="text/css" href="/theme/stylesheet/pygments.min.css">
<link rel="stylesheet" type="text/css" href="/theme/stylesheet/font-awesome.min.css">
<link href="https://dvatvani.github.io/feeds/all.atom.xml" type="application/atom+xml" rel="alternate" title="Dinesh Vatvani Atom">
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="robots" content="" />
<meta name="author" content="Dinesh Vatvani" />
<meta name="description" content="Using the correlation of game ratings at an individual user-level, we can find games that share a common core appeal and use this to create a map of the board games landscape" />
<meta name="keywords" content="Board games">
<meta property="og:site_name" content="Dinesh Vatvani"/>
<meta property="og:title" content="An analysis of board games: Part III - Mapping the board game landscape"/>
<meta property="og:description" content="Using the correlation of game ratings at an individual user-level, we can find games that share a common core appeal and use this to create a map of the board games landscape"/>
<meta property="og:locale" content="en_US"/>
<meta property="og:url" content="https://dvatvani.github.io/BGG-Analysis-Part-3.html"/>
<meta property="og:type" content="article"/>
<meta property="article:published_time" content="2020-09-04 00:10:00+01:00"/>
<meta property="article:modified_time" content="2020-09-04 00:10:00+01:00"/>
<meta property="article:author" content="https://dvatvani.github.io/author/dinesh-vatvani.html">
<meta property="article:section" content="Analysis"/>
<meta property="article:tag" content="Board games"/>
<meta property="og:image" content="/static/Logo/radial_barh.png">
<title>Dinesh Vatvani – An analysis of board games: Part III - Mapping the board game landscape</title>
</head>
<body>
<aside>
<div>
<a href="https://dvatvani.github.io">
<img src="/static/Logo/radial_barh.png" alt="Dinesh Vatvani" title="Dinesh Vatvani">
</a>
<h1><a href="/">Dinesh Vatvani</a></h1>
<p>A Python and data analysis blog</p>
<nav>
<ul class="list">
<li><a href="/pages/about.html#about">About</a></li>
</ul>
</nav>
<ul class="social">
<li><a class="sc-github-square" href="https://github.com/dvatvani" target="_blank"><i class="fa fa-github-square"></i></a></li>
<li><a class="sc-twitter-square" href="https://twitter.com/d_vatvani" target="_blank"><i class="fa fa-twitter-square"></i></a></li>
<li><a class="sc-rss-square" href="http://dvatvani.github.io/feeds/all.atom.xml" target="_blank"><i class="fa fa-rss-square"></i></a></li>
</ul>
</div>
</aside>
<main>
<nav>
<a href="/">Home</a>
<a href="/archives.html">Archives</a>
<a href="/categories.html">Categories</a>
<a href="/tags.html">Tags</a>
<a href="https://dvatvani.github.io/feeds/all.atom.xml">Atom</a>
</nav>
<article>
<header>
<h1 id="BGG-Analysis-Part-3">An analysis of board games: Part <span class="caps">III</span> - Mapping the board game landscape</h1>
<p>Posted on Fri 04 September 2020 in <a href="/category/analysis.html">Analysis</a></p>
</header>
<div>
<p>This is part <span class="caps">III</span> in my series on analysing BoardGameGeek data. Other parts can be found here:</p>
<ul>
<li><a href="./BGG-Analysis-Part-1.html">Part I: Introduction and general trends</a></li>
<li><a href="./BGG-Analysis-Part-2.html">Part <span class="caps">II</span>: Complexity bias in <span class="caps">BGG</span></a></li>
<li>Part <span class="caps">III</span>: Mapping the board game landscape</li>
</ul>
<hr />
<h1>Introduction</h1>
<p>Previous posts in this series cover how we generated a dataset from BoardGameGeek, explored general trends in the tabletop games landscape over time and looked at complexity bias inherent in the <span class="caps">BGG</span> dataset. This post explores a comparison of game ratings at an individual user level to determine which games are similar and use that to create a map of the board games landscape. </p>
<h1>Data collection</h1>
<p>Before we are able to perform any user-level ratings analysis, we need to collect a dataset that contains game ratings at an individual user level since the previous dataset in parts I and <span class="caps">II</span> used an average rating for each game. Extracting individual-account-level information from Board Game Geek is possible using their <span class="caps">XML</span> <span class="caps">API</span>, but is more challenging and time-consuming than extracting game-level aggregates due to some constraints in the <span class="caps">API</span> (e.g. limited to 100 user-level ratings per request). As such, obtaining a comprehensive list of all game ratings by each user for all games in the <span class="caps">BGG</span> database was not considered a viable approach. Instead, the individual user level ratings were obtained for <code>500</code> of the most populer (by Ownership) games on <span class="caps">BGG</span>, with an additional <code>53</code> hand-picked to sample some of the more recent successful titles, including Wingspan, Res Arcana, etc. Those criteria bring down the total number of user-level ratings to be collected considerably, but still amounts to <code>7.5 million</code> individual game ratings at a user level. Those <code>7.5 million</code> user-level game ratings covering <code>553</code> successful games were collected and were found to contain ratings from <code>265,374</code> unique <span class="caps">BGG</span> accounts. </p>
<p>The dataset is currently in a SQLite database. If anyone would like a copy of the data, please let me know.</p>
<h1>User-Driven Similarity</h1>
<p>Having collected individual game ratings per <span class="caps">BGG</span> user, we can take any pair of games, find the users that have provided ratings for both of these games and see how the ratings across games are related. There are a few examples below showing that <span class="caps">BGG</span> users who tend to like <code>Monopoly</code> tend to also like <code>Risk</code>. Similarly, users who like <code>Yahtzee</code> tend to also like <code>UNO</code>. On the other hand, users that like <code>Monopoly</code> aren’t any more likely to enjoy <code>Twilight Struggle</code>, and users liking <code>UNO</code> tells us nothing about their affinity for <code>Scythe</code>. The extent to which ratings of games are correlated indicates how likely it is that users will like one game if they like the other. It’s important to highlight that when we say a user “likes” a game here, we are always talking in relative terms. It means that users that like game A <em>more than average</em> are likely to enjoy game B <em>more than average</em> too if their ratings are positively correlated. </p>
<p><center><img alt="Sample rating correlations" src="https://dvatvani.github.io/static/BGG-analysis/sample_rating_correlations.png" /></center></p>
<p>The correlation of user-level ratings between games can be interpreted as some form of similarity between games. After all, if the users who tend to like one tend to also like the other, there will presumably be something similar between the games. However, the similarity between the games may not be obvious based on a traditional board games classification taxonomy. What the correlation captures is essentially games that “scratch the same itch” or tap into a similar core appeal. This could be the feeling of solving an abstract puzzle, a thematic appeal, the social-component, the rewarding feeling of building an elegant engine, the feeling of cooperating with friends, or any other. The games may have very different mechanics, themes, complexity levels or even overall average ratings, but likely tap into a similar core appeal, and that core appeal will resonate with some groups of <span class="caps">BGG</span> users more than others.</p>
<h1>Scaling up the comparisons</h1>
<p>Now that we’ve introduced the concept of game similarity based on user-rating correlations, we can calculate the pairwise correlations for all <code>152,628</code> unique pairs of games in our dataset. Despite the user-level correlation approach to assessing how similar games are knowing nothing about the games’ type, genre, mechanics, complexity level, rating, designers, or anything tangible about the game, the similarity approach is able to identify that remakes or alternate versions of the same game are very similar (e.g. <code>Codenames</code>, <code>Codenames: Pictures</code> and <code>Codenames: Duet</code>, or <code>Brass: Lancashire</code> and <code>Brass: Birmingham</code>). This approach also finds, rather reassuringly, that games that we would intuitively class as being broadly similar tend to have high user-level rating correlations as well. For example, <code>One Night Ultimate Warewolf</code>, <code>Secret Hitler</code>, <code>Coup</code>, and <code>The Resistance</code> are all light party games based on communication and deception. They all end up with high correlations with one another. Similarly, word-games like <code>Boggle</code>, <code>Scrabble</code>, <code>Taboo</code>, <code>Scattergories</code>, <code>Pictionary</code>, <code>Bananagrams</code> also group together in the same way. Another example is the “Easy to learn. Hard to master” strategy cluster of <code>Chess</code>, <code>Go</code> and <code>Diplomacy</code>. These correlations and their general alignment with games that we’d intuitively consider similar allows us to build a simple recommendation system that displays the most similar games to any other game (refer to Dashboard below for an implementation of this)</p>
<h1>Mapping the board game landscape</h1>
<p>The full grid of <code>152,628</code> game similarities is non-trivial to visualise in its native form. To accurately display the similarities between all games in that matrix as distances between points, we would need a 552-dimensional (N-1) graph. Obviously, that’s not really a tractable solution. Fortunately for us, there are machine learning techniques that provide us with an adequate solution to this problem. A technique known as t-Distributed Stochastic Neighbour Embedding (commonly abbreviated as t-<span class="caps">SNE</span>) allows us to create a lower-dimensionality projection, or more correctly, a manifold, of the 553 x 553 correlation matrix that attempts to keep points that are close together in the high-dimensionality space close together in the reduced-dimensionality space too. What this means is that we can obtain a set of points in 2D that best preserves adjacency between points close together in high-dimensional space, therefore keeping similar games together. Below is an interactive visualision of the results using this approach. You can hover over any point to get more information on it and a list of its most similar games. There is also an interactive dashboard (see next section) where you can search for individual games to highlight them in the plot.</p>
<script src="https://dvatvani.github.io/static/BGG-analysis/test.js" id="5aba6f58-365f-4605-977d-ac1ca5aee322"></script>
<p>We can see that the games that we mentioned as being similar above are close to each other in this visualisation. This visualisation also shows that games percieved to be good “Gateway games” such as <code>Catan</code>, <code>Carcassone</code> and <code>Ticket to Ride</code> are also in close proximity to each other (bottom of the light blue group), despite not having many common themes or mechanics between them. Similarly, many pre-1960s traditional family games such as <code>Monopoly</code>, <code>Risk</code>, <code>Battleship</code> or <code>Clue</code> cluster together as well (bottom of the red group). Navigating the plot reveals several groups of games that are intuitively grouped together e.g. Economic Games, Visual party games, communication-based party games, hidden information card games. Interestingly, I found 2 game designers whose games tend to cluster together: Vlaada Chvátil (near the top right of the orange area) and Reiner Knizia (top left of dark blue area). It’s also interesting that in both of these cases, despite there being a very distinct cluster for their games, they each have games that do not belong in their own cluster e.g. <code>Codenames</code> does not appear to belong with the other Vlaada Chvátil games. Similarly, <code>The Quest for El Dorado</code> does not belong with the other Reiner Knizia games. There are many other interesting features in the plot, but they are best left for the readers to explore and discover.</p>
<h1>Interactive Dashboard</h1>
<p>I’ve built a basic interactive dashboard with more control over the visualisation of the <span class="caps">BG</span> landscape seen above, as well as a basic recommendation system that lists the most similar games to any game of interest. It can be found using the link below:</p>
<p><center><a href="https://bgg-similarity-dashboard.herokuapp.com/">Link to dashboard</a></center></p>
<p><center><a href="https://bgg-similarity-dashboard.herokuapp.com/"><img alt="Dashboard Image" src="https://dvatvani.github.io/static/BGG-analysis/dashboard_image.png" /></a></center></p>
<h1>Closing remarks</h1>
<p>I hope that the framework presented here helps nudge the discussion around tabletop games and their classification towards a consideration of the games’ core appeal rather than a classification based on some of the games’ trappings and mechanics e.g. “Wargame”, “Thematic game”, “Hex and Counter game”, etc. This analysis also had a useful byproduct of allowing us to create a rudimentary game recommender system based on user-level ratings correlations (recommender available in the dashboard) that will hopefully be useful to some people, despite the limited scope of <code>553</code> games.</p>
<hr />
<p><em>Thanks to</em>:</p>
<ul>
<li><em><a href="https://twitter.com/elizhargrave">Elizabeth Hargrave</a>: Elizabeth Hargrave suggested that it might be interesting to do a gender-level analysis on the <span class="caps">BGG</span> dataset following my previous analysis on board games. That motivated the collection of a user-rating-level dataset, which eventually sparked this idea.</em></li>
<li><em><a href="https://twitter.com/ColmSeeley">Colm Seeley</a> for introducing me to the world of modern board games, countless discussions and ideas on interesting things to do with the dataset, and for helping me identify and name many of the clusters in the mapped board game landscape.</em></li>
<li><em><a href="https://twitter.com/PresstoFan">Yihui Fan</a> for suggesting some interesting neural-network-based analysis ideas that could be performed on this data.</em></li>
</ul>
</div>
<div class="tag-cloud">
<p>
<a href="/tag/board-games.html">Board games</a>
</p>
</div>
<div id="disqus_thread"></div>
<script type="text/javascript">
var disqus_shortname = 'dvatvani';
(function() {
var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
})();
</script>
<noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript" rel="nofollow">comments powered by Disqus.</a></noscript>
</article>
<footer>
<p>© Dinesh Vatvani </p>
<p>Built using <a href="http://getpelican.com" target="_blank">Pelican</a> - <a href="https://github.com/alexandrevicenzi/flex" target="_blank">Flex</a> theme by <a href="http://alexandrevicenzi.com" target="_blank">Alexandre Vicenzi</a></p> </footer>
</main>
<!-- Google Analytics -->
<script type="text/javascript">
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-75651339-1', 'auto');
ga('send', 'pageview');
</script>
<!-- End Google Analytics -->
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "BlogPosting",
"name": "An analysis of board games: Part III - Mapping the board game landscape",
"headline": "An analysis of board games: Part III - Mapping the board game landscape",
"datePublished": "2020-09-04 00:10:00+01:00",
"dateModified": "2020-09-04 00:10:00+01:00",
"author": {
"@type": "Person",
"name": "Dinesh Vatvani",
"url": "https://dvatvani.github.io/author/dinesh-vatvani.html"
},
"image": "/static/Logo/radial_barh.png",
"url": "https://dvatvani.github.io/BGG-Analysis-Part-3.html",
"description": "Using the correlation of game ratings at an individual user-level, we can find games that share a common core appeal and use this to create a map of the board games landscape"
}
</script></body>
</html>