Before reading, open this link in another window and turn up the volume. Ok, we can go on.
If I really have to explain to you what is about this mass phenomenon, you probably have been trapped in a mayo jar for weeks. 25 million users only in USA and counting, and it will be probably reminded as a foundational milestone of a subgenre, just as Angry Birds.
In any case, the game shows the user a real map with a series of points, called Pokéstops, where he has to physically move to get objects or capturing Pokémon. Moving. Travelling. Walking. A concept strangely similar to exercising. For gamers. Astonishing.
Also, Pokémon can appear in other unexpected points, and there’s the possibility of meeting in selected places as gyms with other players to fight our Pokémon against each other.
Pokémon Go’s infrastructure is cloud-based, so it seemed a funny exercise explaining how to deploy an architecture to support the game requirements and this number of simultaneous users. Niantic (the company behind the product) utilises Google platform, but I have more expertise with AWS, so I will choose this one for the exercise.
Chapter I: Pokémon Go map
Obtaining geographical information
We are going to work on real world, so we will need cartography data. We can obtain them in OSM project.
The file includes a widespread representation of our home planet in graph shape: a series of nodes (points over a surface) which, in occasions, we can gather forming paths and that are classified by tags.
We have to determine Pokéstops. In order to do that, we will analyse the downloaded file. Depending on the format we use, we will need to process up to 1TB of data, so we better will make a concurrent execution of this part from the very beginning. The goal is to look over the data locating the so-called POI (points of interest) through the associated tags and save their positions as Pokéstops.
The most efficient way to accomplish this is uploading the file to S3: it is an especially reliable storage service. If you upload 10 million files, the average file loss 10.000 years later would be only of 1 file. On average. And you don’t have 10.000 years.
Once we have uploaded all the data, we will code a little MapReduce to read each element detecting which tags are interesting and to emit its coordinates as result.
EMR is a service that provides a managed Hadoop cluster. Using spot instances we will run our little program simultaneously in a great number of machines. For example, now we can launch 40 m3.xlarge with an 80% discount in Virginia region and ask EMR to redistribute the workload, retry mistakes, replace defective nodes, etc. We should have completed the work in minutes.
Supporting a hundred thousand queries per second
Where do we save Pokéstops? Even if we consider the whole planet, the information volume is not especially high, it could be managed by any given traditional database. But there is a critical feature: throughput, say, the number of simultaneous queries we could deal with. It can overtake a hundred thousand of reads per minute. Our system scalability is key, so we will discard a relational database.
We will choose DynamoDB. It is a managed database, so it is not necessary to be concerned about maintenance or high availability. Its key feature is that it can be as fast as you need it to be, only depending on how much do you want to spend. In exchange, we no longer will be able to think in classic relational terms, but this is not a problem in this case.
When modelling data with DynamoDB, you have to consider that on the queries you always need to indicate a partition key, meaning this that in order to retrieve information you have to use as a minimum an equality condition over a field. If you do that, it will be very efficient to retrieve all the data that includes the same value. We have to be very smart to achieve that all the nearby Pokéstops could be accessed in the lowest number possible of operations. In order to do that, you will need a little knowledge on coordinates systems.
We are generally used to longitude and latitude to indicate a place in our world, and most of us never dedicated too much time and effort to wonder what those numbers indicate. We have a vague notion of latitude is zero in the equator and +90 or -90 degrees just as you get close to the poles, while longitude takes an arbitrary line that crosses Greenwich as inception, from -180 to +180. If we want to create a Pokémon Go clone, we will need a little more knowledge about this.
In particular, it’s crystal clear it is mandatory to send the user only the correspondent planet zone. A practical form to solve this is dividing our planet in chunks of more or less a square kilometre. We will determinate which chunks the user need in order to visualise its map depending on the position he is located, and we will only send this information.
It is very tricky to achieve this with lat/lon: a longitude degree does not even represent the same distance lengthwise the planet; the nearer to the poles, the closer the meridians.
There are many different ways to get coordinates (different projections of the planet) and it is easy to convert from one to another. A very popular one is UTM, and utilising it makes really easy to name each of the chunks we spoke lately.
Say, the coordinates in WGS84 of Plaza Catalunya at Barcelona are “41.3866, 2.17”, when converted to UTM are “31T 4582014 430605”. A square kilometre can be identified as “31T 4582 430” just eliminating digits at the end of the number.
So, our DynamoDB table would look like this:
When the user is in the square, we can retrieve all the Pokéstops by indicating its square kilometre (and maybe some adjacent squares if he is at the border). Efficient. Next topic.
We also have to know which Pokémon and objects are on them, obviously. The ideal way to accomplish this would be adding this data in each stop within the own registry (item in DynamoDB slang), but this means that every now and again we should bring them up to date. This operation is expensive in this kind of database, so we will aim for another more sophisticated alternative.
Do you see the type attribute we added? If you have ever played to Dungeons & Dragons you will remember the event tables: interesting happenings that can happen when you reach a given point. In the RPG –role playing game, just in case- you had to throw a dice, but in this case, we want all the players to encounter with the same creature. To explain better this point here you have the first chunk of a table of this kind:
Thus, if we toss the dice and the result is 4, then we have found a hyperpotion. But we need to be more deterministic, and in order to achieve it, we can leverage that your smartphone inner clock is synchronised with the servers in most cases.
Let’s say now it is 11:36:48 of 8/4/2016. Let’s reduce the precision to the time (just to avoid changes one second at a time), so we get 11:00. The corresponding epoch time is 1470308400 and the MD5 of this number is ca1955b07f828f4727d27994100b41a0.
We will save the last number, which is 0, as much in decimal as hexadecimal. Let’s look to our alleged event table and we will see a Growlithe appears.
I personally would fill the whole table of Growlithes, in order to get the same Pokémon always: it’s not enough. Generally, if you want there more Pokémon of the same type you only have to replicate the same value in different slots of the event table.
Obviously, when a user interacts of any which way (for instance, capturing our Grow) we will have to check the coherence between the sent data and the Pokéstop state in that precise moment, or we will opening the field to the cheaters, hackers, know-it-alls and funny people all over the internet. You are included in the last group, so do not delude yourself.
¡Yay! Map thing is close to being up and running, except for the part related to drawing them. My dear friends, we will outsource it: fortunately for ourselves, osmbuildings is able to draw 3D maps based on the same dataset we utilised to generate Pokéstops, so we can put away the work.
Nothing especially new in here: the user uses R53 to retrieve the address of an ELB that distributes the petitions between the managed fleet by an autoscaling group, which looks after the EC2 instances, responsible for access API implementation attacking our DynamoDB.
On the next episode (link en inglés) we will see how we can manage the immense data flow an app like this can generate. Don’t miss it!