🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

MySQL/PHP on Amazon AWS (or alternative)

Started by
7 comments, last by Blixt Gordon 8 years, 4 months ago

I store match and player data in a mySql database with interfacing PHP scripts that interface POST requests. My iOS game app reports match and player data after each match, and match and player data is retrieved when statistics shall be shown in some game scenes. I've about 6 different POST commands, and each POST is about 20 bytes in each direction. So it's quite a limited amount of data. In average, during game play, I guess that each app will make about 1 POSTs every 10 seconds.

I will try to release a game within a couple of months. My current service provider is one.com, but I don't think that I can increase the server capacity at one.com if, saying if, my app would be a success. Therefore my play is to move to AWS before the app is released. I've studied AWS today, but I don't find any pre-packaged mySql/PHP solution. I've seen comments about setting up such solutions, and I've seen that it's possible, though.

I'm inexperienced in database systems, and have some questions. I want to do right from start.:

  • Is AWS suited for a mySql/PHP solution, or shall I use another setup for my database functionality with POST interface?
  • Are there service providers that have a more integrated mySql/PHP solution (like the one i use at one.com), which is scalable? Any suggestions?
  • Any other input is much appreciated.

I don't want to manage the hardware directly, but I want a scalable higher layer functionality.

Advertisement

I guess that each app will make about 1 POSTs every 10 seconds.


That's almost the worst interval for mobile games :-) It's long enough that the phone MAY decide to disconnect from the towers, which means re-spinning up the connection each time you POST. (Spinning up the connection is even worse battery-wise and latency-wise than just keeping it on.)

It may be that 10 seconds is enough to not time out the connection and keep it online, though -- I haven't looked into this in a long while.

Is AWS suited for a mySql/PHP solution


Yes, many apps do this. There are a number of tutorials and images you can play with:
tutorial about LAMP on AWS

Are there service providers that have a more integrated mySql/PHP solution (like the one i use at one.com), which is scalable?


Scalability has more to do with your application than the services. That being said, if you want really scalable databases, go with Google App Engine and their BigTable database back-end, or use SimpleDB on Amazon AWS instead of a relational database.
enum Bool { True, False, FileNotFound };

I've said this before, but a big thank you, hplus! Most valuable.

Scalability has more to do with your application than the services.

If you have time, do you mind elaborating on this?

  • How shall I design my application for scalability?
  • And what do you mean by application, for that matter? Do you mean the whole iOS app/PHP/mySql system, or something else?

I read your linked tutorial now, and realised that it was based on EC2. I guess that means that it's hardly scalable at all for a mySql application. Am I correct? Maybe I'm missing something?

My understanding: In the LAMP solution I will run the mySql DB and the PHP scripts on a virtual machine, and if it gets overloaded I'm in deep shit. mellow.png Am I correct? Or maybe that EC2 VM is very scalable in itself, or maybe multiple VMs can interact with the same mySql DB?

I really suck at web servers and databases. I'm an embedded programmer. rolleyes.gif I've got no idea of how many POSTs that I will be able to handle per second.

How shall I design my application for scalability?
And what do you mean by application, for that matter? Do you mean the whole iOS app/PHP/mySql system, or something else?


How to build scalable back-ends, 101:

A typical web scalability stack looks like:

1) Load Balancer (typically, a pair, for redundancy) -- Amazon Elastic Load Balancer can do this for you

2) Application Servers -- these are stateless, and just take web requests (POST, GET, or whatever) and talk to a storage back-end for persistence. These scale very easily, as you just add more of them, running the same code, and the Load Balancers will spread incoming requests to available servers.

3) Storage Back-end -- these are database servers. Here, the data structure of your application matters! That being said, you can use the Amazon RDB service to run MySQL, hosted by Amazon, with a pay-for-performance price list that lets you go pretty high in performance. It's unlikely that your particular game will need more than how far you can go on that -- if it does, you'll hopefully have enough success and revenue to throw engineers at that problem :-)
Also, you'll often want some kind of in-RAM caching as part of your storage back-end -- Memcached, Redis, or similar.


Now, that being said, how you structure the data in the database matters. If you do "JOIN" between two tables, then those tables need to live on the same database server. Thus, if you do a "JOIN" between two separate players, then all players need to live on a single database server for that to work. You want to spread the data such that it doesn't need JOINs most of the time.
For example, a player, their login history, their inventory, and stats, can all live on a particular server for that player. You then allocate players to database servers in some even fashion. For example, players with player id 1,000,000 - 1,999,999 go to server 1, players with player id 2,000,000 - 2,999,999 go to server 2, and so on. (Don't allocate new player IDs sequentially, but rather based on what database server is least loaded right now to put new players on the lowest-loaded server.) Each such database is called a "shard" and the concept is called "horizontal sharding."

Then, when you have operations that absolutely need transactions across tables, put only that data on a separate server. For example, you may have a trade system. To implement trace, objects need to be transactionally moved to the trade system (typically using a transfer queue on each player shard) where the trade is actually settled. Allocating IDs, and auditing things in order, and accounting for failure along each step is important to avoid item duplication bugs or lost items. Similarly, high scores are typically put on a system of its own, where all player scores can be sorted. Because that system doesn't deal with other player things (login/password/inventory/etc) that system will scale further before it runs out of capacity.


However: I really think you'll do fine on Amazon AWS with load balancing, some number of application servers (start with 2, so if one dies, there's another one still running,) and the Amazon RDS for MySQL. You can go very, very, far on that setup, unless your game is crazy and writes the player state to the database every second or somesuch.


Separately, real-time simulations need another structure (crossbar, shared bus, or similar) and player-to-player chat also needs another structure, because they don't scale well in the "stateless app server" model -- because they aren't stateless. If your server needs simulation, you'll need to take another tack for that. But it doesn't sound like that's what you're doing.


The other good news is that you can start with a single server instance, running both app server and database, without load balancing. Store the database back-end on the Elastic Block Store (so it doesn't go away when the instance dies,) and if your game takes off, you can move the data and code to a larger number of servers.
enum Bool { True, False, FileNotFound };

hplus:

In case you have time...

Just want to check before I change back. I got the impression that it was good to try to split up databases to be able to run each database on a separate server, but maybe I misunderstood you? The guys answering my question on stackoverflow does not seem very enthusiastic :). Maybe I shall wait with the split until it is necessary?

http://stackoverflow.com/questions/35441330/one-table-per-database

If you ever get to Gothenburg, make a post in this thread and I'll by you a couple of beers. I really appreciate your help. smile.png

For sharding of databases, it depends on how much data you've got and how you are using it.

You're talking about scalable, but you're also asking questions about a system that is unlikely to grow that big. Scalable stateless servers are not too difficult, but scaling databases through sharding is an architectural effort. I wouldn't recommend it unless you are realistically going to bump against the limits of the database system.

We recently split a server of data into shards when we were approaching the blob count limit. We partitioned the data into 100 buckets (based on the final two digits of the record key) and then assigned the buckets to seven shard machines plus another 'master shard' machine for some values that cannot easily be split but wouldn't hurt load. We built it so we can add/modify machine mappings by temporarily locking that one bucket with 1% of the values, copying the 1% of the data to a new database, then updating the master shard to indicate the data is on the other server.

For us it is not one table per database. There are several tables on each database, but all the tables are related to a common record table. There is also a small number of tables on the master shard with data to coordinate who is where, such as the master list of all IDs to help with synchronization and locking and ensuring unique IDs remain unique across all shards.

There is a lot of logic involved in sharding a database but fortunately there is also a lot of good documentation online about risks and common patterns. Actions like adding records gain more steps. As one of many examples, instead of just creating a record directly and inserting to the necessary tables as a single transaction, you need to create a record marked inactive on the master shard that generates a new record ID, create a record and all the data on the proper shard, update the master shard to mark the record as active, or on any failure roll back the entire set of transactions, plus have mechanisms in place to resume/repair in case of connections failing machines going down.

Designing small systems that fit in a single database are definitely easier than large systems.

If you ever get to Gothenburg


Be careful what you wish for! My wife inherited 20% of a summer house in Åsa from her grandmother ... :-)

try to split up databases to be able to run each database on a separate server


The best kind of sharding is horizontal sharding: Users with user ID between 1,000,000 and 1,999,999 go to server 1; users with user ID 2,000,000 to 2,999,999 go to server 2, ...
Then allocate user ID based on which server is the lowest loaded. And, when all current servers have high load at peak times, spin up a few new servers with new ranges of user ID.

The one problem here is if you do user-to-user trade (or other such two-party economy-based interactions,) because MySQL can't do cross-database transactions with any degree of efficiency. For those cases, you may want to use a centralized "trade" or "master" database that contains the important information about each user, and/or an escrow system. Trade would then be:
- each user markes the tradeable stuff as moving-to-escrow on the user shard
- each user adds the stuff into actual-escrow on the master
- trade happens on the master
- users make appropriate updates on the user shards

If something dies/goes wrong, an item may temporarily be left in "escrow" state on the user shards, so when other code finds such data (or when a user logs in, say,) you'd check with the master on what the outcome of the escrowed operation was, and update the user shards accordingly.

Anyway, a system with millions of monthly active users runs fine with per-user shards for inventory, messaging, etc, plus a central "master" shard that contains the user name/password/email/in-game-credit balance. This scales just fine. (We do have some number of read-only replicas for the master database to serve additional read load.)
Step #1 is to actually ship a working game, and at least THINK about sharding, but not necessarily implement it before you need it.
Step #2 is actually updating the system to be more scalable, if and when you actually need it. Chances are, you won't need it, so don't spend time on it until you have to!

Or, if you can afford a slightly higher operating cost per-user, using canned solutions like SimpleDB or BigTable is a pretty easy way of letting someone else solve the problem :-)
enum Bool { True, False, FileNotFound };

Thanks guys. I've got one database with two tables now — again. I understand that it's very unlikely that my hobby project would take off, but I like to learn by the way. I see it as a part of my CV, and I like to learn new things.

And BTW hplus, my beer offer stands! smile.png

This topic is closed to new replies.

Advertisement