One Million Chessboards

337 points by chunkles 2 months ago

https://eieio.games/blog/one-million-chessboards/

eieio 2 months ago

Ah hello! I made this :)

My blog describing it is pretty sparse, sorry about that. Happy to answer any questions that folks have about the architecture.

Not that it was necessary, but I got really into building this out as a single process that could handle many (10k+/sec) moves for thousands of concurrent clients. I learned a whole lot! And I found golang to be a really good fit for this, since you mostly want to give tons and tons of threads concurrent access to a little bit of shared memory.

panic 2 months ago

If you don’t mind explaining, I’m curious how you test something like this before it goes live. It seems like it would be hard to simulate all the things that could happen at scale.
- eieio 2 months ago
  
  So sometimes I don't test these projects that much but I did this time. Here are a few thoughts:
  My biggest goal was "make sure that my bottleneck is serialization or syscalls for sending to the client." Those are both things I can parallelize really well, so I could (probably) scale my way out of them vertically in a pinch.
  So I tried to pick an architecture that would make that true; I evaluated a ton of different options but eventually did some napkin math and decided that a 64-million uint64 array with a single mutex was probably ok[1].
  To validate that I made a script that spins up ~600 bots, has 100 of them slam 1,000,000 moves through the server as fast as possible, and has the other 500 request lots of reads. This is NOT a perfect simulation of load, but it let me take profiles of my server under a reasonable amount of load and gave me a decent sense of my bottlenecks, whether changes were good for speed, etc.
  I had a plan to move from a single RWMutex to a row-locking approach with 8,000 of them. I didn't want to do this because it's more complicated and I might mess it up. So instead I just measure the number of nanos that I hold my mutex for and send that to a loki instance. This was helpful during testing (at one point my read lock time went up 10x!) but more importantly gave me a plan for what to do if prod was slow - I can look at that metric and only tweak the mutex if it's actually a problem.
  I also took some free wins like using protobufs instead of JSON for websockets. I was worried about connection overhead so I moved to GET polling behind Cloudflare's cache for global resources instead of pushing them over websockets.
  And then I got comfortable with the fact that I might miss something! There are plenty more measurements I could have taken (if there was money on the line I would have measured some things like "number of TCP connections sending 0 moves this server can support" but I was lazy) but...some of the joy of projects like this is the firefighting :). So I was just ready for that.
  Oh and finally I consulted with some very talented systems/performance engineer friends and ran some numbers by them as a sanity check.
  It looks like this was way more work than I needed to do! I think I could comfortable 25x the current load and my server would be ok. But I learned a lot and this should all make the next project faster to make :)
  [1] I originally did my math wrong and modeled the 100x100 snapshots I send to clients as 10,000 reads from main memory instead of 100 copies of 100 uint64s, which lead me down a very different path... I'm not used to thinking about this stuff!
  - phatskat 2 months ago
    
    > To validate that I made a script that spins up ~600 bots
    Funny, when I went there were just over 600 active players and things were running super smoothly, even on my mobile. Kudos!
    Do you see this project and the things you’ve tried applying to other future projects?
    
    eieio 2 months ago
    
    Hah, yes, but for testing I removed all my rate limits so I pushed 1 million moves in 2 or 3 seconds, whereas now I think I rate limit people to like 3 or 4 moves a second (which is beyond what I can achieve on a trackpad going as fast as I can!) so the test isn't quite comparable!
    I definitely learned a lot here. Most of my projects like this are basically just "give the internet access to my computer's memory but with rules." And now I think I've got a really good framework for doing that performantly in golang, which should make the next set of projects like this much quicker to implement.
    I also just...know how to write go now. Which I did not 6 weeks ago. So that's nice.
    
    cheekyfleek 2 months ago
    
    You ain't the only one who's removed the rate limits lol. Some of these queens are clearing a whole board in like 3s, must've written something to keep a piece selected. This is turning into a race to the godliest piece hackathon.
    
    eieio 2 months ago
    
    The rate limits aren't that aggressive and have a decent amount of burst, you can get about 10 moves done in 1 second before you hit them and start getting throttled[1]. And of course you can run multiple clients (I account for this too, but I'm not that aggressive because I don't want to punish many people NAT'd behind a single IP)
    [1] down to something like 3/sec?
    
    cheekyfleek 2 months ago
    
    I figured the multiple people per ip would be an issue, was wondering if that might be at play here. I thought you said it was already at 3-4/s and I doubted it based on some of what I'm seeing. 10/s tracks a little better.
    As to what you should change, I can't say. It's in the wild now lol.
    
    PetahNZ 2 months ago
    
    Yea seems the rate limit should be stricter, I made a quick script to do 1 move (capture) every 150ms, doesn't seem to get throttled.
    I like how generals.io did it where they explicitly made a bot only server/version.
    
    phatskat 2 months ago
    
    Six weeks is pretty quick! Can I ask what editor you use (always curious), and what other languages you have a background in?
    
    eieio 2 months ago
    
    these days I mostly use vscode / cursor, although I still really like vim and use it for languages that I know really well (mostly python these days) and quick edits.
    I spent much of my professional career at Jane Street Capital, which means that I spent a long time just writing OCaml and some bash (and a tiny bit of C). I'm very comfortable with Python, and over the last year I've gotten pretty comfortable with frontend javascript. And now golang!
    I could probably write semi-reasonable java, ruby, or perl if you gave me a few days to brush up on them. And it'd take me a while before I was happy putting C on the internet. Not sure otherwise.
  - sebmellen 2 months ago
    
    Makes sense that you’re a Jane Street alum. Damn cool stuff.
weiliddat 2 months ago

> The frontend optimistically applies all moves you make immediately. It then builds up a dependency graph of the moves you’ve made, and backs them out if it receives a conflicting update before the server acks your move.
The dependency graph is between pieces you’re interacting with? Meaning if you move a queen and are trying to capture a pawn, and there’s potentially a rook that can capture your queen, those 3 are involved in that calculation, and if you moved your queen but the rook also captures your queen at the same time one of them wins? How do you determine that?
- eieio 2 months ago
  Ah yes good question! Here's some context for you. First off, the way moves work:
  (edit: I realized I didn't answer your question. If we receive a captured for a piece we're optimistically tracking that always takes precedence, since once a piece is captured it can't move anymore!)
  * clients send a token with each move * they either receive a cancel or accept for each token, depending on if the move is valid. If they receive an accept, it comes with the sequence number of the move (everything has a seqnum) and the ID of the piece they captured, if applicable * clients receive batches of captures and moves. if a move captured a piece, it's guaranteed that that capture shows up in the same batch as the move
  So when you make a move we:
  * Write down all impacted squares for the move (2 most of the time, 4 if you castle) * Write down its move token * If you moved a piece that is already tracked optimistically from a prior not-yet-acked-or-canceled move, we note that dependency as well
  We maintain this state separate from our ground truth from the server and overlay it on top.
  When we receive a new move, we compare it with our optimistic state. If the move occupies the same square as a piece that we've optimistically moved, we ask "is it possible that we inadvertently captured this piece?" That requires that the piece is of the opposite color and that we made a move that could have captured a piece (for example, if you moved a pawn up that is not a valid capturing move).
  If there's a conflict - if you moved a piece to a square that is now occupied by a piece of the same color, for example - we back out your optimistically applied move. We then look for any moves that depended on it - moves that touch the same squares or share the same move token (because you optimistically moved a piece twice before receiving a response).
  So concretely, imagine you have this state:
  _ _ _ _ K B _ R
  You move the bishop out of the way, and then you castle
  _ _ B _ _ R K _
  Then a piece of the same color moves to where your bishop was! We notice that, revert the bishop move, notice that it used the same square as your castle, and revert that too.
  There's some more bookkeeping here. For example, we also have to track the IDs of the pieces that you moved (if you move a bishop and then we receive another move for the same bishop, that move takes precedence).
  Returning the captured piece ID from the server ack is essential, because we potentially simulate after-the-fact captures (you move a bishop to a square, a rook of the opposite color moves to that square, we decide you probably captured that rook and don't revert your move). We track that and when we receive our ack, compare that state with the ID of the piece we actually captured.
  I think that's most of it? It was a real headache but very satisfying once I got it working.
  - weiliddat 2 months ago
    
    Amazing thanks for the explanation!
    Would be really cool to read about the different designs and consideration and how you arrived at this in your blog post!
weiliddat 2 months ago

> I use a single writer thread, tons of reader threads, and coordinate access to the board with a mutex
On this I found Go to be at the right balance of not having to worry about memory management yet having decent concurrency management primitives and decent performance (memory use is especially impressive). Also did a multiplayer single server Go app with pseudo realtime updates (long polling waiting for updates on related objects).
- eieio 2 months ago
  
  Yes exactly!
  My goal with the board architecture was "just be fast enough that I'm limited by serialization and syscalls for sending back to clients" and go made that really easy to do; I spend a few hundred nanos holding the write lock and ~15k nanos holding the read lock (but obviously I can do that from many readers at once) and that was enough for me.
  I definitely have some qualms with it, but after this experience it's hard to imagine using something else for a backend with this shape.
NooneAtAll3 2 months ago

I hope you'll share the high score after it ends
highest kills by each piece, longest distance travelled (in cells, not in moves)
I've seen promoted-55, I wonder if anyone managed to do promoted-1000
NooneAtAll3 2 months ago

did you reset the board? whites are back to 990k+ kings
TheKraai 2 months ago

I'm inspired to make something with comparable requirements! Do you need some beefy server to handle the load?
- eieio 2 months ago
  
  I'm using a moderately beefy server but I don't think I've hit double-digit CPU usage; this could have run on a much smaller box
sbassi 2 months ago

Where are the instructions or rules?

nomilk 2 months ago

Someone barricaded their king with about 40 rooks, I hopped in with a knight, and they immediately captured me (with their king), then plugged the gap with another rook so I couldn't do it again. That was amusing lol.

mdaniel 2 months ago

> You can move between boards.

Evidently move between boards but not capture between boards :-( It's extra weird because it's not that the movement isn't projected (e.g. queen blue lines all point correctly across board boundaries just the lines always stop at every piece on the other board, regardless of color)

So, I guess as an exercise in scale, well done! As one million chess boards, caveat gamator

eieio 2 months ago

I didn't like the idea of a queen on one board capturing the king on another board as her first move. And then I tried this rule and thought it created really fun counterplay when you're trying to capture a piece someone else is controlling, since you can move to a new board to be safe (and can lay traps this way).
I'm sorry you don't like that decision! But I think that I stand by it.
- re 2 months ago
  
  > I didn't like the idea of a queen on one board capturing the king on another board as her first move
  If you did want to experiment with supporting cross-board captures, an alternate way to address that could be by rotating the board 180° every other row, so that white pieces have other white pieces behind their home rank.
  - eieio 2 months ago
    
    oh this is a pretty fun idea! If I decide to keep this one around for a bit maybe I'll play with doing this after wiping the board
    
    trenchgun 2 months ago
    
    This would be fun!
- CompuHacker 2 months ago
  
  There exists a player, as of today, who can capture across borders with no apparent in-between cross-color moves or delay.
Turneyboy 2 months ago

In particular that means that if you fill the border up to a depth of 2 on one board with pieces, it becomes a theoretically inpenetrable fortress.
For an overdone example see 2024,4219.

pavel_lishin 2 months ago

(Also submitted here: https://news.ycombinator.com/item?id=43822992)

Neat, though I expected every individual board to have "turns" - I didn't expect that I could just pick a random board, liberate the black queen, and have her clean up every single white piece on the board without my "opponent" getting to do anything in return.

eieio 2 months ago

oh huh, I'm not sure why this one made it to the front page and not the link to my site!
Anyway, yeah, I guess I could have gone with turns here but I thought that building a more realtime MMO thing where pieces could cross boards would be a little more interesting and novel. I also didn't feel like a version of this that was turn based would ever complete.
certainly a queen can go wipe out a whole board, but the game tries to place you next to other active players when you join, which hopefully promotes some interesting counterplay to that. And I think playing chess in realtime like this against someone is pretty fun. But I understand why it might not be for everyone!
- chunkles 2 months ago
  
  Individuals here on hacker news tend to gravitate towards blog posts about games rather than links to actual games, generally. Not a hard fast rule, but it's what I've observed over the years.
  - eieio 2 months ago
    
    I have had some success on Hacker News with similar games before[1]! But I'll certainly do a proper writeup of the tech behind this one because I think it's pretty neat.
    [1] https://news.ycombinator.com/item?id=40800869 for example
  - stevage 2 months ago
    
    As I get older, I find I'm often more interested in reading about things than trying the thing myself. I often go straight to the comments about articles rather than even reading them.
- dang 2 months ago
  
  As you wish! We've swapped out the link from https://eieio.games/blog/one-million-chessboards/ to your site. But I also put the blog link in the top text, so hopefully readers will look at both.
  (Oh and I still owe you an email. I haven't forgotten!)
  - eieio 2 months ago
    
    Ah thank you dang! I meant to submit a Show HN for this one with a little context/a few technical details, but someone beat me to submitting the link. So this is perfect :)
    (and thanks, I'm in no rush!!)
- JusticeJuice 2 months ago
  
  I'd love to know what individual piece has moved the furthest from it's starting point.
tantalor 2 months ago

Agreed. It doesn't make sense. The game here is not to play chess, it is to find a board that has no other player and wipe them out as fast as possible.
- krupan 2 months ago
  
  Or it's just to have fun and create stuff. Already someone made board full of rooks and called it Rooklyn, and another person made a board full of queens and called it Queens
- NooneAtAll3 2 months ago
  
  the game here is to have fun :)
  get the most kills, make a cool shape
  or take a piece veeeeery far away from home

NelsonMinar 2 months ago

This game has gotten interesting, for instance people have figured out a single board with pieces filling the outside edges two squares deep is impervious.

I love how we're seeing emergent gameplay. That's the genius in eieio's projects. He's inventing game systems that on the surface seem simple, but at this mass scale they have interesting possibilities that people discover. And they're entirely new and invented, so we have no idea what to expect until the community figures it out.

ryukoposting 2 months ago

Chasing people around is a lot of fun. Very enjoyable, if not for chess reasons.

pmontra 2 months ago

It works well on my Android phone with Firefox. Well done.

moffkalast 2 months ago

Linux and Firefox, the most brutal test of website reliability haha.

sltkr 2 months ago

It crashed for me:

    Uncaught TypeError: Cannot read properties of null (reading 'type')
        at $80b91fc9d2f468ec$export$4abc8fab4139dfcd (index.e2b13a6c.js:1:406898)
        at $c692b767326c99ec$var$PieceHandler.getMoveableSquares (index.e2b13a6c.js:4:4373)
        at index.e2b13a6c.js:4:13557
        at oQ (index.e2b13a6c.js:1:60912)
        at o0 (index.e2b13a6c.js:1:61761)
        at oZ (index.e2b13a6c.js:1:60941)
        at Object.useState (index.e2b13a6c.js:1:72377)
        at Object.q (index.e2b13a6c.js:1:10352)
        at $a2d3bef833187ce9$export$474cd6ee072cf5a4 (index.e2b13a6c.js:4:12305)
        at oF (index.e2b13a6c.js:1:58673)

darepublic 2 months ago

I was only able to move the black pieces. And I was able to move black consecutively on the same board. So I didn't fully understand. Are the rules being enforced or is it just updates.

I enjoyed the sc2 UI when selecting pieces

pests 2 months ago

It shows your color bottom right. It’s not turn based.

cheekyfleek 2 months ago

We definitely have some cheaters, playing as the other color. Thought I saw it last night but I know I saw it today. I think I saw vindictive use of it too, both times. One black fortress was destroyed and next thing you know the nearest white fortress found its pieces moving into the worst possible positions, often next to a waiting black bishop or rook. I'm not sure how exactly you handle the colors, but in a world where I can RDC onto a dozen computers on a half dozen continents in seconds I suppose this was inevitable.

maxmcd 2 months ago

If you missed it, there was a nice story around One Million Checkboxes: https://eieio.games/blog/the-secret-inside-one-million-check...

I wonder if something similar will happen here.

@eieio please open source the Go code, would be fun to poke at.

eieio 2 months ago

I'll certainly open source the code! I just want the flexibility to change my rate limiting logic in the short term to counteract abuse. Happy to answer questions though!
- nodox 2 months ago
  
  Yes please open source. I tried something similar based one your checkboxes game! I never worked with websockets so I’m curious how you designed for scale and stopped spammers. I game was click the button 10M times and of course the script kiddies started immediately which is fun! But not my server keeps getting hammered with requests long after the initial interest. I did not know how to rate limit bots without blocking whole IP ranges.
  - eieio 2 months ago
    
    fwiw I think the biggest single trick there is to group IPV6 addresses at the /48 or /64 level before applying rate limits (you can rate limit IPV4s on a per-ip basis).
    It's kind of annoying and expensive to get a bunch of IPv4s to evade limits, but it's really easy to get a TON of IPv6s.
    The other Big Trick I know is to persist rate limits after a client disconnects so that they can't disconnect -> reconnect to refresh their limits.
kinduff 2 months ago

I love this story so much.

sashank_1509 2 months ago

I spent a good half an hour making an impenetrable fortress for black, hopefully it lasts till the end (it will unless someone in black breaks it)

felipemesquita 2 months ago

I've achieved what seems to be a stable structure outside the edges, https://onemillionchessboards.com/#1347,3624

yayitswei 2 months ago

I predict this will evolve into bot warfare! For example, here's a snippet to advance pawns: https://gist.github.com/yayitswei/442cd5128b2dbfbd95a101b70f...

ChessviaAI 2 months ago

Really impressive engineering work her, especially running it all on a single server with in-memory board state and optimistic rollback.

I work in chess tech, but in a very different direction (structured games, coaching, serious play). It's inspiring to see chess reimagined like this!

tantalor 2 months ago

I saw a black piece moving around very quickly. Is there a secret way to move without clicking twice?

junon 2 months ago

Was it a queen or rook?
- tantalor 2 months ago
  
  Queen I think.

NooneAtAll3 2 months ago

I made a Promoted-32 queen

I won

NooneAtAll3 2 months ago

herding around ~25 queens was also fun
building fortresses (since enemy can't cross board border by capture) is also. fun.
- NooneAtAll3 2 months ago
  
  theoretically there's a design for "indestructible" fortress (rhoburb, you are a genius)
  1) fill corner board with pieces 2) cover the border (from inside) with pawns 3) cover the promotion square on the border with the king
  king can't exit the board, pawns can't walk backward to leave space, filler doesn't allow these 2 to make way
  ---
  the real holy grail that will never be achieved I fear
  - sc970 2 months ago
    
    https://onemillionchessboards.com/#4362,7
    
    casey2 2 months ago
    
    Nice.
    A lot of people are still making flawed fortresses, just popped a black fortress with the help of tor and they were actually around and tried to defend it. So sad. muwahahah
    
    NooneAtAll3 2 months ago
    
    looks like black side also made several
    https://onemillionchessboards.com/#4404,7993
    https://onemillionchessboards.com/#2177,7993
    https://onemillionchessboards.com/#1787,7994
    and one more for white
    https://onemillionchessboards.com/#3765,4
  - chessCheeser 2 months ago
    
    [dead]

GenshoTikamura 2 months ago

It's interesting how the game will end. Will it end up in a deadlock, or in a fierce duel of two power pieces with hundreds of players behind each struggling to gain control concurrently?

GenshoTikamura 2 months ago

After further reflection I guess it will be a fierce duel of bots, not humans
babynicky 2 months ago

[dead]

Mr_Modulo 2 months ago

Do the boards regenerate at some point? If not, it seems like most of the pieces will be depleted after a while.

mtoner23 2 months ago

nope, just like his original site, one million checkboxes https://onemillioncheckboxes.com/

amelius 2 months ago

These days I keep wondering what a 4d or 3d chess board looks like.

logtempo 2 months ago

https://multiversechess.com/
and someone did it!

fredo2025 2 months ago

This was a strange game. I could only move the black pieces, and I could take the opponent’s king. What next?

Also, the skull button seemed to do a lot of damage and shake things up.

cheekyfleek 2 months ago

The bots are alive!

gitroom 2 months ago

insane seeing so many wild ideas happen at once - i keep wondering, stuff like this, what keeps people coming back and building even weirder moves over time you think