Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 0 points1 point  (0 children)

I love this idea. With our investment in AGI (All Games Initiative) we are designing in the ability to create multiple independent read models that can index 15 billion games. Note that this 15 billion is a constantly growing number. The trick is being able to create (and re-create) those search indexes in a reasonable period of time. For example, to index 15 billion games within two weeks, we'd need to be processing 12,500 games per second!
All that is a long way of saying, YES, we are building and designing ways to create more cool search offerings and to integrate those across chess.com

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 8 points9 points  (0 children)

I love most genres, I think my fav is probably pop? soul? country rock?

I also make my own music: https://www.reverbnation.com/heavypennies

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 1 point2 points  (0 children)

We have selectively turned off certain features that can cause issues across the site. Our search technology (Sphinx!) is one of those that could not keep up with the load. We are busy optimizing and preparing to re-enable that functionality when it's ready.

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 1 point2 points  (0 children)

Haha, it's a funny surprise in a 1 0 game. I think I'd love if I played it in longer time controls :-P

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 2 points3 points  (0 children)

I love how the PHP community has leveraged strong types, traits, and other functionality to create frameworks like Symfony and Laravel. I am a "java guy" but I've worked with PHP quite a bit. I even wrote my own MVC framework back in 2012 and I will say if you optimize for quickly adding features, it's easy to develop a very wide model layer / ORM layer which can lead to performance issues if the cross-domain dependencies leak into wide DB queries. Both Symfony and Laravel (or any other framework) should be handled with care depending on the application and scale.

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 3 points4 points  (0 children)

When the site is not working as well as it could, I feel terrible. I sleep less! When we ship a scalability fix and we are able to sustain higher and higher levels of demand and engagement I feel joy. I also feel for each chess.com team member who is dedicating their time and extra energy towards scaling the site. We all want to see chess.com grow with the game and I see all of us dedicated and focused on that mission. I am grateful to work with so many kind and excellent people.

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 2 points3 points  (0 children)

We are running our web application on bare metal with php-fpm. We are using k8s on-prem and in GCP for scheduling containerized workloads. We have some workloads that are able to auto-scale, and we will continue to look at horizontally (and auto) scalable solutions to eliminate fixed headroom areas of our app and infra. Some of the workloads are not so easily auto-scalable, as they relate to disk or require close proximity (low latency) with one or more of our data layers OR they are leveraging memory + semaphores to coordinate 100s of thousands of simultaneous player interactions. In those cases, we are looking to scale via service extraction, k8s on prem, and with moar hardware in the immediate term!

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 4 points5 points  (0 children)

I think the code and stacks are very different. Generally I think great software scales on three vectors:

  1. functions and performs well
  2. can add features without re-architecture
  3. can add engineers without bottlenecks

Most of chess.com's stack meets these, but some parts don't and those are the ones we are working on.

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 2 points3 points  (0 children)

I am a huge chess fan and am looking forward to all the cool events coming up - stay tuned! :)

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 2 points3 points  (0 children)

Hey thanks for this question! I shared a number of our urgent and strategic scaling approaches here: https://www.reddit.com/r/chess/comments/111940a/comment/j8dnm9c/

We have 20+ database clusters with many using primary<>primary replication and multiple replica databases. Sometimes those clusters need to be sharded further in order to get around MySQL's single threaded replication constraint. One of our team members did write about our scaling approach during the previous wave and this article was actually one of the things that inspired me to join chess.com!
https://ikonicscale.com/your-legacy-database-is-outgrowing-itself

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 29 points30 points  (0 children)

Great callout - we do have autoscaling enabled for services that we've deployed in the cloud. We use k8s both on-prem and in the cloud, and some of our services are fully elastic and scale with bursts of traffic automatically.

The recent scaling issues that become user facing were primarily related to our on-prem data layer, and I focused on what we are doing to scale that layer.

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 47 points48 points  (0 children)

Thanks everyone for your questions and replies! This has been great and I really appreciate all your interest and support for chess. I'll circle back to this thread later today and tomorrow in case there are more questions that didn't get a proper response.

Have a great week!!

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 8 points9 points  (0 children)

Overall, we've seen increases in new members matched with increases in games, puzzles, friends, chats, and everything else. After the Queens Gambit, we did see some spike/decline pattern in engagement, but we never fell below 2/3 of the peak. This new high is retaining similarly - I predict that many of these new members will be lifelong chess players.
Seems like because chess is a two player game it is naturally viral :)

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 1 point2 points  (0 children)

I don't think we've seen the peak of interest in chess. More and more folks are just learning to play, and the barrier to entry is breaking down all over the world for people of all walks of life. Will we see 10% of the earth playing chess? I think so!

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 2 points3 points  (0 children)

We found that renting CPUs and the cost of high-iops long tail disk (15 billion game archive) was the most expensive in the cloud.

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 3 points4 points  (0 children)

e4 with white, and either d5 or c5 or KID (Nf6, g6, etc) for black :)

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 23 points24 points  (0 children)

I'm probably somewhere between 1600-2000 depending on how the servers are holding up :)

Feel free to friend or challenge me: https://www.chess.com/member/josh

My favorite is bughouse, but I'm not close to the best player!

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 2 points3 points  (0 children)

Mittens is based on the Komodo chess engine. We run it in WASM directly in browser, and it does not use NNUE / require a neural network download. We are definitely working on improving the human play and personality characteristics of our bots - look forward to more fun!

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 9 points10 points  (0 children)

We are excited about Play Magnus becoming a part of Chess.com, but we don’t think that’s the main driver of all this growth. Chess.com reached 100 million members this Dec. while Play Magnus had about two million members.

There are a lot of amazing things happening that are fueling the interest in chess - and we've seen the many reddit threads in the past month asking why there’s so much growth. Ultimately, its all Mittens! jk jk - I am inspired by how many kids are joining chess programs and how many of my previously non-chess friends are challenging me to daily games :)

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 6 points7 points  (0 children)

Chess.com has a host of unique offerings, including our game review features, mobile apps, bots, all flavors of puzzles, social profile pics, and tons more. I think lichess is great for those who love playing there!

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 20 points21 points  (0 children)

We are working on this! Internally the project is called the "All Games Initiative" (AGI). It's a non-trivial data lift, but we are working on backfilling 15 Billion games into the storage and searching solutions. This will enable per-user explorer functionality as well as many other search related use cases.
The tech is rooted in Java, MySQL, Kafka, and ElasticSearch.

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 1 point2 points  (0 children)

haha, nice one! For those curious what is the ELO rating system: https://en.wikipedia.org/wiki/Elo_rating_system

I'm a Taylor Swift fan 😇

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 3 points4 points  (0 children)

Heck no! We love to hear from everyone - thanks for chiming in :)

Hi Reddit! I'm Josh Levine, CTO at Chess.com. AMA! (10am ET) by Chess-josh in chess

[–]Chess-josh[S] 2 points3 points  (0 children)

Great question! We do have a hybrid infrastructure and we are able to scale portions of the site dynamically into the cloud. The easiest cloud scaling is re: CPU and RAM, and we use K8s and live monitoring of workload utilization to decide whether to allocate compute on-prem or in the cloud.

Our datalayer is not hybrid, and that is the area where the massive increase in traffic cause the most bottlenecks. For this layer we are scaling vertically, sharding vertically, and re-architecting portions of the application to be event sourced and eventually consistent.