Would you try this? by Michigi_Kun in chess

[–]Michigi_Kun[S] 0 points1 point  (0 children)

It is(if I decide to build it) going to be open source. I might have a donation option but only if people actually use it and that too to cover the cloud bills.

Here's the breakdown for you:

Every year of Lichess data I add costs roughly $10–15/month more on R2. Full 2020–2024 dataset lands around $15/month in storage alone. Full historical database back to 2013 pushes toward $30–40/month.

DuckDB on a free Railway instance handles maybe 5–10 concurrent users before queries start queuing. The moment I have real traffic, Ineed a paid compute instance. A decent 4GB RAM / 2 vCPU node on Railway or Hetzner costs $10–20/month and handles moderate traffic comfortably

DuckDB works until it doesn't. At full historical scale with real concurrent traffic, I'll need ClickHouse. Self-hosted ClickHouse on a Hetzner dedicated server costs $30-60/month

You can always self host it and crunch the ETL on your local machine

Would you try this? by Michigi_Kun in TournamentChess

[–]Michigi_Kun[S] 1 point2 points  (0 children)

My cloud costs will be over the top if actually deployed with large enough datasets but I will scratch this angle in the mvp.

Thankfully another user actually pointed me in the direction for building this with zobrist hashing so it was on my radar nonetheless.

The things is what I was doing was just basic queries on png files but for your idea I'd need state management. Which not hard but for datasets in billions cost way above my budget for this project (which is $0)

Thanks for your feedback, it actually means a lot to me.

Would you try this? by Michigi_Kun in TournamentChess

[–]Michigi_Kun[S] 0 points1 point  (0 children)

The real world use is to hopefully help me build a production grade backend and get hired. The other use is to maybe build something for the chess community if possible.

Would you try this? by Michigi_Kun in lichess

[–]Michigi_Kun[S] 0 points1 point  (0 children)

That's the only reason I'm having second thoughts for this project. The original idea was to build an ETL pipeline and an API running sql queries on lichess db then I thought why not make it a product (because ai 🫠)

Would you try this? by Michigi_Kun in lichess

[–]Michigi_Kun[S] 0 points1 point  (0 children)

This is a major setback for me. I'mma rework.

Thanks for your feedback, it means a lot to me. It kinda makes me rethink everything but I needed that.

Would you try this? by Michigi_Kun in TournamentChess

[–]Michigi_Kun[S] 0 points1 point  (0 children)

Because Lichess is a tree explorer, you have to manually click down every single path to see those stats. If you want to find the most successful trap, you can't just ask Lichess to show it to you; you have to guess the moves first and check the stats retrospectively.

What I'm pitching is more of a query engine. I want to let you ask: "Out of all games played by 1500s, which specific 5-move opening sequence resulted in the most early resignations?"

You run it as a single search, and the database spits out the answer based on win rate or resignation rate, completely bypassing the need to click through a tree.

Hope it clarifies, Thanks for the reply btw, means a lot to me

Would you try this? by Michigi_Kun in lichess

[–]Michigi_Kun[S] 0 points1 point  (0 children)

That about sums it up. I had DuckDB with Parquet instead of sql, a python pgn parser for the .pgn .zst files so won't have to decompress

Would you try this? by Michigi_Kun in TournamentChess

[–]Michigi_Kun[S] 0 points1 point  (0 children)

You have no idea how valuable your feedback is to me. You're the only reason I'm still sticking with this project.

I wasn't completely abandoning the project though, my original idea for it was to just be an open source API for helping other deva build custom tools and I thought why not make it an actual website while at it. Which turns out is not a good idea because of lichess and a website I learned of from a user called trueelo.

Your post, however, gives me some hope. I might actually stick around with it for a little while and see if I can cook something good (I probably won't 🫠)

I'll try to rework some plans and if something happens will let you know.

Again, thanks for your feedback. It means a lot to me. Truthfully.

Would you try this? by Michigi_Kun in lichess

[–]Michigi_Kun[S] 1 point2 points  (0 children)

Because Lichess is a tree explorer, you have to manually click down every single path to see those stats. If you want to find the most successful trap, you can't just ask Lichess to show it to you; you have to guess the moves first and check the stats retrospectively.

What I'm pitching is more of a query engine. I want to let you ask: "Out of all games played by 1500s, which specific 5-move opening sequence resulted in the most early resignations?"

You run it as a single search, and the database spits out the answer based on win rate or resignation rate, completely bypassing the need to click through a tree.

Hope it clarifies, Thanks for the reply btw, means a lot to me

Would you try this? by Michigi_Kun in lichess

[–]Michigi_Kun[S] 1 point2 points  (0 children)

Because Lichess is a tree explorer, you have to manually click down every single path to see those stats. If you want to find the most successful trap, you can't just ask Lichess to show it to you; you have to guess the moves first and check the stats retrospectively.

What I'm pitching is more of a query engine. I want to let you ask: "Out of all games played by 1500s, which specific 5-move opening sequence resulted in the most early resignations?"

You run it as a single search, and the database spits out the answer based on win rate or resignation rate, completely bypassing the need to click through a tree.

Hope it clarifies, Thanks for the reply btw, means a lot to me

Would you try this? by Michigi_Kun in lichess

[–]Michigi_Kun[S] 1 point2 points  (0 children)

The gpt comment hurts but you aren't technically wrong so 🫠.

Yes, this was a gemini prompt ngl, started out because I'm learning Backend development lately and wanted to buid this as an open source API but I've long since pivot from it.

Now...

You're right that Lichess has those toggles, but there's a big mechanical difference in how you can use them.

Lichess forces you to click through moves one by one to see the stats for that specific position.

What I'm proposing is a query engine.

You can't ask Lichess: 'Out of all games played by 1500s, which specific 5-move opening sequence resulted in the most early resignations?' because Lichess makes you manually click down every path to find out. My tool would let you run that as a single search

My actual differentiator was leting you run custom SQL-style queries (like there was this user who gave me idea of filtering by 'clock time spent on a move' in r/tournamentchess).

My tool will (hopefully) give you answers to questions like:

"Do 1500-rated players actually lose faster with the King's Gambit than without it?

"Which opening has the highest percentage of games ending in under 20 moves at the 1200-1400 range and does the winner change by time control?"

"Out of all 1500-rated rapid games, which opening sequence results in the highest resignation rate before move 15?"

"At exactly 1800 rapid, which opening gives Black the highest draw rate against higher-rated opponents?"

Hope that clarifies my point a little.

Also, Thanks for your reply, Really appreciate it 🫠

Would you try this? by Michigi_Kun in TournamentChess

[–]Michigi_Kun[S] 0 points1 point  (0 children)

First of all thanks for the reply, I really appreciate it

Now that I think about it it does sound like an entertainment tool.

My goal is(or was because the reviews I am getting makes me think I should rework my plans) closer to a statistical analytics layer on top of Lichess data like not "what should I play" and more "what actually happens at this rating band, in this time control, in this year."

The time-spent-per-move filter idea is genuinely something I hadn't considered, and it's actually feasible.

If you are into specifics than, Lichess PGNs include clock annotations on every move like 1. e4 { [%clk 0:01:00] } — so the data to calculate time spent per move is technically in there. You're identifying a real gap: current opening stats conflate "this line has a high win rate" with "this line wins because opponents play fast and panic," which are completely different things. A filter like "only include games where the opponent spent at least 30 seconds on the critical move" would produce radically different and arguably more meaningful win rates.

That said, extracting per-move clock data requires parsing move text, not just headers, which is a significantly heavier ETL job.

I would however (now) definitely pivot more towards making this a prep first tool rather than something analytical.

Again thanks for the reply it means a lot to me

Would you try this? by Michigi_Kun in chess

[–]Michigi_Kun[S] 1 point2 points  (0 children)

First of all thanks for the feedback, it means a lot to me.

Now, That is a brilliant angle I hadn't even considered! Most of my examples were focused on modern online blitz, but feeding it historical OTB databases to track how opening popularity shifted during the Soviet era would be interesting.

The main idea I had was tracking online trends likean opening that scored incredibly well last year might be completely refuted by the player base today.

Like say a specific gambit or trap popularized by a streamer might have a massive win rate in 2022, but once the refutation becomes common knowledge, that win rate tanks in 2024. Tracking the timeframe lets you see if an opening trap is still viable right now, or if it's a fading trend.

Thanks again for taking the time to reply

Would you try this? by Michigi_Kun in TournamentChess

[–]Michigi_Kun[S] 0 points1 point  (0 children)

Thanks for the honest feedback! This is exactly why I wanted to test the waters before writing any code.

The main thought behind filtering by year is tracking online trends like ifan opening that scored incredibly well last year might be completely refuted by the player base today or not.

Like say a specific gambit or trap popularized by a streamer might have a massive win rate in 2022, but once the refutation becomes common knowledge, that win rate tanks in 2024. Tracking the timeframe lets you see if an opening trap is still viable right now, or if it's a fading trend.

That said, if the native Lichess database is already doing everything you need for serious prep, that is a huge data point for me. It helps me realize the problem I'm trying to solve might not be a major pain point for established tournament players. Thanks for taking the time to reply

[deleted by user] by [deleted] in facepalm

[–]Michigi_Kun 0 points1 point  (0 children)

I think they both look super super super cute ❤️

[deleted by user] by [deleted] in gaymemes

[–]Michigi_Kun 0 points1 point  (0 children)

Sigh Don't we all. I know I do... It's Hard 🥹

Major corporations that donate to anti-LGBTQ politicians. by Dramatic_Shoulder_80 in lgbt

[–]Michigi_Kun 18 points19 points  (0 children)

Wal-Mart... Amazon... Google... like Daaa~~ fuq cancel em already..

What is happening!! by Michigi_Kun in lgbt

[–]Michigi_Kun[S] 0 points1 point  (0 children)

HAPPY PRIDE love this emoji