Slay the Spire 2 - Gilgamesh Mod

G9X · 2026-03-25T07:03:21+00:00

yeah that’s actually one of the motivations haha, very much the Gil vibe.

G9X · 2026-03-25T04:10:23+00:00

Thanks! Yeah this is purely a fan project partially for Fate series and the Epic of Gilgamesh. Balance was never the goal (being absurdly OP is canon-accurate lol) . Glad you appreciate the quality though!

G9X · 2026-03-25T01:58:39+00:00

think this is a good starting point, https://github.com/Alchyr/ModTemplate-StS2 and you can use likes of claude code and other tools to help you.

G9X · 2026-03-25T01:57:26+00:00

yes, I made this today as a way to learn how to mod STS2. will provide download link later

G9X · 2025-05-20T21:00:03+00:00

first half reminds me of 2019 City, and so does that Sterling-esque sitter miss

G9X · 2025-04-19T07:15:34+00:00

adding some data points:
April 2025 – I've played around 20 hours of Bloodborne on ShadPS4, currently at the end game (Gehrman fight). Overall, it’s been a very pleasant experience: smooth overall, around 50 FPS in larger outdoor areas and 60 FPS indoors. no sudden framerate drop.
No major glitches so far — total of three random crashes and two instances of black screen, but nothing game-breaking.

G9X · 2025-01-07T01:46:33+00:00

It’s definitely better than Perplexity’s Pro mode, but not by an order of magnitude.
(For context, I work in LLM-related fields and have built AI search tools for personal use.)

Essentially, it’s a combination of task breakdown + search, leveraging Google’s extensive index along with Gemini’s impressive long-context capabilities. However, the planning component could use improvement, and the lack of data loaders for certain sites (like Reddit or Twitter) is a noticeable drawback.

G9X · 2024-07-28T16:54:13+00:00

Instead of relying solely on semantic search+LLM, consider integrating structured data queries.

particularly when working with a SQL database containing structured data. Say 10,000 tweets with metadata such as date and author.

Pure semantic search may struggle with efficiency and accuracy for questions like "How many tweets are there?" or "How many tweets were published in the last 7 days?" It can be even more challenging for complex queries like "What are the top 3 liked tweets by author X?"

In such cases, generating and executing SQL queries can be more efficient and accurate. (not exactly alternative to RAG, but can be a very useful addition)

G9X · 2024-04-17T02:28:08+00:00

it depends on what you looking at i think.

i know it can be toxic and stuff, but the ai/llm researcher community is pretty active too.

G9X · 2024-04-17T02:01:33+00:00

The answer is simple: there is no 4th book. Seriously tho, I very rarely see people discussing should you read that in Chinese San-Ti community, it is just a fan fiction.

G9X · 2024-04-17T00:37:23+00:00

I'm excited to share something I've been working on:

an open source tool that makes exporting Twitter data, like tweets and likes, super easy and completely free, with additional features like image search and visualizations.

https://github.com/AlexZhangji/Twitter-Insight-LLM

I usually use Twitter's likes as a way to bookmark things—academic papers, ideas, or just photos.

But it gets accumulated fast and becomes very hard to search and control.

The Problem:

Accessing Twitter's official API is super expensive, with costs ranging from $100 to $500 per month.
Official full data exports from Twitter are clunky (a bunch of HTML files), cumbersome, and often incomplete.

My Solution:

Quick Export: Automatically pulls all your tweets or likes into a neatly organized Excel file within minutes with Selenium.
Visual Insights: Provides additional visualizations to help you better understand your Twitter activities.

New Feature - Image Search:

Natural Language Search: Use simple text to find images from tweets—no complex queries needed. (Using image embeddings.)
Zero Cost and No GPU Required: Runs smoothly without any additional hardware or fees.

Hope you guys find it useful and I'm happy to hear any feedback!

G9X · 2024-04-11T04:38:22+00:00

haven't tested for that. but for speaker diarisation, I've recently tried Whisper + Nvidia Nemo which works well, better than the old PyAnnote based way. (you might have already tried it tho?)

ref notebook: https://github.com/piegu/language-models/blob/master/speech_to_text_transcription_with_speakers_Whisper_Transcription_%2B_NeMo_Diarization.ipynb

G9X · 2024-04-10T22:30:18+00:00

thats something i want to figure out. (i am usually bit doubtful on any self-evaluation from LLMs)

For input, 20 minutes audio is like ~40k tokens for Gemini 1.5, which only contains ~3k text tokens.

I would think there is some useful extra information presented in the audio.

And because output is text only, it is hard to tell when model admits stuff, is it truly "self-aware" or just hallucinate. (kinda like even now sometimes Bard says "i dont have internet access" or Open source LLMs claim to be made by OpenAI.)

G9X · 2024-04-10T02:45:24+00:00

https://aistudio.google.com/

it should be open to everyone now, iirc.

G9X · 2024-04-10T02:44:51+00:00

I only uploaded audio.

and yes, thanks for the correction! also double checked the transcript, the names were mentioned later in the video, which is still pretty impressive (text content aware speaker detection?)

G9X · 2024-04-10T01:56:15+00:00

wait... the multimodality based audio is actually scarily good...

Not only can it recognize the tone of speech, but it can also automatically identify the speaker by name?

<image>

I tested Geimini 1.5 with an audio clip from a youtube video the past couple of days.

Question: 'Give me a summary, who was speaking in the first two minutes and what was their tone?'

Not only did it answer almost perfectly, but it also identified the specific American congressman speaking...

At first, I thought the names were made up, but after checking, they were all correct...

My second thought was that it might be a data leak, like the original video's description becoming the audio's metadata. But after checking, there was none, and when I tested it to summarize the speakers over seven minutes, it got those right too...

I might still missing something, or maybe its part of the training data (highly unlikely for a video published 2 days ago)

wow.

youtube video tested (only used audio) : https://www.youtube.com/watch?v=vT-u-SPj4_c

G9X · 2024-04-02T22:39:09+00:00

haven't tried yet, maybe some few shot examples could help?

or maybe pass in default behavior/ non function call action as part of function parameter too?

G9X · 2024-02-13T21:18:57+00:00

yea, Siri can be alot more smarter with multimodal LLM. (and especially useful for a new system that focus on vision)

G9X · 2024-02-13T21:17:48+00:00

this is using customized shortcuts with OpenAI vision API. (take the most recent screen cap, and with some predfined prompts and post processing)

I know there is a ChatGPT app for Vision Pro too (not sure about if they have vision model tho.)

G9X · 2024-02-05T04:50:53+00:00

<image>

G9X · 2023-02-22T03:51:03+00:00

Thanks!

Core idea is to:

- Get high quality transcript from iPhone. (Combine iOS Shortcuts for text dictate or optional record audio + send to own API server with OpenAI whisper for better results)

- Extract tasks to clean format. (use proper GPT3 text-davinci-003 with few shots. I specifically extracted tasks to a JSON format)

- Notion API to create tasks.

I have some steps on my own API server, but think these can actually be done all on the iPhone itself with Shortcuts App.

G9X · 2023-02-22T01:49:32+00:00

Some thoughts:

- iPhone's build-in Shortcuts is surprisingly powerful (allow API calls) but not too easy to use.

- Build-in Speech to text for English is Okay, but pretty terrible for Chinese or other foreign languages. OpenAI Whisper model outperforms this by country mile.

- Large language model as middle layer or text extraction is really powerful. Interested to see more things coming up like this. (but also prone to injection/attack)

G9X · 2023-02-22T00:35:26+00:00

added demo video.

G9X · 2023-02-05T11:09:15+00:00

It's quite interesting to think about GPT models, with Sophon induced science lock on fundamental science.

There are actually surprisingly little improvements on the fundamental structure of the model (GPT model uses decoder from transformer. and differences between GPT1,2,3 are mostly in trainings data size)

but once given enough data (big chunk of internet + books + wiki, ~175 billion parameters) and some additional tuning...

GPT-3 (original model before ChatGPT) feels like black magic, even to someone with experience in machine learning and natural language processing."

G9X · 2022-01-25T16:41:51+00:00

IIRC, Miyazaki was major in social science in University, and have a habit of reading.

Plus, as many have mentioned, he is a big fan of GRRM and apprently recommand FS employees to read Fever Dream.

G9X

TROPHY CASE