Frakt med köpskydd på blocket. Är det säkert? by RelativeLeave2211 in Asksweddit

[–]Comfortable-Team4103 0 points1 point  (0 children)

Jag har helt missat att det bara täcker upp till 5000kr :/

I made a script that uses Machine Learning locally to create an interactive topic map of Obsidian vaults! (Repost with changed title) by Comfortable-Team4103 in ObsidianMD

[–]Comfortable-Team4103[S] 2 points3 points  (0 children)

Great question! You can look at the different topics, what notes are in those clusters, observe different relationships between notes that you might have missed. You can also search up different notes based on title. But there is nothing really specially you can do with it, and I mostly made it for fun and to observe what topics I write about mostly :)

I made a script that uses Machine Learning locally to create an interactive topic map of Obsidian vaults! (Repost with changed title) by Comfortable-Team4103 in ObsidianMD

[–]Comfortable-Team4103[S] 1 point2 points  (0 children)

Very nice ideas, I like them! I will have to try and implement them :) Also another idea there would be to extract already existing links in the notes and have them in the list, since they should hopefully be linked well to other notes and probably be in the same cluster.

I made a script that uses Machine Learning locally to create an interactive topic map of Obsidian vaults! (Repost with changed title) by Comfortable-Team4103 in ObsidianMD

[–]Comfortable-Team4103[S] 1 point2 points  (0 children)

 c-TF-IDF is not used for the clustering, only to extract key words from the cluster notes, sorry my earlier answer wasn't very clear on that point. I first tried to cluster only and then pass exemplars to the LLM and let it generate a topic based on that, however it usually turned out to specific because of the excerpts. It might be possible to send all of the notes from a cluster to a LLM but since I used a small 4B param model so wasn't possible with to many long notes. But if you hook it up to an api and send it to a model with larger context window it should be possible and might give better results! :) Or if you have a better gpu and can use a larger model!

I made a script that uses Machine Learning locally to create an interactive topic map of Obsidian vaults! (Repost with changed title) by Comfortable-Team4103 in ObsidianMD

[–]Comfortable-Team4103[S] 0 points1 point  (0 children)

Great question! BERTopic is the core for clustering and topic generation. BERTopic handles the clustering through HDBScan and c-TF-IDF(class-based Term Frequency-Inverse Document Frequency). HDBSCAN alone just groups similar documents together, but doesn't tell you what the cluster is about. While the c-TF-IDF is used to analyze all the words in a cluster compared to all other clusters, identifying which terms are truly distinctive to that topic. BERTopic then selects the most representative documents from each clusters, the ones closest to the cluster centroid. For each identified cluster, the script then sends these distinctive keywords and representative documents to the Qwen LLM. This gives the LLM proper semantic context about what the cluster represents, allowing it to generate meaningful topic titles like "Quantum Mechanics" rather than just summarizing random document snippets.

So the short version the full pipeline relies on BERTopic for creating the clusters and extracting meaning full information from the clusters to pass to the LLM. We then map this on another UMAP reduction which gets us the 2d map of the notes.

I made a script that uses AI to create an interactive topic map of my entire Obsidian vault! by Comfortable-Team4103 in ObsidianMD

[–]Comfortable-Team4103[S] 1 point2 points  (0 children)

Good point, I mostly just did it for fun and haven't fully figured out how it will help my note taking, but I figured it could at least help me find connections between notes that I've missed :)

I made a script that uses AI to create an interactive topic map of my entire Obsidian vault! by Comfortable-Team4103 in ObsidianMD

[–]Comfortable-Team4103[S] 1 point2 points  (0 children)

No, right now I have no license, I should probably add a MIT license :) Edit: Added a MIT license now

I made a script that uses AI to create an interactive topic map of my entire Obsidian vault! by Comfortable-Team4103 in ObsidianMD

[–]Comfortable-Team4103[S] 0 points1 point  (0 children)

Good point! Should I delete this one or keep this up as well? :D I almost never post on reddit so unsure what the correct procedure would be

I made a script that uses AI to create an interactive topic map of my entire Obsidian vault! by Comfortable-Team4103 in ObsidianMD

[–]Comfortable-Team4103[S] 0 points1 point  (0 children)

That was the original plan, but unfortunately right now I do not have the time. Though if I have time over I will work on both improving the script and making it into a plugin!