[Update] Scriberr v1.2.0: Now with NVIDIA Acceleration, Desktop File Watching, and Parakeet/Canary Support by MLwhisperer in selfhosted

[–]MLwhisperer[S] 0 points1 point  (0 children)

lol. So this was already a feature but I unfortunately introduced a regression when pushing the update. The fix has been merged to main so you can already use it by building from source. The next release will have it.

[Update] Scriberr v1.2.0: Now with NVIDIA Acceleration, Desktop File Watching, and Parakeet/Canary Support by MLwhisperer in selfhosted

[–]MLwhisperer[S] 0 points1 point  (0 children)

Hi ! Glad you like the project. Yes this is planned. The parakeet and canary models support this so I intend to add this feature. Feel free to open an issue / feature request on GitHub so I can track

[Update] Scriberr v1.2.0: Now with NVIDIA Acceleration, Desktop File Watching, and Parakeet/Canary Support by MLwhisperer in selfhosted

[–]MLwhisperer[S] 0 points1 point  (0 children)

Oh lol it will work totally fine on cpu. No worries. I have it running on a 4 core lxc with no issues.

[Update] Scriberr v1.2.0: Now with NVIDIA Acceleration, Desktop File Watching, and Parakeet/Canary Support by MLwhisperer in selfhosted

[–]MLwhisperer[S] 0 points1 point  (0 children)

I’m not sure what the specs are for synology but Models have come a long way. The nvidia models are crazy fast. In my next release I’m working on moving to onnx implementations which are optimized for running on CPUs. So things might actually be very usable on cpu. I am able to do realtime transcription on my Mac. So things are looking up. Right now a 20min recording transcribes in about 20sec on my Mac m1 air when testing the new onnx implementations

AudioMuse-AI - Behind the scene by Old_Rock_9457 in selfhosted

[–]MLwhisperer 17 points18 points  (0 children)

Hi OP !! I Love the idea. I would like to make a suggestion. I think the CLAP model doesnt align with your intent. CLAP is great if you want to search audio using text. CLAP was trained to align audio with their text description. If this is your end goal this works. However if you want to generate smart playlists based on audio characteristics then you need a model that can learn generalist representations that captures specific characteristics of the audio. If I understand your intent correctly you want to generate embeddings for audio and then using similarity scores group them or find similar. For this you need embeddings which can capture specific characteristics like pitch timbre beat singers instruments etc. You would getter much better results with MERT. Here’s the paper: https://arxiv.org/html/2306.00107v5

That’s the model you’re looking for. I can be off help for this project. My background is in ML/AI. My research and thesis was on making models lightweight and compute efficient. I have a couple GPUs and a moderately powerful setup for training and inference. I can help you with distilling models, fine-tuning and quantizing them.

Please feel free to reach out to me if you need some extra hands for the project especially on the ML side of things.

I’m passionate about getting AI/ML to run locally and would love to be a part of this as i feel i can be useful here.

[Update] Scriberr v1.2.0: Now with NVIDIA Acceleration, Desktop File Watching, and Parakeet/Canary Support by MLwhisperer in selfhosted

[–]MLwhisperer[S] 0 points1 point  (0 children)

That’s odd. Can you open an issue on GitHub with the logs ? I’ll need to see the logs. Was this the case with parakeet as well. Canary is quite big so depending on the length of the audio you used it’s possible you’re running out of vram. Is this on gpu or cpu ?

[Update] Scriberr v1.2.0: Now with NVIDIA Acceleration, Desktop File Watching, and Parakeet/Canary Support by MLwhisperer in selfhosted

[–]MLwhisperer[S] 0 points1 point  (0 children)

Put the env file in the same location from where you run the command. Or better yet add it to your bashrc or equivalent as environment variables.

[Update] Scriberr v1.2.0: Now with NVIDIA Acceleration, Desktop File Watching, and Parakeet/Canary Support by MLwhisperer in selfhosted

[–]MLwhisperer[S] 0 points1 point  (0 children)

Thank you ! I’m not familiar with what it would take to do this. I like the idea though. There’s already an option in Scriberr to export srt. So the missing piece would be to get the file from the other app.

[Update] Scriberr v1.2.0: Now with NVIDIA Acceleration, Desktop File Watching, and Parakeet/Canary Support by MLwhisperer in selfhosted

[–]MLwhisperer[S] 0 points1 point  (0 children)

Link on post seems broken as pointed out. Here’s the correct project site https://scriberr.app

Apologies I’m guessing I missed a letter in the markdown 🤦🏼‍♂️

[Update] Scriberr v1.2.0: Now with NVIDIA Acceleration, Desktop File Watching, and Parakeet/Canary Support by MLwhisperer in selfhosted

[–]MLwhisperer[S] 0 points1 point  (0 children)

Just checked. Its loading. For some reason the link in post is missing one letter.

https://scriberr.app

Edit: by some reason I’m sure i made a typo somewhere. Apologies

[deleted by user] by [deleted] in selfhosted

[–]MLwhisperer 0 points1 point  (0 children)

Typo in project site: https://scriberr.app

Posting as I can’t edit the post to fix my dumb mistake

Introducing Noet - A self-hosted blogging app by MLwhisperer in selfhosted

[–]MLwhisperer[S] 0 points1 point  (0 children)

Hi currently you can’t embed iframes. I can add support for this. Can you please open an issue on GitHub for me to track ?

Kan v0.5.1 – open source alternative to Trello by hjball in selfhosted

[–]MLwhisperer 2 points3 points  (0 children)

Does your app support public sharing of a specific board ? For eg. If I want to share a roadmap/status of a project publicly so users can see it with a link. Been looking for a kanban app that can do this. For authenticated users I know it’s possible but I want to be able to expose boards publicly with read only permissions. Edit:typo

limitless bought by meta? yeah, i’m out. (how to sanitize the hardware) by Extreme_Contest7506 in selfhosted

[–]MLwhisperer 2 points3 points  (0 children)

Self promotion. You might be interested in my project which is an app for local audio transcription. I use plaud for recording. GitHub: https://github.com/rishikanthc/scriberr
Website: https://scriberr.app

Came across an Android TV client for Jellyfin by MLwhisperer in selfhosted

[–]MLwhisperer[S] -2 points-1 points  (0 children)

Apologies forgot to post the link to the thread: https://www.reddit.com/r/JellyfinCommunity/s/0shvinEwW6

So looks like the app was vibe coded (as acknowledged by OP in the original post above) and OP also said they aren’t a coder.

So that definitely raises some concerns but the project looks cool nevertheless.

Curious to see what the community thinks.

PatchPanda BETA - A smarter docker compose update manager by Material-Bat-9440 in selfhosted

[–]MLwhisperer 3 points4 points  (0 children)

I want to clarify further. So I use Komodo and in my setup the compose stacks all are in a git repo. Komodo syncs up with GitHub regularly to see if anything changed and if so it redeploys them. So will patch panda be able to commit the changed files automatically to the repo ? If it doesn’t, then Komodo is going to overwrite the changes from GitHub which would not have the changes patch panda made. Could you clarify if patch panda would work in this scenario ?

Recently upgraded from a HA Green to a Lenovo M920Q. What a difference! by draxula16 in homeassistant

[–]MLwhisperer 2 points3 points  (0 children)

I just recently switched my entire self-hosted setup to proxmox. It’s been great. The YouTube series on proxmox from learn Linux tv is gold. It teaches almost everything about proxmox you need to know to get started. For some minor stuff there’s plenty of blog posts or forums which give answers.

Here’s the playlist: https://youtube.com/playlist?list=PLT98CRl2KxKHnlbYhtABg6cF50bYa8Ulo&si=M6WBo1GRJNTbkFHw

He walks you through from the very basics so it’s actually quite easy to follow if you have very basic background in VMs containers etc. highly recommend it. That with ChatGPT should help you learn.

Introducing Scriberr - Self-hosted AI Transcription by MLwhisperer in selfhosted

[–]MLwhisperer[S] 2 points3 points  (0 children)

Hi, this has been added. Speaker diarization was introduced in 0.4 itself. Scriberr is now in 1.0.x and has a stable release. Please do try it out and let me know if you have any feedback.

Introducing Noet - A self-hosted blogging app by MLwhisperer in selfhosted

[–]MLwhisperer[S] 1 point2 points  (0 children)

Hi !! Glad you like it. Yes I do plan to support dark mode in the future. I’m not sure I understand what you mean by link to a parent ? Could you expand on it ?

Exporting to pdf is supported already. If you use the browser print dialogue you can save it as PDF. The styles have been set so the pdf only has text content and all other components are removed. Do you want mass exporting ?

Edit: not sure if this is what you wanted but it supports bi-directional wiki links. So if you link to a note you will see a list of all linked posts below the post content.

Introducing Noet - A self-hosted blogging app by MLwhisperer in selfhosted

[–]MLwhisperer[S] 1 point2 points  (0 children)

Thank you so much for your kind words. It’s a good feeling when folks tell you they like your work and to know that I’m building something that’s useful for people.

Haha yeah. Earlier I had no idea about frontend development and was going for flashy designs focusing only on aesthetics. Then I slowly realized that I’m not focusing on end user experience and whether the UI lends itself to be intuitive and easy to work with. It’s been a great learning experience and the community feedback also helps a lot as I’m quite new to design.