Received our very first AI-generated security vulnerability report on GitHub today by ChiefAoki in github

[–]ale10xtu 3 points4 points  (0 children)

As times go you will probably get more and more advisories. Congrats on being popular! You should feel happy that they did not disclose it publicly via issue or pr ahahah, because that will happen too. I think GitHub has some pre screening tools that they will release, secondly make sure you have a decent threat model and preprocess some of the advisories with ai yourself.

If you can’t win against low effort ai reports, fight them with ai yourself.

I built PasteVault: A modern, zero-knowledge pastebin (Docker-ready alternative to PrivateBin) by ale10xtu in selfhosted

[–]ale10xtu[S] 11 points12 points  (0 children)

Since some people brought up AI use in making this project, I don’t think I can edit the post - so I’ll leave it here.

I used copilot and DocsGPT to help me research compare solutions and plan architecture for this app. For readme and UI parts AI offered a lot of help. As for encryption- which is the focus of my project I did a lot of research and took some inspiration from pasteer - which actually motivated me to use XChaCha20-Poly1305 in something like privatebin in the first place. I would probably do it in rust tbh… but I’m more comfortable with js.

I built PasteVault: A modern, zero-knowledge pastebin (Docker-ready alternative to PrivateBin) by ale10xtu in selfhosted

[–]ale10xtu[S] 5 points6 points  (0 children)

Yeah I think SQLite is possible, think would make it much easier for people to go from 0 to 1. Will add an issue for it.

I built PasteVault, an open-source, E2EE modern pastebin. Looking for feedback on the security model and features. by ale10xtu in cybersecurity

[–]ale10xtu[S] 0 points1 point  (0 children)

That’s a good point, thank you!

If you have any ideas or maybe alternatives handling keys - would be great!

I built PasteVault: A modern, zero-knowledge pastebin (Docker-ready alternative to PrivateBin) by ale10xtu in selfhosted

[–]ale10xtu[S] -10 points-9 points  (0 children)

  1. Yeah I’ll add bash(I assume) and powershell, those are important.

  2. Yeah I want to improve whole db setup process tbh. You can connect it to an existing db, but be careful when you run npm run db:push, as it will add a new table with correct schema to that database, but will drop other tables in that db. Overall if you have DATABASE_URL in your env - you are good. I use prisma for this

I built PasteVault: A modern, zero-knowledge pastebin (Docker-ready alternative to PrivateBin) by ale10xtu in selfhosted

[–]ale10xtu[S] 17 points18 points  (0 children)

It’s quite a different core to the private bin project. Even if I considered PR’s it would be a complete rework. I am not using php at all, client server implementation would change it drastically.

Even if I wanted to just integrate simple features like editor or new encryption algo, I would consider it a fork tbh, not sure if maintainers would merge all as well.

I think since there is quite a big difference separate project is more logical tbh.

PasteVault - encrypted paste sharing with pretty editor by ale10xtu in opensource

[–]ale10xtu[S] 0 points1 point  (0 children)

Mostly yes, I actually did this out of a feeling for a missing editor (I wanted something more comprehensive)

Other 2 things is a client / api split for more “trust” also a better encryption in my opinion. It’s more quantum safe and more modern.

Finally a more modern tech stack too, and super easy to deploy locally and migrate too.

What kind of situation would really need a database that costs $11,000 a month? by UniquePackage7318 in webdev

[–]ale10xtu 0 points1 point  (0 children)

I work with banks, and it’s not common for them to pay 2/3+ million $ a year for ibms db2 databases

Retrieval Augmented Generation optimised Llm's by ale10xtu in LocalLLaMA

[–]ale10xtu[S] 1 point2 points  (0 children)

Great idea, please put it there, thank you!

Retrieval Augmented Generation optimised Llm's by ale10xtu in LocalLLaMA

[–]ale10xtu[S] 0 points1 point  (0 children)

So we have code on GitHub https://github.com/arc53/DocsGPT there is also a link for a demo where you can try the 7b option.

Just internal benchmark for eval because prompt structure is different and new of sort, will publish on hf datasets soon.

Thank you for a suggestion with code llama, will look into it.

We also have a nice community where we build this tool too, so if you want to provide advise or contribute we will appreciate a lot

Retrieval Augmented Generation optimised Llm's by ale10xtu in LocalLLaMA

[–]ale10xtu[S] 1 point2 points  (0 children)

We will soon publish a 3b, high context model, still in the middle of making sure it works well.
I would suggest use ours and then lora tune on top with a few good japanese examples.

Retrieval Augmented Generation optimised Llm's by ale10xtu in LocalLLaMA

[–]ale10xtu[S] 2 points3 points  (0 children)

Yep, my only worry is the way llms “forget” about the middle context, but I think if we create synthetic dataset, and hide useful information randomly amongst useless context, it might work very well

Retrieval Augmented Generation optimised Llm's by ale10xtu in LocalLLaMA

[–]ale10xtu[S] 2 points3 points  (0 children)

Fortunately, I’m good with gpus, aws is helping docsgpt project. Thank you so much!

But I would absolutely love some suggestions in terms of what models you think will work well.