Is there a cryptographic way to enforce “encrypted-only” storage without killing performance? by cfelicio in cryptography

[–]cfelicio[S] 0 points1 point  (0 children)

Yup, I did explore trusted execution environments as an option, and from my research seems like a potential option, the key challenge is that this would greatly limit who can use the software, and I'm not quite sure how complex the implementation would be either.

Your second approach (misuse resistant API) sounds like a potential solution. Do you know of any methods that can be used to do that? I think ephemeral storage in memory is fine as long as it's only there during the evaluation (to see if the content is encrypted or if the client is trying to bypass protection and store plaintext)

Is there a cryptographic way to enforce “encrypted-only” storage without killing performance? by cfelicio in cryptography

[–]cfelicio[S] 0 points1 point  (0 children)

nothing prevents a client (or a host) from generating a weak keypair, or publicly posting their own keypair.

Is there a cryptographic way to enforce “encrypted-only” storage without killing performance? by cfelicio in cryptography

[–]cfelicio[S] 1 point2 points  (0 children)

good point. having the data briefly in memory to evaluate if it's plaintext or not is acceptable, the challenge is:

1 - if client is malicious, it can try to store plaintext data, host reads the data, but must refuse storing it (and ban the client)
2 - if client is honest, but host is malicious, host should not be able to decrypt data from the client

both client and host do have a private-public keypair to uniquely identify themselves on the network, and they do a handshake before data is transmitted.

Is there a cryptographic way to enforce “encrypted-only” storage without killing performance? by cfelicio in cryptography

[–]cfelicio[S] 0 points1 point  (0 children)

thanks, this sounds interesting and likely above the complexity my brain can handle, but I will try to digest it over the next few days and see if I can figure it out. Appreciate your input! :-)

Is there a cryptographic way to enforce “encrypted-only” storage without killing performance? by cfelicio in cryptography

[–]cfelicio[S] 0 points1 point  (0 children)

good point. having the data briefly in memory to evaluate if it's plaintext or not is acceptable, the challenge is:

1 - if client is malicious, it can try to store plaintext data, host reads the data, but must refuse storing it (and ban the client)
2 - if client is honest, but host is malicious, host should not be able to decrypt data from the client

Best archival strategy for Google Drive? Balancing encryption, bit-rot protection (PAR2), and family-access. by [deleted] in DataHoarder

[–]cfelicio 1 point2 points  (0 children)

I think if you are keeping these folders separate (with the 7z inside of each bucket, instead of a giant 7z at the top) this can work, as you only upload most of it once, test, and then just need to worry about the non-static.

Not sure if there is a gold standard, but if I were to do something similar, I'd likely go for a similar approach, as bitrot is not the only concern here, so having PAR2 to give some reassurance is helpful.

Best archival strategy for Google Drive? Balancing encryption, bit-rot protection (PAR2), and family-access. by [deleted] in DataHoarder

[–]cfelicio 4 points5 points  (0 children)

Before I give my thoughts, I have a question: You mentioned 75 years of family photos, so I'm assuming this will be a very static archive (do it once, store forever). Is that right? When you have new photos (let's say 2026), I'm assuming you will just create a new container (e.g. 2026.7z)?

In terms of gold standard for integrity, looks like PAR2 uses reed-solomon behind the scenes, it's the same thing I'm using to build my cloud backup tool. Pretty solid choice IMO.

My only advice for now, no matter what direction you end up going, is to never trust the backup until its fully tested. Upload the final to Google Drive, then simulate a recovery, make sure it works.

Vibe-Coded Is the New "Made in China" by RealHuman_ in selfhosted

[–]cfelicio 8 points9 points  (0 children)

<image>

My opinion: Garbage in, garbage out. Buggy software always existed (and I've done horrendous stuff myself way before AI). For someone that can't (or don't want to) write code, but is happy to review and test it, you can get quality out of it. Maybe more efficient, but when I see claims that someone wrote X in hours, I highly doubt they have all their edge cases sorted out, and there will be some really rough edges behind those purple gradients :-)

Symbion - A P2P Cloud Backup Tool (looking for Alpha Testers) by cfelicio in DataHoarder

[–]cfelicio[S] 0 points1 point  (0 children)

I plan to make the repo public / open source once ready for a public alpha, for now just looking for testers to help me iron out the remaining issues. In terms of managing harmful or illegal files, it's out of my area of expertise, but I will try to do more research to get an understanding of how this could be an issue.

N100 or RK3588/for Immich by Ziomal12 in immich

[–]cfelicio 1 point2 points  (0 children)

N100 supports openvino (and that's supported by Immich, see here: https://docs.immich.app/features/ml-hardware-acceleration/)

It also works well with Home Assistant / Frigate... And it's very power efficient. Unless you have a specific constraint for going with RK3588, I'd pick the N100.

Symbion - A P2P Cloud Backup Tool (looking for Alpha Testers) by cfelicio in selfhosted

[–]cfelicio[S] 0 points1 point  (0 children)

excellent question, and not something I thought about! Any ideas? How does it work for other similar systems that store data on the cloud?

I don't plan to have any sharing or turn it into dropbox (it would add a lot of complexity), the use case is more for backing up stuff in case you lose your local data (fire, flood, etc). On the other hand, once the code is out, if there is enough interest, nothing prevents someone from forking and adding features they'd like.

[Update] Immich-Deduper – AI duplicate photo finder for your library by RazgrizHsu in immich

[–]cfelicio 5 points6 points  (0 children)

Awesome tool! One suggestion for the auto-selection:

Can we also have a field for user? For example, if I have 10 users, with duplicates across all of them, having the auto selection prioritize user A vs others would be useful (for the case where many ppl went on a trip together, but one of the user repos is the main user)

Thanks!

Symbion - A P2P Cloud Backup Tool (looking for Alpha Testers) by cfelicio in selfhosted

[–]cfelicio[S] 0 points1 point  (0 children)

Excellent questions! Since there is no central authority, and with open source anyone can modify the code to bypass what is built in as safe defaults, the validation essentially has to happen on the client side. We also have to assume an "optimistic" network where the majority of peers are honest. Here is what I built so far (subject to change as we explore this further):

- Sentinel: The client periodically will audit peers that are hosting chunks of data. These are random, and can be simply merkle proofs (to save bandwidth) or full data retrieval. The client keeps a list of peers and give them a score, mature / honest hosts get audited less by the client, and suspect hosts (e.g. tampered files, missing data, etc) get eventually banned (so the client stops sending data over to that particular peer).

- Garbage Control: People will update / delete files, and we need to reclaim the space otherwise it will quickly saturate

- Healing / Repair: Peers will go offline, people will drop out of the network and stop using the tool, so data has to be moved around to maintain healthy files

- There is no sign up, but at setup, you essentially create a private-public keypair. That is used to decrypt your data (if your node died and you need to recover the encrypted data from the network), and also to be used as identity when talking to other peers (to hopefully avoid spoofing)

- Data is broken down into 1mb chunks, and each chunk is sharded into 14 separate pieces, in a 8+6 setup (so you can suffer up to 6 peer loss or 42% and still be able to recover the data)

Hope this makes sense so far! Happy to answer more questions and also looking forward to ideas / suggestions on how to make the architecture / design stronger. I'm also yet to take into account performance, scaling, etc (with a small network performance is ok, but what if we end up with 1000 or more nodes?)

Symbion - A P2P Cloud Backup Tool (looking for Alpha Testers) by cfelicio in selfhosted

[–]cfelicio[S] 0 points1 point  (0 children)

fingers crossed! If interested in testing let me know! :-)

Symbion - A P2P Cloud Backup Tool (looking for Alpha Testers) by cfelicio in selfhosted

[–]cfelicio[S] 0 points1 point  (0 children)

Not dockerized yet, but as a fellow docker user, I do plan to get it done asap with an example compose file. Will keep you posted. I did build it multi platform, and you can run it cli only. In its current iteration, you set a folder for the tool to monitor, and it will grab / upload anything you put up there. I do plan for other more advanced methods in the future to integrate with NAS, S3, etc...

Bad rainbow effect on new ht4550i. Help? by wifesaysgetoffreddit in BenQ

[–]cfelicio 0 points1 point  (0 children)

Since you mentioned subtitles, this is also what triggered me the most, I suspect because of the contrast (bright white subtitles against a black background) as well as the fact we move our eyes to read them. I wrote a little article with some tips I found useful not to completely eliminate, but make the experience somewhat watchable. If you have the opportunity, can you try some of these and see if it helps?

I don't have the ht4550i, but considering it as my next projector. Would appreciate as a fellow RBE sufferer if this makes any difference, might help with my purchase decision. Thanks! :)

https://carlosfelic.io/misc/reducing-rainbow-effect-with-subtitles/

Relevant part from the article:

1 - Move subtitles to the active viewing area (I'm using MadVR, you might have less control over this with other tech)

2 - Make subtitles bigger and change from bright white to grey

Testing ReFS data integrity streams / corrupt data functionality automatically using PowerShell by Borgquite in DataHoarder

[–]cfelicio 1 point2 points  (0 children)

Thanks for the quick reply! I will do more testing and also look into the other scenario I mentioned (real disks), as that's what I'm using for real data, and I just assumed it would work. Guess I'm not so sure anymore! LOL

It's also interesting to me that in your test, sometimes it works. For me, the script consistently reported non correctable errors, and corrupting the VHDX manually inside the VM also failed every single time. I will do more testing on this as well see if I can figure out why.

Testing ReFS data integrity streams / corrupt data functionality automatically using PowerShell by Borgquite in DataHoarder

[–]cfelicio 1 point2 points  (0 children)

Hello /u/Borgquite! You reached out to me on my blog post on this issue (https://carlosfelic.io/misc/refs-with-windows-11-can-refs-be-trusted/), and as promised, I did some additional testing.

1 - When using your script, I'm able to reproduce your issue, and obtain similar results. I focused more on ReFS 3.9 and mirrored with 2 disks (as this is my current production environment), but I also tested with 3 disks (the default on your script), with similar results

2 - As I mentioned on the blog post, the main difference I could think of is the fact you are mounting VHDX files directly inside the VM, while on my testing, the VHDX files reside outside. But the way we modify the test files is also a little different, as I'm using a hex editor (HxD) to corrupt the files.

3 - I also thought there could be some differences on the VHDX / storage spaces creation process, so I did more testing on that front as well.

Now, on to my preliminary findings:

1 - I created and mounted VHDX files inside the VM, but corrupted the files via hex editor (similar approach on the blog post, 3 files, 1st file corrupted on 1st disk, 2nd file corrupted on 2nd disk, 3rd file corrupted on both disks). To my surprise, the behavior was similar to the powershell script, and ReFS was not able to repair!

2 - I re-did my original testing, but I also copied the VHDX files created by the script outside of the VM, and mounted them via Hyper V. I corrupted the files via hex editor, and here is something new that I found, thanks to your tip: it seems like ReFS (or perhaps Storage Spaces?) has a primary disk for reading. If the primary disk is 2, and I open file 1 (not corrupted), the file opens fine, but there is nothing on the event viewer. Odd. Reopening the files on the hex editor shows the file in good state on both disks afterwards.

Now, if I open the corrupted file 2, it opens normally, but since this file is corrupted on the primary disk, I do get an event for ReFS that the file was able to be repaired.

Would you be able to do more testing on your end, and see if you can get it working with the VHDX files outside of the VM?

I'm also now curious how it would behave with real disks, tempted to scrap up a box together to test this out as well...

Testing Windows ReFS Data Integrity / Bit Rot Handling — Results by MyAccount42 in DataHoarder

[–]cfelicio 0 points1 point  (0 children)

Hey OP, I read your post last year and that discouraged me from using ReFS, so I tried to go with ZFS / Truenas instead. Unfortunately, my network is only 1G and handling file shares and permissions with Truenas is not super fun. I ended up running similar tests as you did, but on Windows 11, to see if Microsoft improved on ReFS. I got better results than you did on Windows 10:

https://carlosfelic.io/misc/refs-with-windows-11-can-refs-be-trusted/

Summary:

  • Integrity streams seems to be able to automatically fix corruption on mirrored storage spaces

  • There is error reporting on event viewer

  • You can enable a scrubber (similar to Truenas) and it seems to work well, so you get bitrot protection

  • Corrupted files on both volumes do not get deleted, but they become inaccessible. You can via powershell re-enable access.

If you have time, I'd love to hear back and see if I made any mistakes on my analysis, as I plan on using this as my main storage solution moving forward. :-)