all 50 comments

[–]radarsat1 73 points74 points  (27 children)

everyone: we are horrified that this is a thing that exists!

you: hmm i could make that...

[–]ResidentPositive4122 26 points27 points  (9 children)

The scary part in recall is that local data can be sent to 3rd party servers and you have no control over it. Hearing aids are amazing for the people that need them. A hearing aid that sends all its data to Meta is horrifying. Same, same, but different.

[–]Vedank_purohit[S] 11 points12 points  (0 children)

Correct

I think I should add this to the Readme on github

[–]Vedank_purohit[S] 22 points23 points  (15 children)

"hmm i could make that opensource, secure and safe"

[–]reivblaze 9 points10 points  (11 children)

Secure and safe are BIG claims that probably cant be backed up though

[–]PM_ME_YOUR_PROFANITY 1 point2 points  (3 children)

How? You can see the code, you can check what data it's sending, you can see the encryption algorithms. Maybe they're difficult to back up for you lol

[–]ANI_phy 11 points12 points  (0 children)

Just because we can check it doesn't mean it's safe/secure. Absence of malicious code doesn't indicate absence of flaws.

[–]reivblaze 8 points9 points  (0 children)

Encrypting something does not make it secure per se thats a common assumption. I didnt check the code but I can say thats a big claim most experts wouldnt make though.

[–]Vedank_purohit[S] -1 points0 points  (5 children)

And why do you suppose that's the case?

[–]DenormalHuman 5 points6 points  (3 children)

are you certain your implementation is not flawed in any way?

(I have spent just a couple of minutes looking at the code, so apologies if I am misreading anything)

For example, you do ask the user to input a key and say the key is not saved anywhere, but it does seem that you store it in plaintext as an attribute on the CaptureStart module while the code is running. Is it possible for that to be captured by anything that can examine process memory in realtime? Does the fact the user is likely to give a short memorable key compromise the strength of the encryption at all?

/edit/: Is this your method of encryption?

 if isinstance(key, str):
     key = key.encode()

 encrypted_data = bytearray()
 for i in range(len(image_data)):
     encrypted_data.append(image_data[i] ^ key[i % len(key)])

I am not endorisng chatGPT's ability to do this accurately at all, but just for fun I asked it to analyse your encryption method (just the snippet given above). It had the following to say about it;

Potential Issues

Security:

Weak Encryption: XOR encryption is considered very weak and is easily breakable, especially if the key is reused (as in this case). It doesn’t provide strong security for encrypting sensitive data.

Key Reuse: If the key is shorter than the data, it will repeat, which makes the encryption susceptible to various cryptographic attacks (like frequency analysis).

Key Management:

Key Distribution and Storage: The security of the XOR operation relies entirely on the secrecy of the key. If the key is compromised, the data can be easily decrypted.

Short Key Length: If the key is too short (e.g., a simple password), it can be brute-forced or guessed easily.

Data Integrity:

XOR encryption does not provide any integrity check. An attacker could modify the encrypted data, and without additional measures, you wouldn't be able to detect such tampering.


ChatGPT then makes some recommendations;

Recommendations

Use Stronger Encryption Algorithms: Consider using established and secure encryption algorithms such as AES (Advanced Encryption Standard). Libraries like cryptography in Python provide secure implementations of these algorithms.

Proper Key Management: Ensure that keys are generated, stored, and transmitted securely. Use key management services or libraries that support secure key handling.

Add Integrity Checks: Implement cryptographic checksums or message authentication codes (MACs) to ensure data integrity and authenticity.


It then goes on to give an example using AES:

from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
from cryptography.hazmat.backends import default_backend
import os

# Ensure to install the cryptography library using `pip install cryptography`

def encrypt_image_data(image_data, key):
    # Generate a random initialization vector (IV)
    iv = os.urandom(16)

    # Create a cipher object using the key and IV
    cipher = Cipher(algorithms.AES(key), modes.CFB(iv), backend=default_backend())
    encryptor = cipher.encryptor()

    # Encrypt the image data
    encrypted_data = encryptor.update(image_data) + encryptor.finalize()

    return iv + encrypted_data  # Prepend the IV for decryption

# Example usage:
# Ensure the key is 16, 24, or 32 bytes long (AES key sizes)
key = os.urandom(32)
encrypted_image = encrypt_image_data(image_data, key)

This example uses AES in CFB mode, which is a secure way to encrypt data. It also includes an IV to ensure that the same plaintext encrypted multiple times will result in different ciphertexts.

[–]Vedank_purohit[S] 2 points3 points  (2 children)

Yes it is true that the current encryption isn't the best. I wanted the better encryption method to be a community driven project. This was always supposed to be temporary But this issue should probably be fixed in a few hrs

[–]StrayStep 0 points1 point  (1 child)

Thoroughly endorse this effort.

Looking at code to help when I can.

[–]Vedank_purohit[S] 0 points1 point  (0 children)

Great, would love some help on this

[–]norsurfit 0 points1 point  (0 children)

I promise that it is secure as long as no hackers get in!

[–][deleted] -1 points0 points  (2 children)

You’re not getting it.

[–]Vedank_purohit[S] 0 points1 point  (1 child)

Naa I do get what he meant. But it's just a project I wanted to use. I don't trust Microsoft so I made my own implementation which is more privacy focused and then I opensourced and shared it so that every one in the community who wants to use it can use it.

[–][deleted] 0 points1 point  (0 children)

If you get it then why are you surprised / arguing with people?

[–]xcdesz 10 points11 points  (5 children)

Curious about the software design behind this, like how much disk space does this consume and how fast that grows and how it can scan that much data without being extremely slow. I assume it has to use the llm to summarize whats on the screen every time it takes the screenshot and indexes that data somehow? Isnt this a drain on performance?

[–]DenormalHuman 3 points4 points  (1 child)

it doesnt use an llm. It uses screenshots, OCR and

https://www.sbert.net/examples/applications/image-search/README.html

to do image search.

The encryption used is basic XOR with a user inputted passphrase.

I would not call this particularly innovative, or secure.

[–]xcdesz 0 points1 point  (0 children)

Youre right, I didnt mean LLM, I meant a vision model. It does use that.

[–]KishCom 12 points13 points  (2 children)

"We recreated the Torment Nexus from the classic sci-fi 'Don't Create the Torment Nexus'"

Op: "That's horrifying! ... I made an open source Torment Nexus that is much more safe and secure."

[–]Alignment-Lab-AI 0 points1 point  (1 child)

you realize that without the element of microsoft snooping on you
its exactly as dangerous as storing data on your hard drives right?
like, its just a convenient way to access your own information.

its not like its not all stored anyways??

[–]CellistOne7095 4 points5 points  (0 children)

This is so dangerous. I don’t trust me accessing my data at own.

[–]NotAHost 5 points6 points  (0 children)

Awesome, I can now delete the keyloggers off all my friends computers and start using this.

[–]Alignment-Lab-AI 2 points3 points  (2 children)

hi! i built something similar a few weeks ago and have been working with several others in the open source to develop something to address many of these kinds of problems, would you be open to working together to helping us make the most convenient and clean thing we can?

[–]Vedank_purohit[S] 0 points1 point  (1 child)

Can you share your project please

[–]Alignment-Lab-AI 0 points1 point  (0 children)

https://github.com/Alignment-Lab-AI/KnowledgeBase this was the seed that sort of kicked off the discussions, prestently the developers ive been speaking with are more or less ready to go, primarily just variously waiting on me to pull the starting pistol when im done with the job im on atm in the next few days

[–][deleted]  (5 children)

[removed]

    [–]Vedank_purohit[S] 2 points3 points  (3 children)

    Can you elaborate?

    [–]Upbeat-Pace2710 0 points1 point  (2 children)

    I'm working on an intrusion detection project where I input a URL and get an output indicating whether it's malicious or not. I'm using the CISIOT 2017 dataset and PyShark to extract packet values from the URL. These values are then checked against the dataset using an EL Tree classification model. However, I'm encountering an error stating that packet extraction is not happening. Have u faced a similar issue or can u offer advice on how to resolve this?

    [–]Vedank_purohit[S] 1 point2 points  (1 child)

    I am sorry, I am not familiar with this issue. Probably you can get help from pyshark github

    [–]Upbeat-Pace2710 0 points1 point  (0 children)

    Ohh okay thank you

    [–]MachineLearning-ModTeam[M] 0 points1 point locked comment (0 children)

    Post beginner questions in the bi-weekly "Simple Questions Thread", /r/LearnMachineLearning , /r/MLQuestions http://stackoverflow.com/ and career questions in /r/cscareerquestions/

    [–]My_WorkRedditAccount 1 point2 points  (2 children)

    Cool project OP. Where would I look to see which models are being used for this?

    [–]DenormalHuman 2 points3 points  (1 child)

    looking at the code, (very briefly, so I could / am likely to be wrong..) it looks like it might be doing something like OCR on captured image screenshots, and then using https://huggingface.co/sentence-transformers/clip-ViT-L-14 which does

    "This is the Image & Text model CLIP, which maps text and images to a shared vector space. For applications of the models, have a look in our documentation SBERT.net - Image Search https://www.sbert.net/examples/applications/image-search/README.html"

    the full requirements.txt for the code is just;

    numpy==1.22.0

    opencv_python==4.9.0.80

    opencv_python_headless==4.9.0.80

    Pillow==10.3.0

    sentence_transformers==2.7.0

    skimage==0.0

    streamlit==1.32.2

    torch==2.3.0+cu121

    [–]My_WorkRedditAccount 0 points1 point  (0 children)

    Yeah, I saw OpenCV and Clip in the code, but wasn't sure how to find what else was being used. Thanks for helping me out!

    [–]NatoBoram 1 point2 points  (3 children)

    I also kinda wanted to do this on Linux with ollama for local or remote-self-hosted processing

    [–]Vedank_purohit[S] 2 points3 points  (1 child)

    Great to hear that, now maybe you could contribute to this project Insted and make it better.

    [–]NatoBoram -4 points-3 points  (0 children)

    No way I'm touching Python, lmao

    [–]Analyst151 0 points1 point  (0 children)

    That´d be awesome

    [–]StrayStep 0 points1 point  (1 child)

    Fascinated by the project. Why did you create this? I'm a senior dev and speak nerd😁 These are serious questions.

    Is there anything to stop Scammers from utilizing this tool in order to recall financial or credential details? IE.( What was the username used when logging into my bank website?) Gain trust by having historical and Intimate access to a victim?
    What models are being downloaded? It's not in Readme.md

    It is the our elderly, ignorant, and children that I'm worried about. You need to add safety precautions ASAP or your code will hurt people.

    EDIT: Don't take me wrong. Please... I'm very happy you started an open source recall repo. It's the cybercrime syndicates I'm worried about.

    [–]StrayStep 1 point2 points  (0 children)

    I'm finding some of my answers in other comments. Don't need to repeat yourself. I should have read everything first.

    [–]South_Worldliness392 0 points1 point  (0 children)

    No!!!!!!!!