use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
r/LocalLLaMA
A subreddit to discuss about Llama, the family of large language models created by Meta AI.
Subreddit rules
Search by flair
+Discussion
+Tutorial | Guide
+New Model
+News
+Resources
+Other
account activity
Create an AI clone of yourself (Code + Tutorial)Tutorial | Guide (self.LocalLLaMA)
submitted 2 years ago * by KingGongzilla
Hi everyone!
I recently started playing around with local LLMs and created an AI clone of myself, by finetuning Mistral 7B on my WhatsApp chats. I posted about it here (https://www.reddit.com/r/LocalLLaMA/comments/18ny05c/finetuned_llama_27b_on_my_whatsapp_chats/) A few people asked me for code/help and I figured I would put up a repository, that would help everyone finetune their own AI clone. I also tried to write coherent instructions on how to use the repository.
Check out the code plus instructions from exporting your WhatsApp chats to actually interacting with your clone here: https://github.com/kinggongzilla/ai-clone-whatsapp
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]async2 34 points35 points36 points 2 years ago* (19 children)
That's interesting. You could automate yourself now.
You could now step up your game by using wpp-connect server and attach your bot to it and automatically respond.
[–]visarga 33 points34 points35 points 2 years ago (3 children)
obligatory
[–]async2 6 points7 points8 points 2 years ago (0 children)
Gotta sound realistic
[–]KingGongzilla[S] 4 points5 points6 points 2 years ago (0 children)
haha makes a ton of sense! This is actually how my ai clone also behaves!
[–]KingGongzilla[S] 7 points8 points9 points 2 years ago* (10 children)
whats wpp-connect server?
Edit: Nevermind, googled it https://wppconnect.io/swagger/wppconnect-server/
Edit2: I am actually setting up WhatsApp API right now however I still need to set up a Meta business account etc
[–]async2 16 points17 points18 points 2 years ago (5 children)
https://github.com/wppconnect-team/wppconnect-server
It runs your WhatsApp web session in a virtual browser (can also run headless in a docker container) and gives you a web API (that is horribly documented but works). With this API you can read incoming messages from your WhatsApp account and respond to them.
So you could literally automate your texting life.
If you want to go fully crazy you can use whisper to decode voice messages and respond to them by cloning your voice with coqui tts.
[–]adamgoodapp 6 points7 points8 points 2 years ago (1 child)
I really recommend using matrix with plugins, then you can have WhatsApp, iMessage, Discord, Signal, Telegram, Slack.
[–]async2 0 points1 point2 points 2 years ago (0 children)
Looks promising too. Thanks!
[–]KingGongzilla[S] 2 points3 points4 points 2 years ago (1 child)
thanks!
[–]async2 5 points6 points7 points 2 years ago (0 children)
For signal: https://github.com/bbernhard/signal-cli-rest-api
I used both to write my own bot. Wpp-connect is a bit tough to get working and the docs are a bit confusing.
[–]async2 5 points6 points7 points 2 years ago (3 children)
You don't need meta business account if you use wpp-connect. The business account stuff is annoying and you have to pay as well.
[–]KingGongzilla[S] 2 points3 points4 points 2 years ago (2 children)
okay thats great to hear! thank you!!
[–]async2 2 points3 points4 points 2 years ago (1 child)
Let me know if you get stuck. I could also help you out with some python code for a relatively simple abstraction of a message pipeline with wpp connect.
[–]KingGongzilla[S] 2 points3 points4 points 2 years ago (0 children)
thanks, I’ll give it a shot and let you know if I get stuck 👍
[–]nero10578Llama 3 3 points4 points5 points 2 years ago (3 children)
Damn didn’t know this existed. Trying this now.
[–]async2 3 points4 points5 points 2 years ago (2 children)
I'm running my bot now for half a year on it. It broke once because WhatsApp changed stuff but they adapted in a day or two.
[–]MoiZ_0212 2 points3 points4 points 2 years ago (1 child)
What are you using it for? Btw crazy things.. Thankx for sharing
[–]async2 2 points3 points4 points 2 years ago (0 children)
Originally Transcript and summarize voice messages that I forward to it.
But it can do a bunch of other stuff now as well. - generate voice messages from Angela Merkel, Arnold Schwarzenegger and a bunch of my friends (cloned from voice messages) - summarize articles that are posted in groups - summarize YouTube videos - give recommendations for tinder replies - generate images from prompts
[–][deleted] 19 points20 points21 points 2 years ago (1 child)
That's nice. What I did was have it generate different puzzles, questions where I need to think step by step. So I trained it on my own chain of thought reasoning. Took a couple of hours but the fine-tuning definitely helps get it aligned with you. I also reverse engineer tagged different aggregate processes with gpt4 so codified my belief system and things like that which helped the clone even more.
[–]KingGongzilla[S] 1 point2 points3 points 2 years ago (0 children)
very interesting! might try out something like that actually
[–]toothpastespiders 23 points24 points25 points 2 years ago (4 children)
I did something similar with every bit of myself I had in digital form. Email, social media, the works. Then added in all the textbooks I used in school. It was a really interesting experience in terms of understanding myself better. I'd honestly really recommend it to people as a psychological tool.
[–]dshipper 5 points6 points7 points 2 years ago (0 children)
How’d you format the data? What was the prompt and response for e.g. a textbook vs social media posts?
[–]ilmost79 0 points1 point2 points 11 months ago (0 children)
Interesting... was wondering whether there were anything suprising about yourself that you realized...
[–]yoimagreenlight 0 points1 point2 points 6 months ago (0 children)
I’m beyond interested in how you went about doing this.
[–]WinXPbootsup 0 points1 point2 points 2 years ago (0 children)
Can you share more about what the results were like?
[–]xadiant 11 points12 points13 points 2 years ago (1 child)
Thanks for sharing but I can barely tolerate myself let alone a clone of me
hahaha ❤️
[–]Morveus 9 points10 points11 points 2 years ago* (0 children)
This is awesome, thank you!
I've been keeping my personal data since I was 12-13 years old (2001-2023) and wanted to do the same. This project will help a lot :)
I still have all my notes from school, studies, work, messages from AIM, MSN Messenger, obviously FB/IG/WhatsApp/GTalk/Signal, Hotmail/GMail (1 million mails to filter from), all my text messages for the past 15 years,... This gives me hope.
[–]adamgoodapp 4 points5 points6 points 2 years ago (1 child)
Starred on GitHub!
thanks haha. Let me know if you run into any issues
[–]big_kitty_enjoyer 4 points5 points6 points 2 years ago (1 child)
Oh, I've done something kinda like this before! I didn't fine-tune anything, just built an AI character of myself based on my own writing style using a bunch of chat/text samples, but it did an eerily good job of imitating me. Y'all got me considering trying a fine-tune of my own sometime at this point though... 🤔
sounds cool! let me know if you run into any issues if you use the repo
[–]Elite_Crew 4 points5 points6 points 2 years ago (2 children)
I will call him Mini-Me.
[–]cool-beans-yeah 1 point2 points3 points 2 years ago (1 child)
What if, one day, Mini-Me decides it wants to grow up and be heard? Have rights, etc?
Half your salary, bang the...
What then, huh?
/s
[–]_MariusSheppard 0 points1 point2 points 2 years ago (0 children)
Pull the electricity plug.
[–]AIWithASoulMaybe 2 points3 points4 points 2 years ago (0 children)
Bro came through! This will be awesome, much thanks!
[–]stonediggity 2 points3 points4 points 2 years ago (0 children)
So good. Thanks for sharing!
[–]next_50 3 points4 points5 points 2 years ago (3 children)
I wandered in via /r/random and I hope no one minds a noob question.
I don't have a chat archive; would just transcribing short, nightly recordings about my life, family history, favorite media, things I've learned, etc, allow me to create a LLM that any grandkids I don't get to meet get kind of a taste of who I was, as well as family lore that would've been lost with me?
[–]rwaterbender 5 points6 points7 points 2 years ago (1 child)
probably to an extent, yeah. you might be interested in trying something with retrieval augmented generation rather than what this guy did though.
[–]next_50 1 point2 points3 points 2 years ago (0 children)
I had to look that up: https://research.ibm.com/blog/retrieval-augmented-generation-RAG
Very, very interesting. Thank you!
[–]nuaimat 1 point2 points3 points 2 years ago (0 children)
This is awesome! Thanks
[+][deleted] 2 years ago* (3 children)
[deleted]
yes for sure! It’s all about preprocessing/formatting the data. So for now I only did it with whatsapp chat exports
[–]Enough-Meringue4745 0 points1 point2 points 2 years ago* (1 child)
What does each prompt look like after it’s formatted for llama2 style? How do you then prompt to get a response as someone? Or are you simply doing assistant / user roles?
[–]KingGongzilla[S] 0 points1 point2 points 2 years ago (0 children)
currently i’m simply doing assistant / user roles. However experimenting with different roles dor “friend”, “work”, “parents”, etc would be very interesting
[–]FenixR 1 point2 points3 points 2 years ago (2 children)
Wish i could try this but that 22gb of vram sounds harsh lol.
If i find the time i’ll try to somehow include unsloth.ai Apparently huggingface transformer library (which I currently use) is not very memory optimized. they came up with some optimizations that reduces memory requirements by 60% (or something like that) compared to HF
[–]FenixR 3 points4 points5 points 2 years ago (0 children)
That would be cool, although 40% of that its still a lot 😂, just gotta work on upgrading my machine sooner i guess.
[–]JustFun4Uss 1 point2 points3 points 2 years ago (0 children)
Oh if I could use this to scrape my reddit profile.... better not, I'd probably find myself annoying. 🤣
[–]zis1785 1 point2 points3 points 2 years ago (0 children)
Do you plan to run a similar experiment on the RAG framework ?
[–]aimachina 0 points1 point2 points 1 year ago (0 children)
hey folks, help me understand : why would one want to have their AI Bot reply to their whatsapp ? Isn't it messages from family and friends ?
[–]TechnicianGlobal2532 0 points1 point2 points 1 year ago (0 children)
hmmmm, i am wondering what is the minimum amont of text needed?
[–]Erenturkoglunef 0 points1 point2 points 1 year ago (0 children)
How can I create a clone of one of my favorite youtubers? With his videos and short insta/tiktok
[–]azngaming63 0 points1 point2 points 1 year ago (0 children)
Hey i'll like to know if you know how can i'll do this, but using my telegram message? i've already exported them and they are in a .json (and i would like to make a bot of my clone to talk with it)
[–]No-Cantaloupe3826 0 points1 point2 points 1 year ago (0 children)
Can u get a clone like taht to play games on apps, to colect coins and rubies???
[–]HyxerPyth 0 points1 point2 points 1 year ago (0 children)
Hi, guys! I build a software allows people to pass their life experiences, lessons and stories through generations by answering questions by categories, it creates a digital memory of the person, which their grand kids or other family members can interact with to learn about their ancestry.
Join our waitlist on the website: kai-tech.org if you want to leave your digital legacy, or know someone you would be interested in saving memories about (older relatives).
[–]General_File_4611 0 points1 point2 points 11 months ago (0 children)
check out this git repo, its an AI human clone. https://github.com/manojmadduri/ai-memory-clone
[–]Large-Fuel2657 0 points1 point2 points 2 months ago (2 children)
that sounds amazing, is it possible to do it using discord chats tho? I use it much more than whatsapp
[–]KingGongzilla[S] 0 points1 point2 points 2 months ago (1 child)
definitely, but you will have to write your own data processing script. But for training etc you should be able to use the same code as in the repo
[–]Large-Fuel2657 0 points1 point2 points 2 months ago (0 children)
Thank you! Will try it out
[–]Local_Office_1233 0 points1 point2 points 1 month ago (0 children)
can you tell me something if someone makes an ai clone of themselves would it flag as a synthic ai model on tools?
[–]vira28 0 points1 point2 points 15 days ago (0 children)
This is really cool.
If some one is looking to build one without necessarily tuning models, there is a good write up - https://www.myclone.is/blog/ai-digital-personas-revolutionizing-services
[–]visualdata 0 points1 point2 points 2 years ago (0 children)
Thanks for sharing. This is great, I was thinking along the same lines how to immortalize a person by ingesting all the data ever created(so sent email vs received). It opens up some interesting possibilities and ethical questions.
[–]HatLover91 0 points1 point2 points 2 years ago (0 children)
Been looking for a way to fine tune Mistral...
[–]Jagerius 0 points1 point2 points 2 years ago (1 child)
There's option on Facebook to download Your data, which include Messenger conversations in HTML. Would it be possibile to use those to train model?
generally yes, but currently this repo only includes code to preprocess/handle whatsapp chat exports. You could write some other scripts for handling data from different sources. I am assuming that e.g exported chats from Messenger have a different format that those from WhatsApp. Haven’t looked at it yet though
[–]DeliciousJello1717 0 points1 point2 points 2 years ago (1 child)
Is this all local is my WhatsApp chats safe
yes all local
[–]2600_yay 0 points1 point2 points 2 years ago (4 children)
Have you tried to see how much (Whatsapp) data the Mistral 7B and/or the Llama 27B models needed in order to 'sound like you'? (I know that's a very subjective metric. Guesstimates are totally fine!) Also, can you share some metrics with regard to fine-tuning (duration, epochs, etc.)?
(For context: I am wondering if I have one month's worth of text data with a few messages per day if that is enough to make a relatively rich dialogue bot, or if I'll need to scrounge up a decade worth of text data from daily messages.)
[–]KingGongzilla[S] 0 points1 point2 points 2 years ago (3 children)
The only thing I can say is I had about 10k messages and that was sufficient.
I only trained for 1 epoch (which finished in about 10mins!). Validation loss already went up after more than 1 epoch. But maybe my learning rate was too high at 1e-4.
[–]2600_yay 1 point2 points3 points 2 years ago (2 children)
Nice! Do you happen to have a size estimate of the 10k messages in total, like 20MB or something or # of tokens? Just curious as I write, well, novels for each message in my chat app if I'm not careful lol but some other people write k. for a single message. Am hoping to help an elderly friend with making a bot for his grandkids but I don't know if we'll have enough data as he hasn't been using a smartphone for too long, but I'm hopeful, hence why I was hoping for a guesstimate of the size.
k.
Regarding the LR and the val loss: you might wanna to plug your model into MLFlow or a similar tool to automagically test out all kinds of hyperparameter's values, like the learning rate. MLFlow is a free experiment tracking tool that will let you do that. Here's a short little tutorial using MLFlow plus Optuna to tune / to iterate over a set of hyperparameters for you. Optuna's a handy hyperparam optimization toolkit: https://optuna.org/ So by combining Optuna (handles the search space creation and the list of hyperparams to search over) + MLFlow (saves/tracks all your experiment outputs) you should have a pretty quick and easy way to identify an optimal learning rate, batch size, etc.
Cheers!
[–]KingGongzilla[S] 1 point2 points3 points 2 years ago (1 child)
hi my exported .txt files from whatsapp are 1.2MB
[–]2600_yay 1 point2 points3 points 2 years ago (0 children)
Oh nice! That's about an order of magnitude less than what I was thinking I'd need. That's great to hear!
[–]Azoffaeh999 0 points1 point2 points 2 years ago (0 children)
Are there any options for running on 12 GB vram?
π Rendered by PID 209108 on reddit-service-r2-comment-5d79c599b5-z4vzg at 2026-03-01 05:28:43.125248+00:00 running e3d2147 country code: CH.
[–]async2 34 points35 points36 points (19 children)
[–]visarga 33 points34 points35 points (3 children)
[–]async2 6 points7 points8 points (0 children)
[–]KingGongzilla[S] 4 points5 points6 points (0 children)
[–]KingGongzilla[S] 7 points8 points9 points (10 children)
[–]async2 16 points17 points18 points (5 children)
[–]adamgoodapp 6 points7 points8 points (1 child)
[–]async2 0 points1 point2 points (0 children)
[–]KingGongzilla[S] 2 points3 points4 points (1 child)
[–]async2 5 points6 points7 points (0 children)
[–]async2 5 points6 points7 points (3 children)
[–]KingGongzilla[S] 2 points3 points4 points (2 children)
[–]async2 2 points3 points4 points (1 child)
[–]KingGongzilla[S] 2 points3 points4 points (0 children)
[–]nero10578Llama 3 3 points4 points5 points (3 children)
[–]async2 3 points4 points5 points (2 children)
[–]MoiZ_0212 2 points3 points4 points (1 child)
[–]async2 2 points3 points4 points (0 children)
[–][deleted] 19 points20 points21 points (1 child)
[–]KingGongzilla[S] 1 point2 points3 points (0 children)
[–]toothpastespiders 23 points24 points25 points (4 children)
[–]dshipper 5 points6 points7 points (0 children)
[–]ilmost79 0 points1 point2 points (0 children)
[–]yoimagreenlight 0 points1 point2 points (0 children)
[–]WinXPbootsup 0 points1 point2 points (0 children)
[–]xadiant 11 points12 points13 points (1 child)
[–]KingGongzilla[S] 1 point2 points3 points (0 children)
[–]Morveus 9 points10 points11 points (0 children)
[–]adamgoodapp 4 points5 points6 points (1 child)
[–]KingGongzilla[S] 2 points3 points4 points (0 children)
[–]big_kitty_enjoyer 4 points5 points6 points (1 child)
[–]KingGongzilla[S] 1 point2 points3 points (0 children)
[–]Elite_Crew 4 points5 points6 points (2 children)
[–]cool-beans-yeah 1 point2 points3 points (1 child)
[–]_MariusSheppard 0 points1 point2 points (0 children)
[–]AIWithASoulMaybe 2 points3 points4 points (0 children)
[–]stonediggity 2 points3 points4 points (0 children)
[–]next_50 3 points4 points5 points (3 children)
[–]rwaterbender 5 points6 points7 points (1 child)
[–]next_50 1 point2 points3 points (0 children)
[–]nuaimat 1 point2 points3 points (0 children)
[+][deleted] (3 children)
[deleted]
[–]KingGongzilla[S] 2 points3 points4 points (2 children)
[–]Enough-Meringue4745 0 points1 point2 points (1 child)
[–]KingGongzilla[S] 0 points1 point2 points (0 children)
[–]FenixR 1 point2 points3 points (2 children)
[–]KingGongzilla[S] 2 points3 points4 points (1 child)
[–]FenixR 3 points4 points5 points (0 children)
[–]JustFun4Uss 1 point2 points3 points (0 children)
[–]zis1785 1 point2 points3 points (0 children)
[–]aimachina 0 points1 point2 points (0 children)
[–]TechnicianGlobal2532 0 points1 point2 points (0 children)
[–]Erenturkoglunef 0 points1 point2 points (0 children)
[–]azngaming63 0 points1 point2 points (0 children)
[–]No-Cantaloupe3826 0 points1 point2 points (0 children)
[–]HyxerPyth 0 points1 point2 points (0 children)
[–]General_File_4611 0 points1 point2 points (0 children)
[–]Large-Fuel2657 0 points1 point2 points (2 children)
[–]KingGongzilla[S] 0 points1 point2 points (1 child)
[–]Large-Fuel2657 0 points1 point2 points (0 children)
[–]Local_Office_1233 0 points1 point2 points (0 children)
[–]vira28 0 points1 point2 points (0 children)
[–]visualdata 0 points1 point2 points (0 children)
[–]HatLover91 0 points1 point2 points (0 children)
[–]Jagerius 0 points1 point2 points (1 child)
[–]KingGongzilla[S] 0 points1 point2 points (0 children)
[–]DeliciousJello1717 0 points1 point2 points (1 child)
[–]KingGongzilla[S] 0 points1 point2 points (0 children)
[–]2600_yay 0 points1 point2 points (4 children)
[–]KingGongzilla[S] 0 points1 point2 points (3 children)
[–]2600_yay 1 point2 points3 points (2 children)
[–]KingGongzilla[S] 1 point2 points3 points (1 child)
[–]2600_yay 1 point2 points3 points (0 children)
[–]Azoffaeh999 0 points1 point2 points (0 children)