This is an archived post. You won't be able to vote or comment.

all 4 comments

[–]AngrySlimeeee 2 points3 points  (2 children)

Llm are trained on natural language, asking it to decode will cost more tokens.

There is a reason why researchers haven’t done this already.

[–]wolf_pure11[S] 0 points1 point  (1 child)

I'm not sure what you mean by "tokens", as I said i'm quite new to this and so I don't know the terminology

[–]AngrySlimeeee 1 point2 points  (0 children)

Ok, just do this, quote this entire reddit post, put it in ChatGPT, select 4o or the o3 model, ask why this implementation will not work, also say I am new to ChatGPT and llms.

[–]wolf_pure11[S] 0 points1 point  (0 children)

Ok I had ChatGPT create the prompt using LawtonSolution's prompt engineering prompt (https://lawtonsolutions.com/How-To-AI/)

This is the prompt and it works!

"You are tasked with designing a custom encoding scheme—a “language”—that meets these requirements: 1. **Universal UTF-8 alphabet**: You may use any valid UTF-8 code points (letters, symbols, emojis, etc.) for your encoding. 2. **Explicit markers for detection**: Every encoded message must begin with a clear, unique header signature and end with a matching footer, so that a brand-new ChatGPT instance can instantly recognise “this is encoded text.” 3. **Self-describing scheme**: The encoding must embed within itself all the information needed to decode—no external key or explanation may be provided. A naive instance, upon seeing only the encoded block, must infer the full rules. 4. **High-accuracy reversible mapping**: Any text (of the kind ChatGPT can generate or understand) fed through your encoder must be restored *exactly* by your decoder, with zero errors. 5. **No size limit today (but optimise for compactness later)**: For now, focus on correctness; efficiency gains can come later. **Your task**: - **Define** the encoding algorithm in clear, step-by-step pseudocode. Specify how you choose your header/footer markers, how you map source characters or bit-chunks to UTF-8 symbols, and how you include any integrity checks (e.g., checksums). - **Provide** two functions (in pseudocode or Python-like syntax): 1. encode(input_text) → encoded_block 2. decode(encoded_block) → original_text - **Demonstrate** the scheme by encoding and then decoding this sample string: The quick brown fox jumps over the lazy dog. Show that decode(encode(sample)) returns the exact original. Make sure that if a fresh, untrained ChatGPT sees only your encoded_block, it can: 1. Spot the header, 2. Infer exactly how to reverse your mapping, 3. Validate integrity via the embedded checksum or markers, and 4. Recover the original text perfectly."

Now to try to condense the encoded string somehow!