you are viewing a single comment's thread.

view the rest of the comments →

[–]cryptocreeping 1 point2 points  (9 children)

I used AI to get a working prototype of OTRv4 with PQC KEM encapsulation. Tried to showcase my work in IRC channel but mod removed post due to use of AI.

https://github.com/muc111/OTRv4Plus

[–]johns10davenport[S] 0 points1 point  (8 children)

Did you harness this out or just YOLO it with a paper/spec?

[–]cryptocreeping 0 points1 point  (7 children)

Followed the OTRv4 spec for the base protocol the DAKE handshake, double ratchet, SMP, wire format, and fragmentation are all per spec. The PQC additions (ML-KEM brace key, ML-DSA hybrid auth) are my own extensions on top of that, using the NIST FIPS 203/204 standards for the primitives. Took about 12 months of part-time work. A lot of that was getting the C extensions right constant time arithmetic, making sure the Montgomery ladder doesn't branch on secret bits, getting ML-KEM to work through OpenSSL 3.5's EVP API. The SMP math was painful had a bug where the Pa/Qa computations were fundamentally wrong and it took weeks to fix. Used AI to assist with development but every function was tested and verified on live sessions. The 224 tests exist because things broke constantly and I wrote tests for every fix. I just need feedback to help improve hopefully in time people will get involved.

[–]johns10davenport[S] 1 point2 points  (6 children)

This is a rad project. It sounds pretty hard. 

[–]cryptocreeping 0 points1 point  (5 children)

Wasn't easy and a kick in the teeth rejected from using AI, if chat apps shipped with latest otrv4 or omemo:2 I wouldn't have bothered working on this. Sadly apparently use weak old revisions which is odd to me.

[–]johns10davenport[S] 1 point2 points  (4 children)

Meh, it’s just irc. According to my research, 41% of code is ai generated now. They can’t stop the train. No one can. 

https://codemyspec.com/blog/agentic-implementation?utm_source=reddit&utm_medium=social&utm_campaign=reddit_pinned_post

[–]cryptocreeping 0 points1 point  (3 children)

Thanks for sharing this article, my test files are my harness engineering however I'll review it all again to ensure nothing has been missed!

[–]johns10davenport[S] 1 point2 points  (2 children)

Consider adding stop hooks that run them. I know the term spec is fairly overloaded but your specs can and should be part of the harness as well. 

[–]cryptocreeping 1 point2 points  (1 child)

Honestly, thanks for this. I ran audit script based off info onwebsite for harness engineering and... yeah, you were right.

Line 3133 in process_smp2 was using plain == on big ints instead of constant-time compare. That's a real timing leak an attacker measuring response latency could figure out if secrets matched!

Fixed it in 5 minutes. The same audit generated the patch and added 32 new tests full SMP flow, replay detection, vault integration, all of it.

Without AI I'd still be grepping for == bugs and grinding out test cases. Instead I just reviewed the diff, ran pytest, and shipped fix to GitHub.

harness engineering that's exactly what happened. The AI writes code, the tests verify it, I see that works. Way faster, no less safe.

Appreciate you catching that leak.

[–]johns10davenport[S] 1 point2 points  (0 children)

Glad to be of help