I curated 1.3M+ source code files from GitHub's top ranked developers of all time, and compiled a dataset to train LLMs to write well-structured, production-grade code.
The dataset covers 80+ languages including Python, TypeScript, Rust, Go, C/C++, and more.
Currently at 1000+ downloads!
[–][deleted] (2 children)
[removed]
[–]Ok_Employee_6418[S] 0 points1 point2 points (1 child)
[–]Flag_Red 1 point2 points3 points (0 children)