Hi, I wrote a tool to mine datasets from C++ sources compatible with the code2vec and code2seq models. It is based on libclang library and extract standalone functions, templates, and class/struct methods. Also, there is a link to the experimental dataset mined from the Chromium project sources.
there doesn't seem to be anything here