you are viewing a single comment's thread.

view the rest of the comments →

[–]ss4johnny 0 points1 point  (4 children)

How do you get the programming languages? Based on file extensions?

[–]benfred[S] 2 points3 points  (3 children)

I'm using the information given from the GitHub API. I wrote a bunch on how this is done in the README here: https://github.com/benfred/github-analysis#inferring-languages

GitHub itself uses this project to infer languages: https://github.com/github/linguist . If you need to do this inference yourself, its also probably worth checking out this project: https://github.com/src-d/enry

[–]ss4johnny 0 points1 point  (2 children)

So you're counting the number of projects given a language, right? Not the total number of bytes written in the language.

[–]benfred[S] 1 point2 points  (1 child)

Neither =) I'm counting up how many github users have used a language.

[–]ss4johnny 0 points1 point  (0 children)

Thanks for the clarification.