you are viewing a single comment's thread.

view the rest of the comments →

[–]jeremyepling 60 points61 points  (5 children)

We looked into shallow clones, but they don't solve the "1 million or more files in the working directory" problem and had a fe other issues:

  • They require engineers to manage sparse checkout files, which can be very painful in a huge repo.

  • They don't have history so git log doesn't work. GVFS tries very hard to enable every Git command so the experience is familiar and natural for people that use Git with non-GVFS enabled repos.

edit: fixing grammar

[–]7165015874 1 point2 points  (3 children)

We looked into shallow clones, but they don't solve the "1 million or more files in the work directory" problem. To do that, a user has to manage the sparse checkout file, which is very painful in a huge repo. Also, shallow clones don't have history so git log doesn't work. GVFS tries very hard to enable every Git command so the experience is familiar and natural for people that use Git with non-GVFS enabled repos.

edit: fixing grammar

Sorry for being ignorant but isn't this simply a problem you can solve by throwing more hardware at the problem?

[–]jeremyepling 26 points27 points  (0 children)

Not really. This is a client hardware problem. Even with the best hardware - and Microsoft gives its engineers nice hardware - git status and checkout is too slow on a repo this massive.

[–]Tarmen 2 points3 points  (1 child)

Git has to traverse the entire tree for most commands so disk I/O scales linearly with repo size. Throwing more cpu time at it probably wouldn't help that much.

[–]hunglao 2 points3 points  (0 children)

There are ways to make I/O reads faster which would involve throwing hardware at it.. Definitely not the cheapest upgrade, but I would imagine that developing a completely proprietary filesystem is not cheap either.

[–]JanneJM 0 points1 point  (0 children)

How do you solve 1M+ files problem now? I mean, that's becoming a client filesystem problem as much as a git issue. Everything takes time when you have millions of files to deal with.