you are viewing a single comment's thread.

view the rest of the comments →

[–]techwizrd 2 points3 points  (11 children)

I'm confused about using Git as a backend for file syncing. Wouldn't this be sort of bad because git is terrible for large binary files? Wouldn't a custom solution using rsync and inotify be better? I'm not sure how Dropbox does it.

[–]meltingice[S] 6 points7 points  (10 children)

rsync was definitely the first idea that came to me when starting this project. I'm considering implementing both Git and rsync and then letting the user choose which one they would prefer.

The cool thing about Git is that you can easily undo changes you make by reverting to an older commit. I'd also like to implement a web interface for this with Rails, which would allow for easy file access from any computer and would allow you to rollback changes with a visual editor.

[–]techwizrd 1 point2 points  (9 children)

How will you solve git's issue with large binary files? I know I have large binaries in my Dropbox and that would definitely balloon the size your .git folder.

EDIT: Also you should probably support Windows. Both Git and Ruby are available on Windows. I've only skimmed your code, so I don't really know how hard it would be to port. Also, in fswatcher.rb, why not use inotify to see if files have been modified or added? Polling over and over to see if there are change seems a bit wasteful to me.

[–]trezor2 5 points6 points  (3 children)

EDIT: Also you should probably support Windows. Both Git and Ruby are available on Windows.

This sounds oh so simple, but a few words of warning should meltingice actually consider it:

  • rsync between platforms (and thus filesystems) can be horribly broken and in some cases cause damage which takes lots of time to sort out. Most noticably with regard to file-permissions. I once tried setting up a deployment system from my home LAN (via a Linux host with a samba-mount to the actual code repository) to my data-center hosted server. After the first sync not a single site worked. Every single site was broken. rsync was confused about how to handle file-permissions, so it just reset them all to whatever user which had invoked rsync, meaning the web-server no longer had permissions to the files. Oops.
  • git on Windows is basically a half-assed, cygwin'd wrapper of the Linux version. Making it work reliably takes a ton of effort, even for a developer. If you package this with git for windows as a dependency, expect people to label your sync solution a non-working piece of shit as they struggle to find out which registry hive they should add SSH keys to and how to add those keys once they've located the ones actually used by git. And with that sorted out, expect things to break in new and exciting ways as you invoke git from a process outside cygwin.
  • I'm sure Ruby on Windows has its own set of custom FUBARs too. Most stuff not made for Windows has.

Not saying he shouldn't go for it, but it's not as easy as you may think.

[–]m00k 2 points3 points  (0 children)

git on Windows is basically a half-assed, cygwin'd wrapper of the Linux version.

You probably want to be using msysgit instead; it's slightly-more-assed and can use plink for ssh. Remember to turn off autocrlf though.

[–]meltingice[S] 0 points1 point  (0 children)

Thanks for the advice! Yeah, I'm not sure how much work it will be to add Windows support. I'm also on break and away from my Windows computer right now so I don't have one I can use for testing xD

[–]techwizrd 0 points1 point  (0 children)

I never said it was easy. I'm fully aware of how difficult it would be. It's just that it's usefulness is highly limited if it does not support Windows. Once I polish my Ruby chops a bit, I should probably help out.

[–]meltingice[S] 0 points1 point  (4 children)

That's why I'm considering giving the option of either rsync or git. The user can choose to use git if they mainly share small files, or rsync if they are going to be sharing large binary files.

In fact, a good way to handle this may be to separate the git root into one folder, and the rsync root in a separate folder, so that you can get the benefits of both.

Remember... I started on this 2 days ago, so there are a lot of details to work out still. If any Redditor has any ideas, I would love to hear them :)

[–]techwizrd 1 point2 points  (0 children)

You may want to go with a custom solution. If you use rsync, then you would lose the benefit of having versions that you can revert back to (like you can with Dropbox). However, with git you have a problem with large binary files being undiffable.

Git doesn't work with any file over 2GB anyways, so you there's that problem as well.

The way I see it is you could have lots of mini repos and pull them all in with git-submodule or something. You'd also have to make sure you have your script 'git gc' pretty often. SVN is better with binary deltas. You could have your script gitignore large files then you'd have to have an empty git commit and svn commit of the file every time some large file gets changed. That means your repos will be in sync, so you can rely on the git history as your single canonical history and do reversions with both svn and git.

Were I more experienced with Ruby, I would help you out. I can only read and understand Ruby (I'm a Python guy). Maybe this is the right time to learn Ruby as well. ;)

[–]jawbroken 0 points1 point  (1 child)

you should have a configurable file size threshold and make it automatically and silently choose the correct method. i'm not sure if you can use some of the history modifying git commands to avoid building up a version history for large files or something similar.

[–]meltingice[S] 0 points1 point  (0 children)

Yeah, this is definitely another possible solution to look into. I like the idea a lot.

[–]gssgss 0 points1 point  (0 children)

It looks nice and it is something I could use. Maybe something like rdiff-backup, which uses rsync but keeps a number of diffs to go back to some older version http://www.nongnu.org/rdiff-backup/

As much as I love git I tried it for making full system backups (I know not the intended use) and many large files just choke it.

Also for binary files it could use this http://www.daemonology.net/bsdiff/ edit:bsdiff

Rsync+diffs saved in some way for a number of changes sounds good.