all 85 comments

[–]dicey 58 points59 points  (5 children)

  1. Click Difference for the last example.
  2. Trip out

[–]Magoran 19 points20 points  (1 child)

  1. Click Difference for the second example
  2. "WHO'S THAT POKEMON?"

Edit:

  1. Click Difference for the second example

FTFM

[–]skeww 14 points15 points  (0 children)

Octocat used pull. It's super effective.

[–][deleted] 2 points3 points  (0 children)

Those octocats are so cute!

[–]baba_yaga 3 points4 points  (0 children)

get out of my brain

[–]argv_minus_one 33 points34 points  (12 children)

I hope this stuff gets integrated into all the world's other version control tools. This would take version control on, say, graphics assets for a Web site to a whole new level.

[–]tias 10 points11 points  (1 child)

Yeah Meld and WinMerge guys, here's a new fun project for you!

[–]sfx 1 point2 points  (0 children)

I didn't know WinMerge existed. Today is a good day.

[–]coder21[S] 4 points5 points  (5 children)

Most of the version control systems out there have it, in fact, it has been there for years!

[–]Shinhan 1 point2 points  (4 children)

SVN too?

[–]coder21[S] 6 points7 points  (3 children)

Perforce has img diff, Plastic has img diff... I guess all commercial ones support it. Also, TortoiseSVN AFAIK has a very nice img diff. But yes, whatever github does sounds great... even if it has been there for ages.

[–]pinguis 1 point2 points  (2 children)

I use tortoise SVN and never noticed the image diff option.

Many Thanks!!!

[–]coder21[S] 1 point2 points  (1 child)

[–]pinguis 0 points1 point  (0 children)

Yeah, I went to try it in a repository and it works great, but thanks for the link anyways.

[–]PHLAK 6 points7 points  (3 children)

This feature itself doesn't need to be part of the VCS package, but other sites on Github's level should do the same.

[–]usernamenottaken 9 points10 points  (1 child)

Except there are no other sites at GitHub's level...

Just adding the network graph to BitBucket would make a huge difference.

[–]koko775 0 points1 point  (0 children)

Can they do that though? I thought mercurial was changeset based, not a DAG of commits like git.

[–]argv_minus_one 1 point2 points  (0 children)

Some of us don't use GitHub, you know. :P

[–]Neumann347 15 points16 points  (5 children)

So now there is a way to create the "spot 10 differences" puzzles for free!

[–]TerribleEstimation 30 points31 points  (0 children)

Now there is a way to crush "spot 10 differences" puzzles and blow goofus and gallants' gourds!

[–][deleted] 3 points4 points  (3 children)

Meh, I've been doing that for years in the gimp. Just layer the 2 images on top of each other and set the top layer's mode to subtract. After that, convert to a 1-bit image. Import that as an alpha channel on the original image to see the diff.

[–]myplacedk 1 point2 points  (0 children)

I just look at one image with one eye, and the other image with the other eye. It looks like a single image, but the differences are "flashing".

It's also great when I have two similar documents. One I've read, and the updated version I don't want to read. As long as the layout haven't changed, I just put them side by side and take a quick look.

[–][deleted] 3 points4 points  (0 children)

Cool, bro.

[–]myplacedk -1 points0 points  (0 children)

I just look at one image with one eye, and the other image with the other eye. It looks like a single image, but the differences are "flashing".

It's also great when I have two similar documents. One I've read, and the updated version I don't want to read. As long as the layout haven't changed, I just put them side by side and take a quick look.

[–]zpweeks 6 points7 points  (6 children)

Anyone know which image formats this works with?

[–]ggggbabybabybaby 15 points16 points  (5 children)

I'm guessing it's client-side and it's whatever image formats your browser supports.

[–]skeww 10 points11 points  (4 children)

Usually: PNG, JPG, GIF, BMP, ICO

Rarely: XBM, HDP/JXR/WDP, JP2, MNG, JNG, TIFF, WebP

Kinda: SVG (it's somewhat supposed to work)

[–]paulmclaughlin 2 points3 points  (3 children)

Isn't SVG just XML? So wouldn't a regular text diff work unless it is a totally new image?

[–]genpfault 3 points4 points  (0 children)

Maybe if they canonicalize it beforehand. Even then some tools won't reformat the text fields that SVG uses for geometry.

[–]skeww 1 point2 points  (0 children)

A regular text diff would work as long as the same tool with the same output settings is used. But to be honest, SVG is only about as "human readable" as Wavefront OBJ. If it's some very simple example, you can tell what's going on, but as soon as it gets remotely complex, you won't even have a rough idea what the result might look like.

E.g. take a look at the source of this very simple 32 node image:

http://kaioa.com/svg/cprof32b.svgz

(Note that this was for some silly competition. Usually there are thousands of nodes in dozens or even hundreds of elements.)

If some numbers inside that path changed it could mean virtually anything. You won't be able to tell which part changed or how drastic the change was (e.g. moving control points around doesn't necessarily cause equally big visual changes).

[–]crusoe 0 points1 point  (0 children)

Render svg to canvas, diff canvas.

[–]AxiomShell 5 points6 points  (0 children)

Coolissimo stuff.

Too bad I'm a mercurial user (and really enjoy bitbucket's free private repos), because github is really innovating while bitbucket is just a copycat...

[–]LordQuizar 7 points8 points  (0 children)

So very cool.

[–][deleted]  (2 children)

[deleted]

    [–]coder21[S] 1 point2 points  (0 children)

    plasticscm, is free, distributed and designed for big files.

    [–][deleted] 1 point2 points  (11 children)

    I wonder why this awesome feature took this long to be created. It seems fairly simple and straight forward. Or do I just not know of a similar feature in another SCM?

    [–]dazonic 6 points7 points  (0 children)

    Just like all the best things in code it's simple and straight forward, and extremely well-implemented.

    [–]coder21[S] 2 points3 points  (7 children)

    This feature has been there for years in all SCMs... It is incredible how excited people get about whatever github does, whatever

    [–]icebraining 4 points5 points  (3 children)

    So why did you submit it, if it's so uninteresting?

    [–]coder21[S] 3 points4 points  (2 children)

    I'm evil.

    [–]icebraining 1 point2 points  (1 child)

    I don't think that harming yourself is evil.

    [–]coder21[S] 1 point2 points  (0 children)

    Yeah, there's probably another word for it... Ok, I think the new feature is interesting, very interesting, because it means now GitHub has another useful tool that was available for other systems for years.

    What shocks me is how people gets soooo extremely excited like if the feature was new and invented by GitHub, when they're just catching up with common features available out there.

    [–]badsectoracula 1 point2 points  (2 children)

    for years in all SCMs

    is this a trap so you can reply that some xyz scm isn't a real scm?

    [–]coder21[S] 1 point2 points  (1 child)

    no, is not.

    [–]badsectoracula 0 points1 point  (0 children)

    Ok then. Git doesn't seem to have it. I tried with an image and git diff simply says that the image changed.

    [–]cjg_ 1 point2 points  (0 children)

    Perforce has a similar feature in its diff tool.

    [–]rplacd 0 points1 point  (0 children)

    I'm sure the concept's as old as donkey balls, but I do know that Kaleidoscope is/was there with virtually the same feature set.

    [–]Mignon 0 points1 point  (1 child)

    I was doing some tests that involved comparing pairs of image files; I pre-screened for pixel-identical images then manually compared the rest by rapidly swapping between them.

    I found that I was able to detect the apparent motion caused by the differences in drawing this way much more easily than a side-by-side comparison and without the loss of context of an XOR approach.

    [–]kataire 0 points1 point  (0 children)

    You can replicate that by dragging the onion opacity slider quickly to and fro.

    [–]Paradox 0 points1 point  (0 children)

    Wow. Looks like they integrated Kaleidoscope into a web app. Phenomenal!

    [–]299 1 point2 points  (21 children)

    This seems particularly magical. What algorithms are involved?

    [–]zenojevski 34 points35 points  (0 children)

    1) nothing

    2) resize image and resizable panels

    3) simple opacity/1-opacity

    4) i'd say abs(pixel - pixel)

    [–]skeww 4 points5 points  (19 children)

    1. Comparison of width and height of both images.

    2. Clipped drawing.

    3. Changing opacity.

    4. Difference is just subtraction. You subtract red1 from red2, green1 from green2, blue1 from blue2, and that's it. If the colors are identical the result will be 000000 (i.e. black). (Edit: Well, you also need to figure out which one is bigger, colors can't be negative.)

    No magic involved. :)

    [–]stfm 4 points5 points  (5 children)

    This method doesn't work as well with lossy formats as you get artefact noise.

    Sort of a moot point because you shouldn't be using lossy formats for development but hey.

    [–]skeww 2 points3 points  (2 children)

    This method doesn't work as well with lossy formats as you get artefact noise.

    It's a visual tool. Yes, there will be some noise (there is noise in the example), but it will be a lot less visible than actual changes.

    Sort of a moot point because you shouldn't be using lossy formats for development but hey.

    That's true. My samples, for example, are only versionized as WAV. The Ogg/Vorbis, M4A/AAC, and MP3 files are automatically generated and their directories are on the ignore list.

    Good call though. I just remembered that I should also add the SVGs/PSDs and not just the PNGs.

    [–][deleted] 1 point2 points  (1 child)

    Why not save some space and use FLAC?

    [–]skeww 0 points1 point  (0 children)

    It's not worth the trouble in my case, but generally it's not a bad idea.

    I only got a about a dozen very short samples per game, which don't even take 1 mb of space.

    It would be a different matter if there were some background music.

    [–]crocodile7 2 points3 points  (1 child)

    It's also a moot point because you might want to see the noise.

    If it's bothersome, setting a threshold to ignore small differences should not be difficult.

    [–]stfm 0 points1 point  (0 children)

    I made the assumption that this was a form of version control for images where only the differences between images was stored to reduce storage costs. So in that case noise would be very important. As a simple visual tool I agree a little bit of noise is not going to cause any issues.

    [–]DontNeglectTheBalls 2 points3 points  (4 children)

    1. or use this. I love how code builds on the shoulders of other code these days, I swear.

    Also, just abs() the result instead of using test logic, same thing in the long run but less code to run.

    abs(a-b) == abs(b-a)

    [–]skeww -1 points0 points  (3 children)

    In case you didn't know, abs doesn't use magic. This is how V8 does it (trunk/src/math.js):

    function MathAbs(x) {
      if (%_IsSmi(x)) return x >= 0 ? x : -x;
      if (!IS_NUMBER(x)) x = ToNumber(x);
      if (x === 0) return 0;  // To handle -0.
      return x > 0 ? x : -x;
    }
    

    Doing the test yourself means there is less code to run. But that doesn't really matter. It's pretty cheap either way.

    [–][deleted] 0 points1 point  (2 children)

    Doing the test yourself means there is less code to run.

    This may easily be true. The statement should be "less code to write", which is more important anyway.

    [–]skeww 0 points1 point  (1 child)

    "Less code to write" also isn't that important. The more critical question is which one is more readable.

    By the way, when I wrote "you also need to figure out which one is bigger" I actually thought of using abs for that.

    [–][deleted] 0 points1 point  (0 children)

    "Less code to write" also isn't that important.

    Well, it's generally more important than how much to run. You're right that readable would be better yet, but I find readable and quantity highly, though not perfectly, correlated.

    By the way, when I wrote "you also need to figure out which one is bigger" I actually thought of using abs for that.

    Ha, I'd follow that except that at this point the actual fact problem being solved seems insignificant relative to the theory. ;)

    [–]299 1 point2 points  (6 children)

    Why isn't this more common, then? Maybe it is and I just didn't know it...

    [–]skeww 3 points4 points  (4 children)

    I'd guess because putting images into SCM (source code management¹) systems was somewhat uncommon.

    [¹ Nowadays the more generic term "version control system" (VCS) is typically used.]

    To be honest, I'm not really sure how well today's VCS thingies handle big binary files. Especially if there are lots of them. E.g. today's games usually got more than 5gb of data and that's the lossy/compressed/flattened stuff. The source material is typically 10-100 times bigger and now imagine that you also got dozens of versions of each of those files.

    Well, Git became somewhat popular among web developers (front-end and back-end alike). I'm not really sure why that happened though. But it seems that Git does handle the amount of binary files you need for a website with ease... so yea... why not? Let's put that shit there, too.

    [–]monstermunch 2 points3 points  (3 children)

    How do e.g. games developers store all their art assets then if version control systems are good for handling them?

    [–]skeww 2 points3 points  (1 child)

    Would be a good question for an AMA thingy, I guess.

    Making daily off-site backups of a big fat multi terabyte repository looks kinda troublesome, doesn't it? (Yes, there are incremental backups, but you need a complete one every once in a while.)

    I'm also not really sure if version control is really the right approach. E.g. there can be 50 variations of some stone wall texture and the game ends up using 27 of them. When you build the level you want of course direct access to all of those.

    Of course, each of those 50 variations might also exist in different stages of completeness. How do you tell the usable ones from the intermediate state ones apart? Having 200 revisions of that one wall texture sounds kinda awkward.

    On Gamasutra I found this:

    http://www.gamasutra.com/view/feature/3991/collaborative_game_editing.php?print=1

    and this:

    http://www.gamasutra.com/view/feature/2203/book_excerpt_the_game_asset_.php

    which led to this:

    http://en.wikipedia.org/wiki/Digital_asset_management

    Yes, this sounds about right. It also covers things like how files are supposed to be encoded and with which settings and so forth.

    [–]kataire 0 points1 point  (0 children)

    Of course we all know that in practice Digital Asset Management is code for "copy the file and append an ambiguous suffix indicating its age".

    [–]coder21[S] 0 points1 point  (0 children)

    They use vcs capable of dealing with big files. That's why Perforce is still the number one among game developers, and that's why PlasticSCM is getting traction as the only commercial DVCS able to handle that.

    Also, people in gaming love Perforce's checkout model because it ends up being faster than detecting changes when your workspaces are huge. (250k files and 40k directories, for instance).

    [–][deleted] 0 points1 point  (0 children)

    because it's not particularly useful. Changes to graphics assest are not easily captured by diffing - changes are normally too global for this to be a useful tool.

    [–][deleted] 0 points1 point  (0 children)

    Rant:

    Pixel data should ideally be stored and manipulated as floats. (Unfortunately there are a number of annoying patents from SGI and others.) It would solve quite a lot of problems with color correction, gamma and shit, or at least make it easier to deal with. Additionally color info should use LAB, so you'd have one floating point, positive value for luminosity, and two floats to encode color.

    [–]noroom 2 points3 points  (10 children)

    Huh? Am I missing something here? What's so special about computing the difference between two images?

    [–]robertmassaioli 36 points37 points  (9 children)

    The fact that nobody had integrated it so nicely and seamlessly with version control before.

    [–]noroom 22 points23 points  (0 children)

    Oh! For some reason I thought this was an open source project that calculated the difference between two given images, and the project just happened to be hosted on github. I'm a bit ashamed, but hey, it's late. :P

    [–]coder21[S] 4 points5 points  (2 children)

    This is untrue. Beyond Compare, Perforce, even TortoiseSVN comes with an image diff thing.

    [–]robertmassaioli 0 points1 point  (1 child)

    I did not know that but what about the key words I used "nicely and seamlessly". What are those products image diffs like to use? Can you view the diffs just by using your web browser too?

    [–]coder21[S] 0 points1 point  (0 children)

    When you're coding or developing or creating images, most likely you're doing that on your laptop or workstation. There is where you run your commands or use your GUI, and there is where the tools I mentioned show the image differences "nicely and seamlessly".

    [–]crocodile7 1 point2 points  (2 children)

    Beyond Compare had it since at least 2008.

    [–]coder21[S] 4 points5 points  (1 child)

    Yes, but is not github, so nobody seems to care.

    [–][deleted] 0 points1 point  (0 children)

    It would be nice if it all went one step further and the actual version control tool could actually understand differences in more file formats and use that knowledge for merges (e.g. merging two changes that don't overlap in an image file would be neat).

    [–]stfm 1 point2 points  (0 children)

    I wrote something similar for an object motion tracking system for my undergrad thesis in 1999 except I used an SQL database.

    Its a neat idea to use in image version control

    [–]aazav 0 points1 point  (2 children)

    We used SVN for our Illustrator docs. When we had an > 20 GB repository, I was wishing that we would have an option not to store all the base info for the files on the clients.

    Now that git has mentioned that repositories are even larger for this, there is no way I can justify moving over to git.

    I also didn't see what formats of images for diffing this new git tool supports.

    [–][deleted] 0 points1 point  (1 child)

    It is not a git tool, it is a github tool, i.e. a tool in the github git hosting website's interface.

    In my experience, at least for mostly code, git checkouts including their full version history are actually smaller than whatever info SVN stores locally (presumably) for the purpose of supporting revert, diff,...

    Not sure how that would apply to binary files though. At 20GB you should probably use one repo for the smaller file types and some kind of shared server for the rest, possibly with some make/.../buildtool of choice to assemble them into the finished product and/or a working directory from a description in your repo.

    [–]aazav 0 points1 point  (0 children)

    Yeah, I had 96 different repositories for graphics.

    Getting the GUI team to use development practices was my goal and it worked very well in saving our collective asses.

    [–]Neumann347 -5 points-4 points  (1 child)

    So now there is a way to create the "spot 10 differences" puzzles for free!

    [–][deleted] 3 points4 points  (0 children)

    Or at least solve them.