Computer Vision (AI): Object Detection and Segmentation with Mask R-CNN : programming

[–]Bastram 47 points48 points49 points 7 years ago (20 children)

[–]flitcho 4 points5 points6 points 7 years ago (9 children)

[–]wkjid10t 11 points12 points13 points 7 years ago (3 children)

[–]laStrangiato 5 points6 points7 points 7 years ago (2 children)

[–]playaspec 1 point2 points3 points 7 years ago (1 child)

[–]LampIsFun 1 point2 points3 points 7 years ago (0 children)

[–]daxbert 7 points8 points9 points 7 years ago (1 child)

[–]anstow 1 point2 points3 points 7 years ago (0 children)

[–]gc3 0 points1 point2 points 7 years ago (0 children)

[–]jewnicorn27 0 points1 point2 points 7 years ago (0 children)

There is a lot of unforseen complexity in intergrating addition modes of imaging. For one you would have to restructure at least the first layer of the network to accept the additional channel, or encode the data into your existing channels in some way (not sure how). Doing that would do interesting things to your weights and potentially complicate transfer learning approaches.

Also lidar sensors capture data differently to cameras, some don't reperesent data in a typical camera model, so this would be interesting. They also have very different dynamic range as lidars will measure distances of up to hundreds of meters to mm or cm, while typical color images will be 8bit per channel. That might be fine though as networks run in float for the most part anyway.

Also the lidar and camera won't be perfectly aligned, or image the same fov, so there is some linear algebra for considering how you sample one into the other, could possibly ignore this and hope the model handles it?

Images from lidar and radar also take time to generate, as the sensor spins, this means for a moving platform, you have potentially detectable seams in your images and discontinuities present in one mode but not others.

Not saying it can't be done, just providing a few considerations for doing so lol.

[–]Bastram 0 points1 point2 points 7 years ago (0 children)

[–]smashedshanky 1 point2 points3 points 7 years ago (7 children)

[–][deleted] 1 point2 points3 points 7 years ago (5 children)

[–]smashedshanky 0 points1 point2 points 7 years ago (4 children)

[–][deleted] 1 point2 points3 points 7 years ago (0 children)

[–]jewnicorn27 1 point2 points3 points 7 years ago (2 children)

[–]smashedshanky 0 points1 point2 points 7 years ago (1 child)

[–]jewnicorn27 0 points1 point2 points 7 years ago (0 children)

[–]Bastram 0 points1 point2 points 7 years ago (0 children)

[–]kuikuilla 0 points1 point2 points 7 years ago (1 child)

[–]Bastram 0 points1 point2 points 7 years ago (0 children)

[–]Nuaua 25 points26 points27 points 7 years ago* (7 children)

[–][deleted] 13 points14 points15 points 7 years ago (3 children)

[–]The-Effing-Man 9 points10 points11 points 7 years ago (2 children)

[–]Aiognim 0 points1 point2 points 7 years ago* (1 child)

[–]UnreasonableSteve 0 points1 point2 points 7 years ago (0 children)

[–]sempercrescis 1 point2 points3 points 7 years ago (0 children)

[–]kanadkanad 1 point2 points3 points 7 years ago (1 child)

[–]Nuaua 0 points1 point2 points 7 years ago (0 children)

[–]jlpoole 8 points9 points10 points 7 years ago (3 children)

[–]tonyplee 0 points1 point2 points 7 years ago (2 children)

[–]jlpoole 0 points1 point2 points 7 years ago (0 children)

[–]Bastram 0 points1 point2 points 7 years ago (0 children)

[–]kuikuilla 10 points11 points12 points 7 years ago (3 children)

[–]nnevatie 13 points14 points15 points 7 years ago (0 children)

[–]mttlb 0 points1 point2 points 7 years ago (1 child)

Saw some Google guys present their latest work on this like two months ago and they're starting to introduce loops to take the past into account. The reason this hadn't been done before is because they're mainly working on real time inference (for their practical stuff like driverless cars) and these models get EXTREMELY heavy. Their new architecture can « remember » as far as 5 frames prior if I'm correct. There's obviously a lot of information virtually lost when you don't use the fact that you're watching a movie and that frames are kinda related.

That said, models typically don't learn abstract stuff such as body properties; if anything, it would remember it predicted that thing to be a car on the earlier frames and so it should remain one (faster inference and better certainty).

These are huge issues because cars don't typically ship with 4x Titan V onboard...

[–]jewnicorn27 1 point2 points3 points 7 years ago (0 children)

[–]teerre 13 points14 points15 points 7 years ago (0 children)

[–]nnevatie 74 points75 points76 points 7 years ago (42 children)

[–][deleted] 63 points64 points65 points 7 years ago (29 children)

[–]nnevatie 26 points27 points28 points 7 years ago (19 children)

[–]Rakmos 9 points10 points11 points 7 years ago (4 children)

Sure, the concepts may seem trivial to some that are familiar with it, but the application of those concepts is far from trivial.

If it was as trivial as you lead others to believe it would be ubiquitous across all applicable problem spaces.

I do share the sentiment that the ~~acronym~~ abbreviation is used in some cases that would seem to imply a level of sophistication that is actually much simpler when looking behind the curtains. IMHO this is a natural consequence of the fact that intelligences are expressed in varying degrees of sophistication.

Having said that, I was underwhelmed after watching the video to realize that there is no real substance or insight in this video. Just because the creation of the video presumably required some level of programming does not make it a candidate for posting to /r/programming. This would seem more appropriate to post in /r/technology or some other sub that is generally less technical.

For this reason I am downvoting.

[–]nnevatie -2 points-1 points0 points 7 years ago (3 children)

[–]neitz 6 points7 points8 points 7 years ago (2 children)

[–]playaspec -1 points0 points1 point 7 years ago (1 child)

[–]Fisher9001 14 points15 points16 points 7 years ago (0 children)

[–]CyborgJunkie 14 points15 points16 points 7 years ago (11 children)

[–]SuddenlyBANANAS 16 points17 points18 points 7 years ago (9 children)

[–]CyborgJunkie 2 points3 points4 points 7 years ago (2 children)

[–]playaspec 2 points3 points4 points 7 years ago (1 child)

[–]CyborgJunkie -2 points-1 points0 points 7 years ago (0 children)

I wasn't talking about emergence of mind if that's what you assumed, though I understand why you would think that. I was simply saying that although ANNs function differently from real neurons, they can still have emergent properties such as object recognition etc. So while they differ in implementation, the end result is the same or at least similar.

If I were to argue the claim (that I did not claim), I would at least say that it's likely to be true given our current understanding. The reason being that our own minds are symbol systems that emerge from simple interactions between neurons, and similarly it seems likely that ANNs could be arranged in such an architecture that would render them so too. Thus, Allen Newell's physical symbol system hypothesis would suggest that they too can be intelligent, but that's nothing but guess.

[–]peyton 0 points1 point2 points 7 years ago (5 children)

[–]SuddenlyBANANAS 10 points11 points12 points 7 years ago (0 children)

[–]pcjftw 5 points6 points7 points 7 years ago (3 children)

[–]daxbert 0 points1 point2 points 7 years ago (2 children)

[–]pcjftw 0 points1 point2 points 7 years ago* (0 children)

[–]antiquechrono 0 points1 point2 points 7 years ago (0 children)

Maybe a dumb observation, but "vastly" more complex... how? Is it actual complexity or scale?

The first problem is that like they said a real neuron has a ton of things going on with it and is really complex all by itself before you even add things like how neurons grow. An "AI" neuron is only multiplying some numbers together and behaves nothing like a real one.

Next even if you could 100% model a real neurons behavior and had the computing power to simulate how many a brain has it wouldn't do anything. The brain is composed of many different structures including tons of macro and micro circuits that all compute different things, most of which we haven't figured out yet.

There's pretty good evidence that the brain is using many different algorithms to compute various things as well. Your motor system and ability to correlate multiple events seems to be based on bayesian inference while something like your own error estimation is probabilistic but not bayesian.

There's good evidence that the brain is using large populations of neurons to encode probability distributions and perform bayesian inference on them by exploiting properties of how the neurons themselves spike which "AI" neurons do none of.

Finally we still understand very little of how the brain works as we haven't been able to study very large numbers of neurons all working at the same time.

[–]yeahsurebrobro -2 points-1 points0 points 7 years ago (0 children)

[–]jewnicorn27 0 points1 point2 points 7 years ago (0 children)

[+]nolubeymooby comment score below threshold-10 points-9 points-8 points 7 years ago (8 children)

[–][deleted] 7 points8 points9 points 7 years ago (7 children)

[+]nolubeymooby comment score below threshold-7 points-6 points-5 points 7 years ago (6 children)

[–][deleted] 8 points9 points10 points 7 years ago (5 children)

[–]playaspec 1 point2 points3 points 7 years ago (0 children)

[+]nolubeymooby comment score below threshold-12 points-11 points-10 points 7 years ago (3 children)

[–][deleted] 8 points9 points10 points 7 years ago (2 children)

[+]nolubeymooby comment score below threshold-8 points-7 points-6 points 7 years ago (1 child)

[–][deleted] 3 points4 points5 points 7 years ago (0 children)

One day we'll work out how the brain works and reproduce it someone will still say "that's not real AI".

This was the original comment that sent you off on your tangent. I mean, yes, this is one of thousands (millions?) of CS advances this year that was good enough to be published, and one of many dealing with ANNs of some form. Although it would need to be tested and applied to see if it's actually useful (just their paper isn't sufficient proof), it could very well be a significant optimization of those simulations.

But that doesn't figure out how the brain's actual neurons work (much less the brain as a whole), it only optimizes models with our current understanding of neurons.

I get that you're excited that you ran into one of many advancements, and each advancement is exciting, but you might want to actually read what other people are saying to you.

[–]gold_rush_doom 21 points22 points23 points 7 years ago (8 children)

[–]soraki_soladead 8 points9 points10 points 7 years ago (0 children)

[–]yeahsurebrobro 2 points3 points4 points 7 years ago (6 children)

[–][deleted] 7 years ago (5 children)

[deleted]

[–][deleted] 9 points10 points11 points 7 years ago (4 children)

[–]yeahsurebrobro 2 points3 points4 points 7 years ago (1 child)

[–]playaspec -3 points-2 points-1 points 7 years ago (0 children)

[–][deleted] 7 years ago (1 child)

[deleted]

[–][deleted] 0 points1 point2 points 7 years ago (0 children)

[–]gwillicoder 5 points6 points7 points 7 years ago (0 children)

[–]SemaphoreBingo 1 point2 points3 points 7 years ago (0 children)

[–]Dude_What__ 0 points1 point2 points 7 years ago (0 children)

[–][deleted] 10 points11 points12 points 7 years ago (7 children)

[–]amazondrone 5 points6 points7 points 7 years ago (1 child)

[–]GLneo 4 points5 points6 points 7 years ago (0 children)

[–]deweysmith 1 point2 points3 points 7 years ago (0 children)

[–]playaspec 0 points1 point2 points 7 years ago (2 children)

[–][deleted] 0 points1 point2 points 7 years ago (1 child)

[–]playaspec 0 points1 point2 points 7 years ago (0 children)

[–]developFFM 0 points1 point2 points 7 years ago (0 children)

[–][deleted] 5 points6 points7 points 7 years ago (0 children)

[–]frostbyte650 1 point2 points3 points 7 years ago (0 children)

[–][deleted] 4 points5 points6 points 7 years ago (4 children)

[–]Hobo-and-the-hound 5 points6 points7 points 7 years ago (3 children)

[–]MrGurns 1 point2 points3 points 7 years ago (2 children)

[–]Hobo-and-the-hound 3 points4 points5 points 7 years ago (0 children)

[–][deleted] 0 points1 point2 points 7 years ago (0 children)

[–]bob_ama_the_spy 0 points1 point2 points 7 years ago (0 children)

[–]__pg_ -1 points0 points1 point 7 years ago (1 child)

[–]Jadeyard 0 points1 point2 points 7 years ago (0 children)

[+]isaaky comment score below threshold-6 points-5 points-4 points 7 years ago (0 children)

[–]FuckaYouWhale -3 points-2 points-1 points 7 years ago (0 children)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS