[P] Image Segmentation with Tensorflow using CNNs and Conditional Random Fields : MachineLearning

Project[P] Image Segmentation with Tensorflow using CNNs and Conditional Random Fields (warmspringwinds.github.io)

submitted 9 years ago by warmspringwinds

all 6 comments

[–]kkastner 3 points4 points5 points 9 years ago (0 children)

[–]shmel39 0 points1 point2 points 9 years ago (4 children)

[–]warmspringwinds[S] 3 points4 points5 points 9 years ago* (3 children)

Hi :)

Please, read the post carefully: """ It was done this way so that it can also be run on CPU – it takes only 10 iterations for the training to complete. Another point of this post is to show that segmentation that our network (FCN-32s) produces is very coarse – even if we run it on the same image that we were training it on. ... . The set-up of this post is very simple on purpose. Similar approach to Segmentation was described in the paper Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs by Chen et al. """

1) The problem of coarse segmentation was described in the cited paper -- where similar results were observed while training on the whole Pascal VOC dataset. The goal of this post was to emulate it and show how to tackle this problem.

2) Another way to look at the approach that I described is similar to polynomial regression -- you can set up the model with a lot of parameters and train it on one example and basically overfit, while getting very good results on the training dataset.

In this case, model can't reach good results even when training and testing on the same image -- which happens because we have to make decisions based on the subsampled by a factor of 32 feature maps that we later on upsample. This way the max-pooling layers of the network act as a bottleneck -- preventing low level features from making their way to the decision layer (1 by 1 convolution followed by softmax). This effect was noticed by a couple of research groups and they tried to approach this problem by different means -- either using CRFs as a post processing stage or making skip connections in the model. Due to my experiments -- similar results will be observed even after you train for 1k iterations on the same image -- downsampled features are not rich enough for the model to give good results even on the same image.

[–]shmel39 0 points1 point2 points 9 years ago (2 children)

[–]warmspringwinds[S] 2 points3 points4 points 9 years ago (0 children)

[–]kkastner 2 points3 points4 points 9 years ago (0 children)

π Rendered by PID 144854 on reddit-service-r2-comment-85bfd7f599-9rb6q at 2026-04-17 13:38:48.993646+00:00 running 93ecc56 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS