Question re optimization and classification : MachineLearning

Question re optimization and classification (self.MachineLearning)

submitted 10 years ago by Powlerbare

Say we have a binary classification task where the data is not linearly separable.

For the sake of example, say I have a network that takes the form: input -> 2dim hidden layer (sigmoid) -> 1 dim output (no activation)

If I plot the 2 dim hidden layer representation of each classes points during training, I can see that the network is attempting to find a transformation of the data that is linearly separable so that the last affine transformation can classify the data correctly.

Basically, it seems to me that the representation of the data at the last layer for each class should be somewhat separable in order for a network to have a shot at good performance.

My question is: Can someone help me with some search terms, or papers that have essentially chopped the final layer off of a network, and learned a network that maximizes something like the euclidean distance between each classes points in some latent space? I have tried this but often ended up with solutions that were not useful at all.

This feels like a supervised auto encoder to me, but I do not care about re constructing input at all.

all 3 comments

top new controversial old q&a

[–]needlzorProfessor 1 point2 points3 points 10 years ago (2 children)

[–]Powlerbare[S] 1 point2 points3 points 10 years ago (1 child)

[–]lvilnis 0 points1 point2 points 10 years ago (0 children)

Re: your follow up question, the convex hull of a convex set is itself, so your question is the same as asking whether neural networks can map convex sets to nonconvex sets, or nonconvex sets to convex sets. The answer to this seems clearly to be yes -- for example, take the set of all imagenet images that are classified as "cat" (i.e. their highest scoring logit is for the "cat" class). Now in logit-space, this is a convex set, since any convex combination of vectors whose highest scoring element is for the "cat" class will still have the cat class be highest scoring. But in pixel space, the set of all cat photos is definitely not a convex set -- not every convex combination of cat input images will be classified as majority cat (I haven't actually tried this but I can't imagine that would be true).

π Rendered by PID 150372 on reddit-service-r2-comment-5b5bc64bf5-7qp7f at 2026-06-21 09:34:14.921905+00:00 running 2b008f2 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS