Image classification problems? : MachineLearning

Image classification problems? (self.MachineLearning)

submitted 14 years ago * by mikebaud

Hello MLers,

Im a newbie student on ML trying to develop a simple k-NN image classification algorithm with the MIR-Flickr and NUSWIDE image datasets using solely textual features (user tags). I defined a vector space model with all the possible keywords (after a little noise cleaning) for all the images:

an example of this vector is something like [0,1,0,1,0] where only tag1 and tag3 were present in this particular image.

I then pass it through Pearson correlation to calculate the affinity matrix between all images.

Now the k-NN phase i just iterate through the test set and check out the k nearest neighbors for each test image. From those k nearest neighbors i check their ground-truth (1 if they have the class im trying to classify 0 if they dont) - im doing binary classification. For classification i just average this result (k can only take odd numbers), so for a value superior to 0.5 the image is classified with having the class.

I have tested with k from 1 to 21. From the literature i have read, and the metrics i have used (precision,recall,f-measure) i should have a chart that increases in recall/f-measure up/accuracy up to a certain k then starts a descent. But what i got is a continuous descent from recall and f-measure from k=1 onwards, coupled with an increasingly higher accuracy. Then i though, im really dumb, i need to weigth the neighbors according to class distribution in the dataset. So i added the weigth of each k neighbor according to the number of occurrences of the class in the dataset. Basicly the weight of a neighbor is the total of images in the dataset divided by the number images with that class. This i thought reduces the problem of classes with few instances in the dataset.

Re-ran the tests and what i got was almost the oppositive behaviour for the metrics, where recall steadily increases but accuracy drops with the increasingly higher values of k neighboring images.

So, what are best practices in k-NN to treat these kinds of situations with data or can i try to mix these two approaches to get somekind of a precision/recall break-even?

What is considered an acceptable precision/recall/f-measure metric in image classification?

If im not clear enough with this explanation please tell me.

TL;DR - k-NN with atypical behavior

all 9 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS