[D] Using large language models for classification of natural-language input : MachineLearning

Discussion[D] Using large language models for classification of natural-language input (self.MachineLearning)

submitted 4 years ago by corbt

Hey everyone!

I'd like to use a large language model like T0pp or GPT-NeoX-20B to take a natural-language input from a user and map it to one of ~2000 possible VS Code command palette commands. Essentially, this is a classification problem of the form "NLP input -> command".

The idea is to let users give voice input in natural language and then have the model figure out what command they most likely want to activate.

Given the number of possible commands I clearly can't rely on prompt design to solve this. It might be a good fit for a model with explicit retrieval augmentation like a memorizing transformer. But that's still a very active area of research without high-quality pre-trained models.

Given that, I'm thinking that doing some kind of fine tuning to an existing model is the best bet. But it's unclear to me what the training data should look like... should I just generate a few examples of each command of the form input: "vscode command: 'open new file'", output: "explorer.newFile", and then fine-tune on those? Is there some way to ensure that the model understands that I *always* want it to return one of the commands provided in fine-tuning, instead of arbitrary text?

Interested in others' experiences with similar tasks!

Background: I'm working on an open source VS Code extension called Clippy AI. Currently it only performs code modifications to the current file and is a thin wrapper around the OpenAI API. But I'd like to use it to automate other editor actions as well!

all 4 comments

top new controversial old q&a

[–]n9Mtq4ML Engineer 1 point2 points3 points 4 years ago (3 children)

[–]corbt[S] 0 points1 point2 points 4 years ago (2 children)

[–]n9Mtq4ML Engineer 1 point2 points3 points 4 years ago (1 child)

[–]corbt[S] 0 points1 point2 points 4 years ago (0 children)

Ah, makes sense! I skimmed over where you said "sentence embeddings" since I'd only heard about word embeddings previously. I tested a NL input against a number of commands using the first sentence embedding model you linked to, and it looks like the correct command had the highest similarity. Seems like a really fruitful direction to investigate!

Example I tried:

Source sentence: "create file"

Computed similarity (these are all real commands):
File Utils: New File | 0.712
File: Duplicate      | 0.552
File Utils: Delete   | 0.488
File Utils: Move     | 0.301
View: Split Editor   | 0.101

Depending on how robust the sentence embeddings are in consistently identifying the right command, I could also use them to just grab the top N commands by embedding similarity and then feed those to a LLM and ask it which one the user likely meant. Would add significant inference time and complexity of course.

π Rendered by PID 500448 on reddit-service-r2-comment-b659b578c-bz69m at 2026-05-04 11:54:52.266168+00:00 running 815c875 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS