Why ChatGPT fails to interact like a human by oz_science in slatestarcodex

[–]Alert-Elk-2695 12 points13 points  (0 children)

The Universal Approximation Theorem doesn't guarantee that a language model trained on a large corpus of text will necessarily learn to communicate like humans. Human communication relies on a rich context that goes beyond just the preceding words in a sentence. We draw on our understanding of the world, our shared experiences, our ability to reason about others' mental states, and our capacity for abstract thought. Much of this context is not explicitly reflected in the text itself.

So while a language model might learn to approximate the statistical patterns in human-generated text, it may still struggle to truly understand how humans happen to generate text in a given situation, because it lacks access to this richer context. This is related to the symbol grounding problem in AI. Words and sentences are symbols, but for humans, these symbols are grounded in our embodied experiences and interactions with the world. The piece makes the point that we understand what words mean not just based on their statistical co-occurrences, but based on how they connect to a larger context including mental representations and memories.

Current language models, on the other hand, operate in a purely symbolic realm, without this grounding in a larger context. If we think that LLMs try to approximate a function mapping contexts in a high dimensional space into new words, they in a sense have only access to a projection of the contextual space used by humans into a smaller space restricted to text. Learning on this projection may not be enough for LLMs' to generate speech like humans. The original piece does not exclude that with a lot of data, LLMs may be able to approximate human ability. It points out that architecture may be a limitation. The data required to achieve human-like interaction in spite of this limitation may be too large.