This person is right. But I think the methods we use to train them are what’s fundamentally wrong. Brute-force learning? Randomised datasets past the coherence/comprehension threshold? And the rationale is that this is done for the sake of optimisation and the name of efficiency? I can see that overfitting is a problem, but did anyone look hard enough at this problem? Or did someone just jump a fence at the time and then everyone decided to follow along and roll with it because it “worked” and it somehow became the golden standard that nobody can question at this point?
The researchers in the academic field of machine learning who came up with LLMs are certainly aware of their limitations and are exploring other possibilities, but unfortunately what happened in industry is that people noticed that one particular approach was good enough to look impressive and then everyone jumped on that bandwagon.
This person is right. But I think the methods we use to train them are what’s fundamentally wrong. Brute-force learning? Randomised datasets past the coherence/comprehension threshold? And the rationale is that this is done for the sake of optimisation and the name of efficiency? I can see that overfitting is a problem, but did anyone look hard enough at this problem? Or did someone just jump a fence at the time and then everyone decided to follow along and roll with it because it “worked” and it somehow became the golden standard that nobody can question at this point?
The researchers in the academic field of machine learning who came up with LLMs are certainly aware of their limitations and are exploring other possibilities, but unfortunately what happened in industry is that people noticed that one particular approach was good enough to look impressive and then everyone jumped on that bandwagon.