With all its still-existent flaws, hallucinating, and limitations, what it is able to do is fundamentally amazing.
People had been hypothesizing about making virtual computer brains since the sci-fi of the 50’s and 60’s. There is a reason they are often called neural networks. Even from the beginning they wanted it to be analogous to neurons.
I teach a class on introductory machine learning to senior-level mechanical engineers. We focus mostly on vision-based neural network because they are a bit easier to understand conceptually. I have mostly been a researcher *with* ML instead of a researcher *of* ML.
However, a few years ago I decided I wanted to take a deep dive into to the Transformer/Attention mechanism that is the underpinning of almost all state-of-the-art large language models. I came away from this process even more amazed with what they can do.
Then, I have been horsing around with running a local instance of Llama 3.3 70B. This is Meta’s latest model with 70 billion trained parameters. While they don’t give you the training process or training datasets, they do make the architecture and weights free to use. It has performed slightly better than GPT4o (I never liked o1, and o3 won’t be available to the rank and file for a while).
Even knowing the math under the hood, understanding the training process, and understanding the millions/billions invested to allow the training to finish in finite time, I am still absolutely stunned by the newer models.
That you can pick any random page of Wikipedia and ask a pertinent question about it and have the LLM spit out a correct answer much of the time, with occasional hallucinations and mistakes, leaves me flabbergasted. I can ask about Shakespeare, napoleon, Joseph smith, differential geometry, python code, nutritional information of a recipe, and virtually any other thing you want and have it give a mostly correct answer.
Tl;dr - if you aren’t in awe of what modern LLMs can do, you should be.