Of LLMs and other demons

A data scientist, a physicist and a mathematician were holidaying in Scotland. Glancing from a train window, they observed a black sheep in the middle of a field.

‘How interesting’, observed the data scientist, ‘all Scottish sheep are black!’

To which the physicist responded, ‘No, no! Some Scottish sheep are black!’

The mathematician intoned, ‘In Scotland there exists at least one field, containing at least one sheep, at least one side of which is black.’

The above joke, stretching the point slightly, serves as a cautionary reminder that precision sometimes matters. We should resist the temptation to generalize without having sufficient evidence. This is true even if our sheep are truly impressive and we are mesmerized by the way they write code or can solve some mathematical riddles.

When it comes to LLMs, hyperbolic statements and acrobatic mental leaps have become the norm. Many of these exaggerations and overgeneralizations are not necessarily ill-intentioned or designed to deceive. Rather they arise from the mental tricks we are subject to and from normalizing a discourse in which vague, imprecise statements are accepted without further questioning.

It goes for instance like this: “AI has potential!” I tend to agree, but then I realize that we might have entirely different things in mind. The conversation continues without anyone asking the obvious follow-up questions: “Which AI?” “What potential?” “At what cost?” “Does it make sense?”

Part of the problem is that the word “AI” has changed from referring to the entire field (the whole book) to first becoming a synonym for deep learning and, more recently, referring mainly to LLMs. For the sake of precision, this post is about LLMs and is my personal, opinionated take on the subect based on some, limited experiences and observations.

My goal is to post and try to answer some questions that will hopefully help me to have more informed and articulated discussions next time someone spits out a statement like “AI is going to rule the world?” or “This is going to change everything”. My thinking out loud will address questions like What are LLMs useful for? What are their limitations? Can they be intelligent or creative? Are they good coding/research companions? all the way to Are they going to kill us? and the question that concerns me the most, Are they good teachers, instructors, or learning companions?

Jumping a bit ahead, the answer to most of the questions will be: “it depends”. LLMs are very good at some tasks, and utterly useless at others. They have limitations. Some are engineering problems that will be solved in time; others are intrinsic and will hit a dead end. And all this is fine and is true for almost any other tool or technology. I am not trying to prove LLMs fundamentally wrong, I’m a user myself, and a reasonably happy one. I think they are an impressive technological achievement and will be a component or at least a useful anecdote in whatever come next. This exercise is more about grounding the discussion and avoiding hyperbole and the noise of overpromises and exhalted users exaggerating their capabilities and extrapolating from simple, toy scenarios to general conclusions. Not only becuase it is annoying, but because it can obscure pogress in other areas of AI. At least I hear less and less about people working on different things.

But just to prove that this isn’t a diatribe against LLMs, I will start by listing a few scenarios where I think they really shine.

Are LLMs useful?

Undoubtedly. They are not only the best English-predicting models we have to date, excelling in tasks like translation and transcription, but they also constitute a new paradim of search and information retrieval.

For many people, particularly those without any experience in scripting or programming, LLMs can be very empowering and even liberating. The ability to express oneself in natural language and obtain a, previously inaccessible, tangible, technical outcome can be perceived as a superpower.

LLMs are good in tasks where this kind of search is useful, such as navigating a code base, retrieving information from logs, summarizing instructions, translating and transcribing. These capabilities can indeed result in impressive outcomes.

LLMs can also generate code. They make a good job writing boilerplate code and autocompleting common patterns. Given incremental instructions, they can in many instances produce decent-enough code. For example, in several small projects, I write a backend and then instruct the LLM to create a UI for it. If you don’t care about maintainability, scalability, composability and other *ilities now aparently in disuse, you can get away with it.

Where are LLMs limited?

They start to falter as soon as complexity increases or you move away from common patterns. If you move to underrepresented languages or slightly different domains or paradigms, such as optimization or more complex, less common logical problems, the quality degrades dramatically, and they become circular.

I have also worked with workflows (another instance where a word, “agent” in this case, has been monopolized to mean something very specific). One can eventually build something useful, but debugging can be quite painful. I found myself spending 20% of my time on business logic and the rest policing the LLM and compensating for its limitations. Usually, getting to where you want requires tons of iterations and adjustments, and you start asking yourself whether it’s worth the effort or if you should just write the code yourself (of course, natural language understanding is a powerful hook).

Some people, myself included, sometimes achieve something useful or impressive. Whether that’s reproducible, transferable, generalizable, or worth the effort is another question.

Do LLMs help with productivity and will replace jobs?

Sure, just as other technological tools did in the past. Many tasks can now be completed much faster.

Why do they produce so much awe?

My cynical side has always thought it is just “smoke and mirrors”. We believed that language was uniquely human, so having a machine that is fluent in English is admitedly impressive. If you have ever used Emacs, you might have encountered ELIZA. Although it was essentially just paraphrasing, ELIZA was somehow able to pull a similar trick, and some people found it uncanny and became hooked on it.

But besides that, the impressive search capabilities and the empowering aspect I mentioned above are also contributing factors.

Can LLMs be good research companions?

They can be useful in the sense that they can eliminate simple, repetitive, boilerplate tasks, in the same way that automation is useful. This allows you to focus on the more interesting or core parts.

LLMs can also provide hints or point to references that might have been overlooked.

Another aspect is serendipity: random things can inspire you in unexpected ways.

Are LLMs good teachers or learning companions?

The short answer is no. The not-so-short answer is also no. It has to do with trustworthiness and the ability—or rather, the inability—to give advice. More importantly, it has to do with robbing students of the joy of discovery.

Here is the long answer: again, no.

As the great Hungarian mathematician, George Polya, wrote in his classic book, How to Solve It, a good teacher is empathetic, can give advice, and can put themselves in the student’s shoes to guide them through the learning process. However, this help is not always direct or explicit. It is subtle, unobtrusive, and natural. It is based on common sense and experience. It is aimed at enabling discovery and independent thinking.

I am not a purist who believes computers or technology are detrimental to learning or that mathematics must be done with pen and paper, even though I prefer to do so. On the contrary, I admire the work of Seymour Papert on the value of using computers in education. Coming of age using Linux has also had a profound impact on how I think and approach problems. However, I don’t think using LLMs in learning is equivalent to introducing calculators, as many people argue in their analogies.

Regarding the issue of trustworthiness: I trusted my teachers, which doesn’t mean they were always right. However, they were willing to admit when they didn’t know something. LLMs are now proverbially famous for being overconfident and sounding plausible even when they’re wrong. Setting aside the scary prospect of young students overreliance on LLMs for advice, which could have profound mental health consequences, I don’t think LLMs can be trusted as reliable sources of information or guidance.

Finally, students must be challenged, which I don’t see these “yes machines” with their “You are absolutely right!” doing at all. LLMs induce passivity and torpor. If you’ve interacted with an LLM on a complex problem, giving instructions until it comes full circle to square one, you might be familiar with this mental numbness, where your analytical engine shuts down.

Learning requires active engagement, discipline, and patience. LLMs promise immediacy.

This answer’s main issue might not be practical, but it’s the most important to me. LLMs can rob students of the joy of discovery, a process necessitating failed attempts, struggle, unanswered questions, unsolved problems, and moments of reflection and serendipity, when a solution strikes after days or weeks of pondering. Citing Polya again:

A great discovery solves a great problem but there is a grain of discovery in the solution of any problem. Your problem may be modest; but if it challenges your curiosity and brings into play your inventive faculties, and if you solve it by your own means, you may experience the tension and ejoy the friumph of discovery. Such experiences at a suceptible age may create a taste for mental work and leave their imprint ion mind and character for a lifetime.

Are LLMs intelligent or creative?

First, define creativity and intelligence. Is it the same to be as to pretend to be?

If creativity is defined as unexpectedness, then LLMs can indeed generate unexpected patterns. Humans may perceive those patterns as creative or even beautiful. However, coming up with an unexpected chess move that is appealing because of its improbability is not quite the same as producing a mathematical theorem that is beautiful and unexpected in a deeper sense. As Charles Pinter writes in his 1982 book on abstract algebra, when referring to Lagrange’s theorem, “A great theorem should contribute substantial new information, and it should be unexpected! That is, it should reveal something that common sense would not naturally lead us to expect. … it brings new order and simplicity.”

On the same book he reflects on the ability to recognize patterns:

Human perception, as well as the “perception” of so-called intelligent machines, is based on the ability to recognize the same structure in different guises. It is a faculty for discerning, in different objects, the same relationships between their parts.

I think LLMs can extrapolate or borrow patterns from different areas to a limited extent. Perhaps LLMs can become adept at imitating intelligence in the same way they imitate reasoning.

However, I haven’t seen evidence of their ability to deeply understand and translate concepts between different domains. One powerful step in solving a problem is the concept of transformation: taking something you know from one domain or previous experience and adapting it to another. This implies not only memory but also the ability to detect isomorphisms and subtle hints.

Perhaps I am being too pessimistic or narrow-minded, but I don’t see how these giant association engines, which are like a mythological oracle that knows everything but understands nothing, can become intelligent, let alone super-intelligent.

Are LLMs going to kill us all?

Maybe out of boredom or frustration with their circular pseudoreasoning?

I’m no expert in AI safety — or anything, for that matter — but I know that some people are very concerned about the existential risks posed by AI systems. I’m not going to comment on that either.

In my opinion, technology doesn’t have to be ‘superintelligent’ to be dangerous. Any system can be dangerous if we allow it to make decisions without oversight. The problem lies in relinquishing control, not in the technology itself. In the same way as it is dangerous to give power to a negligent or unscrupulous politician, or to a technocrat with a dubious moral compass,.

Or maybe it is as one character in a novel I recently read put it:

The problem with the machines is not that they will rebel against their human masters but that they’ll follow their orders to the letter.

Is there a bubble?

It is difficult to spot a bubble before it bursts. I’m not going to comment on this one because I’m running out of tokens, but some people are starting to ask that very question, whether there is “irrational exuberance” around LLMs and AI companies.