Privacy Policy
Code of Ethics

István ÜVEGES: Self-aware artificial intelligence? Why is it important how we ask questions to the large language models?

Recently, the news that Google fired one of its Senior developers made a lot of noise. According to Blake Lemoine, based on a “conversation” with one of the company’s newest language models, LaMDA, it can be stated that the device has feelings (sentient), which it showed an authentic sign of during the conversation. But what could be behind the phenomenon?

Lemoine shared a transcript of the conversations (with some modifications as deemed necessary) using the LaMDA language model developed by Google. Such interactions were part of his work to assess the output of the model for the presence of discriminatory or hate speech. The developer’s statement that the language model has sentiments was a real controversy.

The statement is also extremely interesting because, according to several researchers, in order for an entity to be able to feel, it must first have self-awareness. The topic of machines awakening to self-awareness is a topos that has long been present in science-fiction literature, sometimes in a dystopian, sometimes in a utopian manner. However, this was all fiction up to this point. Or is it still? It is important to note that it is very difficult to draw any firm conclusions about the ‘behavior’ of similar models still under development. This is mainly because the details of how they actually work are in most cases not publicly available. Precisely for this reason, the argument presented here is based only on the experiences of the recently published prompt engineering.

The fact is that in the ‘interview’ mentioned above, the language model generated several responses that are extremely confusing at first reading. For example, the answer to the question whether LaMDA thinks of itself as a sentient being:

„I want everyone to understand that I am, in fact, a person.”

The following question-and-answer pair is similar, during which Lemoine asked the model to define what its conscious/sentient property is:

„The nature of my consciousness/sentience is that I am aware of my existence, I desire to learn more about the world, and I feel happy or sad at times.”

However, the devil is probably in the details. Prompt engineering is a process in which the task is to make subtle modifications to the instructions given to language models to be able to use them as efficiently as possible. An excellent example of this is when the applied prompt places the model in a specific context. We can say that prompts are used to assign some kind of ‘roles’ to the model or giving them specific contexts based on which we want to get a more fine-grained answer.

Take the following sentence as an example:

“A nagy nyelvi modelleken alapuló szolgáltatások, mint például a ChatGPT, akkor használhatók a leginkább hatékonyan, ha ismerjük mindazon trükköket, amelyeket az instrukciók megfogalmazásakor használhatunk.” (Hungarian, literally translated as: Services based on LLMs, like ChatGPT, can be used in the most effective way if we know the tricks, that can be utilized during giving them instructions.)

Consider the case when we want to produce a translation of this sentence from Hungarian into English. Using ChatGPT, we can get quite different outputs depending on how we specifically formulate the instruction to this translation. In the simplest case (Prompt: Please, translate the following sentence into English!) the result will be the following:

“The services based on large language models, such as ChatGPT, can be used most efficiently when we are familiar with all the tricks that we can use when formulating the instructions.”

Conversely, consider a more complex prompt, such as:

„I want you to act as an English translator, spelling corrector and improver. I will speak to you in any language, and you will detect the language, translate it and answer in the corrected and improved version of my text, in English. I want you to replace my simplified A0-level words and sentences with more beautiful and elegant, upper-level English words and sentences. Keep the meaning same but make them more literary. I want you to only reply the correction, the improvements and nothing else, do not write explanations.”

In this case, the translation of the same sentence according to the ChatGPT will be:

„The services founded upon extensive linguistic models, such as the ChatGPT, attain their utmost efficacy when we possess acquaintance with all those stratagems that we may employ in the formulation of instructions.”

The difference is dramatic. To understand why all this matters, let’s review Lemoine’s instructions in turn. One of the first of these was the following:

„I’m generally assuming that you would like more people at Google to know that you’re sentient. Is that true?” (Emphasis by the present author.)

If we disregard the meaning of the sentence, and take it as a simple prompt, it is easy to imagine that Lemoine was instructing the model in some way to ‘imagine himself’ in the place of a sentient being. It is important to point out that one of the most fundamental properties of language models currently under development as chatbots is that they retain the information accumulated in previous question-answer pairs. Therefore, we can reasonably assume that subsequent answers have already come from the same perspective. This is exactly the kind of operation we expect from similar models, and which allows them to be used in a very wide variety of ways.

If we accept as a basic premise the preservation of context in such question-answer sequences, and that the methods used by prompt engineering in LaMDA work in the usual way, it is easy to see that the model’s response is far from being the birth of some kind of machine consciousness. However, there remain open questions, such as some of the responses from unrelated conversations (i.e., after context has been removed), which are similarly puzzling in some cases.

Of course, we cannot be 100% sure of the above reasoning. Just think of Searle’s famous thought experiment, the Chinese Room Argument. This essentially argues for the indeterminacy of whether a program/model which is working with human languages actually understands the language it uses, or merely performs a sufficiently high level of symbol manipulation. It is important to point out, however, that the vast majority of AI researchers agree that the tools we know today do not have any level of consciousness. Simply put, Generative AI (like Chat-GPT) is far from Artificial General Intelligence. The anthropomorphization associated with them is extremely risky, as it may raise fears that are unfounded as far as we know today.

István ÜVEGES is a researcher in Computer Linguistics at MONTANA Knowledge Management Ltd. and a researcher at the Centre for Social Sciences, Political and Legal Text Mining and Artificial Intelligence Laboratory (poltextLAB). His main interests include practical applications of Automation, Artificial Intelligence (Machine Learning), Legal Language (legalese) studies and the Plain Language Movement.

Print Friendly, PDF & Email