István ÜVEGES: As an AI language model… The dark side of the AI’s democratization
The democratization of AI will undoubtedly promote transparency and accountability of the technology. But what happens when open-source AI falls into unauthorized hands, or is misused? What is the greater risk, development monopolies concentrated in the hands of large companies or uncontrolled use?
With the rise of generative language models, artificial intelligence has infiltrated areas of our lives that would have been unimaginable even a few years ago. In social media, AI algorithms are responsible for making the user experience as comfortable as possible and maximizing the time spent on these platforms. In e-commerce, it’s a common practice for users to receive suggestions for additional products they may be interested in based on their search history, previous transactions and known preferences. Tools developed for smart homes learn the user’s habits, which can help them optimize energy use, for example, but also significantly improve comfort.
Generative Artificial Intelligence (GAI) has become so dominant that, for example, the Vice-Chancellors of the Russell Group universities in the UK have issued a joint statement on the subject. It is therefore essential that their students and staff are equipped with basic knowledge of artificial intelligence. Without this knowledge, they will not be able to take advantage of the opportunities that technological developments in teaching and learning will create. The declaration also highlights the importance of promoting AI Literacy, sharing best practices, and the importance of special trainings about the ethical use of AI, especially GAI.
In such an environment, it is particularly important that the tools that enter education, from which we expect credible and reliable information, provide truly unbiased and reliable data to those who use them.
As the democratization of AI takes off, it is reasonable to believe that the range of tools that can be used will expand. Since the launch of GPT-3 and the ChatGPT developed from it, hardly a month has gone by without a technology giant coming up with a new solution, a large language model (LLM) or its own architecture. A fair number of these are completely open-source. An excellent example is Llama 2, developed by Meta.
However, while making such tools publicly available will undoubtedly help to increase transparency and trust in technology, it also carries risks. Dame Wendy Hall, co-chair of the British Government’s AI review, once said that such moves are like ‘giving people a template to build a nuclear bomb’.
It should be remembered that currently known solutions, including LLMs, have several vulnerabilities. One aspect of these can be found in the creation of similar models. Data leakage due to biased training data, an inadequately constructed algorithm, or even models trained with sensitive data are all such factors. The other side of the coin is the use of existing models. Problems can be caused by over-reliance on the output of them, which can lead to the propagation of false information, insufficient or vague objectives when implementing ethical principles (AI alignment), or the phenomenon of poisoning in newer training data during fine-tuning.
Another example of the risks of language models is ChatGPT. Who is not familiar with the famous ‘As an AI language model…’ response that comes as a reply when typing inappropriate prompts. This is common when a user asks a question that would violate the company’s ethical principles, such as hate speech, sexist, racist content, or otherwise facilitate the transmission any kind of inappropriate content. However, we must remember that language models do not have any ethical sense, moral compass, or other help in generating answers. Although ‘moderation’ of questions and answers seems to be part of the internal workings of the ChatGPT, little official information is available regarding this.
To filter the model’s responses, two other cases are possible. In the first case, the question formulated by the user does not reach the language model in the first place. In the second case, the answer already generated by the language model is ‘stopped’ by a moderation process if it does not comply with the ethical principles of the operator.
Both steps are most easily illustrated using the OpenAI Moderation API. This is a service that uses a separate machine-learned model to decide if a text contains elements indicating self-harm, sexuality, harassment, or other prohibited activity. In this case, the investigated text may be a question addressed by the user to the model or a response to it by the model. If the answer is yes, the text will be moderated to prevent the creation of unwanted content.
One problem is that such models – like the ones behind the already mentioned Moderation API – never work at 100% efficiency. Given that even humans cannot identify similar content without error, this is perhaps the smaller problem. The bigger problem is that democratization may enable people who are unable or unwilling to take the necessary ethical principles into account to develop their own chatbots. One can easily imagine what would happen if a ChatGPT-level language model started flooding social media with hateful comments. All the malicious user would have to do is create a sufficient number of fake accounts to flood the profile of any company, public figure or even party with comments of their choice.
There are also signs of this on a smaller scale, for example when the aforementioned ‘As an AI language model’ appears in the comments of several Twitter profiles. Presumably, some of these are already the misguided results of such automated generation. The text generation capabilities of today’s LLMs are now sufficient to convince the average user that the content was written by a real human. Given this, and the fact that more and more people are getting access to more and more powerful tools as open-source AI spreads, we should also expect to see an increasing incidence of misuse.
Moving away from language models, the spread of deep fake is also worrying. Deep fake refers to video or audio content in which someone’s digital image is faked using artificial intelligence. In this case, the dedicated purpose is to allow the result to serve explicitly manipulative purposes. But so is the phenomenon of hackers using generative AI to improve their offensive code or using artificially generated voice to generate phone calls with malicious intent. The latter could be, for instance, a case where a subordinate receives instructions to perform a certain action in the voice of their manager.
The democratization of AI is therefore a double-edged sword. Increasing transparency and promoting accountability towards the large companies that currently own the technology is a necessary step. Without it, AI could easily become the privilege of a privileged few. This kind of inequality would erode trust in the technology in the long term and, if combined with an inadequate regulatory environment, could easily lead to abuses in areas such as right to privacy. At the same time, full access to a technological solution necessarily implies that it will be easier to use by people who would not have the resources to produce it without it. Let us take the example of LLMs again. The pre-training of an LLM can cost millions of dollars. Most market actors or private individuals cannot afford this. However, fine-tuning an existing model, for example to run as a chatbot, costs a much smaller amount of investment. If this fine-tuning does not follow the necessary ethical standards, either by accident or through deliberate negligence, it is easy to see that the result could be a lot of chatbots ‘uncontrollably’ prowling the web.
It is likely that neither the vision of a fully monopolized AI nor the vision of hundreds of chatbots on the rampage will be clearly realized. The question is, how will we be able to ensure a balance between cognition and security soon?
István ÜVEGES is a researcher in Computer Linguistics at MONTANA Knowledge Management Ltd. and a researcher at the Centre for Social Sciences, Political and Legal Text Mining and Artificial Intelligence Laboratory (poltextLAB). His main interests include practical applications of Automation, Artificial Intelligence (Machine Learning), Legal Language (legalese) studies and the Plain Language Movement.