Being overtaken by reality. This often happens in the rapidly growing field of artificial intelligence. Things that seem impossible today can suddenly become a reality tomorrow. It also happened with this story. While I was talking to experts about whether and when the computer could independently write a scientific article, ChatGPT, OpenAI’s new text generator, came up.
‘The computer is actually very good at generating bad hypotheses’
And of course, one can argue about whether the texts generated by ChatGPT can be called scientific, but in any case it is a serious step forward in the development of machine-generated text.
Like millions of other people around the world, I’m trying ChatGPT. The task I give is: Write a scientific essay on the importance of DNA for career choice.
Within seconds, there will be a piece of text that is understandable and coherent. It starts with an introductory section on the increased importance of DNA in various fields, such as the field of career choice. The text then discusses how this increased importance is expressed. And the program comes with a number of examples.
An important objection immediately appears here: the program completely ignores the ethical comment you should make regarding this topic: ‘DNA analysis has been used to identify the best candidates for certain positions, such as doctors or scientists. By analyzing potential candidates’ DNA, employers can determine which individuals have the best genetic makeup for the job’, ChatGPT cheerfully writes. Well, it’s a coherent text that sticks to the topic and probably based on what is known on the web about DNA and career choices, but it doesn’t make the world any wiser or better. You lack the ethical considerations. The text collects known facts without morality, wisdom or insight.
No creative contribution
“AI is holding up a mirror to us,” says Haroon Sheikh, professor by special appointment of Strategic Governance of Global Technologies and senior researcher at the Scientific Council for Government Policy. “Because we are increasingly standardizing the work of scientists and making it measurable, it is easier to imitate on the computer. In this way, you can make something that looks like a standard scientific article, but it is not a real, substantive and creative contribution to science, although of course it is very difficult to define exactly what creativity is.”
“Creative thinking is fundamentally different from analyzing large amounts of data”
He makes the comparison with plant-based meat: “People also say it tastes almost the same as chicken nuggets, but they forget that those chicken nuggets have also been processed in such a way that they have little to do with real meat.”
The computer will not take over the actual work of science for the time being, believes Sheikh. “Science stands or falls with the formation of theories, which require creative thinking. And it is fundamentally different from just analyzing large amounts of data. We still don’t really know how people do it. Before computers can do that, a fundamental step must be taken.”
Argument is pattern between data points
Other experts are more optimistic about what AI can do. Although there are still many practical limitations, in principle there are no objections to the possibility that the computer could generate a hypothesis based on data: “An argument is really just a pattern between different data points. It’s not that difficult,” says Lauren Waardenburg, who until recently was a VU researcher, but now works at the University of Lille and specializes in how organizations handle computer systems.
“The computer is actually very good at generating hypotheses,” says Frank van Harmelen, VU professor of knowledge representation and reasoning, “only there are currently many bad hypotheses among them. For now, we still need people to separate the good ones and bad hypotheses, but I don’t see why you couldn’t train computer systems to do better in the future.”
‘There is not a single human gene that only appears under one name in the databases’
In fact, Van Harmelen is already working on such a program: together with social psychologists from VU, his group has developed a computer program that, on the basis of a database of 2,500 experiments that have already been carried out, itself comes up with proposals for new experiments. “They’re by no means all usable yet, but we’re looking at how we can improve it,” he says.
Will we still need theories in the future, or will analysis of raw data be sufficient to predict developments in the world? In 2008, Chris Anderson published the article The end of theory in the computer magazine Wired, in which he argues precisely that scientific theories will become obsolete in the near future. These theories are often flawed descriptions of reality that only hold true under certain circumstances. Why not just look at the data and apply statistics to it? The whole world as a ‘single database’.
But it is precisely in the assumption that there is such a thing as a single database that there is an important problem. A good database requires unique data. And unfortunately, people are much less consistent than computers when it comes to introducing them. “It’s scary,” says Van Harmelen, “there isn’t a single human gene that appears under only one name in the big scientific databases like PubMed, and vice versa, different genes sometimes have the same name.” What applies to the names of the genes also applies to all sorts of other concepts, for which researchers in different countries often use different expressions.
‘American scans are different from European ones. We don’t know why.’
You might think you don’t have that problem with image analysis. With scans of tumors, for example, you are less dependent on human capriciousness. But here, too, there seems to be a lack of cohesion. “Scans of the same tumors in the same body parts still show regional differences,” says Waardenburg. “American scans are different from European ones. We don’t know exactly why. For example, they may be done at a different time of day or with a different brand of scanner.”
Waardenburg investigates how organizations deal with the possibilities that computers offer in practice. She sees how many things go wrong every day because of small things that prevent the systems from being used optimally and that data cannot be exchanged: “Using a computer system properly requires a lot of extra work, especially at the beginning,” says she. “In many organizations you see that this never really gets off the ground.”
In addition, there are all kinds of historical differences in the classification of data between organizations and disciplines. The same is true in science. Waardenburg: “Multidisciplinary data sharing is extremely difficult in practice because of these differences.” The result is that there is a world of difference between what is technically possible and what is possible in practice.
Hallucinatory AI systems
The most optimistic is Piek Vossen, professor of computational lexicology at VU. Vossen believes that in about ten years we can be well on our way with a computer-generated scientific article. “Last year I received an essay from a student in which the first four lines were computer generated. If she hadn’t written it, I wouldn’t have known,’ he says.
Vossen emphasizes that such a text has no conscious meaning: the computer supplements text based on what he often sees passing by in other texts: the cat overflows… The computer can easily supplement such a sentence: the eaves, the window sill, the balcony. “But a computer has no idea what a cat represents or what walking means,” explains Vossen. “So you should always have an article like this checked by a person, but I definitely think the computer could help generate a first version of an article based on data and keywords.”
There are still enough problems and maladies with auto-generated contributions so far. For example, AI systems regularly ‘hallucinate’, inventing information for themselves about subjects they know too little about. Computers also draw conclusions based on existing data, which often contain biases: for example, that all scientists are male, white, American, and work at Stanford.
‘People see causation too quickly, think astrology’
A painful example of this type of bias was the algorithm Amazon used a few years ago to select the best applicants for programming jobs. That algorithm placed female candidates at the bottom of the rankings, not because they didn’t have the right qualifications, but because women were a large minority in the existing workforce. A similar example occurred at our own university where a student of color was repeatedly not recognized as human by the anti-cheat software.
The computer draws conclusions based on past situations, which sometimes do not make sense or are even immoral. But people aren’t always good at reasoning clearly either, says Van Harmelen: “People too quickly see causal relationships, think of astrology or all kinds of other superstitions.” In his research into hybrid intelligence, he therefore tries to combine the strengths of humans and computers. “We will try to build a scientific assistant in the coming years,” he says. “It cannot replace the scientist, but it can be useful in the phases of the scientific process: gathering relevant literature, formulating a hypothesis, setting up and conducting experiments, collecting data, and finally perhaps writing a first draft of an article.”
So far, the AI presents us with an imitation, sometimes amazing and sometimes creepy
But how long will it take? The four experts all start talking about the self-driving car. “Ten years ago we thought it would be there in two years,” says Sheikh, “and now we still think it will be there in two years. A car like that works fine under controlled conditions, but if a duck suddenly crosses the road, it turns out to be more complex.”
Welcome to the real world, computer, where data is entered messy and ducks cross the road, where data from the past is not necessarily predictable or desirable for the future. And where people have been getting creative ideas for centuries without us knowing exactly where they come from. You can go a long way with logic and data, but for now what you present to us remains an imitation, which is sometimes amazing and sometimes terrifying.
The images in this article were generated using automatic image generators Designer Rob Bömer tried about ten of them. Most come from Nightcafe.