On a more serious note, there is this thing that bothers me and that proponent of A.I. seem to ignore. Whenever someone points out that A.I. art is kinda shit, they always argue that right now, it is imperfect, but it is only going to get better from here on out, and I don’t believe it.
Yes, progress in the last years is staggering, but it is a logical fallacy that progress is constant and linear. Just take smartphones, Yes they technically get better every year but there has not been real innovation in years. My first smartphone was my dad's old galaxy s1 from 2010, and in terms of basic functions, it is the same as my current one. The point is A.I. might not get better, it might just stagnate or even get worse.
A.I. advocates don’t seem to understand the basics of what they proselytize. What is commonly called A.I. is actually called large language models. It is not what we imagine when we say A.I. (Skynet, Glados, HAL 9000), but a more advanced complicated version of when you type something on your smartphone, and word suggestion appears at the top of the keyboard. They take in large amounts of data, analyse it for trends, and spit out something it thinks the end user wants based on the previous input. That kind of tool can be useful in limited circumstances but blindly applying it for everything might not work due to the inherent limit of the tool.
Simply predicting the next word function works well, (usually) because words and letters are something easy for a computer to understand, and the structure of language makes it easy to notice a pattern. Things get more complicated when you ask the same tool and ask it to do something more complicated.
Image generation is a lot harder because there are a lot more data points, and the human brain is a lot better at identifying when something is “off”. Let's take the example of hands, for a long time L.L.M. had difficulty reproducing hand because they produce things that are close to the average without understanding what that average is. Objectively, human hands have on average less than 5 fingers, so the computer will spit out an image with something like 4.9 fingers on one hand because it lacks basic understanding on why the average is less than five. Take the same problem, multiply it by hand in every pose at every angle, and it becomes understandable why for a long time image image generators had trouble creating hands.
The way it was eventually fixed was with the amount of training data. There are a lot of images on the internet that feature at least one human hand, so now L.L.M. are more competent at producing images of hands. The problem is that the tool has not fixed the underlying problem that it does not understand what it produces so it can only properly generate things that have sufficient training data.
There are a lot of things that are a lot more complicated than hand with a lot less training data available so I doubt generative models will ever be at the point where everything is right.
There is a common rule of thumb that the last 20% take exponentially more work as the first 80%. A.I. companies have effectively used the entire internet, the collected sum of human knowledge and creation, to reach the 80% mark, but that means they are running out of new training data to finish the last 20%. All of that does not include the risk that the training data is corrupted by shitty A.I. art and the data start inbreeding with itself like the Hapsburg, or deliberately sabotaged by disgruntled artist/employee or rival government/company.
That is why I’m skeptical that “A.I.” will truly reach the potential its advocates insist it has.