bluetoma.to

The ethics of Generative AI

How generative AI works, and some common misconceptions

Some people believe that generative AI models are simply mix-and-match Frankenstein machines that save and contain the images that they were trained on; this is not the case.

Some LoRAs are trained on hundreds of images, while only being 22MB in size. If lossless compression at this scale existed, datacenters would be very, very happy.

At its core, a model is just a bunch of numbers that perform some transformation to the input data. The "learning" process involves feeding the model input (i.e.: text or images), which the model then uses to finetune its billions of numbers (weights and biases, aka parameters). Let's take a look at an example!


Assume we are given these 5 grid configurations to train on. If I asked you to create a new grid configuration based on those examples, you would likely draw me a grid that contains either 4 or 5 coloured blocks, and these blocks would likely be touching each other either horizontally or vertically.

A (good) AI model would come to the same conclusion. It would examine groups of blocks (i.e. 2x2 or 3x3 sub-grids) and adjust its weights such that a green block is extremely likely to be touching 1 or 2 other blocks, and extremely unlikely to be touching any diagonally.

Now I'm not gonna pretend that generative AI is actual intelligence; there is no "thinking" involved; this is purely a metaphor. But it does serve to illustrate another important point: overfitting.

What would the model have produced if we had only trained it on images 1, 2, and 5? It would learn that valid patterns are only ever 4, and exactly 4, blocks long, and always straight. And if we scrapped image 5 as well? Maybe we'd get horizontal lines only, but with such a limited training set, the model might instead assume these are the only two valid configurations!

This is why you sometimes hear about models spitting back texts or images almost verbatim. It's not that the model secretly saved this content, it's just been trained too narrowly. If you train a model on "mythological creatures" for example, and the only images of vampires you feed it are of Edward Cullen, well you bet your sweet bippy that whenever you ask for a blood-sucking nightdweller he's gonna be sparklier than a disco ball.

Ok, but what about the artists whose work these models are trained on?

This is the part where is gets completely subjective, and where reasonable people can disagree.

Artists who create any kind of creative work are automatically granted a copyright on their work. And this is a good thing! It means someone else can't just take their work and either reproduce or publically display it.

However, what it does not protect is the more abstract parts of a work; the style, the ideas, or the techniques used. And again, I would argue this is a good thing! We as humans constantly consume and absorb media, learn from it, and transform it to make something new out of it. In that sense, an AI model's use is very similar; it simply "learns" certain patterns (i.e.: colours or body composition) from a collection of pieces, in aggregate. However, whether a company can train a model on public images and then make money off of it is still an open legal question, and only time will tell how courts around the world will rule on this. But that's the legal side of it, how about the ethical side?

On a personal level, as a creator who has art and (mostly) code out there in public, I am totally fine with AI and other people learning from and remixing my work. I think most people would find it rather silly if an artist posted their works, but accompanied it with a message saying "nobody may learn from this, or study this material for their own gain." And as much as I understand that for some people there is a distinction between a human learning or an AI "learning", I personally don't see an ethical difference there.

One difference I do very clearly see is that there is a huge shift in power dynamic between individuals learning from or remixing your work and profiting from it, and a multi-million dollar VC-funded company doing the same. That will feel extremely skeezy to a lot of people, and I absolutely get that. But this leads away from individual ethics, and into my final point:

Fine, you can do it, and you can sleep at night... But what about the artists who now have sleepless jobs due to loss of income?

This sucks. A lot. And I want to it absolutely clear that I greatly empathize not just with artists losing commissions or their jobs, but also those who have taken a hit to their motivation, or hopes of becoming a professional artist.

That said, automation of human labour has been and will continue to proliferate at an increasing rate. John Deere is building farming equipment that can plant crops and and apply pesticides not only autonomously, but more precise and less wasteful than humans ever could. Amazon has been developing and utilizing robots to replace their workers, and fast food restaurants and other industries are eager to do the same.

And on one hand, this should be a good thing! If you're reading this, you are likely living in a place that has long passed the days where everyone has to work 8-16 hours a day lest we all die from cold or hunger. In an ideal world, less need for human labour would result in more free time, which means means more time to focus on personal health and development, friends and family, and creative endeavours. The reality however is that the value of all of this replaced labour is being concentrated to a small handful of companies and individuals.

I'm not going to turn this into a whole anti-capitalist kind of musing. The reason I include this paragraph is because a lot of the anger at generative AI strongly reminds me of the anger people felt when photocameras became a consumer product (putting a lot of professional photographers out of business), or when cars came to replace the horse and carriage (doing the same for carriage drivers). We cannot stop technhological progress, nor should we want to. What we should, however, is steer this progress and the policies surrounding it such that everyone can benefit from it.

Thanks for coming to my Ted Talk.