In the last article we thought about automation, flow states, and sexual climax.
Now let’s talk about AI-generated imagery, more specifically for the intention of demonstrating that everything that promises is something we’ve already had the luxury of enjoying for quite a number of years…at least, when it comes to porn.
The Flushing Out of Augean Stable Diffusion
semantic: Adjective. Of or relating to semantics or the meanings of words. [from late 19th c.]
The “stable diffusion” approach to machine learning entails the reduction of massively huge image-sets into a model of the reoccurring “latent” features which are semantically labelled in each image.
Play these models backward with novel textual prompts, and they will generate new images exhibiting the requested features. While all the training data—at least when it comes to photographic datasets—is probably realistic, the prompts themselves can easily request the generation of fantastic or surrealistic or impossible scenes.
For the textual prompts to evoke images, each image in the training set needs to be accompanied with a descriptive caption or label from which its unique features can be delineated. The images are what seem most salient, but each image requires a textual description to be useful.
With the past few decades of digitization of existing archives, and continued work in the areas of tagging images, enough of them have been manually labelled or described by humans in an adequately diverse sets of images for this job of “semantic tagging” to itself have become automated by AI.
You’ve helped train AIs every time you identified a boat, traffic light, or other blurry, indistinct object in a CAPTCHA to prove your humanity to a website.
So let’s break stable diffusion into its two constitutive halves. The first half is the semantic aspect of stable diffusion image generation, that is to say, the proper, meaningful labelling of images. The second half is the creation of the images themselves.
The first is about text describing images, the second is creating images.
Here’s my proposition: each of these things being done by the AI constitute a separate, already long-existing handicraft.
Here’s my proposition: each of these things being done by the AI constitute a separate, already long-existing handicraft. They are skill which humans have already been doing manually by hand in batches. What makes stable diffusion special is that it is an automation of these handicrafts, no different than all automation preceding it, as I mentioned in the first piece. Recognizing this is essential—we are just making computers do what we’ve long already done through manual work.
Not only that, the perfect User Inherface... uh, Interface for pornographic fantasy on-tap—the merger of both halves into the whole— already exists as promised.
The conjunction of these two halves has been long operative owning the collective intelligence and cooperation of web communities dedicated to fantasy production and experience. In this case, porn is something of a subset, or a nighttime version of fantasy. At large, fantasy just means the structured, controlled, feedback-regulated, story-like imaginary flow of sensation and dreams over time toward climax—narrative climax or, well, the more embodied sort. In other words, the multi-sensory involvement within narrative.
By the end of this series, I hope to demonstrate that we won’t have to try and guess what AI generated porn will do. We can just examine what its functional equivalent has already done. The results are all around us.
In this part we’ll look at the history of the semantic half, and I’ll introduce the resulting merger of both halves in their pinnacle, the... fuck it. The User Inherface.
The next installment in this series will then consider the image-making half, because that’s where the effects of the whole can already be seen.