Drawing things with an AI.

Mastr Blastr · Post by **Mastr Blastr** » 2022-12-13 09:58am

https://hotpot.ai/art-generator

What it does is browse the Internet for reference photographs, then uses them to create a new picture. The pictures it generates are not photorealistic or anything. They are however very cool. This is a wall of text and images, so I'll shrink it.

First, science fiction movies
Spoiler

Mastr Blastr · Post by **Mastr Blastr** » 2022-12-13 09:59am

Now let's do horror. I love horror movies. Some of these get very abstract-y.
Spoiler

Mastr Blastr · Post by **Mastr Blastr** » 2022-12-13 10:01am

Finally, historical and political figures. Remember, while the AI does use reference photos, the actual pictures are wholly new - it uses context clues and points of comparison to design completely new images. Which is why they sometimes look off (Hitler here, for example). It isn't copying any specific pre-existing photos, but rather combining aspects of many.
Spoiler

Zaune · Post by **Zaune** » 2022-12-17 05:28pm

Mastr Blastr wrote: ↑2022-12-13 09:58amhttps://hotpot.ai/art-generator

What it does is browse the Internet for reference photographs, then uses them to create a new picture.

What's rather less cool about these new AI art generators is that few if any of the artists or photographers whose reference photos were added to the dataset were consulted beforehand, doubly so because most of them are or soon will be competing with these AIs for paid commissions.

Mastr Blastr · Post by **Mastr Blastr** » 2022-12-17 06:25pm

Zaune wrote: ↑2022-12-17 05:28pm
Mastr Blastr wrote: ↑2022-12-13 09:58amhttps://hotpot.ai/art-generator

What it does is browse the Internet for reference photographs, then uses them to create a new picture.
What's rather less cool about these new AI art generators is that few if any of the artists or photographers whose reference photos were added to the dataset were consulted beforehand, doubly so because most of them are or soon will be competing with these AIs for paid commissions.

True. Don't they also use stills from films and so on that don't have single artists to attribute them to?

Formless · Post by **Formless** » 2022-12-17 11:28pm

Zaune wrote: ↑2022-12-17 05:28pm
Mastr Blastr wrote: ↑2022-12-13 09:58amhttps://hotpot.ai/art-generator

What it does is browse the Internet for reference photographs, then uses them to create a new picture.
What's rather less cool about these new AI art generators is that few if any of the artists or photographers whose reference photos were added to the dataset were consulted beforehand, doubly so because most of them are or soon will be competing with these AIs for paid commissions.

Its worse than that. These AI were originally based on image recognition AI, and by design if you train one on a single image it will always give you... a copy of that image, with perhaps a bit of noise due to the random number seed, but still a copy. The diffusion model basically takes the training data and assigns a probability to each individual pixel being a certain shade and hue correlating to specific key words, which in turn are assigned to specific image in the training set. I saw one person knowledgeable in how it works describe it as a "probabilistic photocopier" because in principle, each image in the training data has a non-zero chance of being duplicated perfectly by the AI if the right word combination and random seed happens to be chosen. Most AI researchers charitably refer to this phenomenon as "overtraining", but in truth given their origins as image recognition AI repurposed to generate images, its a feature not a bug. This person didn't have particularly kind words to say about the culture of AI researchers, incidentally.

I've seen what happens when an AI gets trained exclusively on a single artist's work, and the "probabilistic photocopier" analogy holds up remarkably well. While any given image might not look identical to any one work by that artist, its clear that specific parts of the image are being taken from specific images in the training data, making this feel awfully like some form of copyright infringement. In fact, Microsoft, GitHub, and OpenAI are currently in a class action lawsuit over developing a similar AI that copies code instead of images, and the training data contained code covered by several open source licenses which use copyright law to keep that code open source. So depending on how that lawsuit works out, Stable Diffusion could very well also be sued for using training data that they admit simply came from trawling Google Images without consulting anyone about the copyright status of their search results. I've done prompts in Stable Diffusion's demo that pretty conclusively prove that copyrighted imagery is in the training set (try Space Battleship Yamato, for instance, and the AI will try to recreate the faces of characters from that show as well as recreations of the titular spaceship; and I've tried other anime, too, with similar results). Its absolutely disgusting that these are presented to the public as "art" AI when they literally don't have the ability to create, only copy. Whatever the law says about them in the end, they aren't ethical to train this way or to use.

There is already at least one comics convention in New York that has banned imagery generated by AI as being essentially counterfeit merchandise. I fully support this.

The_Saint · Post by **The_Saint** » 2022-12-18 04:24am

Also worth noting that often the AI will include artist signatures from source training images, I suspect that's what's in the corner of your first image of the 'Predator'.

In just the last 24 hours I've seen three artists say they're pulling their ArtStation accounts

ray245 · Post by **ray245** » 2022-12-18 07:22am

Formless wrote: ↑2022-12-17 11:28pm
Zaune wrote: ↑2022-12-17 05:28pm
Mastr Blastr wrote: ↑2022-12-13 09:58amhttps://hotpot.ai/art-generator

What it does is browse the Internet for reference photographs, then uses them to create a new picture.
What's rather less cool about these new AI art generators is that few if any of the artists or photographers whose reference photos were added to the dataset were consulted beforehand, doubly so because most of them are or soon will be competing with these AIs for paid commissions.
Its worse than that. These AI were originally based on image recognition AI, and by design if you train one on a single image it will always give you... a copy of that image, with perhaps a bit of noise due to the random number seed, but still a copy. The diffusion model basically takes the training data and assigns a probability to each individual pixel being a certain shade and hue correlating to specific key words, which in turn are assigned to specific image in the training set. I saw one person knowledgeable in how it works describe it as a "probabilistic photocopier" because in principle, each image in the training data has a non-zero chance of being duplicated perfectly by the AI if the right word combination and random seed happens to be chosen. Most AI researchers charitably refer to this phenomenon as "overtraining", but in truth given their origins as image recognition AI repurposed to generate images, its a feature not a bug. This person didn't have particularly kind words to say about the culture of AI researchers, incidentally.

I've seen what happens when an AI gets trained exclusively on a single artist's work, and the "probabilistic photocopier" analogy holds up remarkably well. While any given image might not look identical to any one work by that artist, its clear that specific parts of the image are being taken from specific images in the training data, making this feel awfully like some form of copyright infringement. In fact, Microsoft, GitHub, and OpenAI are currently in a class action lawsuit over developing a similar AI that copies code instead of images, and the training data contained code covered by several open source licenses which use copyright law to keep that code open source. So depending on how that lawsuit works out, Stable Diffusion could very well also be sued for using training data that they admit simply came from trawling Google Images without consulting anyone about the copyright status of their search results. I've done prompts in Stable Diffusion's demo that pretty conclusively prove that copyrighted imagery is in the training set (try Space Battleship Yamato, for instance, and the AI will try to recreate the faces of characters from that show as well as recreations of the titular spaceship; and I've tried other anime, too, with similar results). Its absolutely disgusting that these are presented to the public as "art" AI when they literally don't have the ability to create, only copy. Whatever the law says about them in the end, they aren't ethical to train this way or to use.

There is already at least one comics convention in New York that has banned imagery generated by AI as being essentially counterfeit merchandise. I fully support this.

Honestly the best way to use this tool is if you're creating an 2D anime or cartoon, in which you train the AI the art the team has created, so it can help create animation for the "boring" parts of animation, thus saving time for artists.

Mastr Blastr · Post by **Mastr Blastr** » 2022-12-18 05:17pm

So question.

I put in the prompt "Halloween 4: The Return Of Michael Myers movie poster". I'm looking for the AI to make an accurate version of this image, which is in fact the movie poster for Halloween 4: The Return Of Michael Myers .

What I get is this.

Now, the Halloween 4 poster is a very simple image. Why can't the AI reproduce it pretty much exactly? All it needs is the mask image, same one as in its depiction, blown up to fill half the screen and a house right next to it. If all it's doing is training itself on Google, that should be really the only thing it sees.

Formless · Post by **Formless** » 2022-12-18 09:36pm

ray245 wrote: ↑2022-12-18 07:22amHonestly the best way to use this tool is if you're creating an 2D anime or cartoon, in which you train the AI the art the team has created, so it can help create animation for the "boring" parts of animation, thus saving time for artists.

Watching Cooridor Crew videos with modern animators, apparently they already use a lot of software like you are talking about, but what they actually use them for is predicative modeling to take a character already in one frame and transform that model between frames, basically cutting the number of frames the animators have to generate the old fashioned way. But it doesn't work like the diffusion model in that its not a digital collage, its much more simple and elegant than that once you realize that even seemingly 2d characters these days are usually 3d models the animators are manipulating and then drawing shading over top of. The diffusion model and others like it are way too complicated for professional work, and their downsides become apparent very quickly. For instance, they can't draw backgrounds for shit. Its one of the easiest ways to spot an AI image, in fact. Good looking foreground subject, bizarre or abstract background or environment. If you are animating in the Japanese style, the environment and backgrounds are often some of the most elaborate and expensive parts of the animation process, which can often be mitigated just by using 3d modeling. The western style of animation already doesn't care so much about that, so other software solutions that already exist serve their needs much better.

I really can't think of many use cases where these kinds of AI are actually useful. They lack any kind of real creativity because they lack the embodied cognition that a real artist has, they just aren't embedded in the real world so they can't imagine an object not already in their training data. An animator or comic artist needs to be able to invent new images, not just recycle old ones.

ray245 · Post by **ray245** » 2022-12-19 06:59am

Formless wrote: ↑2022-12-18 09:36pm
ray245 wrote: ↑2022-12-18 07:22amHonestly the best way to use this tool is if you're creating an 2D anime or cartoon, in which you train the AI the art the team has created, so it can help create animation for the "boring" parts of animation, thus saving time for artists.
Watching Cooridor Crew videos with modern animators, apparently they already use a lot of software like you are talking about, but what they actually use them for is predicative modeling to take a character already in one frame and transform that model between frames, basically cutting the number of frames the animators have to generate the old fashioned way. But it doesn't work like the diffusion model in that its not a digital collage, its much more simple and elegant than that once you realize that even seemingly 2d characters these days are usually 3d models the animators are manipulating and then drawing shading over top of. The diffusion model and others like it are way too complicated for professional work, and their downsides become apparent very quickly. For instance, they can't draw backgrounds for shit. Its one of the easiest ways to spot an AI image, in fact. Good looking foreground subject, bizarre or abstract background or environment. If you are animating in the Japanese style, the environment and backgrounds are often some of the most elaborate and expensive parts of the animation process, which can often be mitigated just by using 3d modeling. The western style of animation already doesn't care so much about that, so other software solutions that already exist serve their needs much better.

I really can't think of many use cases where these kinds of AI are actually useful. They lack any kind of real creativity because they lack the embodied cognition that a real artist has, they just aren't embedded in the real world so they can't imagine an object not already in their training data. An animator or comic artist needs to be able to invent new images, not just recycle old ones.

Manga artist are using it to speed up their work.

https://mobile.twitter.com/daromeon/sta ... 0177571841

There's also a studio working with some Japanese uni on trying to create good AI background for anime.

There's program that works for 3D animation with cell-shading to make it look like 2D in some way, but it's still a 3D model at the end of the day. On the other hand Japanese anime still prefer to use 2D animation and not over-rely on 3D cell-shading. What AI art can do is to create similar programs for 2D animation frames that will actually not kill anime animators because of how overworked the whole industry is at the moment.

An animator can invent new image with AI art help, because they no longer have to waste time animating the "boring" parts of work like a Character eating lunch or just walking to a location.

Instead animators can spend more time on action sequences like these:

ray245 · Post by **ray245** » 2022-12-19 07:16am

As for using AI art for anime background, studios are using it.

If you've done some research, you can see the ways they are already using the technology for it.

And also some discussions on this by people who are in anime/games industry:

madd0c0t0r2 · Post by **madd0c0t0r2** » 2022-12-19 08:39am

Your description of it isn't wrong, but it's very simplified and it's wrong to then assume real life is as simple as the description. there are layers and layers of further optimism and tweaks occurring on the input text and the output image.

One example is the nice --tile function on mjv3 which effectively wraps the image around a sphere during generation. Since each pixels probability distribution is effected by the colour of it's neighbours, the result is an image that tiles coherently.

Similar tricks are used for in painting and out painting. When I was having trouble generating a 'black punk rocker with dandelion hair' on an early version of midjourney, it was because the dataset was reflecting the overwhelmingly large numbers of photos of white American 2000 pop punk era online. I took the image, masked out the white face and told it to inpaint a black face, which worked well enough. Current openai dataset is built on a larger number of images with, I think, some adversial training, to try to reduce the biases of our society. I suspect in and out painting is being used under the hood by MJ to carve up images following trad composition rules to help produce pleasing public photos. The latest definitely has specialist subtrained versions for faces and hands, which humans are hardwired to really notice mistakes in.

I've not even touched on the complexity of the natural language prompt processing, but I'll note there is an interaction there with your suggestion "it can't draw what's not in the dataset". Given how huge the tagged image datasets are now, if you can describe it in English, chances are it exists in the dataset, and has enough related concepts to pinpoint it in latent space. I suppose we should define that.

Latent space doesn't exist if you only train it on a single picture, but if you train it on two, then the latent space is all the possible images between the two, including ones where pixels are extrapolated from their neighbors. It's a huge high dimensional space. If we assume a 20x20 pixel image with only RGB channels, each pixel has an xy coordinate in the picture

madd0c0t0r2 · Post by **madd0c0t0r2** » 2022-12-19 09:05am

ghetto edit. didn't finish typing before window expired:

Your description of it isn't wrong, but it's very simplified and it's wrong to then assume real life is as simple as the description. there are layers and layers of further optimism and tweaks occurring on the input text and the output image.

One example is the nice --tile function on mjv3 which effectively wraps the image around a sphere during generation. Since each pixels probability distribution is effected by the colour of it's neighbours, the result is an image that tiles coherently.

Similar tricks are used for in painting and out painting. When I was having trouble generating a 'black punk rocker with dandelion hair' on an early version of midjourney, it was because the dataset was reflecting the overwhelmingly large numbers of photos of white American 2000 pop punk era online. I took the image, masked out the white face and told it to inpaint a black face, which worked well enough. Current openai dataset is built on a larger number of images with, I think, some adversarial training, to try to reduce the biases of our society. I suspect in and out painting is being used under the hood by MJ to carve up images following trad composition rules to help produce pleasing public photos. The latest definitely has specialist subtrained versions for faces and hands, which humans are hardwired to really notice mistakes in.

I've not even touched on the complexity of the natural language prompt processing, but I'll note there is an interaction there with your suggestion "it can't draw what's not in the dataset". Given how huge the tagged image datasets are now, if you can describe it in English, chances are it exists in the dataset, and has enough related concepts to pinpoint it in latent space. I suppose we should define that.

Latent space doesn't exist if you only train it on a single picture, but if you train it on two, then the latent space is all the possible images between the two, including ones where pixels are extrapolated from their neighbours. It's a huge high dimensional space. If we assume a 20x20 pixel image with only RGB channels, each pixel has an xy coordinate in the picture (2 dimensions) and a RGB coordinate in colour space (another three dimensions). For two images, the same xy coordinate pixel could be described as a vector between the first pixel RGB point to the second RGB point. The path you take through colourspace to get from one to the other may not be a smooth straight line (and shouldn't be for RGB as it's coordinate space doesn't map to human colour difference perception. Corrected CIELAB space is better for that). For a pixel's colour, you could talk about a simple probability probability density being 50% one end image and 50% the other end image and a nuanced probability density including multiple places along that vector path (say green between yellow and blue). But pixels are also diffused, so the vector space you computing includes all the other vectors of nearby pixels, and the intermediate iterations. So there are lots of pixel patterns it can reasonably draw that are not included in the original dataset (such as dandelion leaf punk hair).

In terms of practical use, I know a lot of architects who are using it for rapid sketching and moodboard research - after all, the resulting image from the text prompt tells you a lot about society's association with that prompt (as filtered by the dataset). I asked for 'half baroque church, half tank' once, and in the range of image generated I also got tank tops, oil tankers and industrial process tanks. It is a good way to inspire the creative process.

I know a few architects who went further, and trained a local copy on images of brutalist buildings only, in order to explore the latent space between the images to see if the human architect pattern spotter found new patterns in how a set of windows shifted into a set of columns, for example. Talk is on youtube: https://www.youtube.com/watch?v=ZMlt-ht ... gQ94AaABAg They actually built one of the results for a client.

For me, personally, for sketching I've found stylised images are helpful, as I interpret them into useful concepts for my work: https://bakefoldprint.wordpress.com/202 ... d-dall-e2/
Another architect I know has married stable diffusion into a second MLtrained program that turns flat 2d images into depthmaps to get 3d effects: https://www.youtube.com/watch?v=Q_3QVsETjLc&t=1s
A third architect has been using grasshopper/rhino to generate the rough building shapes, and key views, export those as images, using overpainting to create a detailed image. This in analogous to various animation techniques, and helps provide something consistent. I am interested in combining the two effects to be able to reimport the image to generate a more detailed 3d mesh.

Beyond that, and more into the technical weeds, the latent space coordinate assessment of an image has been found to be an incredible image sloppy compression approach - https://towardsai.net/p/l/stable-diffus ... mpresssion
Likewise the learning about embedded vectors and cosine simialrty between them in the latent space applies to all sorts of ML techniques. I'm currently classifying historic bricks based on a few dozen measures (dimensions) and using the same techniques to measure bricks relative difference/distance from each other, and therefore grouping them by probable commonalities.

Formless · Post by **Formless** » 2022-12-19 03:13pm

My description wasn't meant to be complete. I didn't mention the role of the neural net because it isn't relevant to the ethical or legal problem with the AI, except in the ways it makes the problem worse. Neural nets are great at optimization problems, which is why they were chosen for image recognition. But the only thing to optimize here is the blurring at the edges of the elements of the digital collage so it doesn't look like a traditional collage. Its still a collage from an ethical or legal perspective. And remember that the ethical and legal analysis doesn't take hand fitting and editing by a human into account because it isn't relevant. The tool itself can still generate an image on it's own from the dataset that clearly violates copyright because the copyright violation is inherent to the basicodel. Overftting is a feature of the original image recognition function, not a bug like people pretend it is, and that is all that will be taken into account should someone sue over Stable Diffusion stealing random images from Google Images. In fact, should it be found that the AI can do much as successfully copy someone's watermark or signature correctly, then it has already violated trademark law as well. The "latent space" argument is also a smoke screen. That space is still infinitely smaller than the imagination space of a human being that exists in the real world and can model reality in their mind to create new realities beyond merely blurring the line between a grab bag of things they have already seen. A machine cannot do much as compose a set of basic shapes into an arbitrary pattern if that pattern doesn't exist in the data set or between any number of images in the data set. And there are an infinite number of such arbitrary patterns that a human being can draw due to having a basic understanding of the rules of geometry, let alone physics. The AI can only optimize statistical functions. No matter how big it's training set, it doesn't learn the same way as a human, and without human intervention, will forever be a probabilistic photocopier.

Formless · Post by **Formless** » 2022-12-19 03:25pm

Oh for fucks sake, posting on my phone is a pain and the damn thing won't let me edit posts after the fact correctly. I didn't mean to double (let alone triple) post, but somehow it took the edit and posted a new post with the edited text and then refused to let me edit in this apology without deleting most of the original post. If there is no other reason to mistrust AI, just think about how shitty posting on a phone with it's terrible autocorrect function is.

Fixed for you! --LadyTevar

Lord Revan · Post by **Lord Revan** » 2022-12-20 03:13am

Well seeing as most auto-correct can only recognize Finnish words at their base forms, it ends up either writing in super broken Finnish using the grammar of English for a lot of things, or you have to manually correct the auto-correction to use correct grammar.

and that's just a basic function of the Finnish language and you're asking me to accept that AI can create things out of thin air, not buying it at least not yet.

StarDestroyer.Net BBS

Drawing things with an AI.

Drawing things with an AI.

Re: Drawing things with an AI.

Re: Drawing things with an AI.

Re: Drawing things with an AI.

Re: Drawing things with an AI.

Re: Drawing things with an AI.

Re: Drawing things with an AI.

Re: Drawing things with an AI.

Re: Drawing things with an AI.

Re: Drawing things with an AI.

Re: Drawing things with an AI.

Re: Drawing things with an AI.

Re: Drawing things with an AI.

Re: Drawing things with an AI.

Re: Drawing things with an AI.

Re: Drawing things with an AI.

Re: Drawing things with an AI.