mixed_biscuits

_________________________
microsoft have announced NUWA-Infinity

NUWA-Infinity is a multimodal generative model that is designed to generate high-quality images and videos from given text, image or video input

according to the abstract:

Compared to DALL-E, Imagen and Parti, NUWA-Infinity can generate high-resolution images with arbitrary sizes and support long-duration video generation additionally

there don't seem to be any links to a demo, or any indication of whether it will ever be available to the general public
Video!
 

william_kent

Well-known member
an interesting google collab codebook - CLIP interrogator

give it the path of an image and it will try and it creates a prompt based on artists, keywords and what the "AI sees in the image"

I tried it out on this image:

1660114697641.png

Gouffres amers - Ithell Colquhoun

and the "CLIP interrogator" came up with this prompt:

a painting of a woman laying on the ground, a surrealist painting by Francis Bacon, deviantart, pop surrealism, surrealist, henry moore, pre-raphaelite

which I then fed into MidJourney which produced this:

1660115493957.png

another attempt using this image:

1660115666037.png

The Judgement of Paris - Ithell Colquhoun

generated this prompt:

a painting of a group of naked people, an art deco painting by Marguerite Zorach, cg society, socialist realism, picasso, surrealist, fresco

which predictably produced this response when fed into Midjourney:

1660115842739.png


time to try with a censorship free AI....
 

mixed_biscuits

_________________________
an interesting google collab codebook - CLIP interrogator

give it the path of an image and it will try and it creates a prompt based on artists, keywords and what the "AI sees in the image"

I tried it out on this image:

View attachment 12538

Gouffres amers - Ithell Colquhoun

and the "CLIP interrogator" came up with this prompt:

a painting of a woman laying on the ground, a surrealist painting by Francis Bacon, deviantart, pop surrealism, surrealist, henry moore, pre-raphaelite

which I then fed into MidJourney which produced this:

View attachment 12539

another attempt using this image:

View attachment 12540

The Judgement of Paris - Ithell Colquhoun

generated this prompt:

a painting of a group of naked people, an art deco painting by Marguerite Zorach, cg society, socialist realism, picasso, surrealist, fresco

which predictably produced this response when fed into Midjourney:

View attachment 12541


time to try with a censorship free AI....
Remind me not to have my portrait done by Ithell Colquhoun
 

william_kent

Well-known member
I had a closer look at the "CLIP interrogator" and it could prove useful for "learning" the language of the various AI models

it first generates a table of what the various models "see" in the image and then builds a prompt from that, but you could easily pick and mix from the table and build your own prompt..

the first Ithell Colquhoun image from my previous post breaks down like this:

indexModelMediumArtistTrendingMovementFlavors
0ViT-B/32a surrealist painting (24.2%)by Francis Bacon (14.7%)featured on deviantart (13.1%)surrealism (13.4%)grotesque (17.2%), surrealist (13.1%), ilya kuvshinov (9.8%)
1ViT-B/16a surrealist painting (95.5%)by Salvador Dalí (9.8%)deviantart (14.9%)pop surrealism (47.3%)surrealist (40.3%), henry moore (14.0%), pre-raphaelite (11.1%)
2ViT-L/14a surrealist painting (58.5%)by Dorothea Tanning (35.3%)pixiv (36.0%)nuclear art (16.3%)surrealist (29.4%), henry moore (10.6%), academic art (9.6%)
3RN50an art deco painting (72.7%)by Stanisław Witkiewicz (7.4%)deviantart (36.3%)qajar art (19.8%)pre-raphaelite (25.5%), egyptian art (14.4%), rococo (7.2%)
 

wektor

Well-known member
William you really up to date on this shit, wasn’t this notebook published sometime just today
 

william_kent

Well-known member
DALLE-2 prompt engineering guide

a "fan-made" document exploring the art of DALLE-2 prompt engineering - has useful lists of terms that users have discovered which can improve the quality of images produced - for instance, "spray-painted onto a wall" is more effective than "graffiti" for depictions of graffiti....

I found this through this article which documents one woman's voyage of discovery in how to generate an image of a Llama playing basketball by modifying the prompts to tell the AI exactly what she wants to see..
 

william_kent

Well-known member
another thing I learnt from that article is that people are actually charging $1.99 to create a prompt for those too lazy to come up with their own

I have seen discussion elsewhere that in the future "prompt engineer" could become a viable profession, which must really piss off those guys who can actually build bridges and shit - it was bad enough when computer programmers started referring to themselves as "software engineers", but now the artistic types are in on the act...
 

william_kent

Well-known member
anyway, I should start studying that guide as my DALLE-2 invite arrived today and my first attempts have been less than stellar
 
Top