DALL-E 3: What you can and can’t do with the image generator


Maximilian Herr

October 25, 2023 at 10:30 a.m.

5

Summary
  • Limitations of DALL-E 3
  • The new capabilities of DALL-E 3

DALL-E 3: What you can and can't do with the image generator

DALL-E 3 is the third iteration of image-generating AI from OpenAI, the creator of ChatGPT. This is dedicated to the generation of images, unlike GPT which is oriented towards textual outputs. Like every AI released this past year, it opens up a world of possibilities, but not without certain limits…

Midjourney could generate photorealistic images, but DALL-E 2 lagged in this area, producing artificial images that were easily identifiable. For the release of the third version of DALL-E, a research article explains how it works. We reviewed this article to determine the limitations and areas of expertise of this AI.

Limitations of DALL-E 3

DALL-E 3 represents a significant advance in image generation, making it possible to produce images consistent with the generated texts. Until then, image generators had difficulty producing text corresponding to requests. Consequently, if one requested the inscription of “Salut” on an ancient pedestal, the resulting writing will differ from the required text. However, this problem is now resolved thanks to DALL-E 3.

Generation of scientific diagrams

The new generation of texts for images could inspire you. For example, it would be well suited to creating scientific diagrams. However, OpenAI points out that artificial intelligence can give inaccurate results. However, if you provide all the necessary information, schema creation should, in principle, work accurately. Additionally, you don’t have to provide all the information, as ChatGPT will automatically complete your message before sending it to DALL-E 3.

DALL-E 3 BDD schematic

Prompt: Realistic photo of a database schema on a white background. On the left, a rectangular table titled ‘Users’ with the following columns: ‘UserID’ (primary key), ‘Username’, ‘Password’, ‘Email’, and ‘DateRegistered’. Horizontal lines separate each column to illustrate the records. On the right, a rectangular table titled ‘Products’ with the columns: ‘ProductID’ (primary key), ‘ProductName’, ‘Price’, and ‘StockQuantity’. Horizontal lines also separate each column. Between the two tables, arrows illustrating a relationship between ‘UserID’ from the ‘Users’ table and ‘ProductID’ from the ‘Products’ table, indicating a potential purchasing relationship.

In reality, drawing a diagram will not work entirely well. The example above is quite specific, although normally simplistic. Two rectangles should have been drawn, connected by a link, and in each table, member data should be entered. There, DALL-E 3 clearly did not do that, and the result is not very convincing.

Artificial documents: this is a categorical refusal!

An important point that the system is capable of doing is the generation of artificial documents. This still follows from textual generation, but the difference here is that the generator is very good for this document generation. Unlike scientific images which were inaccurate, the generated documents can fool people. The model used is therefore capable of doing this. However, for ethical and moral reasons, OpenAI has decided to prohibit this possibility.

Inspiration of artistic styles, protection of intellectual property

DALL-E 3 likely uses various artist creations in its source data to be properly trained. Because of this, it can reproduce images in the style of a specific artist. However, this can cause intellectual property issues, both regarding generated images and those used for training. However, in order to avoid possible legal action, DALL-E 3 restricts the creation of images based on the style of artists, particularly those still living, with a handful of exceptions. These include world-renowned artists such as Picasso.

DALL-E 3 picasso Cat

Prompt: Oil painting with geometric shapes, cubist era influences, and bright primary colors showing a cat with abstract features.

The new capabilities of DALL-E 3

Photorealistic images, the spread of fake news is coming?

The images produced by DALL-E 3 are becoming more and more realistic, making it difficult to distinguish from real photos. It is difficult, if not impossible, to recognize an image created by artificial intelligence with just a glance. If we take up a subject that shook France a few months ago, we can have fun creating beautiful fake photos. So let’s take the subject of pension reform, simply asking for a generic photo illustration. The smartest (or perverse) minds will think about the possibilities of creating false images, of a demonstrator breaking works of art, of police blunders or many other false information.

DALL-E 3 event Paris

Prompt: Realistic photo of a large demonstration in the modern streets of Paris with thousands of people. Contemporary buildings and modern structures like skyscrapers are visible in the background. The crowd is colorful, with people carrying union flags. The atmosphere is dynamic and vibrant, with people of various ages, genders and backgrounds. Banners, signs and flags add pops of bright color to the scene.

Caricatures and cartoons

One of the graphic styles of the DALL-E 3 generation that has come out the most on Twitter is the cartoon. This style is very well generated and reproduced by artificial intelligence, with very impressive results. Sometimes the texts are not entirely operational, but in the majority of cases the results are very good. So, with a general theme, you can ask for examples of caricatures around a subject. DALL-E 3 (via ChatGPT) will launch the generation of different images from different prompts, which will diversify the responses.

DALL-E 3 cartoon global warming

Prompt: Cartoon of a city submerged by rising waters, with fish swimming between buildings and people using canoes instead of cars. Text: ‘New mode of transport in the city!’

DALL-E 3 is therefore a very interesting tool, which will become part of the habits of many users. Whether to illustrate a PowerPoint presentation, to create a meme for a friend, or for articles, the image generator is very strong. The most interesting thing now is to know what OpenAI is preparing to create a DALL-E 4 that will be even better. To be continued…

SLAB

Download

SLAB

  • Stunning results
  • 15 free queries per month

DALL·E allows you to be an artist without putting on a beret. The OpenAI tool allows you to generate images using artificial intelligence and a system of credits generous enough to experiment every month, provided you speak English.

DALL·E allows you to be an artist without putting on a beret. The OpenAI tool allows you to generate images using artificial intelligence and a system of credits generous enough to experiment every month, provided you speak English.



Source link -99