ⓘ​  This page has been translated using artificial intelligence.

16 minutes

Generative image AIs and image generation models

A photorealistic image of a cow surfing in the sea and cheering – for a long time, this was impossible. Today, such subjects are already part of everyday life. This is thanks to generative image AI and diffusion models, which can build an image pixel by pixel according to a prompt. On this page, we ask about well-known models, talk about useful use cases and how to distinguish AI-generated images from real ones.

You will find these topics on this page:

How do image AIs work?

Topic

How do image AIs work?

Artificial intelligence has two different methods at its disposal for generating images: generative adversarial networks (GANs) and diffusion models. But what do these terms actually mean?

Generative adversarial networks (GANs) have been the leading AI image generation technology in recent years. In a GAN, an image generator trained using the deep learning approach generates an image in a single step.

This image generator made the GAN method popular in 2014:
https://thispersondoesnotexist.com/(opens in new tab)

The weakness of GANs is that even with different starting points, the same image can be generated twice because the technology favours this.

Diffusion models take a different approach to GANs: in 2021, researchers at OpenAI proposed diffusion models as a new, better technique for image generation in their paper ‘Diffusion Models Beat GANS on Image Synthesis’.

The relevant difference in the process lies in the iterative steps of diffusion models, which avoid duplicates and enable greater detail. Diffusion technology has now become established in all common image generation tools.

AI-generated images and copyright: What you need to know

When images are created using AI, the legal situation is interesting: currently, they are not protected by copyright in the United Kingdom, which makes their use flexible. Nevertheless, trademark and personality rights must be taken into account. Rapid developments in technology could lead to changes in copyright law in the future. Stay informed to keep up to date.

Learn more

Topic

What are the best-known AI image generators?

As with text AI, there are also more and more models available for AI image generators. GPT-4o and Midjourney are currently at the top of the quality scale.

The ‘o’ in GPT-4o stands for ‘omni’ and describes OpenAI's model as multimodal. This means it can natively (i.e. on its own) process text, images and audio. GPT-4o can generate images, but it can also analyse them or talk about them. This is particularly useful for tasks that require both (such as creating a presentation). GPT-4o has been running as the standard image generator in ChatGPT since March 2025. Prior to that, the diffusion model DALL·E (also from OpenAI) had been used for image generation since 2023.

Age rating (GPT-4o)

Recommended for ages 13 and up

Access (GPT-4o)

Web, mobile app, API for developers

Strengths (GPT-4o)
  • Processes text, images and audio (and, with the integration of Sora, video) as a multimodal model.
  • Seamless transitions between media types are possible.
  • Can generate and analyse images and develop them further in dialogue.
Weaknesses (GPT-4o) 
  • Limited number of images per day (in the free version).
  • Little artistic freedom when creating images.
  • Can only edit existing images to a limited extent (in the dialogue box). 
Safety (GPT-4o)
  • Conversations are stored by default.
  • Data is used for training by default (can be opted out of).
  • Strict content guidelines designed to prevent the model from being used for malicious purposes.
Educational value (GPT-4o)
  • Ideal for creating worksheets or illustrations to accompany teaching materials.
  • Can visually support or explain concepts with graphics.
  • Low threshold entry for teachers and pupils.
Classification (GPT-4o)
  • All-rounder for families and schools.
  • Not particularly suitable for professional art projects.
  • Suitable for creating text-image combinations.

Midjourney is a generative AI that specialises purely in image generation – and it does this really well: the AI is well known for its high-quality and often surreal images. The available parameters offer many possibilities for influencing and developing the image during creation. The community factor also plays a role at Midjourney.

Age rating (Midjourney)

From 13 years of age

Access (Midjourney)

via Discord or Midjourney Alpha

Strengths (Midjourney)
  • High image quality and artistic freedom.
  • Works well for portraits and complex compositions.
  • Inspiration from an active community.
Weaknesses (Midjourney)
  • Primarily usable via Discord, which can be cumbersome for beginners.
  • No free version available.
  • Understanding the prompt parameters requires effort during familiarisation.
Safety (Midjourney)
  • Generated images are displayed publicly (depending on subscription).
  • Possible exposure to inappropriate content (moderate content filters and community moderation are not always reliable).
  • Discord environment can be distracting.
Educational value (Midjourney)
  • Displays and integrates different art styles or eras.
  • Teaches composition in application, among other things, thereby also promoting visual thinking.
  • Not particularly suitable for creating learning materials.
Classification (Midjourney)
  • Best choice for artistic projects.
  • Requires perseverance during the learning curve, but ultimately delivers creative, high-quality images.
  • Premium tool for ambitious artists.

Try Midjourney (Discord or Google account required): https://midjourney.com/home(opens in new tab)

For advanced learners: Midjourney Parameter(opens in new tab)

Canva is a popular design platform that also integrates intelligent image generation with Magic Media. The focus is on ease of use and the ability to integrate content directly into designed projects (flyers, social media stories, applications, etc.).

Age rating (Canva AI)

Recommended for ages 13 and up

Access (Canva AI)

Web, mobile app

Strengths (Canva AI)
  • Easy to use.
  • Direct integration into design projects in Canva.
  • Also provides suggestions and inspiration for image generation.
Weaknesses (Canva AI)
  • Subsequent image editing is only possible to a limited extent.
  • Premium features (such as uploading reference images) are subject to a fee.
  • Not very experimental and sometimes repetitive results.
Safety (Canva AI)
  • Child-friendly environment thanks to powerful content filters.
  • Users can be assigned different roles to manage, design or access content.
  • Activities, content and media uploads can be used for training purposes (can be objected to).
Educational value (Canva AI)
  • Canva Education offers numerous templates for teaching materials.
  • Canva Education is available to primary and secondary school pupils upon invitation from their teachers.
  • Suitable for school presentations or for learning the basics of design in a fun way.
Classification (Canva AI)
  • Ideal for beginners or children.
  • Less suitable for purely artistic projects.
  • Offers a good starting point for schools with Canva Education.

Adobe Firefly is Adobe's AI image generator and is integrated into the Adobe Creative Suite programmes. This image AI is based on ethical values: according to the provider's own information(opens in new tab), the first commercial Firefly model was trained using Adobe Stock images and freely licensed works and content (or those for which the copyright had expired).

Age rating (Adobe Firefly)

From 13 years of age (Adobe licence)

Access (Adobe Firefly)

Adobe Creative Cloud, Web

Strengths (Adobe Firefly)
  • Seamless integration with Adobe Creative Suite.
  • Offers the comprehensive Adobe range for further processing.
  • Parts of an image can be replaced.
Weaknesses (Adobe Firefly) 
  • Complex user interface requires Adobe knowledge.
  • Generative credits vary depending on Creative Cloud subscription.
  • Sometimes a bit too corporate and not very artistic.
Safety (Adobe Firefly)  
  • Transparent data usage and licensing.
  • Generally no copyright issues, so images can be used commercially.
  • Strict guidelines.
Educational value (Adobe Firefly) 
  • Ideal for school projects of a commercial nature.
  • Suitable for public presentations.
  • More suitable for higher levels of education.
Classification (Adobe Firefly)  
  • The most professional approach among all the tools mentioned.
  • Professional quality for corporate communications.
  • Interesting and secure in terms of commercial use.

Stable Diffusion was released in August 2022 as an open-source image generation model. As a result, the AI is now often integrated into third-party programmes such as civitai.com or leonardo.ai. Stable Diffusion offers maximum control and customisability, but requires technical understanding and is therefore mainly used by design professionals.

Age rating (Stable Diffusion)

Depending on the platform used

Access (Stable Diffusion)

App, web, local installations

Strengths (Stable Diffusion)
  • Free and open source.
  • Installed locally, the AI runs even without an internet connection.
  • Maximum control over all parameters.
Weaknesses (Stable Diffusion) 
  • Requires technical skills.
  • When used locally: time-consuming setup and maintenance.
  • Requires powerful hardware.
Safety (Stable Diffusion)  
  • May also generate inappropriate content, as there is little to no censorship or content restrictions.
  • Local use offers maximum data protection.
  • Requirements may vary depending on the (third-party) platform used.
Educational value (Stable Diffusion) 
  • Promotes problem-solving and technical skills.
  • Demonstrates open source principles.
  • Less suitable for younger pupils.
Classification (Stable Diffusion)  
  • Suitable for technically savvy individuals.
  • Best solution for privacy-conscious institutions.
  • Not practical for average family use.

Try Stable Diffusion Online: stablediffusionweb.com(opens in new tab)

How do the most popular image generators differ in quality when executing the same prompt?

«cute comic style, wide angle, plush elephant shaking hand of a mouse, sunset, warm colors –ar 16:9»

Topic

Multimodal models: designing through dialogue

The next generation of AI image generators works slightly differently than its predecessors: instead of only understanding text, multimodal ‘Omni’ models such as GPT-4o can process text, images and audio equally well. That sounds like multitasking – and it is. But only for the AI; for you, it makes usage easier and more natural.

Multimodal AI goes beyond text and images.

What this means for your prompts:  

Conventional image models

You write a text prompt (e.g. “A red apple on a table”) and let the AI generate an image.   

Multimodal models

You can also upload an image of a red apple on a table and instruct the AI: ‘Make the apple blue and add a banana’ or ‘Create a similar scene, but in winter’.

With multimodal models, it has become easier to refine your desired image using an example and in dialogue with the AI. Unlike pure image generators such as DALL·E, multimodal models such as GPT-4o can remember the chat history and previous image versions, allowing them to edit the image iteratively and collaboratively with you. Think of image AI as a personal designer whom you can look over the shoulder of and exchange ideas with. Use the dialogue function if you have questions about image editing to see alternatives or give specific feedback on the results (I like this, but not that).

One small drawback: multimodal models are still in their infancy and are sometimes not yet fully developed. As a result, the AI may forget parts of the original image, or not all image details can be controlled during the conversation.

Topic

How do I prompt better images?

A good prompt provides guidelines on visual style, specific content and aspect ratio (depending on the model). Here, we reveal what else you can pay attention to so that the AI generates the images you have in mind.

A few basic principles to start with: When prompting, be careful not to use filler words. The correct prompt length is crucial, as longer prompts help the AI to implement your idea. However, if your specifications are too detailed, the AI may get lost and visualise elements that are not so important to you.

Also research technical terms from the visual arts(opens in new tab) so that you can give the AI very specific style specifications.

Every generative AI works slightly differently. But with all of them, it is worth paying attention to these basic things:

Not all image generators understand English. Find out which language the desired image generator speaks and prompt it in that language. (You can also get help from a translation AI such as DeepL(opens in new tab).)

What style should the image be rendered in? Would you prefer a stylised artistic style (like Van Gogh's paintings) or a photorealistic motif? Give the AI a precise task that it can carry out.

What exactly should be visible in the picture? What is in the foreground, what is in the background? Name all the necessary elements.  

What colour scheme should the image be generated in? Do you want a black-and-white image or a colourful scene? Where does the light in the image come from? What is the mood of the image? 

With some tools (such as Midjourney), you can determine the aspect ratio yourself, for example: portraits with a ratio of 3:4.

Topic

Examples of everyday applications

AI image generation can do more than ‘just’ promote artistic self-expression. It can also help you in everyday family life or in a school context. From room design to history lessons – the possibilities are more diverse than you might think.

Weihnachtskarte mit KI erstellen.

For families

Need a new bedtime story for your child? With multimodal models, you can easily create your own picture book. AI helps you bounce ideas back and forth and formulates your story the way you want it. It can transform your quick sketches into high-quality drawings to illustrate your book. And it can give you helpful tips on printing and organisation.  

Example

Monster Princess by Swisscom

Would you like to freshen up your living room – perhaps with a new sofa? A different colour on the walls? If you don't want to or can't imagine it yourself, let AI do it for you. Simply photograph your living room and use AI to try out different furniture, colours or interior design styles – before you spend any money. 

Example

‘Show me the living room in the uploaded image with a sky blue sofa and bright white walls.’

Whether for a birthday, Christmas or wedding, with AI you can generate personalised cards instead of giving off-the-shelf standard cards.

Note: Please remember to protect your personal data and think carefully about whether and which photos of yourself or others you upload to AI (it is best to obtain their consent beforehand).

Example

Create a Christmas card (video above)

For the school context

How do you explain to your students what life was really like in the Middle Ages? Textbooks can sometimes be dry, and vivid images are not always available. Let AI reconstruct historical scenes and discuss them with your students in class:

Example

‘What did this city look like back then vs. today?’

Microbiological processes occur on a small scale and are usually invisible to the naked eye. However, AI can zoom in very close to a plant cell and make invisible things visible. Conversely, it can also make something unimaginably large tangible, such as what the evolution of humans would look like in fast motion. 

Example

Show me what a plant cell looks like from the inside.

Learning images can be particularly beneficial for visual learners when learning languages, rather than simple word cards. AI illustrates vocabulary and creates appropriate scenes or mnemonics that are easier to remember.  

Examples

‘A happy dog plays in the park.’ / ‘A French family having breakfast.’

Of course, AI can also help teach media literacy, for example by generating AI images and giving them to children to sort together with photographed images.

Example

‘How can real photos be distinguished from AI images?’ / ‘What are some typical AI errors?’ / ‘How can AI-generated content be properly labelled?’ / ‘What does this mean for journalism and the dissemination of news?’

For work

Abstract concepts are often difficult to visualise. AI can help here by quickly sketching out ideas (without requiring a large investment). It can also assist with mood boards, supplementing them with AI-generated images. Sometimes AI helps to overcome creative blocks by filling the blank page with an initial idea. This gives you more time to finalise the best idea.

Example

‘Create a mood board for a Scandinavian-style packaging design for organic coffee.’

Constantly generating new content for your company is time-consuming. Let AI help you. A multimodal model supports you in the design phase and helps with initial visualisations. Some companies in the fashion industry are already relying entirely on AI-generated content in large-scale campaigns.    

Example

‘Create a second image variant to perform A/B testing. Use brighter colours and dynamic perspectives for the second image variant.’

Boring PowerPoint slides with standard clip art no longer really impress anyone these days. But professional graphics are sometimes simply too expensive. AI provides you with the often-valued middle ground and designs graphics and diagrams to your liking.      

Examples

‘Generate a black icon that symbolises teamwork.’ / ‘Visualise our transformation process by incorporating the following aspects and linking them together: ...’

Notes for professional use

If you want to use AI-generated content for commercial purposes, find out in advance about the usage rights and data protection conditions of the models. For ethical and legal reasons, clearly label AI-generated content as such. Of course, you should also observe any corporate design guidelines. And consider AI as a supplement, but not a replacement, for human skills and creativity.

Topic

How can I recognise AI-generated images?

Being able to recognise AI-generated images is becoming an important media literacy skill. Here we show you what to look out for and what to do if you are unsure. With a little practice, you will develop a good sense for it. Nevertheless, always remain vigilant, as the technologies are improving every day.

What applies to detecting video deepfakes usually also helps to expose AI-generated images. However, it is still far from easy. Even experts sometimes get it wrong. So if you are ever unsure, that is completely normal. The important thing is to remain critical and investigate when in doubt.

Distinguishing features of AI images can include

Yes, some image AIs still struggle to render hands and fingers correctly. Pay particular attention to jewellery: finger rings often blend unnaturally with the hands.

Look at the details: How are the teeth arranged? Are they too perfect or unnaturally aligned? What does the skin look like – does it have any strange transitions? What about the pupils – do the eyes look alive or lifeless? Eyes in generated images often have a fixed gaze.

Even though things are getting better, some image AIs still have trouble displaying text correctly and legibly. This sometimes results in words that don't make sense or signs that contain made-up languages.

Also pay attention to reflections in windows or on surfaces: do they look right? Where is the light source coming from, and is there one at all? Do the shadows match the direction of the light source? 

Search for the image in question using Google's (reverse) image search to find out where else the image may be used. This can give you clues about the origin of the image.

Leading tech companies such as Adobe, Intel and Microsoft are increasingly committed to ensuring that the origin of media content can be certified with watermarks. Perhaps your image is certified?

To the image checker(opens in new tab)

As a general rule, do not rely on one characteristic alone; instead, examine several aspects. Remain sceptical, especially when it comes to perfect images.

Deepfakes and the dangers of generative AI

Deepfakes exist not only in the form of videos, but also in the form of images. For example, when image elements are replaced using generative AI, the message changes, but the image still looks deceptively real. In the case of images, copyright is also a controversial issue.

What are the dangers of generative AI?

Topic

What are the opportunities and limitations in education?

As a teacher, you are faced with the question: Should I use image AI for preparation or in class – and if so, how? As is so often the case, the answer is: Of course, take advantage of the opportunities offered by new technologies, but also be aware of their limitations and risks. This will enable you to make your own decisions and consciously shape media literacy in your class.

Opportunities

How do you explain to a child in Cycle I how a solar panel works? Or how a plant performs photosynthesis? Multimodal models in particular are good at visually depicting how things work and complex interrelationships, and explaining them in a way that is appropriate for a specific age group. While GPT-4o can use the vivid metaphor of a factory to explain solar panels, for example, the integrated image generator supplements the explanations with a suitable illustration.

With this support, you quickly have suitable image material at your fingertips when preparing for class, without having to pay a lot of licence fees (or fraying your nerves).

A picture is worth a thousand words – especially when those words are not yet part of your vocabulary. For example, when teaching children who are not fluent in English. Or when the key concepts related to the teaching material are very abstract. In such cases, pictures, graphics and visual sequences can help to make the topic easy for everyone to understand.

Boundaries

If you generate historical or scientific representations using image AI and integrate them into your lessons, make it clear that you have used AI. Also point out that these are not historically or scientifically accurate representations, but rather visual approximations of the topic that may not necessarily have existed in this form. You may be able to discuss directly in class why and where the generated images differ from real historical material.

Also, be aware that AI representations can reinforce stereotypes (since generative AI always reproduces common and learned patterns) when depicting cultural groups, for example.

Of course, image AI can be very helpful when it comes to visually illustrating complex concepts. But in doing so, AI also takes over part of the students' own thinking – in the case of image AI, in particular, their creative imagination.

It's like watching a film before you've read the book: if you still want to read the book afterwards, you automatically have the actors from the film in your head instead of forming your own image of them. So be aware of the power of images and how they influence your students' imagination.

Webinar for teachers: Understanding and using AI image generators

In this course, teachers learn about AI image generators and what happens in the background once the prompts are sent. We discuss where and how image generators are suitable for teaching and how reality, manipulation and responsibility can be addressed in relation to image generation in the classroom.

The 90-minute webinar was developed in collaboration with LerNetz.

Information about the course(opens in new tab)

This is important

Other interesting topics

Ask Marcel

Marcel is a trainer at Swisscom. He is available to answer any questions you may have about AI.

Portrait des Leiters Jugendmedienschutz Michael In Albon
Marcel

Trainer at Swisscom