ⓘ This page has been translated using artificial intelligence.
A photorealistic image of a cow surfing in the sea and cheering – for a long time, this was impossible. Today, such subjects are already part of everyday life. This is thanks to generative image AI and diffusion models, which can build an image pixel by pixel according to a prompt. On this page, we ask about well-known models, talk about useful use cases and how to distinguish AI-generated images from real ones.
Go directly to topic
Share this page
How do image AIs work?
Artificial intelligence has two different methods at its disposal for generating images: generative adversarial networks (GANs) and diffusion models. But what do these terms actually mean?
Generative adversarial networks (GANs) have been the leading AI image generation technology in recent years. In a GAN, an image generator trained using the deep learning approach generates an image in a single step.
This image generator made the GAN method popular in 2014:
https://thispersondoesnotexist.com/(opens in new tab)
The weakness of GANs is that even with different starting points, the same image can be generated twice because the technology favours this.
Diffusion models take a different approach to GANs: in 2021, researchers at OpenAI proposed diffusion models as a new, better technique for image generation in their paper ‘Diffusion Models Beat GANS on Image Synthesis’.
The relevant difference in the process lies in the iterative steps of diffusion models, which avoid duplicates and enable greater detail. Diffusion technology has now become established in all common image generation tools.
When images are created using AI, the legal situation is interesting: currently, they are not protected by copyright in the United Kingdom, which makes their use flexible. Nevertheless, trademark and personality rights must be taken into account. Rapid developments in technology could lead to changes in copyright law in the future. Stay informed to keep up to date.
As with text AI, there are also more and more models available for AI image generators. GPT-4o and Midjourney are currently at the top of the quality scale.
The ‘o’ in GPT-4o stands for ‘omni’ and describes OpenAI's model as multimodal. This means it can natively (i.e. on its own) process text, images and audio. GPT-4o can generate images, but it can also analyse them or talk about them. This is particularly useful for tasks that require both (such as creating a presentation). GPT-4o has been running as the standard image generator in ChatGPT since March 2025. Prior to that, the diffusion model DALL·E (also from OpenAI) had been used for image generation since 2023.
Recommended for ages 13 and up
Web, mobile app, API for developers
Try out GPT-4o: https://chat.openai.com/ (opens in new tab)
Midjourney is a generative AI that specialises purely in image generation – and it does this really well: the AI is well known for its high-quality and often surreal images. The available parameters offer many possibilities for influencing and developing the image during creation. The community factor also plays a role at Midjourney.
From 13 years of age
via Discord or Midjourney Alpha
Try Midjourney (Discord or Google account required): https://midjourney.com/home(opens in new tab)
For advanced learners: Midjourney Parameter(opens in new tab)
Canva is a popular design platform that also integrates intelligent image generation with Magic Media. The focus is on ease of use and the ability to integrate content directly into designed projects (flyers, social media stories, applications, etc.).
Recommended for ages 13 and up
Web, mobile app
Try Canva AI: https://www.canva.com/ (opens in new tab)
Adobe Firefly is Adobe's AI image generator and is integrated into the Adobe Creative Suite programmes. This image AI is based on ethical values: according to the provider's own information(opens in new tab), the first commercial Firefly model was trained using Adobe Stock images and freely licensed works and content (or those for which the copyright had expired).
From 13 years of age (Adobe licence)
Adobe Creative Cloud, Web
Try Adobe Firefly: https://firefly.adobe.com/ (opens in new tab)
Stable Diffusion was released in August 2022 as an open-source image generation model. As a result, the AI is now often integrated into third-party programmes such as civitai.com or leonardo.ai. Stable Diffusion offers maximum control and customisability, but requires technical understanding and is therefore mainly used by design professionals.
Depending on the platform used
App, web, local installations
Try Stable Diffusion Online: stablediffusionweb.com(opens in new tab)
How do the most popular image generators differ in quality when executing the same prompt?
«cute comic style, wide angle, plush elephant shaking hand of a mouse, sunset, warm colors –ar 16:9»
The next generation of AI image generators works slightly differently than its predecessors: instead of only understanding text, multimodal ‘Omni’ models such as GPT-4o can process text, images and audio equally well. That sounds like multitasking – and it is. But only for the AI; for you, it makes usage easier and more natural.
Multimodal AI goes beyond text and images.
What this means for your prompts:
You write a text prompt (e.g. “A red apple on a table”) and let the AI generate an image.
You can also upload an image of a red apple on a table and instruct the AI: ‘Make the apple blue and add a banana’ or ‘Create a similar scene, but in winter’.
With multimodal models, it has become easier to refine your desired image using an example and in dialogue with the AI. Unlike pure image generators such as DALL·E, multimodal models such as GPT-4o can remember the chat history and previous image versions, allowing them to edit the image iteratively and collaboratively with you. Think of image AI as a personal designer whom you can look over the shoulder of and exchange ideas with. Use the dialogue function if you have questions about image editing to see alternatives or give specific feedback on the results (I like this, but not that).
One small drawback: multimodal models are still in their infancy and are sometimes not yet fully developed. As a result, the AI may forget parts of the original image, or not all image details can be controlled during the conversation.
A good prompt provides guidelines on visual style, specific content and aspect ratio (depending on the model). Here, we reveal what else you can pay attention to so that the AI generates the images you have in mind.
A few basic principles to start with: When prompting, be careful not to use filler words. The correct prompt length is crucial, as longer prompts help the AI to implement your idea. However, if your specifications are too detailed, the AI may get lost and visualise elements that are not so important to you.
Also research technical terms from the visual arts(opens in new tab) so that you can give the AI very specific style specifications.
Every generative AI works slightly differently. But with all of them, it is worth paying attention to these basic things:
Not all image generators understand English. Find out which language the desired image generator speaks and prompt it in that language. (You can also get help from a translation AI such as DeepL(opens in new tab).)
What style should the image be rendered in? Would you prefer a stylised artistic style (like Van Gogh's paintings) or a photorealistic motif? Give the AI a precise task that it can carry out.
What exactly should be visible in the picture? What is in the foreground, what is in the background? Name all the necessary elements.
What colour scheme should the image be generated in? Do you want a black-and-white image or a colourful scene? Where does the light in the image come from? What is the mood of the image?
With some tools (such as Midjourney), you can determine the aspect ratio yourself, for example: portraits with a ratio of 3:4.
AI image generation can do more than ‘just’ promote artistic self-expression. It can also help you in everyday family life or in a school context. From room design to history lessons – the possibilities are more diverse than you might think.
Weihnachtskarte mit KI erstellen.
Need a new bedtime story for your child? With multimodal models, you can easily create your own picture book. AI helps you bounce ideas back and forth and formulates your story the way you want it. It can transform your quick sketches into high-quality drawings to illustrate your book. And it can give you helpful tips on printing and organisation.
Would you like to freshen up your living room – perhaps with a new sofa? A different colour on the walls? If you don't want to or can't imagine it yourself, let AI do it for you. Simply photograph your living room and use AI to try out different furniture, colours or interior design styles – before you spend any money.
‘Show me the living room in the uploaded image with a sky blue sofa and bright white walls.’
Whether for a birthday, Christmas or wedding, with AI you can generate personalised cards instead of giving off-the-shelf standard cards.
Note: Please remember to protect your personal data and think carefully about whether and which photos of yourself or others you upload to AI (it is best to obtain their consent beforehand).
Create a Christmas card (video above)
How do you explain to your students what life was really like in the Middle Ages? Textbooks can sometimes be dry, and vivid images are not always available. Let AI reconstruct historical scenes and discuss them with your students in class:
‘What did this city look like back then vs. today?’
Microbiological processes occur on a small scale and are usually invisible to the naked eye. However, AI can zoom in very close to a plant cell and make invisible things visible. Conversely, it can also make something unimaginably large tangible, such as what the evolution of humans would look like in fast motion.
Show me what a plant cell looks like from the inside.
Learning images can be particularly beneficial for visual learners when learning languages, rather than simple word cards. AI illustrates vocabulary and creates appropriate scenes or mnemonics that are easier to remember.
‘A happy dog plays in the park.’ / ‘A French family having breakfast.’
Of course, AI can also help teach media literacy, for example by generating AI images and giving them to children to sort together with photographed images.
‘How can real photos be distinguished from AI images?’ / ‘What are some typical AI errors?’ / ‘How can AI-generated content be properly labelled?’ / ‘What does this mean for journalism and the dissemination of news?’
Abstract concepts are often difficult to visualise. AI can help here by quickly sketching out ideas (without requiring a large investment). It can also assist with mood boards, supplementing them with AI-generated images. Sometimes AI helps to overcome creative blocks by filling the blank page with an initial idea. This gives you more time to finalise the best idea.
‘Create a mood board for a Scandinavian-style packaging design for organic coffee.’
Constantly generating new content for your company is time-consuming. Let AI help you. A multimodal model supports you in the design phase and helps with initial visualisations. Some companies in the fashion industry are already relying entirely on AI-generated content in large-scale campaigns.
‘Create a second image variant to perform A/B testing. Use brighter colours and dynamic perspectives for the second image variant.’
Boring PowerPoint slides with standard clip art no longer really impress anyone these days. But professional graphics are sometimes simply too expensive. AI provides you with the often-valued middle ground and designs graphics and diagrams to your liking.
‘Generate a black icon that symbolises teamwork.’ / ‘Visualise our transformation process by incorporating the following aspects and linking them together: ...’
If you want to use AI-generated content for commercial purposes, find out in advance about the usage rights and data protection conditions of the models. For ethical and legal reasons, clearly label AI-generated content as such. Of course, you should also observe any corporate design guidelines. And consider AI as a supplement, but not a replacement, for human skills and creativity.
Being able to recognise AI-generated images is becoming an important media literacy skill. Here we show you what to look out for and what to do if you are unsure. With a little practice, you will develop a good sense for it. Nevertheless, always remain vigilant, as the technologies are improving every day.
What applies to detecting video deepfakes usually also helps to expose AI-generated images. However, it is still far from easy. Even experts sometimes get it wrong. So if you are ever unsure, that is completely normal. The important thing is to remain critical and investigate when in doubt.
Yes, some image AIs still struggle to render hands and fingers correctly. Pay particular attention to jewellery: finger rings often blend unnaturally with the hands.
Look at the details: How are the teeth arranged? Are they too perfect or unnaturally aligned? What does the skin look like – does it have any strange transitions? What about the pupils – do the eyes look alive or lifeless? Eyes in generated images often have a fixed gaze.
Even though things are getting better, some image AIs still have trouble displaying text correctly and legibly. This sometimes results in words that don't make sense or signs that contain made-up languages.
Also pay attention to reflections in windows or on surfaces: do they look right? Where is the light source coming from, and is there one at all? Do the shadows match the direction of the light source?
Search for the image in question using Google's (reverse) image search to find out where else the image may be used. This can give you clues about the origin of the image.
Leading tech companies such as Adobe, Intel and Microsoft are increasingly committed to ensuring that the origin of media content can be certified with watermarks. Perhaps your image is certified?
As a general rule, do not rely on one characteristic alone; instead, examine several aspects. Remain sceptical, especially when it comes to perfect images.
Deepfakes exist not only in the form of videos, but also in the form of images. For example, when image elements are replaced using generative AI, the message changes, but the image still looks deceptively real. In the case of images, copyright is also a controversial issue.
As a teacher, you are faced with the question: Should I use image AI for preparation or in class – and if so, how? As is so often the case, the answer is: Of course, take advantage of the opportunities offered by new technologies, but also be aware of their limitations and risks. This will enable you to make your own decisions and consciously shape media literacy in your class.
How do you explain to a child in Cycle I how a solar panel works? Or how a plant performs photosynthesis? Multimodal models in particular are good at visually depicting how things work and complex interrelationships, and explaining them in a way that is appropriate for a specific age group. While GPT-4o can use the vivid metaphor of a factory to explain solar panels, for example, the integrated image generator supplements the explanations with a suitable illustration.
With this support, you quickly have suitable image material at your fingertips when preparing for class, without having to pay a lot of licence fees (or fraying your nerves).
A picture is worth a thousand words – especially when those words are not yet part of your vocabulary. For example, when teaching children who are not fluent in English. Or when the key concepts related to the teaching material are very abstract. In such cases, pictures, graphics and visual sequences can help to make the topic easy for everyone to understand.
If you generate historical or scientific representations using image AI and integrate them into your lessons, make it clear that you have used AI. Also point out that these are not historically or scientifically accurate representations, but rather visual approximations of the topic that may not necessarily have existed in this form. You may be able to discuss directly in class why and where the generated images differ from real historical material.
Also, be aware that AI representations can reinforce stereotypes (since generative AI always reproduces common and learned patterns) when depicting cultural groups, for example.
Of course, image AI can be very helpful when it comes to visually illustrating complex concepts. But in doing so, AI also takes over part of the students' own thinking – in the case of image AI, in particular, their creative imagination.
It's like watching a film before you've read the book: if you still want to read the book afterwards, you automatically have the actors from the film in your head instead of forming your own image of them. So be aware of the power of images and how they influence your students' imagination.
In this course, teachers learn about AI image generators and what happens in the background once the prompts are sent. We discuss where and how image generators are suitable for teaching and how reality, manipulation and responsibility can be addressed in relation to image generation in the classroom.
The 90-minute webinar was developed in collaboration with LerNetz.
We have compiled further information and content on the topic of ‘image AI and image generators’ here.
Marcel is a trainer at Swisscom. He is available to answer any questions you may have about AI.
Trainer at Swisscom