The Problem with AI Image Generation + How We Made it Better with TEV1

The TEV1 is ready to use and better than ever!

In this article, we take a look at the prompt engineering techniques as well as tools we used to enhance the creativity of its image generation capabilities.

Note: The TEV1 is an AI tool that analyses user-inputted themes to generate visual representations using advanced models such as DALL-E 3 and GPT-4o. Primarily used in visual design, TEV1 excels in creating detailed graphics for various mediums, ranging from digital art to marketing materials. Continuously refined through user feedback and model training, TEV1 evolves over time to deliver increasingly sophisticated and accurate visual outputs.

The Issue with Traditional Image Generators

One of the core challenges we faced with early iterations of image generation AI, including our initial TEV1, was its tendency to produce art that lacked creativity and uniqueness. These AIs often generated surface-level imagery, closely mimicking the initial input without delving into deeper, more nuanced interpretations of complex themes such as ‘the meaning of life’ or ‘a blank mind.’

This limitation was not just a characteristic of our system, but a common shortfall among most image generators available at the time.

For content creators, brands, and advertisers seeking distinctive and novel visual content, this lack of depth and originality was a significant drawback. It became clear that to offer something truly unique, we needed a solution that could simulate a real-life artistic process, exploring themes more creatively and originally.

Creating an Artificial Digital Artist using Multi-Agent LLMs

Instead of stuffing the prompt with all our requirements and instructions for artwork creation, we broke up the creation process to lean into the strengths of modern LLMs.

LLMs are good at the following:

  • Text summarisation

  • Following patterns in text and continuing them

  • Following instructions and executing only when instructions are not too broad, vague or ‘complicated’

The keyword there being ‘complicated’. The issue with telling the LLM to first analyse a theme, then create artwork descriptions exploring that theme is that without explicitly telling the LLM how to explore the theme, it struggles or its exploration is ill-advised. Additionally, the part of the prompt where it selects the key features of the artwork lacks the necessary attention required by the LLM due to sparse attention of LLMs across long context windows.

To avoid this issue, we set up different agents who were responsible for different parts of the artistic process.

The ‘Theme Explorer’

Named after the tool, is the most integral part of the artwork creation process. This agent has an input being the user’s theme and then is responsible for turning that theme into a representative subject line to form the main content of the artwork.

For example, given an input of ‘pure souls’, we might get a subject line like ‘ethereal life-like figures dancing’ or ‘soul of person being illuminating the body’. Here is an example of the model output for this agent based on the input, ‘pure souls’.

Subject Line: Enchanted forest illuminated by fireflies with ethereal, glowing lights representing pure souls dancing through the night.

Now that we have our subject line, we can pass this onto the next agent, the ‘Curator’.

The ‘Curator’

This step of the process involves taking the input theme as well as subject line created by the previous agent and then selecting the key features of the artwork.

We provide this agent with additional data such as art styles and artists to help it pick features of the artwork to accompany the subject line. We also provide access to a repository of previous artwork descriptions that we manually select as guidance for structure.

The historical prompt data helps the AI or LLM establish a pattern between the subject line and associated artwork features.

The output is the artwork features we need to generate the artwork.

Art Style: 
Subject Matter: 
Referenced Artists: 
Composition: 
Orientation: 
Color Palette: 

The ‘Structurer’

This agent actually does not involve any AI response of any kind but rather a simple algorithm that converts the unstructured feature data from the curator. Any errors or missing information is added and then formatted into the final image prompt to feed into a foundation image generation model like DALL-E or Stable Diffusion.

TEV1 in Action — Some Examples

Let’s take a look at some examples of the tool to see how it may be used in the real-world.

‘Digital Transformation’

Maybe you are writing an article about the potential impact of AI or specific technologies like the impact of pattern recognition on biometric security. You might enter ‘digital transformation’ into the TEV1. Using a temperature of 40 we can get the following cover art for the article.

‘Health & Wellness’

Health and wellness is a common topic of discussion in blogs and forums so why not generate thematic artworks to help get your message across. Temperature = 40.

‘Political Discourse’

Perhaps you are drafting a post to Instagram or Twitter about a recent event in politics and you want to capture the attention of your audience upon first glance. Simply enter in ‘political discourse’ into the tool to bring the topic to life. Temperature = 70.


TRY THE TOOL NOW - FOR FREE

Expand your creative horizons with Asycd’s state-of-the-art image generator! By leveraging advanced AI technology, our tool transforms your text into breathtaking artworks, offering endless creative possibilities. As a special welcome, sign up now and enjoy 5 free generations to explore the full potential of our tool.

By accepting the terms and conditions, you are applicable to have copyright ownership over the images generated using our tool. Rest assured, our platform respects copyright implications, providing guidance on legal aspects to protect your work.

Discover the comprehensive range of Asycd’s services and innovations designed to enhance your artistic journey.

Previous
Previous

“we build software now!” - july update

Next
Next

ReAct Prompt Framework: Enhancing AI’s Decision Making with Human-Like Reasoning