As a hobbyist in AI image generation, I’ve played a lot with Midjourney, and I had the opportunity to compare it to DALL-E 3 just this morning. Below is a table with a numerical TL;DR version of the comparison between DALL-E 3 and Midjourney: the former emerges as a clear winner with a total of 7.5 points against 5 for Midjourney. However, for the way I’m using it (generating creative seamless patterns), I’m going to stick with Midjourney for now. Read my comments below for a deeper explanation of the differences.
Midjourney | DALL-E 3 | |
Body parts (e.g. hands) | 0 | 1 |
Text inclusion | 0 | 1 |
Seamless pattern | 1 | 0 |
API | 0.5 | 1 |
Zoom-out and image variation | 1 | 0.5 |
Refining and brainstorming prompts | 0.5 | 1 |
Cost | 0 | 0 |
Commercial rights | 1 | 1 |
Privacy | 0 | 1 |
Artist protection | 0 | 1 |
Creativity | 1 | 0 |
Total | 5 | 7.5 |
Here are my 9 takeaways:
1. DALL-E 3 does way better with hands (or paws 😺 ). These are notoriously difficult to render, even for human artists, and I usually have to spend a few iterations with Midjourney to get them right. DALL-E 3 did an excellent job on every image (number of fingers, position, and even number of hands!)
2. DALL-E 3 is also much better at integrating specific texts into images, which is an impressive step up compared to DALL-E 2 and Midjourney.
3. Midjourney is better for seamless patterns, which is a must-have if you are working on generating patterns for anything home decor or clothes. ChatGPT was nice enough to help me come up with a good prompt for seamless patterns, and the result was decent, but not as flawless as what Midjourney does.
4. I really like the Midjourney options of zooming out and being able to target parts of the image to vary. It’s not possible yet with the version of DALL-E 3 accessible through ChatGPT, but similar features are available for DALL-E 2 and the API, so I expect this not to be an issue in the future.
5. Having an API available is really key for software innovation, hopefully Midjourney is working on getting an official one soon 🤞 (there are several unofficial ones).
6. Midjourney has a bunch of commands and parameters to easily specify technical requests. As an engineer, this is my language, but I can see the power of having DALL-E 3 integrated with ChatGPT to allow for brainstorming and refining prompts.
7. Both Midjourney and DALL-E 3 have a cost (and I’m not talking about the environmental cost of these models, but hopefully it will come to the AI conversation sooner rather than later). Both give their users commercial rights. However, a big difference is that unless you’re on the most expensive plans, Midjourney is open to the community and the generated images and prompts are visible to others. I’m still not sure how I feel about that.
8. I find that the results from DALL-E 3 are less artistic than Midjourdney’s, and I’m wondering if it’s because they might have imposed more constraints on their training set. The article introducing DALL-E 3 specifies that it’s designed to decline requests in the style of living artists and that creators can opt out their images for training. I guess it’s a step in the right (i.e. ethical) direction, although it would be better to have an opt-in system.
9. Overall, I’d say that I still prefer Midjourney in terms of creativity to explore different styles and directions, as well as its seamless pattern capability. However DALL-E 3 seems to be much better at following prompts and rendering difficult but important details such as body parts or texts.