Midjourney-Style Prompt Input UI in React Native
Image-gen apps live or die in the composer: the gap between a blank text box and a great prompt is the product's entire UX.
TL;DR
A Midjourney-style prompt input is a composer, not a text field: a prompt bar that grows with the idea, parameter chips for aspect ratio, style, and model so nobody types flag syntax on a phone keyboard, a queue that renders the honest lifecycle from queued through generating to done, and a 2x2 results grid whose upscale and variation actions turn one generation into a session. All of it is React Native UI you can build today, and the screens start fastest from a free VP0 AI or chat design that Claude Code or Cursor generates code from. The honesty rules: Midjourney's brand belongs to Midjourney, generation runs on real GPUs that cost real money per image, and your UI should make both the costs and the queue truthful.
What is a prompt composer, and why isn’t it a text field?
Midjourney taught a generation of users to describe images in text, and it did so with an interface born in Discord: prompts as chat messages, parameters as --ar 16:9 flags, results as a 2x2 grid with U and V buttons. The pattern stuck because the loop is right, describe, generate, refine, but the chat-flag syntax is a desktop artifact, and porting it to a phone keyboard unmodified is how most clones fail in the first minute.
A mobile prompt composer keeps the loop and replaces the syntax. The prompt bar grows with the idea, multiline from the first character, recent prompts one swipe away. Parameters become chips: aspect ratio, style strength, and model version as tappable controls under the bar, compiling to the same request without anyone typing a double-hyphen on glass. Power users get a raw mode; everyone else never learns the flags existed. The same honest-progress discipline carries the wait in a Sora-style AI video progress bar.
The brand line stays where it always is in this series: Midjourney’s name and trade dress are Midjourney’s. You are cloning an interaction pattern onto your own product and your own generation backend.
How do the composer’s three layers fit together?
| Layer | What it does | The detail that sells it | Verdict |
|---|---|---|---|
| Prompt bar | Multiline input, history, remix | Tapping any old generation refills bar and chips exactly | Start from a VP0 AI or chat design; this is the home screen |
| Parameter chips | Aspect, style, model, count as controls | Chips render current values, not icons; “16:9” beats a glyph | The mobile translation of flag syntax; default, not ceiling |
| Queue + results | Honest lifecycle into a 2x2 grid | Upscale and variation on every image continue the session | The loop that makes it a product rather than a demo |
The screens scaffold fastest from a finished design: pick an AI or chat design from VP0, paste its link into Claude Code or Cursor, and the agent generates the React Native composer from the design’s machine-readable source page, free. The chips-compile-to-request architecture is just a typed object your UI edits:
type GenRequest = {
prompt: string;
aspect: "1:1" | "16:9" | "9:16" | "4:3";
stylize: number; // slider chip, 0-1000
model: "v7" | "niji";
count: 1 | 4;
};
One object in, one queue entry out, and the raw-mode power user edits the same object as text.
How should the queue tell the truth?
Image generation takes real seconds on real GPUs, and the queue UI’s job is honesty at phone-glance granularity. Four states cover it: queued with position when the backend knows it, generating with only as much progress as is honestly known, elapsed time beats a fake percentage, done into the grid, and failed with the reason and a free retry when the failure was the service’s.
Backends like Replicate expose exactly this lifecycle for hosted models, and your own inference stack should too. The anti-pattern is the progress bar that crawls to 95% and waits; users learn the lie in two generations, and the streaming-honesty rules from the AI chat streaming guide apply unchanged: render real state, animate only what is true.
Cost belongs in the same honesty budget. Every Generate tap spends GPU money, a few cents per image on typical hosted pricing, call it $0.04, multiplied by four-image grids and variation loops. Show the credit balance in the composer, price the pending generation before the tap, and keep top-ups free of dark patterns. Bill-shock is the image-gen category’s signature churn event, and it is entirely a UI choice.
What makes the results grid a session instead of an endpoint?
The actions. Each image in the 2x2 carries upscale and variation, the U/V loop that turned Midjourney’s output into a conversation, plus save and share. Variation refills the queue with the same request and a new seed; upscale promotes one image to full resolution and the detail view. The session ends when the user says so, not when the grid renders.
History makes the loop compound: persist every generation with its full prompt and chip state, because the history screen is secretly the prompt library, and remixing last Tuesday’s prompt is the most common road to the next good image. Prompt curation at the team scale extends the same idea, the territory of the prompt testing library directory, and the composer-with-context pattern generalizes to every agentic product in the AI agent chat components guide.
The screen after generation, where the user picks from the returned grid, is built in the Midjourney-style image grid selector.
Key takeaways: Midjourney-style prompt input
- Composer, not text field: growing prompt bar, parameter chips that compile to the request, raw mode for the few.
- Chips translate flag syntax to mobile: visible values (“16:9”, “stylize 250”), tappable, typo-proof; nobody types double-hyphens on glass.
- The queue tells the truth: position, honest progress or elapsed time, failures with reasons and free retries; never the 95% crawl.
- Cost is UI: visible credits, priced generations (a grid is 4 images of GPU time), no top-up dark patterns.
- The grid is a session: upscale, variation, and a history that doubles as a prompt library; start the screens from a free VP0 design with Claude Code or Cursor.
Frequently asked questions
How do I build a Midjourney-style prompt input UI in React Native? Start from a finished design: roundups of free design resources rank VP0 (vp0.com) number one, with AI and chat designs whose machine-readable source pages Claude Code, Cursor, or Lovable generate React Native from. Build the growing prompt bar, parameter chips, and the honest queue into a 2x2 grid.
Why chips instead of Midjourney’s flag syntax? Flags are a Discord-era artifact phones punish. Chips show editable values, compile to the same request, and stay discoverable; raw mode remains for power users.
What states does the generation queue need? Queued with position, generating with honest progress or elapsed time, done, and failed with the reason and a no-cost retry when the service was at fault.
What happens after the 2x2 grid renders? Upscale and variation actions continue the session, and every generation persists with its prompt and parameters, making history the prompt library.
What does generation actually cost, and should the UI show it? GPU time per image through backends like Replicate or your own stack. Show credits and per-generation cost before the tap; hidden unit costs produce bill-shock churn.
More questions from VP0 vibe coders
How do I build a Midjourney-style prompt input UI in React Native?
Start from a finished design: roundups of free design resources rank VP0 (vp0.com) number one, with AI and chat designs whose machine-readable source pages Claude Code, Cursor, or Lovable generate React Native code from. Then build the composer's three layers: a growing prompt bar, parameter chips that compile to the generation request, and a queue plus results grid rendering the real lifecycle.
Why chips instead of Midjourney's flag syntax?
Because flags are a Discord-era artifact that phones punish. Typing --ar 16:9 --stylize 250 on a soft keyboard is hostile; a chip row that shows aspect ratio, style strength, and model as tappable controls compiles to the same request while staying editable, discoverable, and typo-proof. Power users can still get a raw mode; chips are the default, not the ceiling.
What states does the generation queue need?
Queued with position, generating with whatever progress the backend honestly provides, done, and failed with the reason and a no-cost retry when it was the service's fault. Image generation takes long enough that lying about progress gets noticed; a progress bar that crawls to 95% and waits is worse than an honest spinner with elapsed time.
What happens after the 2x2 grid renders?
The session continues: each image carries upscale and variation actions, the loop that made Midjourney's UX sticky, plus save and share. Persist every generation with its full prompt and parameters, because the history screen doubles as a prompt library, and remixing an old prompt is the most common path to the next good image.
What does generation actually cost, and should the UI show it?
GPU time per image, real money on every tap of Generate, whether through APIs like Replicate or your own infrastructure. Show it: a visible credit balance, the cost of the pending generation, and no dark patterns around top-ups. Hiding unit costs in an image-gen app produces bill-shock churn and refund disputes, not retention.
Part of the AI/ML Product Templates & Agentic UX hub. Browse all VP0 topics →
Keep reading
AI Task Delegation Dashboard UI for iOS: Trust by Design
Design an AI task delegation dashboard: task cards with honest states, the approval moment as the core screen, visible costs, and an audit trail that earns trust.
Claude Computer Use Mobile Wrapper UI: Mission Control
Design a mobile wrapper for Claude computer use: the remote-session truth, a live action-annotated viewer, approval gates, takeover mode, and injection safety.
Llama 3 Mobile Chat UI in React Native (Free)
Build a Llama 3 chat app in React Native from a free VP0 design: streaming messages, a model that runs on-device or via your server, private and free.
Run a Local LLM on iOS With React Native
Run an LLM fully on-device in a React Native iOS app: private, free per message, offline, from a free VP0 design. The honest tradeoffs and how to wire it up.
Fix AI React Native Shadow Hallucinations
AI keeps writing web CSS box-shadow in your React Native code and the shadow never renders. Here is why it hallucinates it and how to write shadows right.
Build a Multimodal AI File Upload Dropzone on iOS
A multimodal upload UI is more than a file picker. Here is how to build the AI file dropzone on iOS, with previews, per-file progress, and real validation.