The return to text interfaces will be temporary
Many computer users are too young to remember, but in the early days of computing, all software was text-based. Users typed commands into a terminal, and results were displayed as text. Fortunately, we moved away from this, building specialised user interfaces for virtually all applications, and up until recently, text-based user interfaces were a relic of the past for most users.
Well, text-based user interfaces are back in vogue thanks to ChatGPT, and to many users and builders, this is disappointing. Why would we want to throw away our long history of graphical user interfaces for inferior, difficult-to-use, text-based interfaces?
ChatGPT has a text user interface because the inputs and outputs are too diverse to predict. This is because it is a massively generalised tool (it can do a lot of unrelated tasks) and its inputs and outputs are probabilistic (the input incantation does not have to be specific, and the output will vary every time).
For example, the output of a request to an AI assistant could be a booked flight, hotel options to choose from, an edited essay, a movie that requires a little fine-tuning, a functioning website, suggested restaurants on a map, directions, video instructions for a recipe, or the meaning of life. All of these outputs (and their corresponding inputs) require very different user interfaces.
It’s impossible to deterministically design a user interface for a truly generalised, probabilistic tool. There are simply too many variations of input and output to do this in a deterministic (i.e., pre-planned) way1. Asking a UX designer to do this is equivalent to asking them to open up Figma and, in a single file, design every single app that has ever existed and will ever exist in the future. They will be there for eternity.
So, are we doomed to text interfaces for the rest of our lives? Is chat the ultimate form factor? Absolutely not.
The solution to probabilistic inputs and outputs is probabilistic UIs. Future versions of ChatGPT (or whatever replaces it) will generate results and then generate a custom user interface to best display these results. Our phones (and future devices2) will no longer be defined by pre-installed apps and features. Instead, these will morph and evolve with each use3, based on what our device decides we need to see/do to get our work done. This will make the current paradigm seem prehistoric.
This paradigm shift makes me more bullish on native mobile apps, as I expect the LLMs that undergird this functionality to run on device. It also, obviously, makes me more bullish on on-device AI and the future ability for anyone to get just about any “real work” done from their phone. Lastly, it probably makes almost all SaaS apps moot, apart from ERPs/CRMs, as they will house the corporate cloud data and deterministic logic/rules that undergird the business versions of these apps. Eventually, they may not have UIs of their own at all.
The mobile era started to shift all workloads to tiny handheld devices, supported by centralised clouds. But not all problems could be solved on a small screen, limiting the full potential of this transition. I think LLMs will reaccelerate this transition and take it to its logical conclusion: a world where we probably don’t need bigger devices at all (and the AR future originally promised by Google Glass).
Footnotes
Google has tried to do this with their search results in recent years, where certain results will be represented in a much richer way than just some blue links. But they’ve barely scratched the surface when it comes to the total number of possible graphical representations of search results. ↩︎
This concept is particularly compelling for augmented reality interfaces which will present in front of our eyes the minimum effective UI to solve our problems. ↩︎
One thing I’m excited for is the ability for users to customise these user interfaces on demand. Don’t like the way the navigation works in your writing app? Ask your device to rearrange the UI for you. ↩︎