Want a deeper understanding of generative AI? You’re not alone. With the release of ChatGPT, OpenAI’s generative language model, it seems like the only thing the Twitterverse is talking about. By now, many people have generated new versions of Van Gogh’s starry night or rewritten Bohemian Rhapsody in the style of Shakespeare. But it can be challenging for a non-technical audience (👋) to conceptualize generative AI in terms of practical business applications.
We talked to 27 Tribe machine learning engineers from startups like OpenAI and Google Brain to cut through the tidal wave of generative AI content and get a deeper understanding of:
- What non-technical people need to understand about the various generative AI models
- What the near-term practical business opportunities are (0-3 years)
- Where the risks of working with generative models lie & how workstreams are going to change (more on this in next week’s article)
What is generative AI?
In this post, we’ll talk about two types of models – image-based and large language models (LLMs). The first uses natural language text prompts to generate an image. The second type are used to generate text from a prompt. Here are some of the specific models you’ve probably heard a lot about in the last few weeks:
- GPT-3 – The third generation of LLM trained by OpenAI, this one of the most powerful models currently available. It can be fine tuned for a wide range of tasks – language translation, text summarization, and more. (Note: GPT-4 is expected in early 2023)
- ChatGPT – This variant of OpenAI’s GPT-3 model has been specifically designed for chatbot and conversational applications. Because it’s been trained on a large dataset of conversational text, it is able to generate context-specific responses in conversations. While less powerful than GPT-3, it is faster and more efficient (and many engineers also said it hallucinates less).
- Whisper – OpenAI’s Whisper is an automatic speech recognition (ASR) system that enables transcription in multiple languages as well as translation from those languages into English.
- Codex – Codex is an OpenAI model that translates natural language into code. You can use codex for tasks like turning comments into code, rewriting code for efficiency, or completing your next line in context.
- Dall-E-2 – Dall-E-2 is an AI model that can create realistic images and art from a text description. It can combine concepts, attributes, and artistic styles in a single prompt.
- Dreambooth – Google’s Dreambooth is an algorithm used to fine-tune existing text-to-image models. You input an image of a corgi and it fine tunes your corgi in whatever context you prompt: in a bucket, sleeping, getting a haircut.
- Stable Diffusion – This text-to-image diffusion model generates photo-realistic images given any text input. You may have heard about the controversial AI-generated artwork created using Stable Diffusion that won an award last fall.
Why you should care
Before we dive in, it’s worth assessing why businesses should care in the first place. What makes this a turning point that everyone – tech and non-tech businesses alike – need to pay attention to, is a combination of technology and market readiness.
In the 2010s we had deep learning. In 2019 GPT2. In 2022 we have GPT3. “In all of this, the time to each new breakthrough has been shorter,” said Andrew Carr, a senior researcher with Gretel AI (formerly at OpenAI where he worked on data quality for Codex). “We’re in a phase of exponential growth.”
In other words: the hype is real, but so is the sudden breakthrough in technology. By 2025, Gartner predicts that generative AI will be used in 50% of drug discovery initiatives and be producing 10% of all data.
Meghan Keaney Anderson, Head of Marketing at Jasper, an AI content platform, agrees. “In technology, there are iterative ages and there are inventive ages,” she said. “We just came through a long iterative phase.”
Meaning that in the last decade, most of the focus has been on refining existing tools and on differentiation over invention. A better this. An Uber for X. But now we’re entering a new field of progress. “Solving things never attempted before,” said Anderson, “and yes, having our share of misses along the way.”
So…kind of like the hype around Web3? Yes and no. In every conversation with a researcher or ML engineer, I asked whether this is the platform shift we’ve all been waiting for (iPhone) or the revolution that hasn’t quite happened yet (Web3). Unanimously, they said that this moment is different.
“Yes, there is hype, but it’s very different,” said Abhishek Bhargava – Y Combinator alum & founder of a crypto hedge fund and BetterBrain, a human-in-the-loop generative AI startup for business intelligence. “Web3 was purely market risk. Few companies could explain the problems they were solving without using the words Web3 or blockchain. Generative AI is technical risk. The problems are clear, foundation models are getting better by the day, and we’re at an inflection point where the tech is finally good enough to deliver real business value .”
“Web3 was purely market risk. Few companies could explain the problems they were solving without using the words Web3 or blockchain. Generative AI is technical risk. The problems are clear, foundation models are getting better by the day, and we’re at an inflection point where the tech is finally good enough to deliver real business value .”
Abhishek Bhargava, Founder at BetterBrain
Practical near-term business applications for generative AI
At a recent Tribe forum, we discussed the biggest opportunities and inherent risks of generative AI with 27 ML engineers and AI researchers. From there, we dug into some of the emerging themes with technologists and founders working in the space. First, everyone agreed that there’s no way to predict what the world looks like in ten years. At the same time, some key themes emerged when it comes to the strongest practical business applications in the 0-3 year range. Let’s dive in:
A co-pilot for every industry
Imagine a world where every knowledge worker has a personal army of interns to do research, synthesize information, and take care of repetitive tasks like writing emails. This is the future envisioned by nearly every engineer, researcher, and founder we interviewed for this article.
“To me, co-pilots are the area that are going to create the most business value in the long run,” said Bhargava. Co-pilots, in this case, means a class of tools that augment an LLM with the knowledge base of a specific company or industry. “That connective tissue between generative AI and domain expertise is where the most interesting companies will be building over the next two years.”
The use cases for this type of tool are nearly infinite and depend heavily on the specific needs of a vertical. For a recruiter, it might look like an AI assistant scouring 100 Linkedin profiles and providing a synthesis of relevant information. For a biomedical researcher, it might be an AI lab assistant who drafts the experiment’s documentation. For a programmer, it has already happened with Github Co-pilot, which uses the OpenAI Codex to suggest code right from the editor.
Nearly everyone aligned around three key value areas we can expect to see these co-pilots emerging in first:
- Tasks that involve copying, synthesizing, and transforming information
- Human-in-the-loop tools that rely on an expert’s validation
- Tools that focus on specific verticals vs general tools (In other words, we can expect more Harveys and fewer Zapiers for AI – more on this below)
At a high level, co-pilots have the power to 10x productivity for knowledge workers, especially teams with repetitive tasks. Read on for a breakdown of what this could look like in specific verticals.
“It’s like the end of writer’s block, but not just for writers – for everyone.”
Andrew Maboussin, Engineering Lead at Surge AI
Sales & Marketing
Marketing is a place where generative AI is already proving its value with companies like Jasper, an AI content platform that recently raised $125 million at a $1.5 billion valuation. The internet is abuzz with sales and marketing teams using ChatGPT to draft marketing plants, write SEO blog posts, and outbound sales emails. But this is just the beginning.
“Where it gets really exciting to me is when we have a shared action space between humans and the AI," said Carr. “Imagine a stable diffusion like plug-in for photoshop where a designer can issue a command and see instantaneous results, but instead of creating pixels, the model creates layers and brush strokes” Change the sky to sunset or come up with 20 logos in an instant.
Right now, most of the focus on marketing tools are in natural language processing (NLP), but many of the engineers pointed to an opportunity for generative tools that can integrate more fully into the funnel: generate 20 versions of an email, A/B test them, and then synthesize the results.
Andrew Maboussin – an engineering lead at Surge AI, a data-labeling platform that helps power LLMs for companies like Google, Microsoft, and Stanford (formerly an ML engineer at Twitter)– sees a future where engineers no longer write code.
“Engineers can become co-creators who work with the model to build the code instead of writing it directly,” said Maboussin, describing a human-in-the-loop workflow where the engineer specifies requirements, brainstorms new ideas, and verifies the results. “It’s like the end of writer’s block,” he said, “but not just for writers – for everyone.”
Even now, ChatGPT offers a robust enough co-pilot for coding that it’s replacing existing tools. “I think it can replace stack overflow,” said Bhargava, referring to the popular knowledge sharing platform for programmers. “I mean, it already has for me.”
While the potential exists for a co-pilot for every industry, the near-term applications are most compelling for industries like law, healthcare, and the hard sciences. “These models are doing the best job when the domain text is rigid and structured,” said Prem Viswanathan, founder of SwiftCX, a platform for contact center management that leverages natural language processing (NLP). Viswanathan’s work centers on using NLP to help companies understand existing content and unstructured data – like customer support tickets.
“So, in the near term, the strongest applications are going to be for industries like law, coding, and biomedical labs,” he said. “I could see it being used like a lab assistant to generate full length experimentation documents.”
This is already happening. Harvey, an Open AI-backed startup which recently emerged from stealth with $5 million in funding, uses generative AI to answer legal questions (i.e. a copilot for lawyers). The founders Gabe Pereyra (disclosure: a Tribe alum and customer) – a former research scientist at DeepMind, Google Brain, and Meta AI – and Winston Weinberg – a former antitrust litigator – embody that combination of technical and industry knowledge that will make this new technology so powerful.
“We want Harvey to serve as an intermediary between tech and lawyer, as a natural language interface to the law,” Pereyra told TechCrunch. “Harvey will make lawyers more efficient, allowing them to produce higher quality work and spend more time on the high value parts of their job.”
Innovation in the world of biomedical research has already been vastly accelerated by access to computer vision and deep learning. Companies like Recursion ( disclosure: a Tribe AI customer) are using machine learning to discover new therapeutic treatments for rare diseases faster and with a much higher success rate than in the past. And platforms like md.ai are building tools for data annotation and model development.
Recent research has also found that domain-agnostic models can learn relevant medical concepts, so there are clear opportunities to leverage this technology in a medical context, especially in image-based fields like radiology.
VR & Gaming
While image-generating models and apps have gotten a huge amount of public interest, the engineers we interviewed unanimously predicted that there will be more concrete business opportunities to leverage LLM. The exception was in gaming.
“Purely visual mediums like VR and gaming are going to be the area where I see the biggest opportunity around generative images,” said Ashwath Rajan, an ML engineer and creator of dreambooth.market. Like the rest of us, game designers will be able to iterate 10x faster. And be able to create beautiful, immersive worlds at a fraction of the cost.
“I could see a marketplace experience where artists can opt-in to sell their styles,” he says. “But a lot of these content-based businesses built around fine-tuning [an open source model] feel like a race to the bottom. I think gaming is the space to build.”
Another engineer pointed out that even Lensa, which pulled in $8 million in revenue in Dec 2022 alone, is basically just “a wrapper around stable diffusion.” From a market perspective, there can only be so many Lensas (and that’s putting aside the ethical implications about artists’ IP and other abuse).
"A lot of these content-based businesses built around fine-tuning [an open source model] feel like a race to the bottom. I think gaming is the space to build.”
Ashwath Rajan, ML Engineer & Creator of dreambooth.market
Media & E-commerce
Another recurring theme across our conversations was personalization taken to the extreme. In entertainment, this could look like a new streaming series with personalized characters and settings depending on a user’s preferences (imagine something like Netflix’s new randomized series Kaleidoscope but driven by generative AI). In e-commerce, generative AI will move the industry from personalized product recommendations to personalized everything. Right now, RunwayML gives brands the ability to customize video and images to appeal to entirely different audiences by changing the background, style, and more.
“I think this will go further,” said Carr. “I can see automatic transformation of a product page through a new web browser running a program synthesis engine. Hyper personalized web at your fingertips.”
The conclusion about the relationship between content and consumer is clear. “Content is going to move from a one to many to a one to one relationship,” said Carr. And this will have trickle down effects across industries.
“Content is going to move from a one to many to a one to one relationship.”
Andrew Carr, Senior Researcher at Gretel AI
At Tribe, we’ve already seen a surge in demand from companies who want to work with ML talent experienced in generative AI. It’s clear that not every company can hire its own generative AI experts to take advantage of this emerging technology – there simply aren’t enough.
“There’s going to be a big opportunity [in the market] to build the infrastructure layer that makes it easy for engineers who aren’t LLM experts to build apps within generative AI,” says Maboussin. “So everyone can use this technology.”
A number of engineers also drew parallels to the rise of ML ops tools that allowed teams to increase the pace and quality of model development and production with a focus on data infrastructure and the deployment cycle. In the same vein, there’s going to be a huge opportunity for companies out there to build the tools that will allow any company to take advantage of the power of generative AI.
The only thing everyone agreed on is that the future is impossible to predict. “20 years ago we said – we’ve cracked neural networks – self-driving cars are just on the horizon,” said Viswanathan. “It could be the same with LLMs.”
The point being that in the excitement of new technology there’s often a large gap between a working demo – 15 years ago for autonomous vehicles – and getting to a point where we accept something will work in any conditions. But, hey, that’s part of the excitement.
For more on Generative AI:
- Our next article will cover the risks for businesses of building on generative AI. Sign up for the Tribe AI newsletter to stay up to date.
- Schedule a call with the Tribe team to talk about the potential applications of generative AI for your organization