Generative AI Tutorial
Are you curious about how AI can enhance your work, but feel overwhelmed by the technical jargon? If so, you’re in the right place!
When you last interacted with AI, did it seem like a mysterious magic box, or did you feel empowered to mold it to your needs?
In this primer, we’re going to remove the magic, and help you understand it as a tool you can manipulate.
The hammer
Design history
A hammer is one of humanity’s earliest tools. What started out as a hand-held flat rock has been refined for thousands of years into the item you can find at the hardware store.
From the shape of the head all the way to the end of the handle, every part has been optimized to the best of our abilities. There are even different kinds of hammers for different kinds of jobs. They are a marvel of design.
The limitations
Do you know what a hammer doesn’t do? Hammer in nails. You can put a hammer next to a pile of nails and a stack of wood, and nothing will happen.
You have to hammer in the nails. A hammer only makes it easier.
A hammer is a tool. It isn’t always the right tool, but it’s fantastic when it is.
Using a hammer as an example
We’re using a hammer as an example because it’s a simple and familiar tool. It takes practice to learn how to use it. The more practice you get, the better you will be. Have you ever seen someone who does construction or remodeling use a hammer? They are absolutely amazing, true experts with thousands of hours refining their usage.
Like a hammer, generative AI won’t do the work for you. It’s a tool that amplifies your input - but how you use it determines the result.
Asking “how can I use AI to solve my problems?” is like asking “how can I use this hammer to fix my car?” A better question is “I have this problem. Can I use AI to help solve it?”
Generative AI is a tool
Generative AI, also known as “LLMs”, or even more generally as “AI”, is also a tool. Just like a hammer, it does nothing on its own. You have to use it. It also takes practice to learn how to use it. The more you practice, the better you will be.
In this primer, we’re going to use “AI” to refer to “Generative AI”. Just keep in mind that not all AI is the same.
“What comes next?”: the game
Like a hammer, an AI requires practice and skill to use effectively. Let’s explore this idea with a simple game called “What comes next?”
The rules for “What comes next?” are pretty simple:
- There must be an answer
- There are no wrong answers
- Answers can be as long or as short as you like
“ABC”
Let’s start with “ABC”. What comes next?
How about “DEF”? That meets our rules: it’s an answer, and it isn’t wrong.
What about “123”? Yup, that’s okay too.
How about “NBC CBS FOX”? That’s a little different, but it isn’t wrong.
Is “laser printer” an answer? It sure is! It isn’t wrong, either.
What’s the best answer?
If you were to ask 100 people “ABC. What comes next?”, what do you think the most common answer would be?
If you asked 100 random people, maybe most would say “DEF”.
If you asked 100 TV producers, you might get “NBC CBS FOX” the most.
Ask 100 grade-school teachers, and you might get “DEFGHIJKLMNOPQRSTUVWXYZ” as the most common answer.
It’s about probability
An AI is doing essentially the same thing. It is sifting through all possible responses, and returning the one that it calculates is the most probable response.
It isn’t deterministic, which is what most people are used to. If you ask a calculator “what is 2 + 2?”, it will always answer “4”.
Generative AI is probabilistic. It returns what is probably a good response. It’s like asking 100 people “what rhymes with ‘orange’?” You’ll get a lot of answers, but no single right answer.
“Laser printer” may seem to come out of nowhere, but that’s what an AI is doing - connecting ideas based on patterns, not logic.
The prompt
Back to the game
Let’s expand our “What comes next?” question a little bit, and consider our previous answers.
“Pretend you are a grade school teacher. ABC. What comes next?”
- “DEF” is a very likely response.
- “123” isn’t as likely, but isn’t completely out of bounds.
- “NBC CBS FOX” is really unlikely.
- “laser printer” wouldn’t come up at all.
The prompt is important
An AI doesn’t know what a “question” is. Nor does it have any knowledge of an “answer”. Like a hammer, AI doesn’t “know” the answer, it responds based on how skillfully you guide it.
The question we’ve been asking isn’t really a question. It’s a prompt. It’s the beginning of a question and answer.
By starting our prompt with “Pretend you are a grade school teacher”, we’ve reduced the number of possible answers a lot. We are more likely to get a response which better reflects our intent.
People who use AIs on a regular basis often have prompts that are several hundred lines long. They may even have libraries of prompts that they use. In more advanced scenarios, custom tools string together several prompts.
Tokens
When you’re using AIs, you may see something about “tokens”. For example, “$.10/million tokens”. What’s a token?
A token is a numerical representation of a piece of data. That data may be a word, part of a word, or multiple words. It may also be part of a picture, part of a sound, or even part of a video.
Think of it like Legos, where each Lego has part of a word on it. One piece may say “cat”, and two others might split “running” as “run” on one block, and “ing” on another. Each Lego brick (token) is a piece of a word, sentence, or image. Just like stacking bricks builds a house, the AI combines tokens to build answers, poems, or art. But the bricks themselves are just numbers the AI can crunch.
This is significant because ultimately everything in a computer ends up as ones and zeros. Computers have no way of storing and using words, sounds, or pictures, only numbers. When it is calculating the probabilities, it is using a statistical analysis of numbers.
An AI doesn’t answer questions, draw pictures, or create videos. It performs math calculations on numbers to determine the probability of the next token.
The context
An AI doesn’t have a memory, the way you and I may think of it. It has a context. Once tokens are calculated, they’re stored temporarily in the AI’s context - think of it like a notepad for active ideas. This is more like a short-term memory, which has only the most recent part of the conversation. That includes your prompt, and the response.
The context is limited. They can be pretty long, but about 128,000 tokens is typical.
Think of it as a post-it note. It isn’t an unlimited amount of space, and its usefulness is based on what you put there.
An ABC example
Let’s go back to the “ABC” game, and let’s say every letter is a token. In other words, A = 1, B = 2, C = 3, D = 4, etc., all the way to “Z”.
That means that “ABC” is three tokens, and “DEF” is three tokens, for a total of six tokens.
Let’s also say that the context - our post-it note - is limited to 10 tokens. “ABCDEF” is only six tokens, so it all fits into the context, the short-term memory.
What happens if our context is only two tokens?
Starting with “ABC”, we can only fit two tokens, and only “BC” fits into the context. When calculating what comes next, the AI will only use “BC”. Ask a hundred people “BC, what comes next?” and you’ll get a wide variety of answers.
This is where a lot of “hallucinations” start. It’s pretty hard to give a good answer when you can’t remember the whole question.
The model
If you were to ask 100 random people “ABC, what comes next?” you’d get a pretty wide variety of answers. Sure, you’d get a decent chunk of people who answer “DEF”, but it’s entirely possible you’d get at least one person who would give an almost random answer, like “ZYX, or even “laser printer”.
If you were to ask 100 grade school teachers the same question, the likelihood of “DEF” being the most common answer goes up considerably. You would be less likely to get “123”, and even less likely to get “ZYX”.
What about those 100 TV network executives? “CBS”, “NBC”, and “Fox” jump to the top of the probability list. Their “training data” (experience) shapes their answers, just like AI’s.
You can think of those samples of 100 people as the model. The model is the initial training data which sets the base probabilities of the responses. You influence those probabilities with your prompt.
Like training a carpenter, the AI’s model is shaped by the data it learned from. But like asking strangers vs. experts, the answers depend on who “taught” it.
Bringing it all together
An AI uses probabilities to generate its response. Remember, an AI doesn’t “know” answers, it guesses what comes next. Your prompt tells it what kind of guess to make.
- A generative AI selects the most probable next token
- A token is a number which represents a piece of data
- The user provides a prompt to influence those probabilities
- A model is the initial data which provides the base probabilities
Just like using a hammer, you can improve your results with consistent practice. Yes, you’ll be bending a lot of nails at first. With patient practice, you’ll learn how to use an AI very effectively, and when to use it at all.