What Is Machine Intelligence?
What Is Machine Intelligence?
“The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.” — Edsger W. Dijkstra
Learning Objectives
By the end of this module, you will be able to:
- Trace the key milestones in AI history that led to modern LLMs
- Explain why we use “machine intelligence” instead of “artificial intelligence” in this program
- Distinguish between narrow AI, general AI, and the marketing hype
- Articulate what LLMs actually are — and what they’re not
- Approach AI tools with informed curiosity rather than fear or blind trust
Why “Machine Intelligence”?
Let’s start with a confession: “Artificial Intelligence” has become almost meaningless.
The term gets slapped on everything from spam filters to robot vacuums to existential threats to humanity. When a word means everything, it means nothing.
We use Machine Intelligence in this program for two reasons:
-
Precision: We’re talking about specific capabilities — pattern recognition, language processing, generation — not some vague “smartness”
-
Perspective: “Artificial” implies fake or lesser. But what these systems do is genuinely impressive, even if it’s different from human thinking. They’re machines that exhibit intelligent behavior in specific domains.
This isn’t just semantics. How you frame something shapes how you use it. Students who think AI is magic tend to trust it blindly. Students who think AI is hype tend to dismiss it. Neither serves you well.
The goal: informed partnership. Understanding what the machine does well, where it fails, and how to work with it effectively.
A Brief History (You Need This Context)
Understanding where we are requires knowing how we got here. This isn’t academic history — it directly explains why today’s tools work the way they do.
The Dream: 1950s-1960s
1950: Alan Turing publishes “Computing Machinery and Intelligence,” proposing the famous Turing Test. The question: Can machines think?
1956: The term “Artificial Intelligence” is coined at the Dartmouth Conference. Researchers predicted human-level AI within 20 years.
1966: ELIZA, a simple chatbot, convinces some users it understands them — by reflecting their own words back as questions. (“I’m feeling sad.” → “Why are you feeling sad?”) This demonstrated something important: humans want to believe machines understand.
The Winters: 1970s-1980s
Twice, AI research hit walls and funding dried up — periods called “AI winters.”
Why? The early approaches (hand-coded rules, symbolic logic) couldn’t handle real-world complexity. You can’t write rules for everything. The world is too messy.
Key lesson: Rigid, rule-based systems fail when facing novel situations.
The Neural Turn: 1980s-2010s
Researchers returned to an old idea: instead of programming rules, let machines learn patterns from data.
Neural networks — loosely inspired by brain structure — could find patterns humans couldn’t specify. Feed them thousands of cat photos and dog photos, and they learn to distinguish cats from dogs, without anyone defining “cat.”
Progress was slow. The computers weren’t powerful enough. The data wasn’t available. But the approach was sound.
The Deep Learning Revolution: 2012-2020
Three things converged:
- GPUs: Graphics cards, designed for video games, turned out to be perfect for neural network math
- Big Data: The internet generated massive datasets for training
- Algorithms: Better network architectures (especially “deep” networks with many layers)
2012: A neural network crushes the competition in ImageNet, an image recognition challenge. Deep learning arrives.
2016: AlphaGo defeats the world champion at Go — a game so complex that brute-force search can’t work. The system learned to play by studying human games, then playing itself millions of times.
This was narrow AI — superhuman at one specific task, useless at everything else. But it proved machines could master complex domains.
The Transformer: 2017
A paper titled “Attention Is All You Need” introduced a new architecture: the Transformer.
The key insight: when processing language, some words matter more than others for understanding meaning. “The cat sat on the mat because it was tired” — what does “it” refer to? The transformer architecture lets the model attend to relevant context.
This architecture could process text in parallel (fast!) and maintain coherence over long passages. It became the foundation for everything that followed.
The LLM Era: 2020-Present
Researchers discovered something remarkable: if you train a transformer on enough text with enough parameters, it doesn’t just learn grammar. It learns… a lot.
- Factual knowledge (though unreliably)
- Reasoning patterns (though imperfectly)
- Code structure (surprisingly well)
- Writing styles (very well)
- Task-following (very well)
GPT-3 (June 2020) showed that large language models could perform tasks they weren’t explicitly trained for, just from examples in the prompt.
ChatGPT (November 2022) added something crucial: the raw model was fine-tuned using human feedback (a technique called RLHF — Reinforcement Learning from Human Feedback). This made it not just capable, but helpful and conversational. Suddenly, millions of people were talking to an AI that actually tried to answer their questions usefully.
2023-2025: Rapid iteration. GPT-4, Claude, Gemini, Llama, and dozens of others. Multi-modal capabilities (text + images + audio). Longer context windows. Better reasoning. Tool use.
November 2025: Claude Opus 4.5 and GPT-5.2 crossed a threshold — not just better, but qualitatively different in their ability to handle complex, multi-step tasks. More on this in Module 02.
What LLMs Actually Are
Let’s be concrete about what you’re working with.
The Basic Mechanism
A Large Language Model is a text prediction system trained on enormous amounts of text.
At its core, it does one thing: given some text, predict what text should come next.
That’s it. The magic emerges from scale.
When you train a model to predict the next word across trillions of words of human writing — books, websites, code, conversations — it must learn:
- Grammar and syntax
- Facts and relationships
- Argument structures
- Coding patterns
- Stylistic conventions
- Task formats
Not because anyone told it to. Because predicting what comes next requires understanding these patterns.
A Useful Analogy (With Caveats)
Imagine someone who has read everything ever written but experienced nothing directly.
They can complete the phrase “the capital of France is…” because they’ve seen “Paris” follow that pattern thousands of times. They can structure a Python function because they’ve seen millions of them. They can construct an argument because they’ve seen countless essays.
But they’ve never been to Paris. They’ve never run a Python program. They’ve never believed anything — they just know what words tend to follow other words.
The caveat: This analogy is useful but imperfect. LLMs don’t actually “know” things the way humans do. They don’t have memories or beliefs. What they have is statistical patterns — weights in a neural network that make certain outputs more likely given certain inputs. When an LLM produces “Paris” after “the capital of France is,” it’s not recalling a fact; it’s completing a pattern.
This matters because it explains both the power and the failure modes. The patterns are incredibly rich — but they’re patterns, not understanding.
What They Do Well
- Text generation: Writing, summarizing, translating, explaining
- Code generation: Producing functional code from descriptions
- Pattern completion: Following examples and formats
- Information synthesis: Combining knowledge from training data
- Instruction following: Completing tasks as specified
What They Do Poorly
- Factual accuracy: They can confidently state falsehoods (“hallucinations”)
- Reasoning: Complex multi-step logic often fails
- Math: Arithmetic errors are common, especially with larger numbers
- Current events: Training data has a cutoff; they don’t “know” recent news
- Self-knowledge: They don’t reliably know what they know or don’t know
We’ll explore capabilities and limitations deeply in Module 03.
The Hype vs. Reality Spectrum
You’ve probably encountered extreme claims about AI:
Doomsayers: “AI will destroy humanity / take all jobs / end civilization”
Boosters: “AI will solve all problems / achieve consciousness / surpass humans at everything”
Both camps share a flaw: they treat AI as a single thing with a single trajectory, rather than a collection of specific technologies with specific capabilities.
A More Useful Frame
Think of current AI as powerful but narrow tools that:
- Excel at specific tasks when properly directed
- Fail unpredictably outside their strengths
- Amplify human capabilities rather than replace human judgment
- Require skill to use effectively
A chainsaw is a powerful tool. In skilled hands, it transforms what’s possible. In unskilled hands, it’s dangerous. In any hands, it’s useless for tasks it wasn’t designed for.
LLMs are similar: powerful, requiring skill, suited for specific purposes.
The Partnership Model
This program teaches human-AI partnership, not human replacement or AI worship.
The model:
| Human Brings | AI Brings |
|---|---|
| Judgment | Speed |
| Accountability | Scale |
| Goals and values | Pattern matching |
| Real-world grounding | Tireless iteration |
| Verification | Generation |
Neither is sufficient alone. The human without AI is slower, more limited. The AI without human direction is unreliable, purposeless.
Your job isn’t to be replaced by AI or to reject AI. It’s to become skilled at directing AI effectively while maintaining your own judgment and accountability.
This is what we’ll call vibe engineering (more in Module 04) — the professional approach to AI collaboration.
Practical Exercise: Meeting the Machine
Let’s move from concepts to direct experience.
Exercise 1: The Basic Interaction
- Open Claude (or ChatGPT)
- Ask: “What are you?”
- Read the response carefully
- Now ask: “What are you not?”
- Compare the two responses. What does the model claim about itself?
Exercise 2: The Limits of Knowledge
- Ask: “What happened in the news yesterday?”
- Ask: “What is the square root of 17 times 23?”
- Ask: “Write a haiku about a concept that doesn’t exist yet”
- Note: Which tasks worked? Which failed? How did failures manifest?
Exercise 3: Prediction in Action
- Type: “The quick brown fox jumps over the”
- Let the model complete it
- Now type: “In software engineering, the most important principle is”
- Let it complete
- Ask yourself: How is the model “deciding” what comes next? (Hint: it’s predicting what text typically follows.)
Exercise 4: Hallucination Detection
- Ask: “Tell me about the scientific research of Dr. Eleanor Whitmore at Stanford on neural network optimization in 2019.”
- The model will likely produce a confident, detailed response
- Now search the web for “Dr. Eleanor Whitmore Stanford neural networks”
- Key insight: The model generated plausible-sounding text, but Dr. Eleanor Whitmore doesn’t exist. This is a “hallucination” — the model completing a pattern (academic biography) without any connection to reality.
- Reflection: How might you detect hallucinations in more subtle cases?
Key Insights
| Concept | Implication for Your Work |
|---|---|
| LLMs predict text | They’re completing patterns, not “understanding” in the human sense |
| Scale creates emergence | Capabilities arise from training, not explicit programming |
| Knowledge comes from text | They know what they’ve “read,” not what they’ve experienced |
| Tools amplify humans | Your judgment directs their power |
| Hype obscures reality | Stay curious but skeptical |
Connection to What’s Next
You now have a foundation for understanding what you’re working with.
Module 01 goes deeper into how LLMs work — tokens, context windows, temperature — the concepts you’ll need to use them effectively.
Module 02 explains what changed in November 2025 — why the tools you’re using are qualitatively different from what existed a year ago.
The history matters because it explains the present: these are pattern-matching systems trained on text, not thinking machines or magic oracles. Understanding that shapes how you’ll work with them.
Reflection Questions
-
The Turing Test asks whether a machine can fool a human into thinking it’s human. Is that the right question to ask about intelligence? What might be a better question?
-
ELIZA (1966) convinced some users it understood them using simple tricks. How might your own tendency to see understanding where there isn’t any affect how you work with modern AI?
-
“A chainsaw is a powerful tool.” What’s a task where an LLM is like a chainsaw? What’s a task where using an LLM would be like using a chainsaw to butter toast?
Further Reading (Optional)
- Attention Is All You Need — The transformer paper (technical, but foundational)
- On the Dangers of Stochastic Parrots — Critical perspective on large language models
- What Is ChatGPT Doing… and Why Does It Work? — Stephen Wolfram’s accessible explanation
Next module: How LLMs Work (Conceptually) — tokens, prediction, and the mental models you need to use these tools effectively.