Abstract Vector Spaces · AI Engineer

16Abstract Vector Spaces

Start here

This is the last chapter, and it answers a question that has been hiding the whole time: what is a vector, really? We have pictured arrows. We have used lists of numbers. Now we reveal the bigger truth. A vector is anything that plays by the rules of adding and scaling. Once you accept that, every tool in this series suddenly works on things that are not arrows at all, like functions.

Watch the original

Follow along with 3Blue1Brown: Abstract vector spaces

The question we kept dodging

Is a vector an arrow, or is it a list of numbers? We bounced between both for fifteen chapters. The honest answer is: neither is the real definition. Both are just examples.

The arrow and the list are two ways to picture something more general. What actually makes a vector a vector is not its shape; it is the fact that you can add two of them and scale one by a number, and those operations behave sensibly. That is the whole requirement.

Functions are vectors too

Here is the surprise that makes the point. Consider ordinary functions, like f(x) = x^2 or g(x) = sin(x). You can do two familiar things to them:

Add them: (f + g)(x) = f(x) + g(x). Adding two functions gives a new function.
Scale them: (2f)(x) = 2 * f(x). Scaling a function by a number gives a new function.

These behave exactly like vector addition and scaling. So functions form a vector space. A function is a kind of vector, just one with infinitely many "components" (its value at every input).

Why this is not just a word game

If functions follow the same add-and-scale rules as arrows, then every theorem we proved about vectors automatically applies to functions. We get span, basis, linear transformations, and even eigenvectors for functions, for free. That is the power of spotting the shared structure.

Linear transformations on functions

Remember the two promises of a linear transformation from Chapter 3: it respects addition and scaling. Plenty of operations on functions do exactly that. The cleanest example is the derivative.

derivative of (f + g) = (derivative of f) + (derivative of g)
derivative of (2f)    = 2 * (derivative of f)

The derivative respects addition and scaling, so it is a linear transformation, just one acting on functions instead of arrows. Everything you learned about matrices and transformations applies to it.

You can even write the derivative as a matrix

To drive it home: pick a space of polynomials and use the basis 1, x, x^2, x^3, .... Any polynomial is a combination of these, so it has coordinates, just like an arrow.

The derivative, acting on these basis functions, can be written as a matrix. For example, the derivative of x^2 is 2x, the derivative of x^3 is 3x^2, and so on. Lining those up turns the derivative into a matrix that you apply by the same column rule from Chapter 3.

d/dx of:   1 -> 0,   x -> 1,   x^2 -> 2x,   x^3 -> 3x^2

as a matrix (on the basis 1, x, x^2, x^3):
| 0  1  0  0 |
| 0  0  2  0 |
| 0  0  0  3 |
| 0  0  0  0 |

Calculus and linear algebra, the same machinery. That is not a coincidence; it is the abstraction paying off.

The rules (axioms), in plain words

Mathematicians make this precise with a short list of rules, called axioms, that any vector space must follow. You do not need to memorize them, but here is the gist:

Adding vectors gives a vector, and order or grouping of additions does not matter.
There is a zero vector that changes nothing when added.
Every vector has an opposite that cancels it to zero.
Scaling behaves sensibly: scaling by 1 does nothing, and scaling distributes over addition.

If a set of objects obeys these rules, it is a vector space, and the entire toolbox of linear algebra is yours to use on it. Whether the objects are arrows, lists, functions, or signals does not matter.

Why this matters

This abstraction is the reason linear algebra is everywhere.

Signals and audio are treated as vectors; the Fourier transform is a change of basis on them.
Quantum mechanics describes states as vectors in an abstract space, with measurements as linear operators.
Machine learning represents words, images, and users as vectors in spaces with hundreds of dimensions, none of which you can draw, yet all of which obey these rules.

The arrows were never the point. They were training wheels for the rules.

The mental model to keep

A vector is anything you can add and scale by the standard rules. Arrows, number lists, and functions are all examples. Because they share the rules, they share every result in this series. That shared structure is what makes linear algebra one of the most reusable tools in all of math.

Quick gotchas

Do not get attached to arrows. They are one example, useful for intuition, not the definition.

The objects can be wildly different. Functions, polynomials, and matrices are all vectors in their own spaces.

The axioms are the real test. If addition and scaling behave properly, it is a vector space, full stop.

What you walked away with

A vector is anything that obeys the rules of adding and scaling, not just an arrow or a list.
Functions form a vector space, and operations like the derivative are linear transformations on them.
Pick a basis and even the derivative becomes a matrix.
Because so many different objects share these rules, linear algebra applies to all of them, which is why it shows up across science, engineering, and AI.

That is the series. You started with a single arrow from the origin, and you finished seeing that the same handful of ideas, add and scale, basis and span, transformations and eigenvectors, describe everything from 3D graphics to neural networks. Go back to Chapter 1 anytime; it reads differently once you can see the whole shape. Well done.