Start here
Chapter 2 of the Essence of Linear Algebra series, and the second of the three fundamentals. In Chapter 1 we learned the two moves: add and scale. This chapter asks the obvious follow-up question: if those are the only moves I have, how much of space can I actually reach?
Follow along with 3Blue1Brown: Linear combinations, span, and basis vectors
The hidden assumption in every coordinate
When you write the vector [3, 1], you are quietly trusting two reference arrows you never named.
- i-hat (written
î): the arrow of length 1 pointing right,[1, 0]. - j-hat (written
ĵ): the arrow of length 1 pointing up,[0, 1].
[3, 1] really means: take 3 copies of î, take 1 copy of ĵ, add them.
| 3 | | 1 | | 0 |
| 1 | = 3 * | 0 | + 1 * | 1 |
(î) (ĵ)
Those two arrows, î and ĵ, are the basis vectors of the standard xy-plane. Every coordinate you have ever written was secretly a recipe: "this much of the first basis vector, that much of the second".
Coordinates are not sacred. They are just the amounts you scale your chosen basis vectors by. Pick different basis vectors and the same point gets different coordinates. That single realization is what Chapter 13 (Change of Basis) cashes in later.
Linear combination: scale, then add
When you scale two vectors and add the results, that is a linear combination:
result = a * v + b * w
where a and b are any scalars you like. Turn the dials a and b and watch result move around.
The word "linear" is here because if you fix one dial and slide the other, the tip of result travels in a straight line.
Span: everything you can reach
The span of a set of vectors is the full collection of every linear combination of them. It is the answer to "where can these arrows take me?"
For two vectors in 2D, three cases cover everything:
- Two arrows pointing different ways span the entire plane. By dialing
aandbyou can land on any point you want. This is the normal, healthy case. - Two arrows on the same line are redundant. No matter how you scale and add, you never leave that line. The span collapsed to 1D.
- Two zero vectors can only ever give you the origin. The span is a single point.
The lesson: more vectors do not automatically mean more reach. What matters is whether each one points somewhere genuinely new.
Linear independence: does this vector pull its weight?
A set of vectors is linearly independent when every vector adds a new direction the others could not already reach.
A set is linearly dependent when at least one vector is redundant, meaning it already sits in the span of the others. You could throw it away and lose nothing.
The crisp test:
Vectors are linearly dependent if one of them can be written as a linear combination of the rest.
Examples in 2D:
[1, 0] and [0, 1] -> independent (one is right, one is up, different directions)
[1, 0] and [2, 0] -> dependent (both lie on the x-axis; the second is just 2x the first)
[1, 2] and [2, 4] -> dependent (second = 2 * first, same line)
"Linearly dependent" does not mean the numbers look similar. [1, 2] and [2, 4] are dependent because one is a scalar multiple of the other. [1, 2] and [2, 1] look just as similar but are independent. Always ask: can I build one from the other by scaling and adding? If yes, dependent.
Basis: the smallest set that spans the space
Put the two ideas together and you get the punchline of the chapter.
A basis of a space is a set of linearly independent vectors that spans that space.
Two conditions, both required:
- Spans the space: you can reach every point. Nothing is missing.
- Linearly independent: no vector is wasted. Nothing is redundant.
î = [1, 0] and ĵ = [0, 1] are the standard basis for 2D. But they are not the only basis. [1, 1] and [1, -1] also point different directions, so they also span the whole plane and are independent. They are a perfectly good, slightly tilted basis. The same point just gets different coordinates in that system.
A basis for a 2D space always has exactly 2 vectors. For 3D, exactly 3. That count, the number of vectors in any basis, is the dimension of the space.
Why an interviewer cares
These three words show up constantly once you leave the textbook:
- Span is "what outputs can this system produce?" In ML it is the set of predictions a linear model can possibly make.
- Linear independence is "are my features redundant?" Two columns of data that are scalar multiples carry the same information. That is collinearity, and it wrecks regression.
- Basis is "what is the smallest set of directions that describes my data?" Dimensionality reduction (PCA) is the art of picking a smarter, smaller basis.
The geometry you are building here is the same geometry under the hood of those tools.
Span = everywhere scale-and-add can take you. Independent = every vector earns its spot. Basis = the smallest independent set that still reaches everything. Dimension = how many vectors that takes.
Quick gotchas
Span is about directions, not number of vectors. Three vectors all on one line still only span a line.
The zero vector is never independent. Any set containing it is automatically dependent, because 0 = 0 * (anything).
A basis is not unique. Infinitely many bases exist for the same space. They all have the same size, which is why dimension is well defined.
What you walked away with
- A linear combination is scale-then-add:
a*v + b*w. - Span is every linear combination of a set, the region you can cover.
- Linearly independent means no vector is redundant; dependent means one sits in the span of the others.
- A basis is an independent set that spans the whole space, and its size is the dimension.
Next up, Chapter 3, the last of the three fundamentals: we stop moving single vectors around and start moving all of space at once. That is a linear transformation, and the grid it leaves behind is exactly what a matrix records. This is where the numbers in a matrix finally start to mean something.