The .epub has very clean math done in HTML (no images), which is a cool way to do things. I've never seen this before. I wonder what the author used to produce the .epub from the .tex?
The .epub has very clean math done in HTML (no images), which is a cool way to do things. I've never seen this before. I wonder what the author used to produce the .epub from the .tex?
Instead, it is replaced with a red error box saying: [ Unable to render expression. ]
I wonder if there is an artificial limit for the amount of latex expression that can rendered per page.
I agree that this is not an ideal start - at least without any further clarification - for beginners but I think it works well for people that already known mathematical notation but not many specifics of linear algebra.
Also, I don't want to be the preacher bringing this into every argument but this is one of the genuinely good uses for AI that I have found. Bringing the beginning of a beginner friendly work down to my level. I can have it explain this if I'm unsure about the specific syntax and it will convey the relevant idea (which is explained in a bit of unnecessary complexity / generality, yes) in simple terms.
One of my pet peeves is using mathematical symbols beyond basic arithmetic without introducing them once by name. Trying to figure out what a symbol is and what branch of math it comes from is extremely frustrating.
But it is SOOO boring to learn the basic mechanics. There's almost no way to sugar coat it either; you have to learn the basics of vectors and scalars and dot products and matrices and Gaussian elimination, all the while bored out of your skull, until you have the tools to really start to approach the interesting areas.
Even the "why does matrix multiplication look that way" is incredibly deep but practically impossible to motivate from other considerations. You just start with "well that's the way it is" and grind away until one day when you're looking at a chain of linear transformations you realize that everything clicks.
This "little book" seems to take a fairly standard approach, defining all the boring stuff and leading to Gaussian elimination. The other approach I've seen is to try to lead into it by talking about multi-linear functions and then deriving the notion of bases and matrices at the end. Or trying to start from an application like rotation or Markov chains.
It's funny because it's just a pedagogical nightmare to get students to care about any of this until one day two years later it all just makes sense.
It's a disservice to anyone to tell them "Well, that's the way it is" instead of telling them from the start "Look, these represent linear functions. And look, this is how they compose".
In my experience it need not be like that at all.
One can start by defining and demonstrating linear transformations. Perhaps from graphics -- translation, rotation, reflection etc. Show the students that these follow the definition of a linear transformation. That rotating a sum is same as summing the rotated(s).
[One may also mention that all differentiable functions (from vector to vector) are locally linear.]
Then you define adding two linear transformations using vector addition. Next you can define scaling a linear transformation. The point being that the combination can be expressed as linear transformations themself. No need to represent the vectors as R^d, geometric arrows and parallelogram rule would suffice.
Finally, one demonstrates composition of linear transformations and the fact that the result itself is a linear transformation.
The beautiful reveal is that this addition and composition of linear transformations behave almost the same as addition and multiplication of real numbers.
The addition asociates and commutes. The multiplication associates but doesn't necessarily commute. Most strikingly, the operations distributes. It's almost like algebra of real numbers !
Now, when you impose a coordinate system or choose a basis, the students can discover that matrix multiplication rule for themselves over a couple of days of playing with it -- Look, rather than maintaining this long list of linear transformations, I can store it as a single linear transformation in the chosen basis.
But when I was learning linear algebra all I could think was "who cares about linear functions? It's the simplest, dumbest kind of function. In fact, in one dimension it's just multiplication -- that's the only linear function and the class of scalar linear functions is completely specified by the factor that you multiple by". I stuck to it because that was what the course taught, and they wouldn't teach me multidimensional calculus without making me learn this stuff first, but it was months and years later when I suddenly found that linear functions were everywhere and I somehow magically had the tools and the knowledge to do stuff with them.
https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2x...
Which books or “non-standard” resources would you recommend then, that do a better job?
Once you get to eigenvalues (in my opinion) things start to pick up in terms of seeing that linear spaces are actually interesting.
This approach sort of betrays itself when the very first section about scalars has this line:
> Vectors are often written vertically in column form, which emphasizes their role in matrix multiplication:
This is a big "what?" moment because we don't know why we should care about anything in that sentence. Just call it a convention and later on we can see its utility.
I learned from Strang, for what it's worth, which is basically LU, spaces, QR, then spectral.
I am really bad at math, for what it's worth; this is just the one advanced math subject that intuitively clicked for me.
The simplicity(/beauty) of matrix multiplication still irks me though, in the sense of "wow, seriously? when you work it out, it really looks that simple?"
The "x = b / A" is a bit of a gut-punch on first look because my mind immediately tells me all the ways that that does not work. It makes a some sense once I take a second to think about it, and I can see why it would make you want to jump in a little deeper, but matrices being non-commutative makes me cringe at the idea of a division operator which does not very very clearly spell out where it appears in the chain.
Ax = b is all well and good, but AxA^-1 = bA^-1 is not meaningful; the application/composition order is very important.
What also helped me as a visual learner was to program/setup tiny experiments in Processing[1] and GeoGebra Classic[2].
- [1] https://processing.org - [2] https://www.geogebra.org/classic
He also created a course on using Linear Algebra for machine learning:
> Linear algebra concepts are key for understanding and creating machine learning algorithms, especially as applied to deep learning and neural networks. This course reviews linear algebra with applications to probability and statistics and optimization–and above all a full explanation of deep learning.
- MIT OCW Course: Matrix Methods in Data Analysis, Signal Processing, and Machine Learning (https://ocw.mit.edu/courses/18-065-matrix-methods-in-data-an...)
- The text book website: Linear Algebra and Learning from Data (2019) https://math.mit.edu/~gs/learningfromdata/
- The Classic Linear Algebra Course: https://ocw.mit.edu/courses/18-06-linear-algebra-spring-2010...
Maybe ... but the fact that you included translation in the list of linear operations seems like a big red flag. Translation feels very linear but it is emphatically not [1]. This is not intended to be a personal jab; just that the intuitions of linear algebra are not easy to internalize.
Adding linear transformations is similarly scary territory. You can multiply rotations to your heart's content but adding two rotations gives you a pretty funky object that does not have any obvious intuition in graphics.
[1] I wouldn't jump into projective or affine spaces until you have the linear algebra tools to deal with them in a sane way, so this strikes me as a bit scary to approach it this way.
But the good news is that if you are only interested in for example geometry, game theory, systems of linear equations, polynomials, statistics, etc, then you can skip 80% of the content of linear algebra books. You don't have to read them, understand them, memorize them. You'll interact with a tiny part of linear algebra anyway, and you don't have to do that upfront.
For a moment I was thinking in homogeneous coordinates - that's not the right thing to do in the introductory phase.
Thanks for catching the error and making an important point. I am letting my original comment stand unedited so that your point stands.
About rotations though, one need not let the cat out of the bag and explain what addition of rotation is *.
One simply defines addition of two linear operators as the addition of the vectors that each would have individually produced. This can be demonstrated geometrically with arrows, without fixing coordinates.
* In 2D it's a scaled rotation.
Anyway, I believe that it's perfectly possible to explain rotation matrices so that it "clicks" with a high probability, as long as you understand the basic fact that (cos a, sin a) is the point that you get when you rotate the point (1, 0) by angle a counter-clockwise about the origin (that's basically their definition!) Involving triangles at all is fully optional.
(The term "algebra" can also refer to a particular type of algebraic structure in math, but that’s not what I meant.)
Of course I am not suggesting building synthetic graphics engines :) but the synthetic approach is sufficient to show that the operation is linear.
My formal linear algebra course was boring as hell, to me. The ~4 lectures my security prof dedicated to explaining just enough to do some RSA was absolutely incredible. I would pay lots of money for a hands-on what-linalg-is-useful-for course with practical examples like that.
(If you work through the prerequisites and use "understanding this post" as a sort of roadmap of what you actually need to know, this gets you about 2/3rds through undergraduate linear algebra, and you can skim through nullspaces --- all in the service of learning a generally useful tool for attacking cryptosystems).
It's only difficult if you are wedded to a description of matrices and vectors as seas of numbers that you grind your way through without trying to instill a fuller understanding of what those numbers actually mean. The definition makes a lot more sense when you see a matrix as a description of how to convert one sense of basis vectors to another set of basis vectors, and for that, you first need to understand how vectors are described in terms of basis vectors.
Where vectors do come up it’s usually only Cartesian vectors for mechanics, and only basic addition, scalar multiplication and component decomposition are talked about - even dot products are likely ignored.
You can go very far without touching matrices, and actually find motivation on this abstract base before learning how it interops with matrices.
The natural motivation of matrices is as representing systems of equations.
To the best of my knowledge: Scalars are variables. Vectors are arrays. Matrices are multi dimensional arrays. Addition and multiplication is iteration with operators. Combinations are concatenation. The rest like dot products or norms are just specialized functions.
But it'd be nice to see it all coded up. It wouldn't be as concise, but it'd be readable.
(I use algorithmic calculus to describe the high-school subject, and distinguish it from what in American universities is usually called "analysis," where one finally has the chance to make the acquaintance of the conceptual and proof-based aspects squeezed out of algorithmic calculus.)
This seems to make it good motivation for an intellectually curious student—"linear functions are the simplest, dumbest kind of function, and yet they still teach us this new and exotic kind of multiplication." That's not how I learned it (I was the kind of obedient student who was interested in a mathematical definition because I was told that I should be), but I can't imagine that I wouldn't have been intrigued by such a presentation!
(Basic probability / combinatorics is actually pretty cool, but both tend to be glossed over.)
If those concepts cause difficulty, it probably makes sense to go back down the learning curve a bit before tackling linear algebra. Alternatively, just cut and paste the expression into any LLM and it'll explain what's what.
If I write a matrix, say, this:
[[1 2]
[3 4]
[5 6]]
What I am doing is describing is a transformation of one vector space into another, by describing how the basis vectors of the first vector space are represented as a linear combination of the basis vectors of the second vector space. Of course, the transformed vectors may not necessarily be a basis of the latter vector space.> The natural motivation of matrices is as representing systems of equations.
That is very useful for only very few things about matrices, primarily Gaussian elimination and related topics. Matrix multiplication--which is what the original poster was talking about, after all--is something that doesn't make sense if you're only looking at it as a system of equations; you have to understand a matrix as a linear transformation to have it make sense, and that generally means you have to start talking about vector spaces.
Be aware that Lang has another book, called just "Linear Algebra", which is more theoretical.
Pair it with Edgar Goodaire's Linear Algebra: Pure & Applied and you can transition nicely from intuitive geometric to pure mathematical approach. The author's writing style is quite accessible.
Add in Introduction to Applied Linear Algebra: Vectors, Matrices, and Least Squares by Stephen Boyd et al. and you are golden. Free book available at https://web.stanford.edu/~boyd/vmls/
The real payoff though is after you do a deep dive and convince yourself there's plenty of theory and all of these interesting examples and then you learn about SVD or spectral theorems and that when you look at things correctly, you see they act independently in each dimension by... just multiplication by a single number. Unclear whether to be overwhelmed or underwhelmed by the revelation. Or perhaps a superposition.
Machine learning, LLMs, RSA, etc.
It's generally useful for multivariate statistics, 3D flies (insects), in 3D space, clustering about a narrow slanting plane of light from a window slit are points that can be projected onto "the plane of best fit" - nominally the slanting plane of light.
That right there is a geometric picture of fitting a line, a plane, a lower order manifold, to a higher order data set, the errors (distance from plane), etc. and something of what Singular Value Decomposition is about (used for image enhancement, sharpening fuzzy data, etc).
The real test of applications is what kind of work do you see yourself doing? - A quick back read suggests your currently a CS student, so all unfocused potential for now (perhaps).
I remember in a differential geometry course, when we reached "curves on surfaces", I thought "what stupidity! what are the odds a curve lies exactly on a surface?"
Thanks to everyone recommending books too!
Linear transforms (such as rotations and displacements) in GPU graphics.
Fourier series in signal processing.
JPEG compression.
Obtaining the best fit element in a vector space of curves given data or other constraints.
Understanding autodiff in JAX.
The mathematical definition of a tensor helps develop intuition for manipulating arrays/tensors in array libraries.
Transition matrices of a Markov chain.
PageRank.
Start the path at calculus. Naturally, this will lead to differential equations. Trick the engineers into defining everything in terms of differential equations.
The engineers will get really annoyed, because solving differential equations is impossible.
Then, the mathematicians swoop in with the idea of discretizing everything and using linear algebra to step through it instead. Suddenly they can justify all the million-by-millions matrices they wanted and everybody thinks they are heroes. Engineers will build the giant vector processing machines that they want.
The actual presentation was terrible, I'll be lucky if I die before having to invert a matrix by hand again, but it was there.
Because there is so much to teach/learn, "Modern Mathematics" syllabi has devolved into giving students merely an exposure to all possible mathematical tools in an abstract manner, dis-jointly with no unifying framework, and no motivating examples to explain the need for such mathematics. Most teachers are parrots and have no understanding/insight that they can convey to students and so the system perpetuates itself in a downward spiral.
The way to properly teach/learn mathematics is to follow V.I.Arnold's advice i.e. On Teaching Mathematics - https://dsweb.siam.org/The-Magazine/All-Issues/vi-arnold-on-... Ground all teaching in actual physical phenomena (in the sense of existence with a purpose) and then show the invention/derivation of abstract mathematics to explain such phenomena. Everything is "Applied Mathematics", there is no "Pure Mathematics" which is just another name for "Abstract Mathematics" to generalize methods of application to different and larger classes of problems.
To give an example: A simple multiplication of two numbers is better seen as rotating one of the numbers to be perpendicular to the other and then quantifying the area/volume spanned by them. This gives vector dot product.
While geometry might better address "why", algebra gets into the work of "how to do it". Mathematics in old times, like other branches of science, did not encourage "why". Instead, most stuff would say "This is how to do it, Now just do it". Algebra probably evolved to answer "how to do it" - the need to equip the field workers with techniques of calculating numbers, instead of answering their "why" questions. In this sense, Geometry is more fundamental providing the roots of concepts and connecting all equations to the real world of spatial dimensions. Physics adds time to this, addressing the change, involving human memory of the past, perceiving the change.
Short, simple answer to that question by Michael Penn: https://www.youtube.com/watch?v=cc1ivDlZ71U
Another interesting treatment by Math the World: https://www.youtube.com/watch?v=1_2WXH4ar5Q&t=4s
There's no impenetrable mystery here. Probably just bad teaching you experienced.
https://www.youtube.com/watch?v=yAb12PWrhV0&list=PLBQcPIGljH...
It starts with the axioms of being able to draw one line parallel to another, and a line through point, and builds up everything from there. No labeled Cartesian axes. Just primitive Euclidean objects in an affine space.
Starts from linear transformations and builds from there.
Some books for studying Mathematics using J are listed here - https://code.jsoftware.com/wiki/Books
This is only beautiful if you already understand monoids, magmas and abelian half groups (semigroups) and how they form groups. Also, we do not talk of linear transformations, we talk of group homomorphisms.
I don't know about anyone else, but I was taught linear algebra this way in the first semester and it felt like stumbling in a dark room and then having the lights turned on in the last week as if that was going to be payback for all the toe stubbing.
edit to add: (I think your point relates only to the projection system, and not a pure, unprojected model; I just want to make sure I understand because it seems like an important point)
Let's take another approach.
Take a point p that's sum of vectors a and b, that is
p = a + b.
Now, if translation was a linear transformation, then translating p (say along x-axis by 1 unit) is equivalent to applying same translation to a and b separately and then summing them. But the latter ends up translating by twice the amount. Or in other words
p +t ≠ (a +t) + (b +t) = p + 2t.
So translation is not a linear operators in this vector space.
All that needs to be demonstrated is that for real numbers + associates and commutes. That * associates and commutes. And most satisfyingly, these two operations interact through the distribution property.
Of course, it's more revealing and interesting if one has some exposure to groups and fields.
Do people encounter linear algebra in their course work before that ?
For us it came after coordinate/analytical geometry where we had encountered parallelogram law. So while doing LA we had some vague awareness that there's a connection. This connection solidified later.
We also had an alternative curriculum where matrices were taught in 9th grade as a set of rules without any motivation whatsoever. "This is the rule for adding, this one's for multiplication, see you at the test"
If you have a system Ax=y and a system By=z there exists a system (BA)x=z
This system BA is naturally seen as the composition of both systems of equations
And the multiplication rule expresses the way to construct the new systems' coefficients over x constrained by z.
The C_i equation has coefficients which are the evaluations of the B_i equation over the A_k-th coefficients
C_ik = B_ij A_jk
concretely
A11 x1 + A12 x2 = y1
A21 x1 + A22 x2 = y2
and
B11 y1 + B12 y2 = z1
B21 y1 + B22 y2 = z2
then
B11 (A11 x1 + A12 x2) + B12 (A21 x1 + A22 x2) = z1
B21 (A11 x1 + A12 x2) + B22 (A21 x1 + A22 x2) = z2
rearrange and collect terms
(B11 A11 + B12 A21) x1 + (B11 A12 + B12 A22) x2 = z1
(B21 A11 + B22 A21) x1 + (B21 A12 + B22 A22) x2 = z2
the coefficients express the dot product rule directlyIn this book, I cover Functions, Derivatives, Integrals, Multivariable Calculus, and Infinite Processes. In addition, I've included appendices with sketch proofs and applications to Physics, Probability and Statistics, and Computer Science.
Most obvious case that it fails is that it doesn't map zero to itself, and you can see the contradiction there:
T(0 + 0) = T(0) = t
T(0) + T(0) = t + t = 2 * t
Can you expand on your experience with this? I do some graphics programming so I understand that applying matrix transformations works, and I've seen the 3blue1brown 'matrices are spreadsheets' explanation (luv me sum spreadsheets), but the intuition still isn't really there. The 'incredibly deep "why matrix multiplication looks that way"' is totally lost on me.