Quotients and Homomorphisms for Beginners: Part 6, Modern Ideas of Category Theory AKA Quotients Done Right

Some university’s have adopted the convention of naming a first course in algebra “Modern Algebra,” as opposed to the traditional (at least for the past few decades) “Abstract Algebra,” or just “Algebra.” I am not a fan of this convention in most cases. The reason is simple: most first courses in algebra fail to mention category theory; students may not even see a single commutative diagram! I am not an algebraist per se, but my work involves a tremendous amount of algebra from representation theory, Galois theory, and much, much more. I cannot get through a lecture or a paper without seeing something like:


(pg. 38, Integral p-adic Hodge Theory by Sholze, Morrow, and Bhatt)

Category theory is the common language of modern algebra, and it has been for a few decades now; indeed, category theory is approaching the level of universal understanding to rival naive set theory as a lingua franca of mathematics as a whole. (Of course, categories include sets – well, classes – so set theory will remain.) I am of the opinion that all undergraduate pure mathematics majors ought to see some category theory in their required introductory algebra course(s). The beginnings of the theory are not difficult to understand, and the framework is far too important, far too beautiful not to know. Some “Modern Algebra” courses do include a discussion of category theory, e.g., MIT’s Math 18.703, and I commend the departments in which that decision was made, for I believe it a wise one. It is fairly standard to introduce category theory by spreading the ideas throughout a first course in graduate algebra, where the ideas naturally arise. (Categories are of obvious use in undergraduate algebra as well, but such courses essentially serve to introduce to the language of algebra in broad terms, so matters never get particularly complex as to find significant simplification via categorical techniques.)  Some departments, including my own, I am sad to say, lack a regular offering of basic category theory in of itself, however, and so the primary exposure a student, undergraduate or graduate, will have to category theory in through the fundamentals of algebra.

In this post, we hope to demonstrate how quotient groups and the first isomorphism theorem fits into the modern picture of algebra as well as why it might be useful to take this view. We begin by introducing the basic ideas of category theory, suitable for anyone with a bit of mathematical maturity as well as some exposure to fundamental mathematics objects, in particular sets, groups, vector spaces, and topological spaces; if you do not know what all of those objects are, you may still stand to gain something from reading this, so I encourage you to try to understand the examples involving objects you know, perhaps doing some searching on the web to understand the objects you do not. You will want to know all of these objects as a serious student of mathematics, anyway, as they are soundly in the class of things every math student ought to know. Without further adieu, we define a category, thus embarking on our journey to understand the framework that has quickly become the preferred language of mathematicians the world over and which has been suggested as an alternative foundation for mathematics. We cover somewhat more category theory than is strictly needed. 

Definition. category \mathcal{C}consists of the following data:

  • A class of objects, denoted \mathbf{Ob}(\mathcal{C})
  • A set of arrows between objects where the arrows between objects X, Y \in \mathbf{Ob}(\mathcal{C}) is denoted \mathrm{Hom}_{\mathcal{C}}(X,Y)
  • A sensible operation of composition on arrows, i.e., if X, Y, Z \in \mathbf{Ob}(\mathcal{C}), then there is a map \displaystyle \mathrm{Hom}(Y,Z) \times \mathrm{Hom}(X,Y) \longrightarrow \mathrm{Hom}(X,Z) such that:
    • The composition is associative in the sense that if f: X \to Y \, , \, g: Y \to Z \, , \, h: Y \to Z are arrows, then h \circ (g \circ f) = (h \circ g) \circ f, where the notation \circ denotes the obvious composition of arrows as one is accustomed to in the special cases of functions. (Note: just as in the case of groups and other such structures, this makes the expression f_1 \circ f_2 \circ \cdots \circ f_n unambigious — Prove!)
    • There exists an identity arrow for each object, i.e., if X \in \mathcal{C}, then there exists \mathrm{id}_X \in \mathrm{Hom}_{\mathcal{C}}(X,X) so that for every f: X \to Y in the category we have f \circ \mathrm{id}_x = f; similarly, for every g: Z \to X in \mathcal{C}, we have \mathrm{id}_X \circ g = g.

This definitions takes far more words than it ought to. The idea is really quite simple. It is so simple that it may even be described as an evolution in notation. The eighteenth century point of view of a function was more or less “I know it when I see it,” and then set theory formalized this a bit, and we began writing functions using the now omnipresent notation f: X \to Y (or X \overset{f}{\to} Y). This represented somewhat of a shift toward considering the relations (maps) between sets important, rather than the sets (objects) themselves. Category theory takes this to the extreme by essentially doing away with objects altogether.

Before diving into examples, I feel an obligation to mention some matters of notation as well as one technicality. Some authors write \mathrm{Hom}(X,Y) as \mathrm{Mor}(X,Y), where “Mor” is clearly an abbreviation for “morphism.” It is a common abuse of notation to write X \in \mathcal{C} in place of X \in \mathbf{Ob}(\mathcal{C}), and similarly, some might write f \in \mathcal{C} for f \in \mathrm{Hom}(X,Y) for two objects X, Y \in \mathcal{C}. Frequently, categories are denoted with fancy fonts, especially bold (\LaTeX: \mathbf{}, e.g., \mathbf{C}), calligraphic (\LaTeX: \mathcal{}, e.g., \mathcal{C}), and script (\LaTeX: \mathscr{}, e.g., \mathscr{C}). Usually, authors have enough sense to write composing g after f as g \circ f, but sometimes the order is reversed, or people get lazy and write gf. Finally, the reason we require the collection of objects to be a class rather than a set is a technical one, which can almost always be ignored; a class is a collection of objects that can be defined by a property that all its members share, and sets are a proper subset of the collection of classes (non-set classes are called “proper”). The idea of a class was rigorously formulated to avoid some paradoxes found in naive set theory around the turn of the century.

For the sake of concreteness, let us look at a stupid example of a very small category, one which is easily diagrammed. The category \mathbf{3} is represented by the following commutative diagram:


(pg. 8, Category Theory, Second Edition, Oxford Logic Guides, Steve Awodey)

Note that I am borrowing other people’s commutative diagrams principally because I am lazy; I am not particularly proficient at TeX‘ing such diagrams, but these simple ones are trivial to do.

The identity arrows are exempted, as per usual, because they are annoying to write. Note that the symbols are essentially arbitrary objects. It should be easy enough to verify the axioms are satisfied. I leave this as an exercise to the reader. Let us look at a less trivial example now.

The category \mathbf{Sets} consists of sets as objects and maps as functions. This is in some sense the most concrete category we have to work with. Try to play with this idea. Here is a question: can you see why \mathbf{Ob}(Sets) must be a class? This can be answered by doing nothing more than citing a famous paradox. Sometimes, we write specific categories in bold, italics, print, or with an underline. One might let sets denote the finite elements of \mathbf{Sets}, which is a convenient notation, as it clearly generalizes to other categories. 

The category Vec_k consists of the vector spaces over a field k with arrows between them being linear maps (linear transformations). The category \mathbf{Top} consists of topological spaces as objects. If you know anything about topology, try to guess at what the maps are.

The category \mathbf{Grp} has as its objects groups and as its arrows group homomorphisms. Whence, \mathbf{Grp} consists of all groups and all homomorphisms between them. This might sound like a big category, and indeed, it is fairly large. Things are not always so bad, however. What I mean by this is that, for instance, \mathrm{Hom}(X,Y) might be as small as the trivial homomorphism. Indeed, the objects can be quite simple as well, e.g., \{e\} \in \mathbf{Grp}. What do you think the subcategory \mathbf{Ab} is? 

A beautiful consequence of the formalism of category theory is that we can make concrete the idea that many theorems transfer from one algebraic object to another. For instance, the isomorphism theorems hold not just for vector spaces but for groups as well. This also shows us why abstraction can be useful in some cases. My favorite example here is from analysis, because few texts take this approach, even at the graduate level, but my school’s text (written by two of our own) does. The text Integration and Modern Analysis by Bendetto and Czaja introduces Lebesgue integration first in the special case of \mathbb{R}, then in somewhat more general case of \mathbb{R}^n, then in full generality for general spaces X. This gets pretty technical, but the jist of it is that, the authors start concrete, indeed the first chapter is devoted to the classical study of real variables, then the second chapter begins with the aforementioned spaces, but then it generalizes massively, which takes some effort at the start, but it helps in the long-term. In particular, no longer do theorems need to be proven again for every change of category, so a lot of time is saved. 

Definition. If \mathscr{C} is a category and f \in \mathscr{C}, then f is an isomorphism if and only if there exists an arrow g: Y \to X in \mathscr{C} such that g \circ f = \mathrm{id}_X and f \circ g = \mathrm{id}_Y.

Naturally, we might also call isomorphisms invertible arrows, and we might call the map g the inverse of f. It is easy to check that inverses are unique, and that if f,g are invertible, then so is their composition. What else might you be able to say about invertibility and composition? If there exists an isomorphism between objects in a category, say X and Y, then we say X and Y are isomorphic, X \cong Y. Try to figure out what the isomorphisms for \mathbf{Sets} and \mathbf{Grp} are, and investigate how many isomorphisms there might be between elements. 

A clear extension of isomorphisms are automorphisms, which are isomorphisms from an object X to itself. We denote the set (group) of automorphisms of X as expected as \mathrm{Aut}(X).

Definition. groupoid is a category in which all arrows are invertible.

Here is an exercise from Ravi Vakil’s The Rising Sea: Foundations of Algebraic Geometry, which I cannot resist including here. I highly recommend Vakil’s notes, by the way, and I amazed he has yet to formally publish them. The exercise is as follows: realize a group as a groupoid of a single object. 

I should note that every example of a category considered here is a so-called concrete category, which is one which can be realized in an obvious way as having objects and maps with additional structure between said objects. Evan Chen’s An Infinitely Large Napkin (currently located in Chapter 22.3 on page 235) provides a good discussion of the important example of an abstract category, that of posets. If you are interested, you can look there. 

It would be remiss of me not to mention some other arrows of sorts in categories. I will not give the topics of functors, natural transformations, dual categories, or isomorphism and equivalence of categories nearly enough attention, but I think I should at least mention them.

Definition. (covariant) functor \mathcal{F}: \mathscr{C}_1 \to \mathscr{C}_2 is an arrow associating every object X \in \mathscr{C}_1 with an object \mathcal{F}(X) \in \mathscr{C}_2 and associating to every arrow f: X \to Y in \mathscr{C}_1 an arrow \mathcal{F}(f): \mathcal{F}(X) \to \mathcal{F}(X) in \mathscr{C}_2 so that \mathcal{F}(g \circ f) = \mathcal{F}(g) \circ \mathcal{F}(f) whenever f:X \to Y and g: Y \to Z are morphisms in \mathscr{C}_1. Furthermore, \mathcal{F}(\mathrm{id}_X) = \mathrm{id}_{\mathcal{F}(X)} for all objects.

Functors are usually assumed to be covariant, but they need not be. Contravriant functors simple reverse the directions of all arrows. Somewhat more precisely, a contravariant functor can be defined as a covariant functor on the dual (or opposite) category, which is the category derived from another by simply reversing all arrows. A stupid example of a functor is the identity functor or the inclusion functor. Functors where some structure is lost such as the functor Vec_k \hookrightarrow \mathbf{Sets} are called forgetful. Injective functors are called faithful, whereas surjective functors are call full; bijective functors are called fully faithful. Functors are written in capital print calligraphic font, or capital Greek letters most frequently. 

Definition. Let F,G be functors between \mathscr{C,D}, then a natural transformation \eta: F \to G is a collection of morphisms such that every X \in \mathscr{C} has associated to it an arrow \eta_X: F(X) \to G(X) (the component of \eta at X), and so that the components are such that every f: X \to Y in \mathscr{C} has the diagram commute (i.e., \eta_Y \circ F(f) = G(f) \circ \eta_X — sometimes people write a curly arrow in the middle of the diagram to denote the direction to read to see the commutative property).


(Wikimedia Foundation)

A collection of natural transformations where each map is invertible is called an isomorphism. There is a natural composition law on functors and natural transformations. The category of functors is denoted \mathrm{Funct}. (Why not have a category \mathrm{Nat}?) This can be taken infinitely further by doing higher category theory with 2-categories and even \infty-categories. 

If there is a functor between categories such that there exists another functor in the reverse direction so that their compositions are the identity functors, then the categories are called isomorphic. The inverse here is unique. If there is a functor between categories such that there exists another in the reverse so that their compositions are isomorphic to the identity functors, then the functor is an equivalence of categories. Here we do not have unique inverses and hence do not usually use the term at all. 

In his notes, Vakil promises not to present excessively more than was needed (i.e., no topoi). In that vain, I will not present any more general category theory except the essential idea of universal properties, which we will look at through groups (i.e., no Yoneda’s lemma). 

Definition. Let G be a group and H be a normal subgroup of G. The quotient G/H in the categorical sense is a group K alongside a homomorphism k: G \to K such that \mathrm{Ker}(k)=H, which is universal among all homomorphisms of that form in the sense that if \phi: G \to G' is any homomorphism such that H \subset \mathrm{Ker}(H), then a unique homomorphism \overset{\sim}{f} is induced, which makes the following diagram commutes.


(Joseph R. Heavner — I already had this diagram in a document of mine, so there I had no excuse but to reproduce it myself.)

It does require a proof to show that \mathbf{Grp} admits such quotients, but we will not bother ther, because it is not too different from things we have already seen. Notice that this absolute kills the proof of the first isomorphism theorem. Replace K with G/K and add in the usual map, and you are done! (OK, it does require a bit more of a detailed argument than this, but that is what exercises are for, right?) 

This explains the alternative nomenclature for quotient groups as “factor groups.” Any group homomorphism f: G \to G' factors through the quotient of the domain by the kernel of the map. 

I have always found thinking of the first isomorphism theorem in this way to be most revealing. This fundamental fact in addition to the fact that the theorem allows us to prove an isomorphism between a quotient and a group by simply finding a surjective homomorphism with the kernel satisfying the necessary condition. This is almost always how it is done in practice. Universal properties are quite prevalent in mathematics; another example of something with a universal property in the direct sum of vector spaces.

And, with that we bid adieu to the series. I hope you have learned something. In particular, I hope the motivation was helpful, the basic theory is now clear, the problems have allowed for some proficiency in solving problems involving this subject matter with relative ease, and you now have some appreciation for the modern perspective.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s