Henry Crapo: A Brief Reminiscence

A guest post by James Oxley.

Henry in Wellington in 2007

Henry Crapo died on September 3, 2019 at the age of 86. He contributed much to matroid theory, making foundational contributions to the subject. Among his notable achievements were the first text in matroid theory, written jointly with Gian-Carlo Rota; the introduction of the Tutte polynomial and the beta invariant for matroids; the analysis of how to treat single-element extensions of matroids; the identification of the critical problem for matroids; the definition of an attractive non-commutative way to combine two matroids; and a catalogue of all matroids on at most eight elements. He was a very friendly and generous man who, in addition to his many influential publications, hosted intimate and stimulating conferences at his home in the south of France. He will be greatly missed.

This short remembrance will highlight some of Henry’s most important contributions to matroid theory. In 1964, Henry completed his Ph.D. dissertation at MIT. According to the online Mathematics Genealogy Project, Henry had two advisors: Gian-Carlo Rota and Kenneth Myron Hoffmann, although more will be said about this below. Rota’s name will be well known to matroid theorists (or “combinatorial geometers” as he would have preferred), but Hoffman is unlikely to be as he worked in functional analysis.

Henry was one of the attendees at the first conference in matroid theory, which was organized by Jack Edmonds and held at the National Bureau of Standards in Washington, D.C. in 1964. Bill Tutte was another notable attendee at that conference where he delivered his Lectures on Matroids. Of that year, Tutte wrote, `To me that was the year of the Coming of the Matroids. Then and there the theory of matroids was proclaimed to the mathematical world. And outside the halls of lecture there arose the repeated cry: “What the hell is a matroid?”.’

Henry’s first paper [3], which was on single-element extensions of matroids, appeared in the proceedings of that conference, which were published in 1965 in the Proceedings of the National Bureau of Standards. In that paper, Henry noted that, when extending a matroid $M$ by an element $e$ to produce a new matroid $N$, one needs only to consider the flats $F$ of $M$ such that $F\cup e$ is a flat of $N$ with $r(F \cup e) = r(F)$. The set ${\mathcal M}$ of such flats $F$ is called a modular cut. Thus, for example, if ${\mathcal M}$ is empty, then $e$ is added as a coloop, whereas if ${\mathcal M}$ consists of all of the flats of $M$, then $e$ is added as a loop. The extension $N$ is uniquely determined by the modular cut ${\mathcal M}$. Thus a study of the single-element extensions of $M$ is a study of the modular cuts of $M$. Henry proved that a subset ${\mathcal F}$ of the flats of $M$ is a modular cut if and only if every flat containing a member of ${\mathcal F}$ is also in ${\mathcal F}$, and, whenever $X$ and $Y$ are in ${\mathcal F}$ and $r(X) + r(Y) = r(X \cup Y) + r(X \cap Y)$, the flat $X \cap Y$ is also in ${\mathcal F}$.

The pictures in Figures 1 and 2 are from that first matroid conference and were kindly supplied to me by Bill Pulleyblank. In the first one, Henry is in the front row on the extreme left. Slightly to his left in the row behind him is Gian-Carlo Rota. Bill Tutte is in the same row as Rota but two along from him. Jack Edmonds is in the back row, to Tutte’s immediate right. That photo also includes Tutte’s students, Neil Robertson and Ron Mullin, as well as Ray Fulkerson and Dijen Ray-Chaudhuri. Since the reader may enjoy trying to identify those people, their locations will not be revealed until the end of this note.

Figure 1: Participants in the first matroid theory conference.

Figure 2: The matroid theory conference participants seated.

Henry’s second and third papers [4, 5] appeared in Volumes 1 and 2 of the newly founded Journal of Combinatorial Theory. His third paper A higher invariant for matroids introduced what Brylawski [2, p.252] called “Crapo’s Betsy invariant”. More formally, for a matroid $M$, the beta invariant is defined by $$ \beta (M) = (-1)^{r(M)} \sum_{A \subseteq E(M)} (-1)^{|A|} r(A).$$

Henry proved that, in a matroid $M$, for any element $e$ other than a loop or a coloop, $$ \beta (M) = \beta(M\backslash e) + \beta(M/e).$$ Since $\beta(U_{0,1}) = 0$ and $\beta(U_{1,1}) = 1$, a straightforward induction argument gives that $\beta (M) \ge 0$ for all matroids $M$. Henry proved that, when $M$ has at least two elements, $M$ is connected if and only if $\beta(M)$ is positive. He also showed, again when $M$ has at least two elements, that $\beta(M) = \beta(M^*)$. Then, using Tutte’s excluded-minor characterization of regular matroids, Henry proved the following result.

Theorem. A matroid $M$ is regular if and only if $\beta(N_4) \le 1$ for all $4$-element minors $N_4$ of $M$, and $\beta(N_7) \le 2$ for all $7$-element minors $N_7$ of $M$.

Henry’s 1969 paper [6] The Tutte polynomial introduced the Tutte polynomial for matroids by building on Tutte’s 1954 paper [14] A contribution to the theory of chromatic polynomials, which introduced what we now call the Tutte polynomial for graphs. In a footnote in Henry’s paper, he observes that Sections 3 and 4 of his paper “constitute a rewriting, with new proofs, of the main theorems in the author’s doctoral dissertation `On the Theory of Combinatorial Independence’, submitted to the Massachusetts Institute of Technology in June, 1964, under the supervision of Professor Gian-Carlo Rota.” Note that there is no mention here of Henry’s second advisor Kenneth Hoffman leading one to wonder what role he played.

Tutte himself studied the Tutte polynomial for representable matroids, which he called “nets”, in his 1948 Cambridge Ph.D. thesis [13]. Since representability was not important in Tutte’s definition, he had effectively treated the Tutte polynomial for all matroids although he never published this work. The Tutte polynomial for matroids has been extensively studied by many authors in the half century since Henry’s paper formally introducing it. It can be defined as follows:
$$T(M;x,y) = \sum_{A \subseteq E} (x-1)^{r(E) – r(A)}(y-1)^{|A| – r(A)}.$$ Thus $T(U_{0,1};x,y) = y$ and $T(U_{1,1};x,y) = x$. This polynomial obeys the following two basic recursions: $$T(M;x,y) = T(M|\{e\};x,y) T(M\backslash e;x,y) ~~~\text{when $e$ is a loop or a coloop,}$$ and $$T(M;x,y) = T(M\backslash e;x,y) + T(M/ e;x,y) ~~~\text{otherwise.}$$
It follows from these that we can write $$T(M;x,y) = \sum_{i,j\ge 0} t_{ij}x^iy^j$$ where $t_{ij} \ge 0$ for all $i$ and $j$. It turns out that the beta invariant shows up among these coefficients. Specifically, when $M$ has at least two elements, $$\beta(M) = t_{10} = t_{01}.$$

Beginning in 1964, Gian-Carlo Rota published a series of papers “On the foundations of combinatorial geometry.” The most widely cited paper in that series is Part I: Theory of Möbius functions [12]. Part II, joint with Henry, is Combinatorial geometries and was published in 1970. That 25-page paper [7] was followed in the same year by a monograph whose full title is On the Foundations of Combinatorial Theory: Combinatorial Geometries. Preliminary Edition. This was the first text in matroid theory and, as such, was very influential. A year later, Tutte published An Introduction to the Theory of Matroids, which was effectively a reprinting of his 1965 Lectures on Matroids, but that book [16] did not attract nearly the same attention as Crapo and Rota’s book. When Crapo and Rota decided to update their book, it began as a joint project with Neil White [18, p.xv]. It turned into a series of three books [18, 19, 20], which were edited by Neil White and which contained chapters by a large number of different authors including two in the first volume by Henry. The approach to matroid theory taken by Crapo and Rota was a very geometric one and much of the focus was on the lattice of flats. Studying such geometric lattices is, of course, equivalent to studying simple matroids. With hindsight, the major drawback of this approach is that the geometric lattice of the dual of a matroid $M$ is not easily obtained from the geometric lattice of $M$. The fundamental role of duality and the link that it provides between deletion and contraction would seem to have been a major factor in the subsequent shift in focus in the study of matroids away from the study of geometric lattices. Crapo and Rota’s book has many attractive features. One in particular is the introduction of the critical problem for matroids. The goal of that problem was to provide a common framework within which one could view a number of problems in extremal combinatorics including what was then the Four Colour Problem, the Five-Flow Conjecture [14], and Tutte’s Tangential $2$-Block Conjecture [15]. For a matroid $M$, the characteristic or (chromatic) polynomial is defined by $$p(M;\lambda) = \sum_{A \subseteq E} (-1)^{|A|}\lambda^{r(M) – r(A)}.$$ This is an evaluation of the Tutte polynomial: $$p(M;\lambda) = (-1)^{r(M)}T(M;1-\lambda,0).$$ If $G$ is a graph with $k(G)$ connected components and cycle matroid $M(G)$, the chromatic polynomial $P_G(\lambda)$ of the graph satisfies $$P_G(\lambda) = \lambda^{k(G)}p(M(G);\lambda).$$ Moreover, if $M$ is loopless and its underlying simple matroid is $M’$, then $p(M’;\lambda) = p(M;\lambda)$.

Now let $M$ be a rank-$r$ simple $GF(q)$-representable matroid and view $M$ as a restriction of the projective geometry $PG(r-1,q)$. When $q$ is at least five, the embedding of $M$ in $PG(r-1,q)$ need not be unique. The critical exponent (or critical number) $c(M;q)$ of $M$ is $r – k$ where $k$ is the rank of the largest projective subspace of $PG(r-1,q)$ that contains no element of $E(M)$. In particular, recalling that the affine geometry $AG(r-1,q)$ is obtained from $PG(r-1,q)$ by deleting the elements of a hyperplane of the latter, we see that $c(M;q) = 1$ if and only if $M$ is a restriction of $AG(r-1,q)$. It is natural to expect that $c(M;q)$ may depend on the embedding of $M$ in $PG(r-1,q)$. However, the following attractive result of Crapo and Rota [8] proves that it does not.

Theorem. Let $M$ be a rank-$r$ simple $GF(q)$-representable matroid. Then
$$c(M;q) = \min\{j \ge 1: p(M;q^j) > 0\}.$$
One consequence of this is that a simple graph $G$ is bipartite if and only if its critical exponent $c(M;2)$ is one. More generally, the chromatic number $\chi(G)$ of $G$ and the critical exponent of $M(G)$ obey the following: $$q^{c(M;q) – 1} < \chi(G) \le q^{c(M;q)}.$$ The study of the behaviour of the critical exponent of matroids has recently been enjoying a renaissance with it now being more commonly known as the critical number. Peter Nelson [11] has recently written a very approachable survey of some of this work.

In 1973, with John Blackburn and Denis Higgs, Henry published A Catalogue of Combinatorial Geometries [1]. The preprint of that had an extraordinary cover, which is reprinted in Figure 3. The catalogue was assembled by doing sequences of single-element extensions. Thanks to Rudi Pendavingh and Stefan van Zwam, we have a wonderful matroid package in SageMath. This paper of Henry and his collaborators seems to have been the first to do extensive computation involving small matroids.

Figure 3: Can you identify what this says? The answer appears below.

I met Henry when I was working with Tom Brylawski in North Carolina probably in 1980. By then, Henry’s research focus was on structural rigidity, a topic that occupied his mind for many years. Henry did return to work on matroids from time to time after that. I conclude this reminiscence with one such return, in 2005, when Henry and Bill Schmitt [9] introduced a beautiful and useful matroid construction. Let $M_1$ and $M_2$ be matroids on disjoint sets $E_1$ and $E_2$. One easy way to combine $M_1$ and $M_2$ is to take their direct sum, a matroid of rank $r(M_1) + r(M_2)$. Crapo and Schmitt defined a different matroid on $E_1 \cup E_2$ of rank $r(M_1) + r(M_2)$. The free product $M_1 \Box M_2$ of $M_1$ and $M_2$ has as its bases all subsets $B$ of $E_1 \cup E_2$ of cardinality $r(M_1) + r(M_2)$ such that $B\cap E_1$ is independent in $M_1$ while $B\cap E_2$ is spanning in $M_2$. This operation is non-commutative. Indeed, $M_1 \Box M_2 = M_2 \Box M_1$ if and only if both $M_1$ and $M_2$ have rank zero, or both $M_1^*$ and $M_2^*$ have rank zero. This operation has a number of attractive properties. For example, $$(M_1 \Box M_2)^* = M_2^* \Box M_1^*.$$ Moreover, given $|E_1|$, the matroid $M_1 \Box M_2$ uniquely determines $M_1$ and $M_2$ up to isomorphism.

Theorem. For matroids $M_1$, $M_2$, $N_1$, and $N_2$, if $M_1 \Box M_2 \cong N_1 \Box N_2$ and $|E(M_1)| = |E(N_1)|$, then $M_1 \cong N_1$ and $M_2 \cong N_2$.

In 1969, Dominic Welsh [17] made a natural and seemingly innocuous conjecture about the number $f(n)$ of non-isomorphic matroids on an $n$-element set. Thirty-five years later, the conjecture was settled essentially simultaneously by Manoel Lemos [10] and by Crapo and Schmitt [9] using quite different methods. Crapo and Schmitt’s proof of Dominic’s conjecture was derived by applying the last theorem.

Theorem. For all non-negative integers $m$ and $n$, $$f(m+n) \ge f(m)f(n).$$

Henry liked to host small meetings at his home in La Vacquerie in the south of France. These were very stimulating meetings. The mathematics was exciting and the food was superb, with Henry hiring a chef for the week. Henry had a wonderful cellar I am told, and he shared extensively from it during these meetings. In addition, he arranged for local cultural events in the evenings including music recitals and interactive puppet shows. The experience was unforgettable for all lucky enough to enjoy it. At the only such meeting I attended, Henry gave up his own bedroom for my use. He was a most generous man. Those of us fortunate enough to have known him will miss him greatly; and all of us have his extensive mathematical legacy to continue to enrich our lives.

Answers to the open questions

In the photograph in Figure 1, Ray Fulkerson is in the back row on the extreme right; next to him is Neil Robertson; on the row in front of that, Ron Mullin is on the exteme right and Dijen Ray-Chaudhuri is next to him. In the photograph in Figure 2, Jack Edmonds is at the right-hand end of the table; two along from him is Gian-Carlo Rota; three along from him are, in order, Bill Tutte, Ray Fulkerson, and Dijen Ray-Chaudhuri. In the second row, starting at the left-hand end, the first three people are Ron Mullin, Neil Robertson, and Henry Crapo. In Figure 3, the title of the preprint is The Henry Crapo Group Presents The Incredible Catalogue of 8 Point Geometries. See Single Element Extensions Grow Before Your Eyes.


[1] Blackburn, John E.; Crapo, Henry H.; Higgs, Denis A. A catalogue of combinatorial geometries. Math. Comp. 27 (1973), 155–166.

[2] Brylawski, Thomas H. A decomposition for combinatorial geometries. Trans. Amer. Math. Soc. 171 (1972), 235–282.

[3] Crapo, Henry H. Single-element extensions of matroids. J. Res. Nat. Bur. Standards Sect. B 69B (1965), 55–65.

[4] Crapo, Henry H. The Möbius function of a lattice. J. Combinatorial Theory 1 (1966), 126–131.

[5] Crapo, Henry H. A higher invariant for matroids. J. Combinatorial Theory 2 (1967), 406–417.

[6] Crapo, Henry H. The Tutte polynomial. Aequationes Math. 3 (1969), 211–229.

[7] Crapo, Henry H.; Rota, Gian-Carlo On the foundations of combinatorial theory. II. Combinatorial geometries. Studies in Appl. Math. 49 (1970), 109–133.

[8] Crapo, Henry H.; Rota, Gian-Carlo. On the Foundations of Combinatorial Theory: Combinatorial Geometries. Preliminary Edition. M.I.T. Press, Cambridge, Mass.-London, 1970.

[9] Crapo, Henry; Schmitt, William. The free product of matroids. European J. Combin. 26 (2005), 1060–1065.

[10] Lemos, Manoel. On the number of non-isomorphic matroids. Adv. in Appl. Math. 33 (2004), 733–746.

[11] Nelson, Peter Colouring without colours: graphs and matroids. Lond. Math. Soc. Newsl. No. 482 (2019), 25–29.

[12] Rota, Gian-Carlo. On the foundations of combinatorial theory. I. Theory of Möbius functions. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 2 (1964), 340–368.

[13] Tutte, W. T. An algebraic theory of graph colorings. Ph.D. thesis. University of Cambridge, 1948.

[14] Tutte, W. T. A contribution to the theory of chromatic polynomials. Canadian J. Math. 6 (1954), 80–91.

[15] Tutte, W. T. Lectures on matroids. J. Res. Nat. Bur. Standards Sect. B 69B (1965), 1–47.

[16] Tutte, W. T. Introduction to the Theory of Matroids. American Elsevier, New York, 1971.

[17] Welsh, D. J. A. Combinatorial problems in matroid theory. Combinatorial Mathematics and its Applications (Proc. Conf., Oxford, 1969) pp. 291–306, Academic Press, London, 1971.

[18] White, Neil (editor). Theory of Matroids. Cambridge University Press. Cambridge. 1986.

[19] White, Neil (editor). Combinatorial Geometries. Cambridge University Press. Cambridge. 1987.

[20] White, Neil (editor). Matroid Applications. Cambridge University Press. Cambridge. 1992.

Signed difference analysis

I’d like to discuss an application of oriented matroids to psychology that I’ve been working on with John Dunn. John is a mathematical psychologist who, along with Ralph James, started this ball rolling — I got an email from him out of the blue one day beginning “I hope you don’t think I’m a crackpot, but I think I have an application of oriented matroids”.

The type of psychological model we consider has three types of variables. Two of them are things one can measure in the lab: independent variables represent factors one can change to affect a subject’s responses, and dependent variables represent the responses. Between these, typically impossible to measure, are latent variables — theoretical constructs that can be viewed as depending on the independent variables and determining the dependent variable. For instance, a memory experiment might ask people to memorize words under different conditions (the conditions are independent variables), and then give them a test whose score is the dependent variable. The theoretical construct would describe mental processes involved in memory (the latent variables), which might be changed by changes in the independent variables, and would describe how values of the latent variables determine values of the dependent variables. We assume all latent variables and all dependent variables to be numerical.

If a theory were to propose that the functional relationship between latent and dependent variables has a particular algebraic form, then we could easily test the theory. For instance, if the relationship between the vector $\mathbf{x}$ of latent variables and $\mathbf{y}$ of dependent variables were theorized to be $\mathbf{y}=\mathbf{A}\mathbf{x}+\mathbf{b}$ for some specified matrix $\mathbf{A}$ and vector $\mathbf{b}$, then the theory predicts that each $\mathbf{y}$ will lie in the affine space ${\mathrm{col}}(\mathbf{A})+\mathbf{b}$. If we expanded the theory to leave $\mathbf{b}$ unspecified, then we could still test this theory: it predicts that a set $\{\mathbf{y}_1, \ldots, \mathbf{y}_N\}$ of $N$ measurements of the vector of dependent variables should satisfy $\mathbf{y}_i-\mathbf{y}_j\in{\mathrm{col}}(\mathbf{A})$ for all $i$ and $j$.

But, if you think it seems odd to propose an algebraic relationship between numbers directly describing mental processes and numbers arising as test scores, then you are not alone.

However, in some situations a reasonable theory proposes that the vector of dependent variables $\mathbf{y}$ is related to the vector of latent variables $\mathbf{x}$ by a function \[\mathbf{y}=f(\mathbf{A}\mathbf{x})\]
where $\mathbf{A}$ is a specified matrix and $f$ is a function that the model does not specify but proposes to be componentwise monotonic. That is, $f$ is assumed to have the form $f(z_1, \ldots,z_m)=(f_1(z_1),\ldots, f_m(z_m))$, where either each $f_i$ is a strictly monotonically increasing function or each $f_i$ is a strictly monotonically decreasing function. I’ll give some examples of such theories shortly, and I’ll also expand this idea to a broader range of theories. First I’ll discuss how this form of hypothesis leads us to oriented matroids.

Signed difference analysis

For a vector $\mathbf{x}\in\mathbb{R}^m$, let $\mathrm{sign}(\mathbf{x})\in\{0,+,-\}^m$ be the componentwise sign. For a vector space $V$, let $\mathcal{V}^*(V)=\{\mathrm{sign}(\mathbf{v}): \mathbf{v}\in V\}$. Thus ${\mathcal{V}}^*(V)$ is the signed covector set of an oriented matroid realized by $V$.

Proposition 1([3])
Let $g$ be a function with codomain ${\mathbb{R}^n}$, and let $f:{\mathbb{R}^n}\to{\mathbb{R}^n}$ be a componentwise monotonic function. Then for any $\mathbf{x}$ and $\tilde{\mathbf{x}}$ in the domain of $g$,

This is really more of an observation than a proposition — the proof is immediate. But it tells us that for models of the form I described earlier, oriented matroids provide an appropriate test for the model.

Corollary 2 Let $\mathbf{A}$ be an $m\times n$ matrix over $\mathbb{R}$, and $f:{\mathbb{R}^n}\to{\mathbb{R}^n}$ a componentwise monotonic function. Then for any $\mathbf{x},\tilde{\mathbf{x}}\in{\mathbb{R}^m}$,

As a very simple example, a theory might propose a single latent variable $x$ such that one’s score at each of several tests is a monotonically increasing function of $x$. Thus one proposes that the vector $\mathbf{y}$ of test scores satisfies
for some componentwise increasing $f$. Our theory and Corollary 2 says that
if $\mathbf{y}$ is the vector of test scores for Maria and $\tilde{\mathbf{y}}$ is the vector of test scores for Fred, then $\mathrm{sign}(\mathbf{y}-\tilde{\mathbf{y}})^T$ is in $\{(+,\ldots, +), (0,\ldots, 0),(-,\ldots,-)\}$.

This end result is obvious: our original theory was that the test-taker with the larger value of $x$ should do better on all of the tests. The point of bringing this example up is to note that this theory is more realistic than a theory that proposes some specific formula relating latent and dependent variables. By considering only the sign of $\mathbf{y}-\mathbf{y}’$, we discard numerical issues and consider only ordinal relationships between corresponding components of $\mathbf{y}$ and $\mathbf{y}’$.

To add some realism: Perhaps Maria has a larger value of $x$ than Fred, but not all of the tests are sensitive enough to detect her superior skill: on some tests they get the same score. Thus, perhaps $f$ is only componentwise weakly increasing. To encompass this we have the following observation. We order $\{0,+,-\}$ so that 0 is the unique minimum and $+,-$ are incomparable, and we order $\{0,+,-\}^n$ componentwise. For a subset $P$ of $\{0,+,-\}^n$, we will let $P_\downarrow$ be the order ideal of $P$ and $P^\uparrow$ be the filter of $P$. As another easy proposition we have the following.

Proposition 3 Let $g$ be a function with codomain ${\mathbb{R}^n}$, and let $f:{\mathbb{R}^n}\to{\mathbb{R}^n}$ be a componentwise weakly monotonic function. Then for any $\mathbf{x}$ and $\tilde{\mathbf{x}}$ in the domain of $g$,

Corollary 4 Let $\mathbf{A}$ be an $m\times n$ matrix over ${\mathbb{R}}$, and $f:{\mathbb{R}^n}\to{\mathbb{R}^n}$ a componentwise weakly monotonic function. Then for any $\mathbf{x},\tilde{\mathbf{x}}\in{\mathbb{R}^m}$,

Thus, if we have a theory that proposes that dependent variables $\mathbf{y}$ and latent variables $\mathbf{x}$ are related by $\mathbf{y}=f(\mathbf{A}(\mathbf{x}))$ for some matrix $\mathbf{A}$ and componentwise weakly increasing $f$, and we have data $\{\mathbf{y}_1, \ldots, \mathbf{y}_N\}$, where each $\mathbf{y}_i$ is the vector of dependent variables measured in one trial, then to test our theory we do the following:

  1. Find all signed difference vectors $\mathrm{sign}(\mathbf{y}_i-\mathbf{y}_j)$.
  2.  For each signed difference vector $S$, see if there is a $X\in{\mathcal{V}}^*({\mathrm{col}}(\mathbf{A}))$ such that $S$ and $X$ coincide on the support of $S$. Any signed difference vector for which there is no such $X$ is a violation of the model.

We can carry this out more efficiently (at least, when $|{\mathcal{C}}({\mathrm{col}}(\mathbf{A}))|<|{\mathcal{V}}^*({\mathrm{col}}(\mathbf{A}))|$) using the following result.
Proposition 5 ([2]) For any oriented matroid on elements $E$ with signed covector set ${\mathcal{V}}^*$ and signed circuit set ${\mathcal{C}}$, we have that $\{{\mathcal{V}}^*_\downarrow,{\mathcal{C}}^\uparrow\}$ is a partition of $\{0,+,-\}^E$.

Thus Step 2 above can be replaced with

2. For each signed difference vector $S$, see if there is a $Z\in{\mathcal{C}}({\mathrm{col}}(\mathbf{A}))$ such that $S$ and $Z$ coincide on the support of $Z$. Any signed difference vector for which such a $Z$ exists is a violation of the model.

Example: the Remember-Know problem

Think about the last time you saw someone and thought, “I know I’ve seen that person before, but I can’t remember where”. Some theories hold that remembering — recalling details such as “Oh yeah, I see that person working at the grocery store sometimes” — is an essentially different mental process from knowing — having a sense of familiarity without any context around it.

Another theory proposes that the only difference between remembering and knowing is the strength of the recall. More specifically, a signal detection model proposes that each of the above questions elicits in your mind a response (a signal) of some strength, and that the strength of this signal determines whether you “remember” the answer, merely “know” the answer, or don’t know the answer.
A testable instantiation of this ([2]) is as follows. On Day 1, a subject is given a list of words. On Day 2, the subject is given a new list and is asked, for each word on the new list, whether they “remember” seeing it on the old list, whether they “know” it was there despite not “remembering”, or whether they don’t recall that it was on the list.

A signal detection model theorizes that each word on the new list elicits a signal, so that

  •  if the word was on the old list, there is a probability distribution $p(s)$ for the strength of this signal, and
  • if the word was not on the old list, the probability distribution for the strength of this signal is $p(s+d)$ for some $d$.

The model then says that there are numbers $a<b$ such that the subject will “remember” the word from the list if the strength of the signal is greater than $b$, will merely “know” the word was on the list if the strength was in the interval $[a,b]$, and will not recall the word if the strength is less than $a$.

Often the function $p(x)$ is assumed to be a normal distribution. But there’s frequently little theoretical basis for this assumption, and the use of signed difference analysis allows us to avoid any such assumption. We have latent variables $a$, $b$, and $d$, and we have that

  •  the probability ${\mathrm{Prob}}(\mbox{rem:true})$ of “remembering” a word that was on the old list is $\int_b^\infty p(t) dt=1-P(b)$. Here $P$ is the cumulative probability distribution for $p$,
  •  the probability ${\mathrm{Prob}}(\mbox{know:true})$ of either “remembering” or “knowing” a word that was on the old list is $\int_a^\infty p(t) dt=1-P(a)$,
  •  the probability ${\mathrm{Prob}}(\mbox{rem:false})$ of “remembering” a word that was not on the old list is $\int_b^\infty p(t+d) dt=1-P(b+d)$, and
  •  the probability ${\mathrm{Prob}}(\mbox{know:false})$ of either “remembering” or “knowing” a word that was not on the old list is $\int_a^\infty p(t+d) dt=1-P(a+d)$.

Let $f(x)=1-P(x)$, a monotonically decreasing function, and let $F:{\mathbb{R}}^4\to R^4$ act by $f$ in each component. Thus our model proposes that
So, if we conduct this experiment under different conditions (i.e., with different values for the independent variables) and record our data in terms of the four variables on the left-hand side of this equation, the model predicts that the signed differences of data points will be in ${\mathcal{V}}^*({\mathrm{col}}(\mathbf{A}))_\downarrow$, where $A$ is the matrix on the right-hand side of the above equation.

This is a pretty low bar: the only two elements of $\{0,+,-\}^4$ which are not in ${\mathcal{V}}^*({\mathrm{col}}(\mathbf{A}))_\downarrow$ are $\pm (+,-,-,+)$. But that is part of the point. Previous work had dismissed signal-detection models for the Remember-Know problem based on results of experiments similar to type I’ve described, but their analyses assumed a particular form for the probability distribution $p(x)$, and hence assumed data points must lie on a curve of a certain form. Signed difference analysis suggests that without making such an assumption, a signal-detection model for this problem is very difficult to falsify.

Nonlinear structure in SDA

We can broaden signed difference analysis further by considering models of the form $\mathbf{y}=f(g(\mathbf{x}))$, where $\mathbf{y}$ is the vector of dependent variables, $\mathbf{x}$ is the vector of latent variables, $f$ is an unspecified componentwise monotonic function, and $g$ is close enough to some affine function $\mathbf{x}\to\mathbf{A}\mathbf{x}+\mathbf{b}$ that, for all $\mathbf{x}$ and $\tilde{\mathbf{x}}$,
Proposition 1 then promises us that for all $\mathbf{x}$ and $\tilde{\mathbf{x}}$,

For instance, if $g:{\mathbb{R}^m}\to{\mathbb{R}^n}$ is differentiable and the column space $V$ of the Jacobian $Dg$ is constant, then for every $\mathbf{x},\tilde{\mathbf{x}}\in{\mathbb{R}^m}$ and differentiable path $\lambda:[0,1]\to{\mathbb{R}^m}$ from $\mathbf{x}$ to $\tilde{\mathbf{x}}$ we have
\[ g(\tilde{\mathbf{x}})-g(\mathbf{x})=\int_0^1 Dg(\lambda(t))\lambda'(t) dt\in V.\]

For instance, in ([2]) we consider a signal-detection model for inductive and deductive reasoning that predicts data of the form
$\mathbf{y}=f(d,-d,(d-c)/s, -d+c)$, where $d$, $c$, and $s$ are latent variables and $f$ is componentwise monotonic. The Jacobian of the function $g(d,c,s)=(d,-d,(d-c)/s, -d+c)$ is
which has the same column space as
and thus for every $(d,c,s)$ and $(\tilde d,\tilde c,\tilde s)$ and every componentwise monotonic $f$, the model proposes that the signed difference $f(g(d,c,s))-f(g(\tilde d,\tilde c,\tilde s))$ is in

As another example, one model for reading ([1]) proposes that in reading single words one uses one or both of two skills. To oversimplify a bit, either one sounds out a word phonetically or one compares the word to known words. The model was proposed, in part, to account for cases of acquired dyslexia affecting either ability to read phonetically regular words but not phonetically irregular words or ability to read known words but not phonetically regular unknown words. Let $a$ denote the probability that a particular reader is will be successful at phonetically reading a phonetically regular word, and let $b$ be the probability that this reader will successfully read a word based on comparison to known words. These are the latent variables. Let $\mathbf{y}=(y_1, y_2, y_3)^T$ be the reader’s scores on three tests: $y_1$ is the score on a test of phonetically regular nonsense words, $y_2$ is the score on a test of phonetically irregular common words, and $y_3$ is the score on a test of words that are both phonetically regular and common. The model proposes that $\mathbf{y}=f(a,b,a+b-ab)$, where $f$ is componentwise weakly increasing. The Jacobian of $g(a,b)=(a,b,a+b-ab)$ is
which no longer has constant column space. However, we can still see that, for all $(a,b)$ and $(\tilde a,\tilde b)\in [0,1]^2$, the sign difference $\mathrm{sign}(g(a,b)-g(\tilde a, \tilde b))$ is in
by calculating
g(a,b)-g(\tilde a, \tilde b)&=\int_0^1 Dg(t(\tilde a,\tilde b)+(1-t(a,b))\begin{pmatrix}\tilde a-a\\\tilde b-b\end{pmatrix} dt\\
\int_0^1 (1-t)b+t\tilde b\ dt&\int_0^1 (1-t)a+t\tilde a\ dt
\end{pmatrix}\begin{pmatrix}\tilde a-a\\\tilde b-b\end{pmatrix}\\
1-\frac{1}{2}(b+\tilde b)&1-\frac{1}{2}(a+\tilde a)
\end{pmatrix}\begin{pmatrix}\tilde a-a\\\tilde b-b\end{pmatrix}
and noting that
1-\frac{1}{2}(b+\tilde b)&1-\frac{1}{2}(a+\tilde a)

As an aside: ${\mathcal{V}}^*({\mathrm{col}}(Dg))$ is constant on $(0,1)^2$, and one might hope that this already implies that all signed differences $\mathrm{sign}(f(g(a,b))-f(g(\tilde a,\tilde b)))$ are in ${\mathcal{V}}^*({\mathrm{col}}(Dg))$. This would render the above integration argument unnecessary. If $Dg$ were rank 1, such a result follows from
the Mean Value Theorem. However, in higher rank this fails — see \cite{DA18} for an example.


  1. Coltheart, M., “Cognitive neuropsychology and the study of reading”, in Attention and performance XI, M. I. Posner and O. S. M. Marin, eds. (1985)
  2. Dunn, J. C. and Anderson, L., Signed difference analysis: Testing for structure under monotonicity, Journal of Mathematical Psychology 85 (2018), 36-54
  3. Dunn, J. C. and James, R. N., Signed difference analysis: Theory and application, Journal of Mathematical Psychology 47 (2003), 389–416

Exceptional matroids in chain theorems

At the end of November 2017, the Tutte Centenary Retreat was held.  32 researchers gathered in Creswick Australia to work on problems in three areas where Tutte made seminal contributions. One of those three areas was Matroid Structure Theory: nine of us (Rutger Campbell, Deborah Chun, Tara Fife, Kevin Grace, Dillon Mayhew, James Oxley, Charles Semple, Geoff Whittle, and myself) split into two groups to work on some carefully curated problems in this area.  In this post, I’m going to talk about matroids where certain subsets of the ground set appear in circuits and cocircuits of certain sizes — mostly work that originated during this week in Creswick — as well as some related work and open problems in the area.

Rather than getting into any detail of the proofs, my aim with this post is to give an overview of the motivation (from a connectivity-centric point of view), the results, and give some open questions and conjectures on the topic.  Essentially, most of the results follow from repeated use of orthogonality: the fact that a circuit and cocircuit of a matroid cannot intersect in a single element.

To start with, let’s consider matroids where every $t$-element subset of the ground set appears in an $\ell$-element circuit and an $\ell$-element cocircuit; for brevity, call these $(t,\ell)$-matroids.

For example, wheels and whirls are (1,3)-matroids: every element in a wheel or whirl appears in a triangle (a 3-element circuit) and a triad (a 3-element cocircuit).  Excluding the rank-2 wheel, these matroids are 3-connected, and, due to the triangles and triads, deleting or contracting any single element results in a matroid that is no longer 3-connected.  Tutte’s Wheels-and-Whirls Theorem states that these are in fact the only 3-connected matroids with no single-element deletion or contraction preserving 3-connectedness.

More generally, one reason why someone might be interested in $(t,\ell)$ matroids is that they would appear as exceptional matroids in chain theorems (results like the Wheels-and-Whirls theorem). For example, any 4-connected (1,4)-matroid has no single-element deletion or contraction that is 4-connected (due to the 4-element circuits and cocircuits), and any 3-connected (2,4)-matroid has no pair of elements whose deletion or contraction remains 3-connected (here we are allowed only to delete both elements, or contract both elements). These may or may not be the only matroids with this property, but they provide a starting point.

(2,4)-matroids, a.k.a. spikes

So what can we say about (2,4)-matroids? Joel Miller [Miller2014] showed the following:

Let $M$ be a matroid with $|E(M)| \ge 13$.  Then $M$ is a (2,4)-matroid if and only if $M$ is a spike.

One way of defining a spike (useful for the purposes of this post) is as a matroid with a partition into pairs $(X_1,X_2,\ldots,X_t)$, for some $t \ge 3$, such that for all distinct $i,j \in [t]$, $X_i \cup X_j$ is both a circuit and a cocircuit.  Note that all “spikes” in this post are what are sometimes referred to as tipless spikes.

Miller also showed that the bound of 13 is tight, and described all matroids with the (2,4)-property when $|E(M)| \le 12$.

As I mentioned earlier, since spikes are (2,4)-matroids, they have no pair of elements whose deletion or contraction remains 3-connected.  In fact, Alan Williams [Williams2015] showed that the only 3-connected matroids having this connectivity property, with $|E(M)| \ge 13$, and no triangles or triads, are spikes.  So in this case, (2,4)-matroids are the only exceptional matroids appearing in a chain theorem for removing a pair of elements from a 3-connected matroid with no triangles or triads, and retaining 3-connectivity (the caveat being the “no triangles or triads” condition: I’ll touch more on this in the section after next).

$(t,2t)$-matroids, a.k.a. $t$-spikes

With Rutger Campbell, Deborah Chun, Kevin Grace, and Geoff Whittle [BCCGW2018], we generalised Miller’s result as follows.

Let $M$ be a matroid, and let $t$ be a positive integer. For each $t$, there exists an $n_t$ such that if $M$ is a matroid with the $(t,2t)$-property and $|E(M)| \ge n_t$, then $M$ has a partition into pairs such that the union of any $t$ pairs is both a circuit and a cocircuit.

We call a matroid a $t$-spike if it has a partition $\pi$ into pairs such that the union of any $t$ pairs is both a circuit and a cocircuit.

The infinite family of $t$-spikes is a natural class of $(t,\ell)$-matroids to consider: we also showed there are only finitely many $(t,\ell)$-matroids for $\ell < 2t$.  Note that spikes are 2-spikes, and it is not hard to show that 1-spikes are matroids obtained by taking direct sums of $U_{1,2}$.  $t$-spikes share some well-known properties of spikes: A $t$-spike $M$ with $r$ legs has rank $r$ (where a leg is one of the pairs in the partition $\pi$), and, when $r$ is sufficiently large, $M$ is $(2t-1)$-connected.  Moreover, the partition $\pi$ associated with a $t$-spike naturally gives rise to crossing $(2t-1)$-separations (for those familiar with flowers, an appropriate concatenation of $\pi$ is a $(2t-1)$-anemone, following the terminology of [AO2008]).

A $(t+1)$-spike $M_2$ can be obtained from a $t$-spike $M_1$ (with sufficiently many legs), by the following construction.  Recall that $M_1’$ is an elementary quotient of $M_1$ if there is some single-element extension $M_1^+$ of $M_1$ by an element $e$ such that $M_1^+/e = M_1’$.  First, take an elementary quotient of the $t$-spike $M_1$ such that none of the $2t$-element cocircuits (from the union of $t$ legs) are preserved. That is, extend $M_1$ by an element $e$ in such a way that the extension does not preserve any of the $2t$-element cocircuits, and then contract $e$. We then repeat this process in the dual: this corresponds to taking an elementary lift such that none of the $2t$-element circuits are preserved. The resulting matroid is a $(t+1)$-spike.  Note that one option for the quotient is to simply truncate (i.e. take a free extension by $e$, and then contract $e$) but there may be others.
For the purposes of this post, I’ll refer to this operation as an inflation of a $t$-spike.  We showed, in [BCCGW2018], that for $t \ge 1$, any $(t+1)$-spike with $r$ legs can be obtained from a $t$-spike with $r$ legs, by an inflation.

Spikes are ubiquitous in matroid theory; perhaps $t$-spikes may also be an interesting family of matroids.


Recall that spikes (i.e. (2,4)-matroids) are the only 3-connected triangle-and-triad-free matroids with no pair of elements whose deletion or contraction preserves 3-connectivity, when we restrict our attention to matroids on at least 13 elements.  What if we want to remove the “triangle-and-triad-free” condition; what additional structures arise? (*)
Certainly wheels and whirls (i.e. (1,3)-matroids) for one, but this is not all.  Another example is any matroid where every pair of elements is in a 4-element circuit, and every element is in a triad.

Say that $M$ is a $(t_1,\ell_1,t_2,\ell_2)$-matroid if every $t_1$-element set is in an $\ell_1$-element circuit, and every $t_2$-element set is in an $\ell_2$-element cocircuit (the $(t,\ell)$-matroids considered earlier have $t_1=t_2$ and $\ell_1=\ell_2$).  James Oxley, Simon Pfeil, Charles Semple and Geoff Whittle [OPSW2018] showed the following:

For $k=3$ or $k=4$, a $k$-connected matroid with $|E(M)| \ge k^2$ is a $(2,4,1,k)$-matroid if and only if $M \cong M(K_{k,n})$ for some $n \ge k$.

So $M(K_{3,n})$ and $M^*(K_{3,n})$ are answers to (*). But there are other structures that arise that don’t fit the $(t_1,\ell_1,t_2,\ell_2)$-matroid framework, that I won’t go into (for more details, see [BWW2018, Conjecture 7.5]; a conjectured answer to (*)).

Apart from the [OPSW2018] result and the case where $t_1 = t_2$ and $\ell_1 = \ell_2$, these $(t_1,\ell_1,t_2,\ell_2)$-matroids have had little attention, as far as I know.  We conjecture the following in [BCCGW2018]:

Any sufficiently large $(t_1,2t_1,t_2,2t_2)$-matroid has a partition into pairs such that the union of any $t_1$ pairs is a circuit, and the union of any $t_2$ pairs is a cocircuit.

$t$-cyclic matroids

If the removing-sets-of-size-$t$-style chain theorems are a bit far-fetched for your taste, I’ll now attempt to return to more traditional single-element deletion/contractions, in a slightly roundabout way.

It seems that obtaining a single-element chain theorem for 4-connectivity in the style of Tutte’s Wheels-and-Whirls Theorem has its difficulties (to put it lightly) — see, for example, [CMO2011] for internally 4-connected binary matroids.

Even if we just consider 4-connected (1,4)-matroids, which we know are matroids with no single-element deletion or contraction that preserves 4-connectedness, this seems like a potentially wild class: it includes $M(K_{4,n})$ and $M^*(K_{4,n})$ for any $n \ge 4$; cycle matroids of grids; or, more generally, take the cycle matroid of any 4-connected 4-regular graph with no 3-cycles, but every edge is in a 4-cycle.

Recall the inflation operation, which we used to obtain a $(t+1)$-spike from a $t$-spike. Using essentially the same operation, we see that (1,6)-matroids are at least as wild as (1,4)-matroids.  (I say “essentially the same” here because now we require that the elementary quotient/lift does not preserve the $2t$-element circuits/cocircuits corresponding to consecutive elements in the cyclic ordering.)  So any horrors from (1,4)-matroids extend to (1,2t)-matroids for integers $t > 2$.  I still reserve some small amount of hope for $(1,2t+1)$-matroids, for $t \ge 2$.  But, in general, characterising $(1,t)$-matroids seems difficult, so let’s first look at doing something easier.

Wheels and whirls (that is, (1,3)-matroids) also have the property that there is a cyclic ordering on the elements such that every pair of consecutive elements in the ordering is contained in a triangle, and contained in a triad.

We say that a matroid has the cyclic $(t-1,t)$-property if there is an ordering $\sigma$ of the ground set such that every set of $t-1$ consecutive elements in $\sigma$ is contained in a $t$-element circuit and a $t$-element cocircuit.

So wheels and whirls have the cyclic (2,3)-property.  Note also that swirls and spikes (i.e. 2-spikes) have the (3,4)-cyclic property.  In fact, $t$-spikes have the $(2t-1,2t)$-cyclic property.

Together with Deborah Chun, Tara Fife, and Charles Semple, we proved a characterisation of matroids with the $(t-1,t)$-cyclic property [BCFS2018].  Before I state this, let me give some intuition.  Essentially, the result shows that one can think of wheels and whirls as a canonical example of matroids with the $(t-1,t)$-property when t is odd; and swirls as a canonical example when $t$ is even — at least, with regards to how the 3- or 4-element circuits/cocircuits appear in either case.  These matroids have not only an ordering that certifies they are $(t-1,t)$-cyclic, but an ordering with a stronger property: for whirls and whirls, any set of $t$ consecutive elements in the ordering is either a (coindependent) circuit or (independent) cocircuit, and these alternate; or for swirls, each set of $t$ consecutive elements in the ordering alternates between being both a circuit and a cocircuit, and being independent and coindependent.

We say that a matroid $M$ is $t$-cyclic if there is an ordering $(e_1,e_2,\ldots,e_n)$ of $E(M)$ such that, when $t$ is odd, each set of $t$-consecutive elements $\{e_i,\ldots,e_{i+t-1}\}$ is a (coindependent) circuit when $i$ is odd, and a (independent) cocircuit when $i$ is even; and when $t$ is even, each set of $t$-consecutive elements $\{e_i,\ldots,e_{i+t-1}\}$ is a circuit and a cocircuit when $i$ is odd (and is independent and coindependent when $i$ is even).  (Indices are interpreted modulo n.)

Theorem [BCFS2018]:
Let $M$ be a matroid with the $(t-1,t)$-property, where $t \ge 3$ and $n \ge 6t-10$. Then $M$ is $t$-cyclic.

A $t$-cyclic matroid with rank $r$ has $2r$ elements, and $t$-cyclic matroids have crossing $t$- or $(t-1)$-separations (when $t$ is odd or even respectively) that can be described in terms of flowers. (For those familiar with flowers: when $t$ is odd, these are daisies; when $t$ is even it is possible, depending on the matroid, to have either daisies or anemones.)  One interesting thing to observe is the effect of the parity of $t$.

We can use the construction referred to as inflation to obtain $(t+2)$-cyclic matroids from $t$-cyclic matroids. Maybe we can get all $t$-cyclic matroids this way:

Conjecture [BCFS2018]:
Let M be a $t$-cyclic matroid for some $t \ge 2$.
If $t$ is even, then M can be obtained from a spike or a swirl by a sequence of inflations.
If $t$ is odd, then M can be obtained from a wheel or a whirl by a sequence of inflations.

I would be surprised if the odd $t$ case of this conjecture does not hold; I am a bit less confident about the case where $t$ is even.

If you’ve made it this far in the post, the reward is a potentially foolhardy conjecture or two.

As touched on earlier, I think perhaps there is some hope for a “nice” characterisation of $(1,t)$-matroids for odd $t \ge 5$.  Here is an optimistic conjecture:

Let $t$ be an odd integer, with $t \ge 3$.  There exists an $n_t$ such that whenever $|E(M)| \ge n_t$, $M$ is $t$-cyclic if and only if $M$ is a $(1,t)$-matroid.

In fact, I’m not even aware of sporadic examples.

For odd t, does there exist a matroid $M$ where every element is in a $t$-circuit and $t$-cocircuit, but $M$ is not $t$-cyclic?


[AO2008] J. Aikin, J. Oxley. The structure of crossing separations in matroids. Adv. in Appl. Math. 41 (2008), 10-26.
[BCCGW2018] N. Brettell, R. Campbell, D. Chun, K. Grace, G. Whittle. On a generalisation of spikes. arXiv:1804.06959.
[BCFS2018] N. Brettell, D. Chun, T. Fife, C. Semple. Matroids with a cyclic arrangement of circuits and cocircuits. arXiv:1806.03625.
[BWW2018] N. Brettell, G. Whittle, A. Williams.  N-detachable pairs in 3-connected matroids III: the theorem. arXiv:1804.06588.
[CMO2011] C. Chun, D. Mayhew, J. Oxley. A chain theorem for internally 4-connected binary matroids. J. Combin. Theory Ser. B 101 (2011), 141-189.
[Miller2014] J. Miller. Matroids in which every pair of elements belongs to a 4-circuit and a 4-cocircuit. M.Sc. thesis, Victoria University of Wellington, 2014.
[OPSW2018] J. Oxley, S. Pfeil, C. Semple, G. Whittle. Matroids with many small circuits and cocircuits. Submitted.
[Williams2015] A. Williams. Detachable pairs in 3-connected matroids. Ph.D. thesis, Victoria University of Wellington, 2015.

The infinite matroid intersection conjecture

Today we’ll return to our examination of infinite matroids. So far we saw why they are defined the way they are and what the known examples look like. Then we examined a very flexible way of building infinite matroids from trees of finite matroids and saw how to use that construction as a tool in topological infinite graph theory.

The aim today is to understand the deepest and most important unproved conjecture about infinite matroids, the infinite matroid intersection conjecture. We won’t be looking at progress towards the conjecture today, just approaching the statement from a number of angles and getting a sense of its connections to various very different-looking problems in infinite combinatorics. I hope that by the end of the post you will be convinced, as I am, that it is a deep and compelling problem. Here it is:

Conjecture (Nash-Williams): Let $M$ and $N$ be (possibly infinite) matroids on the same ground set $E$. Then there are a subset $I$ of $E$ and a partition of $E$ into sets $P$ and $Q$ such that $I$ is independent in both $M$ and $N$, $I \cap P$ spans $P$ in $M$ and $I \cap Q$ spans $Q$ in $N$.

Like a TARDIS, at a first glance this statement seems simple and perhaps a little odd, and its deeper significance is hidden. To get a sense of that significance, we must go on a long journey and see how it materialises within apparently widely separated worlds.

Our journey begins with the observation that finding good infinite versions of theorems about finite combinatorial objects is hard. All too often, the obvious generalisation is either straightforwardly false or else is a simple consequence of the finite version of the theorem, and as such has no new content.

An example of the latter phenomenon is Menger’s Theorem. If $G$ is a graph and $A$ and $B$ are sets then an $A$-$B$ separator in $G$ is defined to be a set $S$ of vertices of $G$ such that there is no path from $A$ to $B$ in $G – S$. Menger’s theorem states that if $G$ is finite then the minimal size of an $A$-$B$ separator in $G$ is the same as the maximal size of a set of disjoint paths from $A$ to $B$ in $G$.

The obvious way to generalise this statement to infinite graphs would be to simply replace the word ‘size’ with the word ‘cardinality’ in both places where it appears. However, the statement obtained in this way has no more content than the finite version of the theorem. We can see this by considering an $A$-$B$ separator $S$ of minimal cardinality.

If $S$ is infinite, then any set of fewer than $|S|$ paths from $A$ to $B$ uses fewer than $|S|$ vertices, and so cannot be maximal. So in that case the statement is clear, and we can suppose instead that $|S|$ is some natural number $n$. Now for each $m \leq n$ we can easily build a finite subgraph $G_m$ of $G$ in which any $A$-$B$ separator has size at least $m$: we may take $G_0$ to be empty, and build $G_{m+1}$ from $G_m$ by adding a path $P_X$ of $G$ from $A$ to $B$ avoiding each set $X$ of $m$ vertices of $G_m$. Then by Menger’s theorem $G_n$ already contains $n$ disjoint paths from $A$ to $B$.

It was Paul Erdős who saw how to get a much deeper infinite generalisation by first reformulating Menger’s theorem as a structural statement. Suppose that we consider an $A$-$B$ separator $S$ of minimal size and a set $\cal P$ of disjoint paths from $A$ to $B$ of maximal size. Then each path in $\cal P$ contains at least one vertex in $S$, and these vertices must all be distinct since the paths are disjoint. But by Menger’s theorem there can only be as may paths in $\cal P$ as there are vertices in $S$. So $S$ must consist of one vertex on each path in $\cal P$.

So it follows from Menger’s theorem that in a finite graph $G$ we can always find a set $\cal P$ of disjoint $A$-$B$ paths together with an $A$-$B$ separator $S$ consisting of one vertex from each path in $\cal P$. On the other hand, this structural statement also implies Menger’s theorem. After all, if ${\cal P}’$ is a set of disjoint paths from $A$ to $B$ of maximal size and $S’$ is an $A$-$B$ separator of minimal size then $|S’| \leq |S| = |{\cal P}| \leq |{\cal P}’|$. But also $|{\cal P}’| \leq |S’|$ since each path in ${\cal P}’$ must contain a different point of $S’$. So $|{\cal P}’| = |S’|$, as desired.

Erdős’ generalisation of Menger’s theorem is therefore the following structural statement:

Theorem (Aharoni and Berger): Let $G$ be a (possibly infinite) graph and let $A$ and $B$ be sets. Then there is a set ${\cal P}$ of disjoint $A$-$B$ paths together with an $A$-$B$ separator $S$ consisting of one vertex from each path in ${\cal P}$.

This statement contains some serious content about the structure of infinite graphs, and it remained open for almost half a century before finally being proved by Aharoni and Berger in 2009 [AB09]. Their proof remains one of the deepest ever given in infinite combinatorics.

Another example of the difficulties of generalisation from finite to infinite objects is given by the tree packing and covering theorems. The tree covering theorem states that a connected graph $G$ is a union of $k$ spanning trees if and only if for any set $X$ of vertices of $G$ the induced subgraph $G[X]$ has at most $k(|X| – 1)$ edges, and the tree packing theorem states that a connected graph $G$ includes $k$ edge-disjoint spanning trees if and only if for any partition $P$ of the vertex set of $G$, the quotient graph $G/P$ has at least $k(|P|-1)$ edges. Here $G/P$ is the graph whose vertices are the partition classes and whose edges are those of $G$ which go between partition classes, with endpoints the partition classes which they join.

Once more, the obvious generalisation of the tree covering theorem to infinite graphs has no more content than the finite version of the theorem; it can be proved from it by a straightforward compactness argument. On the other hand the obvious generalisation of the tree packing theorem to infinite graphs is false; there is a counterexample due to Aharoni and Thomassen [AT89]. And once more, to find the correct infinite version of the theorems we must begin by finding a structural version in the finite case. Indeed, it turns out that the tree packing and covering theorems have a unifying structural generalisation:

Theorem ([D17, Theorem 2.4.4]): Let $G$ be a connected finite graph and $k$ a natural number. Then there is a partition $P$ of the vertex set of $G$ such that $G/P$ is a union of $k$ spanning trees and $G[X]$ is connected and has $k$ edge-disjoint spanning trees for each partition class $X$ of $P$.

This tree packing/covering theorem implies both the tree packing theorem and the tree covering theorem. For tree packing, the necessity of the condition is clear, so it suffices to prove sufficiency. We can do this by applying the condition to the partition $P$ given by the tree packing/covering theorem. This gives that $G/P$ has at least $k(|P|-1)$ edges. Since it is a union of $k$ spanning trees, those trees must be edge disjoint. Combining these with the edge-disjoint spanning trees in each $G[X]$ gives $k$ edge-disjoint spanning trees in $G$. The derivation of the tree covering theorem from the packing/covering theorem is similar.

This gives us a nontrivial common generalisation of the tree packing and covering theorems to infinite graphs: we can simply omit the word ‘finite’ from the tree packing/covering theorem. The proof of this generalisation, though much simpler than that for the infinite version of Menger’s theorem, goes beyond the scope of this post.

We have now seen two examples where, to find the correct infinite generalisation of a theorem about finite graphs, it was necessary to first reformulate the finite theorem as a structural result. The same is true for theorems about finite matroids, but in this case something remarkable happens. The infinite structural statement you get is usually just the infinite matroid intersection conjecture!

This is not too surprising for the matroid intersection theorem, since Nash-Williams formulated the intersection conjecture to be an infinite structural generalisation of that statement. Recall that the matroid intersection theorem states that the largest size of a common independent set of two matroids $M$ and $N$ on the same ground set $E$ is the same as the minimum value over all partitions of $E$ into sets $P$ and $Q$ of $r_M(P) + r_N(Q)$. The inequality one way around is clear, since if $I$ is independent in both $M$ and $N$ and $\{P, Q\}$ is a partition of $E$ then $|I| = |I \cap P| + |I \cap Q| \leq r_M(P) + r_N(Q)$. For this inequality to be an equality, we must have that $I \cap P$ spans $P$ in $M$ and $I \cap Q$ spans $Q$ in $N$, just as in the conjecture.

There are some less expected cases. Let’s consider Tutte’s linking theorem, the closest matroidal analogue of Menger’s theorem. Let $M$ be a finite matroid with ground set $E$, and let $A$ and $B$ be disjoint subsets of $E$. Let $E’ := E \setminus (A \cup B)$. Then the connectivity $\lambda_M(A, B)$ from $A$ to $B$ in $M$ is defined to be the minimal value of $\kappa_M(A \cup P)$ over all bipartitions of $E’$ into sets $P$ and $Q$. Here $\kappa_M$ is the connectivity function of $M$, given by $\kappa_M(X) := r_M(X) + r_M(E \setminus X) – r(M)$. The linking theorem states that the maximum value of $\kappa_N(A)$ over all minors $N$ of $M$ with ground set $A \cup B$ is $\lambda_M(A,B)$.

It turns out that there is a structural analogue of this statement. Each such minor $N$ must have the form $M/I\backslash J$, where $I$ and $J$ form a partition of $E’$. By moving loops of $M/I$ into $J$ if necessary, we may suppose that $I$ is independent. We may now calculate as follows:

$\kappa_{M/I \setminus J}(A) = (r(A \cup I) – |I|) + (r(B \cup I) – |I|) – (r(M) – |I|) \\= (r(A \cup I) – |Q \cap I|) + (r(B \cup I) – |P \cap I|) – r(M) \\ \leq r(A \cup (I \cap P)) + r(B \cup (I \cap Q)) – r(M)  \\ \leq r(A \cup P) + r(B \cup Q) – r(M) \\ = \kappa_M(A \cup P)$

So equality of the left and right sides is equivalent to the statement that each inequality in the above calculation is an equality, giving the following four conditions:

  1. $I \cap P$ spans $P$ in $M/A$
  2. $I \cap Q$ spans $Q$ in $M/B$
  3. $I \cap P$ is independent in $M/(B \cup (I \cap Q))$
  4. $I \cap Q$ is independent in $M/(A \cup (I \cap P))$

The outlines of our TARDIS are beginning to materialise. Indeed, consider a minimal set $I$ satisfying these conditions. By minimality, $I \cap P$ will be independent in $M/A$ and $I \cap Q$ will be independent in $M/B$. Thus $I$ itself will be independent in both matroids. To put it another way, $I$, $P$ and $Q$ will witness that $M/A \backslash B$ and $M \backslash A/B$ satisfy the matroid intersection conjecture.

Thus the infinite generalisation of Tutte’s linking theorem is the statement that, for any (possibly infinite) matroid $M$ and any disjoint sets $A$ and $B$ of elements of $M$, the matroids $M/A\backslash B$ and $M \backslash A/B$ satisfy the infinite matroid intersection conjecture. Given this connection, it should not be too surprising that Aharoni and Berger’s infinite generalisation of Menger’s theorem follows from the infinite matroid intersection conjecture. Precise details of the derivation can be found in [ACF18].

What about the tree packing and covering theorems? Their matroidal analogues are the base packing and covering theorems, which in their full generality apply to a list $M_1, M_2, \ldots M_k$ of finite matroids on a common ground set $E$. A base packing for such a list is a collection of disjoint bases, one from each $M_i$. A base covering for such a list is a collection of bases, one from each $M_i$, whose union is the whole of $E$. The base packing theorem states that there is a base packing precisely when for any subset $Q$ of $E$ we have $\sum_{i = 1}^k r(M_i.Q) \leq |Q|$, and the base covering theorem states that there is a base covering precisely when for any subset $P$ of $E$ we have $\sum_{i = 1}^k r(M_i | P) \geq |P|$.

Once more we can combine these statements into a unified structural statement, the base packing/covering theorem, which states that given such a list of finite matroids on $E$ we can find a bipartition of $E$ into sets $P$ and $Q$ such that the matroids $M_1 | P, \ldots M_k | P$ have a packing and the matroids $M_1.Q, \ldots M_k.Q$ have a covering. The derivations of the base packing and covering theorems from this statement are analogous to the derivation of the tree packing theorem from the tree packing/covering theorem above. So the infinite version of the base packing and covering theorems is given by the same statement applied to a family of infinite matroids. We shall call this the base packing/covering conjecture.

Let’s consider the special case $k = 2$ in more detail. The existence of a packing for $M_1 | P$ and $M_2 | P$ is equivalent to the existence of a subset $I_P$ of $P$ such that $I_P$ spans $P$ in $M_1$ and $P \setminus I_P$ spans $P$ in $M_2$. Similarly the existence of a covering for $M_1.Q$ and $M_2.Q$ is equivalent to the existence of a subset $I_Q$ of $Q$ such that $I_Q$ is independent in $M_1/P$ and $Q \setminus I_Q$ is independent in $M_2/P$. Since a set is independent in a matroid precisely when its complement is spanning in the dual matroid, we can rephrase these conditions as follows:

  1. $I_P$ spans $P$ in $M_1$
  2. $I_Q$ spans $Q$ in $M_2^*$
  3. $I_P$ is independent in $M_2^*/Q$
  4. $I_Q$ is independent in $M_1/P$

Once again, as if from nowhere, the TARDIS appears. If we choose $I_P$ and $I_Q$ minimal subject to conditions (i) and (ii) then they will still satisfy conditions (iii) and (iv), which will guarantee that $I:=I_P \cup I_Q$ is independent in both $M_1$ and $M_2^*$, meaning that $I$, $P$ and $Q$ witness that $M_1$ and $M_2^*$ satisfy the infinite matroid intersection conjecture.

The TARDIS not only appears in unexpected places, it is also bigger on the inside than it seems. For example, the remarks in the last couple of paragraphs only apply to pairs of matroids, that is, to lists of length 2. But in fact it is possible to derive the full base packing/covering conjecture from the special case of pairs, and hence from the infinite matroid intersection conjecture. We will see the reasons for this when we look at the structure of the conjecture more closely in the next post in the series. For now we just note the consequence that the tree packing/covering theorem mentioned earlier also follows from the infinite matroid intersection conjecture.

We have seen how the infinite matroid intersection conjecture arises naturally as the infinite structural analogue of the matroid intersection theorem, the linking theorem, and the base packing and covering theorems. The same also holds for the matroid union theorem, which we do not have space to discuss here [BC15]. Thus the process of finding an infinite generalisation of all these statements reveals their unified structural heart. In the next post we will examine that structural heart more closely, looking at just what sort of structure the conjecture gives us, and we will survey the special cases for which the conjecture is already known.


[AB09] R. Aharoni and E. Berger, Menger’s Theorem for Infinite Graphs, Inventiones mathematicae 176(1):1–62 (2009).

[ACF18] E. Aigner-Horev, J. Carmesin and J.-O. Fröhlich, On the Intersection of Infinite Matroids, Discrete Mathematics 341(6):1582-1596 (2018).

[AT89] R. Aharoni and C. Thomassen, Infinite, highly connected digraphs with no two arc-disjoint spanning trees. J. Graph Theory, 13:71–74 (1989).

[BC15] N. Bowler and J. Carmesin, Matroid Intersection, Base Packing and Base Covering for Infinite Matroids, Combinatorica 35(2):153-180 (2015).

[D17] R. Diestel, Graph Theory, 5th edition, Springer-Verlag (2017).