Decomposition-width for matroids

In this post I want to discuss material I have been working on with Daryl Funk, Mike Newman, and Geoff Whittle. In particular, I’m going to discuss a parameter for matroids called decomposition-width. This terminology has been used by Dan Král [Kra12] and Yann Strozecksi [Str10, Str11]. We didn’t discover their work until after we had developed our own notion of decomposition-width, so our definition looks quite different from theirs, although it is equivalent. We have chosen to adopt their terminology.

Decomposition-width has a very natural motivation if you are familiar with matroids representable over finite fields, and matroid branch-width. Consider the following geometric illustration of the binary matroid $AG(3,2)$. The ground set has been partitioned into the sets $U$ and $V$. Let $X$ stand for the set of points coloured purple, and let $X’$ stand for the set of orange points. In the lefthand diagram, $V$ can distinguish between $X$ and $X’$. By this I mean that there is a subset $Z\subseteq V$ (we colour the points in $Z$ green) such that $X\cup Z$ is a circuit while $X’\cup Z$ is independent. However, in the righthand diagram, no subset of $V$ can distinguish $X$ and $X’$ in this way. Geometrically, this is because $X$ and $X’$ span exactly the same subset of the three-point line that lies in the spans of both $U$ and $V$ in the ambient binary space.

In general, let $M$ be a matroid on the ground set $E$, and let $(U,V)$ be a partition of $E$. We define the equivalence relation $\sim_{U}$ on subsets of $U$. We write $X\sim_{U} X’$ to mean that whenever $Z$ is a subset of $V$, both $X\cup Z$ and $X’\cup Z$ are independent, or neither is. This is clearly an equivalence relation.

Now we consider branch-width and decomposition-width. A decomposition of a matroid, $M=(E,\mathcal{I})$, consists of a pair $(T,\varphi)$, where $T$ is a binary tree (by this I mean that every vertex has degree one or three), and $\varphi$ is a bijection from $E$ to the set of leaves of $T$. If $e$ is an edge of $T$ joining vertices $u$ and $v$, then let $U_{e}$ be the subset containing elements $z\in E$ such that the path in $T$ from $\varphi(z)$ to $u$ does not contain $v$. Define $V_{e}$ symmetrically. We say that $U_{e}$ and $V_{e}$ are displayed by the decomposition. Define $\operatorname{bw}(M;T,\varphi)$ to be the maximum of $r(U_{e})+r(V_{e})-r(M)+1$, where the maximum is taken over all edges $e$ with end-vertices $u$ and $v$. Now I will define $\operatorname{dw}(M;T,\varphi)$ to be the maximum number of equivalence classes under the relation $\sim_{U_{e}}$, where we again take the maximum over all displayed sets $U_{e}$. The branch-width of $M$ is the minimum of $\operatorname{bw}(M;T,\varphi)$, where the minimum is taken over all decompositions $(T,\varphi)$. We define the decomposition-width of $M$ in the same way: as the minimum value taken by $\operatorname{dw}(M;T,\varphi)$. We write $\operatorname{bw}(M)$ and $\operatorname{dw}(M)$ for the branch- and decomposition-widths of $M$.

The notion of decomposition-width is clearly motivated by matroids over finite fields, but I won’t discuss those applications now. Instead we will continue to look at more abstract properties of decomposition-width. Král proved this next result for matroids representable over finite fields.

Proposition 1. Let $M$ be a matroid. Then $\operatorname{dw}(M)\geq \operatorname{bw}(M)$.

Proof. Let $E$ be the ground set of $M$, and let $U$ be a subset of $E$. Recall that $\lambda(U)$ is $r(U)+r(E-U)-r(M)$. We will start by proving that $\sim_{U}$ has at least $\lambda(U)+1$ equivalence classes. Define $V$ to be $E-U$. Let $B_{V}$ be a basis of $M|V$, and let $B$ be a basis of $M$ that contains $B_{V}$. Then $B\cap U$ is independent in $M|U$, and
\begin{align*}
r(U)-|B\cap U| &=r(U)-(|B|-|B_{V}|)\\
&=r(U)-(r(M)-r(V))\\
&=r(U)-(r(U)-\lambda(U))\\
&=\lambda(U).
\end{align*}
Therefore we let $(B\cap U)\cup\{x_{1},\ldots, x_{\lambda(U)}\}$ be a basis of $M|U$, where $x_{1},\ldots, x_{\lambda(U)}$ are distinct elements of $U-B$. Next we construct a sequence of distinct elements, $y_{1},\ldots, y_{\lambda(U)}$ from $B_{V}$ such that $(B-\{y_{1},\ldots, y_{i}\})\cup\{x_{1},\ldots, x_{i}\}$ is a basis of $M$ for each $i\in\{0,\ldots, \lambda(U)\}$. We do this recursively. Let $C$ be the unique circuit contained in\[(B-\{y_{1},\ldots, y_{i}\})\cup\{x_{1},\ldots, x_{i}\}\cup x_{i+1}\] and note that $x_{i+1}$ is in $C$. If $C$ contains no elements of $B_{V}$, then it is contained in $(B\cap U)\cup\{x_{1},\ldots, x_{\lambda(U)}\}$, which is impossible. So we simply let $y_{i+1}$ be an arbitrary element in $C\cap B_{V}$.

We complete the claim by showing that $(B\cap U)\cup\{x_{1},\ldots,x_{i}\}$ and $(B\cap U)\cup\{x_{1},\ldots, x_{j}\}$ are inequivalent under $\sim_{U}$ whenever $i< j$. Indeed, if $Z=B_{V}-\{y_{1},\ldots, y_{i}\}$, then $(B\cap U)\cup\{x_{1},\ldots, x_{i}\}\cup Z$ is a basis of $M$, and is properly contained in $(B\cap U)\cup\{x_{1},\ldots, x_{j}\}\cup Z$, so the last set is dependent, and we are done. Now assume for a contradiction that $\operatorname{bw}(M)>\operatorname{dw}(M)$. Let $(T,\varphi)$ be a decomposition of $M$ such that if $U$ is any set displayed by an edge of $T$, then $\sim_{U}$ has at most $\operatorname{dw}(M)$ equivalence classes. There is some edge $e$ of $T$ displaying a set $U_{e}$ such that $\lambda(U_{e})+1>\operatorname{dw}(M)$, for otherwise this decomposition of $M$ certifies that
$\operatorname{bw}(M)\leq \operatorname{dw}(M)$. But $\sim_{U_{e}}$ has at least $\lambda_{M}(U_{e})+1$ equivalence classes by the previous claim. As $\lambda_{M}(U_{e})+1>\operatorname{dw}(M)$, this contradicts our choice of $(T,\varphi)$. $\square$

Král states the next result without proof.

Proposition 2. Let $x$ be an element of the matroid $M$. Then $\operatorname{dw}(M\backslash x) \leq \operatorname{dw}(M)$ and
$\operatorname{dw}(M/x) \leq \operatorname{dw}(M)$.

Proof. Let $(T,\varphi)$ be a decomposition of $M$ and assume that whenever $U$ is a displayed set, then $\sim_{U}$ has no more than $\operatorname{dw}(M)$ equivalence classes. Let $T’$ be the tree obtained from $T$ by deleting $\varphi(x)$ and contracting an edge so that every vertex in $T’$ has degree one or three. Let $U$ be any subset of $E(M)-x$ displayed by $T’$. Then either $U$ or $U\cup x$ is displayed by $T$. Let $M’$ be either $M\backslash x$ or $M/x$. We will show that in $M’$, the number of equivalence classes under $\sim_{U}$ is no greater than the number of classes under $\sim_{U}$ or $\sim_{U\cup x}$ in $M$. Let $X$ and $X’$ be representatives of distinct classes under $\sim_{U}$ in $M’$. We will be done if we can show that these representatives correspond to distinct classes in $M$. Without loss of generality, we can assume that $Z$ is a subset of $E(M)-(U\cup x)$ such that $X\cup Z$ is independent in $M’$, but $X’\cup Z$ is dependent. If $M’=M\backslash x$, then $X\cup Z$ is independent in $M$ and $X’\cup Z$ is dependent, and thus we are done. So we assume that $M’=M/x$. If $U$ is displayed by $T$, then we observe that $X\cup (Z\cup x)$ is independent in $M$, while $X’\cup (Z\cup x)$ is dependent. On the other hand, if $U\cup x$ is displayed, then $(X\cup x)\cup Z$ is independent in $M$ and $(X’\cup x)\cup Z$ is dependent. $\square$

When we combine Propositions 1 and 2, we see that the class of matroids with decomposition-width at most $k$ is a minor-closed subclass of the matroids with branch-width at most $k$. The class of matroids with branch-width at most $k$ has finitely many excluded minors [GGRW03]. In contrast to this, Mike and I convinced ourselves that there are classes of the form $\{M\colon \operatorname{dw}(M) \leq k\}$ with infinitely many excluded minors. I guess we’d had a couple of beers by that point, but I think our argument holds up. I’ll eventually add that argument to this post. If anyone presents a proof in the comments before I do then I will buy them a drink at the next SIGMA meeting.

References

[GGRW03] J. F. Geelen, A. M. H. Gerards, N. Robertson, and G. P. Whittle. On the excluded minors for the matroids of branch-width $k$. J. Combin. Theory Ser. B 88 (2003), no. 2, 261–265.

[Kra12] D. Král. Decomposition width of matroids. Discrete Appl. Math. 160 (2012), no. 6, 913–923.

[Str10] Y. Strozecki. Enumeration complexity and matroid decomposition. Ph.D. thesis, Université Paris Diderot (2010).

[Str11] Y. Strozecki. Monadic second-order model-checking on decomposable matroids. Discrete Appl. Math. 159 (2011), no. 10, 1022–1039.

Clutters III

A long time ago I started a series of posts abut clutters. The most recent post followed the textbook by Gérards Cornuéjols in defining several important classes, and introduced a Venn diagram, showing the relationships between them.

ClutterVenn

In this post we will spend a little more time discussing this diagram.

We let $H=(S,\mathcal{A})$ be a clutter. This means that $S$ is a finite set, and the members of $\mathcal{A}$ are subsets of $S$, none of which is properly contained in another. We let $M$ stand for the incidence matrix of $H$. This means that the columns of $M$ are labelled by the elements of $S$, and the rows are labelled by members of $\mathcal{A}$, where an entry of $M$ is one if and only if the corresponding element of $S$ is contained in the corresponding member of $\mathcal{A}$. Any entry of $M$ that is not one is zero. Let $w$ be a vector in $\mathbb{R}^{S}$ with non-negative values. We have two fundamental linear programs:

(1) Find $x\in \mathbb{R}^{S}$ that minimises $w^{T}x$ subject to the constraints $x\geq \mathbf{0}$ and $Mx\geq \mathbf{1}$.

The vectors $\mathbf{0}$ and $\mathbf{1}$ have all entries equal to zero and one, respectively. When we write that real vectors $a$ and $b$ with the same number of entries satisfy $a\geq b$, we mean that each entry of $a$ is at least equal to the corresponding entry in $b$.

(2) Find $y\in\mathbb{R}^{\mathcal{A}}$ that maximises $y^{T}\mathbf{1}$ subject to the constraints $y\geq \mathbf{0}$ and $y^{T}M\leq w$.

Lemma 1. Any clutter with the Max Flow Min Cut property also has the packing property.

Proof. Let $H$ be a clutter with the Max Flow Min Cut property. This means that for any choice of vector $w$ with non-negative integer entries, the programs (1) and (2) both have optimal solutions with integer values.

We will show that $H$ has the packing property. According to the definition in the literature, this means that for any choice of vector $w$ with entries equal to $0$, $1$, or $+\infty$, there are optimal solutions to (1) and (2) with integer entries. I think there is a problem with this definition. Assume that $r$ is a row of $M$, and every member of the support of $r$ receives a weight of $+\infty$ in the vector $w$. Then (2) cannot have an optimal solution. If $y$ is a purported optimal solution, then we can improve it by adding $1$ to the entry of $y$ that corresponds to row $r$. We are instructed that if entry $i$ in $w$ is $+\infty$, then this means when $x$ is a solution to (1), then entry $i$ of $x$ must be $0$. Again, we have a problem, for if $r$ is a row of $M$, and the entire support of $r$ is weighted $+\infty$, then (1) has no solution: if $x$ were a solution, then it would be zero in every entry in the support of $r$, meaning that $Mx$ has a zero entry.

The literature is unanimous in saying that $H$ has the packing property if (1) and (2) both have integral optimal solutions for any choice of vector $w$ with entries $0$, $1$, and $+\infty$. As far as I can see, this means that any clutter with $\mathcal{A}$ non-empty does not have the packing property: simple declare $w$ to be the vector with all entries equal to $+\infty$. Then neither (1) nor (2) has an optimal solution at all. I think the way to recover the definition is to say that whenever $w$ with entries $0$, $1$, or $+\infty$ is chosen in such a way that (1) and (2) have solutions, they both have optimal solutions that are integral. This is the definition that I will use here.

After this detour, we return to our task, and assume that $H$ has the Max Flow Min Cut property. Assume that $w$ is a vector with entries equal to $0$, $1$, or $+\infty$, and that (1) and (2) both have solutions. This means that any row in $M$ has a member of its support which is not weighted $+\infty$ by $w$. We obtain the vector $u$ by replacing each $+\infty$ entry in $w$ with an integer that is greater than $|S|$. Now because $H$ has the Max Flow Min Cut property, it follows that there are optimal integral solutions, $x$ and $y$, to (1) and (2) (relative to the vector $u$). We will show that $x$ and $y$ are also optimal solutions to (1) and (2) relative to the vector $w$.

We partition $S$ into $S_{0}$, $S_{1}$, and $S_{+\infty}$ according to whether an element of $S$ receives a weight of $0$, $1$, or $+\infty$ in $w$. We have assumed that no member of $\mathcal{A}$ is contained in $S_{+\infty}$. We note that if $z\in \mathbb{Z}^{S}$ is a vector which is equal to zero for each element of $S_{+\infty}$, and one everywhere else, then $z$ is a solution to (1), by this assumption. Moreover, $w^{T}z=u^{T}z\leq |S|$. Now it follows that $x$ must be zero in every entry in $S_{+\infty}$, for otherwise $u^{T}x>|S|$, and therefore $x$ is not an optimal solution to (1) relative to $u$. Since $x$ is integral and optimal, it follows that we can assume every entry is either one or zero. If $x$ is not an optimal solution to (1) relative to $w$, then we let $z$ be an optimal solution with $w^{T}z < w^{T}x$. But by convention, $z$ must be zero in every entry of $S_{+\infty}$. Therefore $u^{T}z=w^{T}z < w^{T}x=u^{T}z$, and we have a contradiction to the optimality of $x$. Thus $x$ is an optimal solution to (1) relative to the $\{0,1,+\infty\}$-vector $w$.

Now for problem (2). Since $y$ is integral and non-negative, and $y^{T}M\leq w$, where every member of $\mathcal{A}$ contains an element of $S_{0}$ or $S_{1}$, it follows that each entry of $y$ must be either one or zero. Let $z$ be any solution of (2) relative to $w$. Exactly the same argument shows that each entry of $z$ is between zero and one. Therefore $z^{T}\mathbf{1}\leq y^{T}\mathbf{1}$ so $y$ is an optimal solution to (2).

We have shown that relative to the vector $w$, both (1) and (2) have optimal solutions that are integral. Hence $H$ has the packing property. $\square$

Lemma 2. A clutter with the packing property packs.

Proof. This one is easy. In order to prove that $H$ packs, we merely need to show that (1) and (2) have optimal integral solutions when $w$ is the vector with all entries equal to one. But this follows immediately from our revised definition of clutters with the packing property. $\square$

The final containment we should show is that clutters with the packing property are ideal. Idealness means that (1) has an optimal integral solution for all vectors $w\in \mathbb{R}^{S}$. This proof is difficult, so I will leave it for a future post. Usually we prove it by using a theorem due to Lehman [Leh].

Theorem (Lehman). The clutter $H$ is ideal if and only if (1) has an optimal integral solution for all choices of vector $w\in\{0,1,+\infty\}^{S}$.

Question. Is there a short proof that clutters with the packing property are ideal? One that does not rely on Lehman’s (quite difficult) theorem?

We will conclude with some examples showing that various containments are proper.

Let $C_{3}^{2}$ and $C_{3}^{2+}$ be the clutters with incidence matrices
\[
\begin{bmatrix}
1&1&0\\
0&1&1\\
1&0&1
\end{bmatrix}
\quad\text{and}\quad
\begin{bmatrix}
1&1&0&1\\
0&1&1&1\\
1&0&1&1
\end{bmatrix}.
\]
Let $Q_{6}$ and $Q_{6}^{+}$ be the clutters with incidence matrices
\[
\begin{bmatrix}
1&1&0&1&0&0\\
1&0&1&0&1&0\\
0&1&1&0&0&1\\
0&0&0&1&1&1
\end{bmatrix}
\quad\text{and}\quad
\begin{bmatrix}
1&1&0&1&0&0&1\\
1&0&1&0&1&0&1\\
0&1&1&0&0&1&1\\
0&0&0&1&1&1&1
\end{bmatrix}
\]

Exercise. Check that:

  1. $C_{3}^{2}$ is not ideal and does not pack,
  2. $C_{3}^{2+}$ packs, but is not ideal,
  3. $Q_{6}$ is ideal, but does not pack,
  4. $Q_{6}^{+}$ is ideal and packs, but does not have the packing property.

This leaves one cell in the Venn diagram without a clutter: the clutters with the packing property that do not have the Max Flow Min Cut property. In fact, Conforti and Cornuéjols [CC] have speculated that no such clutter exists.

Conjecture (Conforti and Cornuéjols). A clutter has the packing property if and only if it has the Max Flow Min Cut property.

[CC] M. Conforti and G. Cornuéjols, Clutters that Pack and the Max Flow Min Cut Property: A Conjecture, The Fourth Bellairs Workshop on Combinatorial Optimization, W.R. Pulleyblank and F.B. Shepherd eds. (1993).

[Leh] A. Lehman, On the width-length inequality. Mathematical Programming December 1979, Volume 16, Issue 1, pp 245–259.

Four proofs of a theorem by Vámos

Let $\chi(M)$ be the characteristic set of the matroid $M$; that is, \[\chi(M)=\{\operatorname{char}(\mathbb{F})\colon M\ \text{is representable over the field}\ \mathbb{F}\},\] where $\operatorname{char}(\mathbb{F})$ is the characteristic of $\mathbb{F}$. It is well known that $0\in\chi(M)$ if and only if $\chi(M)$ contains infinitely many primes. One direction is due to Rado [4]. The following theorem contains the other direction.

Theorem 1. If $\chi(M)$ contains infinitely many primes, then it contains zero.

One unusual aspect of Theorem 1 is that there are four proofs of it in the literature. All of them are attractive, and they call upon different (but related) sets of tools: Zorn’s lemma, compactness in first-order logic, ultraproducts, and Gröbner bases. In this post I will review all four proofs.

To start with, let $r$ be $r(M)$, and assume that $E(M)=\{1,\ldots, n\}$. Let $X$ be the set of variables $\{x_{i,j}\}_{1\leq i \leq r,\ 1\leq j \leq n}$, and let $R$ be the polynomial ring $\mathbb{Z}[X]$. We will let $A$ be the $r\times n$ matrix where the entry in row $i$ and column $j$ is $x_{i,j}$. Each $r$-element subset of $E(M)$ corresponds to an $r\times r$ submatrix of $A$, and thereby to a homogeneous polynomial in $R$, namely the determinant of that submatrix. Let $\mathcal{B}$ be the set of polynomials that correspond to bases of $M$, and let $\mathcal{N}$ contain the polynomials that correspond to dependent $r$-element subsets in $M$. Thus a representation of $M$ is a ring homomorphism from $R$ to a field that sends every polynomial in $\mathcal{N}$, but no polynomial in $\mathcal{B}$, to zero.

The first proof of Theorem 1 is due to Peter Vámos. He published two articles in the proceedings entitled Möbius Algebras, produced by the University of Waterloo in 1971. The proof of Theorem 1 has sometimes been cited as coming from [6], but in fact the correct attribution is to [5]. His proof hinges on the following lemma. Recall that an ideal, $P$, is prime, if $ab\in P$ implies that $a\in P$ or $b\in P$.

Lemma 1. Let $I$ be an ideal in a commutative ring, $S$, and let $T$ be a multiplicatively closed set of $S$ that is disjoint from $I$. There exists a prime ideal, $P$, such that $I\subseteq P$, and $P\cap T=\emptyset$.

Proof. We consider the partial order of ideals in $S$ that contain $I$ and that are disjoint from $T$. Assume that $J_{1}\subset J_{2}\subset J_{3}\subset \cdots$ is a chain in the order. It is easy to confirm that the $J_{1}\cup J_{2}\cup J_{3}\cup \cdots$ is itself an ideal in the order. This means that every such chain of ideals has a maximal element, so we can apply Zorn’s Lemma. Therefore the partial order has a maximal element, $P$. It remains to show that $P$ is a prime ideal. Assume otherwise, so that the product $ab$ belongs to $P$, even though $a\notin P$ and $b\notin P$. The maximality of $P$ means that the ideal generated by $P$ and $a$ is not disjoint from $T$, so there is some element, $ca+p$ in $T\cap \langle P, a\rangle$. (Here $p\in P$, and $c$ is an element of $S$.) Similarly, there are elements $q\in P$ and $d\in S$ such that $db+q$ belongs to $T$. Since $T$ is multiplicatively closed, we see that $(ca+p)(db+q)=(cd)(ab)+(ca)q+(db)p+pq$ is in $T$. But this element is also in $P$, since $ab\in P$. Now we have a contradiction to the fact that $P$ and $T$ are disjoint. $\square$

Proof 1. Let $p_{1},p_{2},p_{3},\ldots$ be distinct primes such that $M$ is representable over a field of characteristic $p_{i}$ for each $i$. Let $\phi_{i}$ be a homomorphism from $R$ to a field of characteristic $p_{i}$ that takes all the polynomials in $\mathcal{N}$, and no polynomial in $\mathcal{B}$, to zero. Then the kernel, $\ker(\phi_{i})$, of $\phi_{i}$ is a prime ideal of $R$ containing $\mathcal{N}$, but disjoint from $\mathcal{B}$. Moreover, the integer $p_{i}$ is contained in $\ker(\phi_{i})$. In fact, the only integers in $\ker(\phi_{i})$ are the multiples of $p_{i}$, since if $\ker(\phi_{i})$ contains a non-multiple of $p_{i}$, then it contains the greatest common divisor of that non-multiple and $p_{i}$ (namely $1$). This would imply that $\ker(\phi_{i})$ is equal to $R$, which is impossible.

Assume that there is an element in both the ideal $\langle \mathcal{N}\rangle$ and the multiplicatively closed set generated by $\mathcal{B}\cup\mathbb{Z}$. Such an element is of the form $kf$, where $k$ is an integer, and $f$ is a product of polynomials from $\mathcal{B}$. Since $kf$ is in $\langle \mathcal{N}\rangle$, it is in each $\ker(\phi_{i})$, even though $f$ is not. Thus $k$ is in each ideal $\ker(\phi_{i})$. Thus $k$ is a multiple of each of the distinct primes $p_{1},p_{2},p_{3},\ldots$, and we have a contradiction. Now, by Lemma $1$, there is a prime ideal, $P$, of $R$ which contains $\langle \mathcal{N}\rangle$, and which avoids $\mathcal{B}\cup\mathbb{Z}$. Because $P$ is prime, the quotient ring, $R/P$, is an integral domain. Let $F$ be the field of fractions of $R/P$. The natural homomorphism from $R$ to $F$ gives us a representation of $M$. As $P$ does not contain any integer, it follows that $R/P$ has characteristic zero, and we are done. $\square$

The next proof of Theorem 1 makes use of the Compactness Theorem in first-order logic. It is due to Samuel Wagstaff and appeared two years after Vámos’s proof, in 1973 [7]. Wagstaff does not supply many details — perhaps the account in his PhD thesis (upon which the article based) is more explicit. But I believe that my version below captures his key ideas.

Proof 2. We construct a first-order language of vector spaces. We have a countable supply of variables, $x_{1},x_{2},x_{3},\ldots$, and unary predicates $\operatorname{Scal}$ and $\operatorname{Vec}$. In our interpretation, these predicates will indicate that a variable stands for a scalar or a vector. We also have constants, $0_{S}$, $1_{S}$, and $0_{V}$, which are interpreted as the identities of the vector space. The functions $+_{S}$, $\times_{S}$, $+_{V}$, and $\times_{V}$, are interpreted as the binary operations of the field of scalars, as addition of vectors, and as scalar multiplication. The only binary relation we need is equality. Now, in this language, we can express the axioms for fields and for vector spaces. To illustrate, the compatibility of scalar and vector multiplication would be expressed with the sentence \begin{multline*}\forall x_{1}\forall x_{2}\forall x_{3} (\operatorname{Scal}(x_{1})\land \operatorname{Scal}(x_{2})\land \operatorname{Vec}(x_{3}) \to\\ ((x_{1}\times_{S}x_{2})\times_{V}x_{3}=x_{1}\times_{V}(x_{2}\times_{V}x_{3}))).\end{multline*} We can assert the representability of $M$ by requiring that there exist $x_{1},x_{2},\ldots, x_{n}$ such that $\operatorname{Vec}(x_{i})$ holds for each $i$. For each subset $X\subseteq E(M)=\{1,\ldots, n\}$, we include a sentence asserting that the corresponding subset of $\{x_{1},x_{2},\ldots, x_{n}\}$ is linearly independent if $X$ is independent in $M$, and otherwise asserting that it is linearly dependent. Let us illustrate how this can be accomplished by writing the formula which asserts that $\{x_{1},x_{2},x_{3}\}$ is linearly independent. Let $y_{1}, y_{2}, y_{3}$ be variables that are not used elsewhere in our sentence. The formula we need is as follows: \begin{multline*}\forall y_{1}\forall y_{2}\forall y_{3} (\operatorname{Scal}(y_{1})\land \operatorname{Scal}(y_{2})\land \operatorname{Scal}(y_{3}) \land (y_{1}\ne 0_{S} \lor y_{2}\ne 0_{S} \lor y_{3}\ne 0_{S})\to\\ (y_{1}\times_{V}x_{1}) +_{V} (y_{2}\times_{V} x_{2}) +_{V} (y_{3}\times_{V}x_{3})\ne 0_{V}).\end{multline*} We simply perform this trick for each subset of $E(M)$.

Now we have a finite collection, $\mathcal{T}$, of sentences which can be satisfied if and only if $M$ is representable over a field. For each prime, $p$, let $S_{p}$ be the sentence asserting that the sum of $p$ copies of $1_{S}$ is equal to $0_{S}$. Thus $M$ is representable over a field of characteristic $p$ if and only if $\mathcal{T}\cup\{S_{p}\}$ can be satisfied. Assume that $M$ is not representable over a field with characteristic zero. Then the collection $\mathcal{T}\cup\{\neg S_{p}\colon p\ \text{is a prime}\}$ is inconsistent: it cannot be satisfied by any model. The Compactness Theorem tells us that there is a finite subcollection which is inconsistent. Therefore we can assume that $\mathcal{T}\cup\{\neg S_{p} \colon p\ \text{is a prime and}\ p\leq N\}$ has no model, for some integer $N$. This implies that there can be no representation of $M$ over a field with characteristic greater than $N$, so we have proved the contrapositive of Theorem 1. $\square$

The next proof appeared in 1989, and comes courtesy of Imre Leader [3]. I think it might be my favourite out of the four. It uses the notion of an ultraproduct.

Proof 3. Let $F_{1},F_{2},F_{3},\ldots$ be an infinite sequence of fields such that $M$ is representable over each $F_{i}$ and $\operatorname{char}(F_{i+1})>\operatorname{char}(F_{i})$. Let $\mathcal{U}$ be a non-principal ultrafilter of the positive integers. This means that $\mathcal{U}$ is a collection of subsets of positive integers, and $\mathcal{U}$ satisfies: (1) any superset of a member of $\mathcal{U}$ is also in $\mathcal{U}$; (2) if $U,V\in\mathcal{U}$, then $U\cap V\in \mathcal{U}$; and (3) if $U$ is any subset of $\mathbb{Z}^{+}$, then exactly one of $U$ or $\mathbb{Z}^{+}-U$ is in $\mathcal{U}$. Non-principal ultrafilters cannot be described explicitly, but we know that they exist by an application of Zorn’s Lemma. If $U$ is a finite subset of $\mathbb{Z}^{+}$, then $\mathbb{Z}^{+}-U$ is in $\mathcal{U}$. We consider sequences $(x_{1},x_{2},x_{3},\ldots)$, where each $x_{i}$ is an element of $F_{i}$. We declare $(x_{1},x_{2},x_{3},\ldots)$ and $(y_{1},y_{2},y_{3},\ldots)$ to be equivalent if $\{i\in\mathbb{Z}^{+}\colon x_{i}=y_{i}\}$ is a member of $\mathcal{U}$. We perform componentwise addition and multiplication on sequences. It is not too difficult to see that these operations respect our notion of equivalence, and that the equivalence classes of sequences form a field, $F$, under componentwise addition and multiplication. Moreover, the characteristic of $F$ is zero. We consider a sequence of representations of $M$ over the fields $F_{1},F_{2},F_{3},\ldots$. This leads to a representation of $M$ over $F$ in the obvious way: we simply “stack” the infinite sequence of matrices, and produce a matrix whose entries are infinite sequences of numbers. $\square$

Finally, we come full circle! In 2003, Vámos (along with Rosemary Baines) published another proof of Theorem 1 [2]. The proof is actually an easy consequence of the algorithm that Stefan mentioned in a previous post, and the main ingredient is the use of Gröbner bases. I will tweak the proof slightly, for the sake of simplicity.

Proof 4. Let $f$ be the product of all polynomials in $\mathcal{B}$. We introduce the new, “dummy”, variable, $y$. From now on, we will operate in the ring $\mathbb{Z}[X\cup\{y\}]$. Note that an evaluation of the variables in $X\cup\{y\}$ will make $yf-1$ zero if and only if it makes each polynomial in $\mathcal{B}$ non-zero. Let $I$ be the ideal of $\mathbb{Z}[X\cup\{y\}]$ that is generated by $\mathcal{N}\cup\{yf-1\}$. Thus a representation of $M$ is really a ring homomorphism from $\mathbb{Z}[X\cup\{y\}]$ to a field that takes every polynomial in $I$ to zero.

Let $G$ be a strong Gröbner basis of $I$ (see [1]). Thus $I$ is the ideal generated by $G$. In addition, we have an ordering on products of variables (the lexicographic order will suffice). The leading term of a polynomial is the highest non-zero term under this ordering. For $G$ to be a strong Gröbner basis, we require that if $h$ is in $I$, then there is a polynomial $g\in G$ such that the leading term of $g$ divides the leading term of $h$. Assume that $G$ contains an integer, $N$. If $\phi$ is a homomorphism from $\mathbb{Z}[X\cup\{y\}]$ to a field, $F$, such that $\ker(\phi)$ contains $I$, then $N$ is taken to zero by $\phi$. This means that the characteristic of $F$ is at most $N$, so $M$ is not representable over infinitely many prime characteristics. Therefore we will now assume that $G$ contains no integer.

Let $I_{\mathbb{Q}}$ be the ideal of $\mathbb{Q}[X\cup\{y\}]$ generated by $\mathcal{N}\cup\{yf-1\}$. Assume that $1\in I_{\mathbb{Q}}$. Then $1$ can be expressed as a polynomial combination of the polynomials in $\mathcal{N}\cup\{yf-1\}$. By multiplying this combination through by the denominators of the coefficients, we see that $I$ contains an integer, $N$. Thus $G$ contains a polynomial whose leading term divides the leading term of $N$, which is $N$ itself. But in the lexicographic ordering, a constant polynomial is below all non-constant polynomials, from which we deduce that $G$ contains a constant polynomial, contradicting our conclusion from the previous paragraph. Therefore $1\notin I_{\mathbb{Q}}$, which means that $I_{\mathbb{Q}}$ is contained in a maximal ideal, $P$. Note that $P$ contains no constant polynomial. Now $\mathbb{Q}[X\cup\{y\}]/P$ is a field, and the natural homomorphism from $\mathbb{Q}[X\cup\{y\}]$ to this field leads to a representation of $M$ over a field of characteristic zero. $\square$

References.

[1] W. W. Adams and P. Loustaunau. An introduction to Gröbner bases. Graduate Studies in Mathematics, 3. American Mathematical Society, Providence, RI, 1994.

[2] R. Baines and P. Vámos. An algorithm to compute the set of characteristics of a system of polynomial equations over the integers. J. Symbolic Comput. 35 (2003), no. 3, 269–279.

[3] I. Leader. A short proof of a theorem of Vámos on matroid representations. Discrete Math. 75 (1989), no. 1-3, 315–317.

[4] R. Rado. Note on independence functions. Proc. London Math. Soc. (3) 7 (1957), 300–320.

[5] P. Vámos. A necessary and sufficient condition for a matroid to be linear. Möbius algebras (Proc. Conf., Univ. Waterloo, Waterloo, Ont., 1971), pp. 162–169. Univ. Waterloo, Waterloo, Ont., 1971.

[6] P. Vámos. Linearity of matroids over division rings. Notes by G. Roulet. Möbius algebras (Proc. Conf., Univ. Waterloo, Waterloo, Ont., 1971), pp. 170–174. Univ. Waterloo, Waterloo, Ont., 1971.

[7] S. S. Wagstaff Jr. Infinite matroids. Trans. Amer. Math. Soc. 175 (1973), 141–153.

Clutters II

The last time I wrote on this subject, I made the case that clutters are of fairly natural interest to matroid theorists, since they are combinatorial objects with a duality involution, along with operations of deletion and contraction that are swapped by duality. They also arise fairly naturally in integer programming. In one formulation, integer programming is the problem of finding an integral vector $x$ that minimises the quantity $w^{T}x$ ($w$ is a fixed vector of weights) subject to the constraints $x\geq \mathbf{0}$ and $Mx\geq b$, where $M$ is a matrix of constraints. (Every entry of $\mathbf{0}$ is zero, and all comparisons of vectors are done entrywise – $x\geq \mathbf{0}$ means that every entry of $x$ is non-negative.)

Many optimisation problems in combinatorics can be expressed as integer programs. In particular, several problems are equivalent to integer programs where $b$ is equal to the vector of all ones, and $M$ contains only zeroes and ones. It’s an easy exercise to show that finding a minimum-size vertex cover in a graph is one of these problems. So is finding a minimum-size $s$-$t$ cut in a network.

In these cases, we may as well assume that the support of no row in $M$ properly contains the support of another row: if $r$ and $r’$ are rows of $M$, and $r$ is equal to $1$ in every position that $r’$ is equal to $1$, then we might as well remove the row $r$. This does not change the solution space of the program at all. In other words, we may as well assume that the rows of $M$ are the characteristic vectors of the edges in a clutter.

So clutters arise from combinatorial optimisation problems, via integer programming, and this leads to the following Venn diagram.

ClutterVenn

My aim today is modest: to define these classes of clutters. The illustration, along with essentially all the content found here, is in Combinatorial Optimisation: Packing and Covering, by Gérards Cornuéjols.

First of all, let $H=(S,\mathcal{A})$ be a clutter, so that $S$ is a finite set, and the members of $\mathcal{A}$ are subsets of $S$, none of which is properly contained in another. Let $M(H)$ be the incidence matrix of $H$. That is, the columns of $M(H)$ are labelled by the elements of $S$, and the rows are labelled by members of $\mathcal{A}$. An entry of $M(H)$ is one if and only if the corresponding element of $S$ is contained in the corresponding member of $\mathcal{A}$, otherwise it is zero.

Now assume that $w$ and $x$ are $|S|\times 1$ vectors with rows labelled by elements of $S$ (in the same order as the columns of $M(H)$), and the entries of $w$ non-negative. The vector $w$ can be seen as assigning a weight to each element of $S$. We will consider the following linear program.

(1) Minimise $w^{T}x$ subject to the constraints $x\geq \mathbf{0}$ and $M(H)x\geq \mathbf{1}$.

(Here $\mathbf{1}$ is the vector of all ones.)

If $x$ is an integer solution to this problem, then there is no harm in assuming that the entries of $x$ are all $0$ or $1$. (If any entry is greater than $1$, then replace it with $1$. Then $M(H)x\geq \mathbf{1}$ will still hold, and $w^{T}x$ will not have increased.) So an integer solution can be seen as a characteristic vector of a subset of $S$. This subset must intersect every member of $\mathcal{A}$ (by virtue of the constraint $M(H)x\geq \mathbf{1}$). Thus we are looking for a minimum-weight hitting set, and (1) can be thought of as the relaxation of the hitting set problem that allows non-integral solutions.

Next we consider the dual problem. Here $y$ is a $|\mathcal{A}|\times 1$ vector with entries labelled by the members of $\mathcal{A}$.

(2) Maximise $y^{T}\mathbf{1}$ subject to the constraints $y\geq \mathbf{0}$ and $y^{T}M(H)\leq w$.

In this problem we are thinking of $w$ as a capacity on each element of $S$. The vector $y$ assigns a non-negative flow value to each edge in $\mathcal{A}$, and the constraint $y^{T}M(H)\leq w$ says that the sum of flows over the edges that contain an element $s\in S$ cannot exceed the capacity that $w$ assigns to $s$. A solution to (2) maximises the sum of the flow values.

Now we can define the four classes in the Venn diagram.

$H$ has the Max Flow Min Cut property if, for every non-negative integer vector $w$, the programs (1) and (2) both have optimal solutions, $x$ and $y$, that are integral.

$H$ has the packing property if, for every vector $w$ whose entries are $0$, $1$, or $+\infty$, the programs (1) and (2) both have optimal solutions that are integral. In program (1), a weight of $+\infty$ means that we try to never include that element of $S$ in our hitting set. A weight of $0$ means that we can include that element for free.

$H$ packs if programs (1) and (2) have optimal solutions that are integral vectors when $w$ is set to be $\mathbf{1}$.

$H$ is ideal if, for any choice of vector $w$ with non-negative entries, not necessarily integral, program (1) has an optimal solution that is an integral vector.

Next time I will say more about the relations between these properties. For now, I will just note that, rather unfortunately, there really are clutters that pack, but which do not have the packing property. On the other hand, the shaded area of the Venn diagram is believed to be empty. So it is conjectured that a clutter has the Max Flow Min Cut property if and only if it has the packing property. Cornuéjols offers a bounty of $5000 for a resolution of this conjecture (or one of 17 others) before 2020.