The Matroid Secretary Problem

In this post, I am going to discuss the matroid secretary problem, which is a very nice problem introduced by Babaioff, Immorlica, and Kleinberg [1].  I will try to give an up-to-date account, but am far from an expert in this area, so please feel free to comment if I miss or muddle anything.

Let’s warm up with the classical secretary problem, which has the following setup.  We want to hire a secretary.  There is a set $X$ of $n$ candidates, and each $x \in X$ has a competency $w(x)$.  We know that there are $n$ candidates, but we do not know the competency function.  The secretaries are then presented to us in a random order.  Once a secretary is presented to us, we interview him and discover his competence.  We then have to make an irrevocable decision to hire him or not.  If we hire him on the spot, the process ends.  If not, then due to a hyper-competitive labour market, he will be hired by another firm, and we must move on to the next candidate.

Let OPT be the maximum competency of all secretaries.  Our goal is to devise an algorithm that hires a secretary with competency close to OPT.  Say that an algorithm $\mathcal A$ is $\alpha$-competitive if OPT / $\mathbb{E} (\mathcal A) \leq \alpha$, where $\mathbb{E} (\mathcal A)$ is the expected competency outputted by $\mathcal A$ for a random permutation $\sigma$ of $X$.

Here is a $4$-competitive algorithm for the classical secretary problem.  We reject each of first $\frac{n}{2}$ candidates but we keep track of the highest competency, say $h$, of the first $\frac{n}{2}$ candidates.  We then hire the first candidate with competency at least $h$ (if none exists, we just hire the last secretary).  This algorithm is $4$-competitive because the probability that the most competent secretary is in the second half and the second most competent secretary is in the first half is more than $\frac{1}{4}$.  Interestingly, if we instead reject $\frac{1}{e}$ of the initial candidates, we obtain an $e$-competitive algorithm, which turns out to be best possible over all possible algorithms [2].

The matroid secretary problem has the same setup as above, except that there is a matroid $\mathcal M$ on the underlying set $X$ of secretaries.  We now want to hire a set of secretaries, subject to the condition that our hired set is independent in $\mathcal M$.  Again, there is an unknown weight function $w:X \to \mathbb R_+$, and the secretaries are presented to us in a random order.  After interviewing $x$, we must make an irrevocable decision to add or not add $x$ to our currently constructed independent set $I$.  Again, the goal is to devise an algorithm whose expected output is close to the maximum weight independent set.

Note that we recover the classical secretary problem when $\mathcal M$ is a rank-$1$ uniform matroid.  The generalization to matroids might seem like abstract nonsense, but the matroidal version is actually relevant in the theory of auctions, and has garnered a lot of interest.  The big (and still open) problem is the following conjecture.

Conjecture 1 (Babaioff, Immorlica, and Kleinberg [1]).  For every matroid $\mathcal M$, there is a $O(1)$-competitive algorithm.

Using a clever modification of the threshold price algorithm that we discussed for the classical secretary problem, Babaioff, Immorlica, and Kleinberg [1] proved that there is a $O(\log r)$-competitive algorithm, where $r$ is the rank of $\mathcal M$.  In a breakthrough paper, Chakraborty and Lachish [3] devised a $O(\sqrt{ \log r})$-competitive algorithm.  Their algorithm is a patchwork of a few different algorithms, and is quite complicated.  Finally, the state-of-the-art is a $O( \log \log r)$-competitive algorithm of Lachish [4].  Using different tools, Feldman, Svensson, and Zenklusen [5] have also obtained a $O( \log \log r)$-competitive algorithm for general matroids.

Another line of research is to weaken the problem slightly, in the hope of obtaining a constant-competitive algorithm.  Perhaps the most important modification is known as the random assignment model.  Note that in the original matroid secretary problem, we may as well assume that the weight function is chosen by an adversary.  In the random assignment model, we weaken the adversary, by only allowing her to choose the set of weights.  The weights are then randomly assigned to the set of secretaries.  Soto [6] proved that there is a constant-competitive algorithm for all matroids in the random assignment model.

Theorem 2 (Soto [6]). For each matroid $\mathcal M$, there is a $O(1)$-competitive algorithm in the random assignment model.

Soto’s algorithm uses the classic decomposition of a matroid into its principal minors.  See the recent paper of Fujishige [7] for a survey of this theory.  The principal minors of a matroid are uniformly dense ($\mathcal M$ is uniformly dense if $\max_{A \subseteq E} \frac{|A|}{r(A)}$ is obtained by $E$). Using the decomposition, one can reduce to the case of uniformly dense matroids, and Soto proved that there is indeed a $O(1)$-competitive algorithm for uniformly dense matroids. Note that his proof crucially assumes that we are in the random assignment model.

There are also many other interesting variants of the original matroid secretary problem. See Section 3 of Dinitz [8] for a nice overview.

I will finish by discussing the original matroid secretary problem for restricted classes of matroids.  In [1], it is proved that Conjecture 1 holds for graphic matroids, uniform matroids, partition matroids, and bounded-degree transversal matroids.  The bounded-degree condition for transversal matroids was later removed by Dmitrov and Plaxton [9].

Here is another class of matroids for which Conjecture 1 holds.  Let $\mathcal F$ be a laminar family and $c: \mathcal F \to \mathbb{N}$.  If we let $\mathcal I$ be the set of all $I$ such that $|I \cap F| \leq c(F)$ for all $F \in \mathcal F$, then $\mathcal I$ is the set of independent sets of a matroid.  Matroids arising in this way are called laminar matroids.  Note that the class of laminar matroids contains the partition matroids.  Im and Wang [10] showed that there is a constant-competitive matroid secretary algorithm for the class of laminar matroids.

Finally, Dinitz and Kortsarz gave a $O(1)$-competitive algorithm for the matroid secretary problem on regular matroids.  Their proof uses Seymour’s decomposition theorem for regular matroids [11].  Note that graphic matroids, cographic matroids, and $R_{10}$ all have $O(1)$-competitive matroid secretary algorithms.

Using the structure theorem for $\mathbb F$-representable matroids proved by Geelen, Gerards and Whittle [12], it may be possible to prove the following special case of Conjecture 1.

Conjecture 3.  For every finite field $\mathbb F$, there is a $O(1)$-competitive algorithm for the matroid secretary problem on the class of $\mathbb F$-representable matroids.

I suspect this might be tricky since one basic class of $\mathbb F$-representable matroids are those representable over a subfield $\mathbb F’$ of $\mathbb F$, and I don’t see how to get a $O(1)$-competitive matroid secretary algorithm for  $\mathbb F’$-representable matroids unless we are doing some cunning induction.

On the other hand, the structure theorem for binary matroids is slightly simpler (due to the absence of subfields), so the following special case of Conjecture 3 may be doable.

Conjecture 4. There is a $O(1)$-competitive algorithm for the matroid secretary problem on binary matroids.

References

[1] Moshe Babaioff, Nicole Immorlica, and Robert Kleinberg. Matroids, secretary problems, and online mechanisms. In Proc. SODA, pages 434–443, 2007.

[2] E. B. Dynkin. The optimum choice of the instant for stopping a Markov process. Soviet Math. Dokl, 4, 1963.

[3] Sourav Chakraborty and Oded Lachish. Improved competitive ratio for the matroid secretary problem. In Proc. SODA, pages 1702–1712, 2012.

[4] Lachish, Oded. O(log log rank)-competitive ratio for the matroid secretary problem. Foundations of Computer Science (FOCS), 2014 IEEE 55th Annual Symposium on. IEEE, 2014.

[5] Feldman, Moran, Ola Svensson, and Rico Zenklusen.  A simple O(log log (rank))-competitive algorithm for the matroid secretary problem.  Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM, 2015.

[6] José A. Soto. Matroid secretary problem in the random assignment model. In Proc. SODA, pages 1275–1284, 2011.

[7] S. Fujishige. Theory of principal partitions revisited. In Research Trends in Combinatorial Optimization, pages 127–162, 2009.

[8] Dinitz, Michael. Recent advances on the matroid secretary problem. ACM SIGACT News 44 (2013), no. 2, 126–142.

[9] Nedialko B. Dimitrov and C. Greg Plaxton. Competitive weighted matching in transversal matroids. In Proc. ICALP, pages 397–408, 2008.

[10] Sungjin Im and Yajun Wang. Secretary problems: laminar matroid and interval scheduling. In Proc. SODA, pages 1265–1274, 2011.

[11] P. D. Seymour. Decomposition of regular matroids. J. Combin. Theory Ser. B, 28(3):305–359, 1980.

[12] J. Geelen, B. Gerards, G. Whittle. Structure in minor-closed-classes of matroids. Surveys in Combinatorics, London Mathematical Society Lecture Note Series 409, 327–362, 2013.

Hilbert Bases of Cuts and Cocircuits

Let $X$ be a set of vectors in $\mathbb{R}^d$.  Three important objects related to $X$ are the cone, lattice, and integer cone generated by $X$.  These are (respectively), the set of all real non-negative combinations, the set of all integer combinations, and the set of all non-negative integer combinations of vectors in $X$.  Clearly, the integer cone is contained in the intersection of the cone and the lattice.  If the integer cone is equal to the intersection of the cone and the lattice, we say that $X$ is a Hilbert basis.

Hilbert bases were introduced by Giles and Pulleyblank [1] as a tool to study total dual integrality. They are often nicer to work with than arbitrary sets of vectors.  For example, from the computational perspective, membership testing can be much easier for the cone and lattice than for the integer cone.

Given a graph $G$ (or matroid), we may regard subsets of $E(G)$ as characteristic $(0,1)$-vectors. In this way, we can ask whether a family $\mathcal{S}$ of subgraphs of $G$ is a Hilbert basis. In the case that $\mathcal{S}$ is the set of circuits of $G$, there is the following nice theorem of Alspach, Goddyn and Zhang [2].

Theorem 1 [Alspach, Goddyn, Zhang].  The set of circuits of a graph $G$ is a Hilbert basis if and only if $G$ does not contain the Peterson graph as a minor.

From the matroidal perspective, the difference between circuits and cuts is superficial, so it is very natural to ask the following question.

Question 2.  For which graphs $G$ is the set of cuts of $G$ a Hilbert basis?

Let us define a graph to be cut- (respectively circuit-) Hilbert if its set of cuts (respectively circuits) is a Hilbert basis.  Let $K_{5}^{\perp}$ be the unique 3-connected uncontraction of $K_5$.

$K_5^{\perp}$

The graph $K_5^{\perp}$ with a distinguished edge $e$.

Laurent [3] had previously shown that $K_{5}^{\perp}$ is cut-Hilbert.

In a recent paper of Deshpande, Goddyn and myself [4], we prove that all graphs without a $K_{5}^{\perp}$-minor are cut-Hilbert.

Theorem 3 [Deshpande, Goddyn, H].  Every graph without a $K_{5}^{\perp}$-minor is cut-Hilbert.

However, somewhat curiously, we also prove that the class of cut-Hilbert graphs is not as well-behaved as the class of circuit-Hilbert graphs.

Theorem 4 [Deshpande, Goddyn, H].  The class of cut-Hilbert graphs is not closed under edge-deletion, edge-subdivision, nor 2-sums.

Note that Theorem 4 is in stark contrast to Theorem 1, where there is an exact excluded-minor characterization for the class of circuit-Hilbert graphs.

I will now give the examples that establish Theorem 4.

It will be important to clarify what we mean by a 2-sum.  A 2-sum of two graphs is obtained by gluing them together along a common edge, and then deleting that edge. A 2-clique-sum of two graphs is obtained by gluing them together along a common edge, but keeping that edge. In contrast to 2-sums, Laurent [3] also proved that the family of cut-Hilbert graphs is closed under 2-clique-sums.

Three different weightings of a non-cut-Hilbert graph $H_1$. Black edges all have weight 1.

Consider the three edge-weightings of the above graph $H_1$.  For a simple graph $G$, it is easy to show that a vector $x \in \mathbb{Z}^{E(G)}$ is in the lattice of cuts of $G$ if and only if $x(C)$ is even for every circuit $C$ of $G$.  One easily checks that this condition holds for each of the above three edge-weightings.  Furthermore, one can also write (but for the benefit of the reader we will not) each of these vectors as real non-negative combinations of cuts of $H_1$. On the other hand, none of these vectors is in the integer cone of cuts of $H_1$ (this is just a (large) finite case check).  This last claim was verified by our silent robot partner Normaliz [5], but can also be checked by humans (as we do in the paper). Therefore, $H_1$ is not cut-Hilbert.

Three different weightings of a non-cut-Hilbert graph $H_2$. Black edges all have weight 1.

Similarly, each of the three edge-weightings of the above graph $H_2$ show that $H_2$ is not cut-Hilbert. Now let $H_3$ be the 2-clique-sum of two copies of $K_{5}^{\perp}$ along the distinguished edge $e$. Since $H_3$ is a 2-clique-sum of cut-Hilbert graphs, $H_3$ is also cut-Hilbert. However, note that $H_1$ is obtained from $H_3$ by deleting an edge and $H_2$ is obtained from $H_3$ by subdividing an edge. This establishes Theorem 4, as promised.

I will finish by discussing the matroid analogue of Question 2.

Question 5.  For which matroids $M$ is the set of cocircuits of $M$ a Hilbert basis?

Luis Goddyn and Marko Mitrovic (an NSERC undergraduate research student of Luis in 2013) made some interesting progress which I will briefly report here. Using Normaliz, the matroid Sage package, and the catalogue of all matroids up to 9 elements, they were able to determine all the non-cocircuit-Hilbert matroids up to 8 elements.  It turns out that there are lots of them.  In particular, of the 2198 matroids with up to 8 elements, 1305 are non-cocircuit-Hilbert.  Below is an image of the three smallest non-cocircuit-Hilbert matroids.

threeSmallestNonH

If one restricts to binary matroids, then the situation is more manageable. There are only two binary non-cocircuit-Hilbert matroids up to 8 elements, and interestingly, both of them are coextensions of the Fano matroid.  Therefore, the following problem may be doable.

Problem 6.  Characterize the binary matroids whose set of cocircuits is a Hilbert basis.

Note that Problem 6 is open even for graphs.  By Theorem 4, the set of cut-Hilbert graphs is not closed under taking minors, so it does not have an excluded-minor characterization. However, perhaps there is an alternate nice characterization.

References

[1] F. R. Giles and W. R. Pulleyblank.  Total dual integrality and integer polyhedra.  Linear Algebra Appl., 25:191-196, 1979.

[2] Brian Alspach, Luis Goddyn, and Cun Quan Zhang.  Graphs with the circuit cover property. Trans. Amer. Math. Soc., 344(1):131-154, 1994.

[3] Monique Laurent.  Hilbert bases of cuts.  Discrete Math., 150(1-3):257-279, 1996.

[4] Tanmay Deshpande, Luis Goddyn, and Tony Huynh.  On Hilbert bases of cuts. arXiv.

[5] Winfred Bruns and Bogdan Ichim.  Normaliz: algorithms for affine monoids and rational cones.  J. Algebra, 324(5):1098-1113, 2010.

Sudoku Matroids and Graph Colouring Verification

As this is my first post, I feel obliged to say the obligatory Hello World!. With that aside, the problem I am going to discuss comes from the popular game Sudoku. Suppose that we are given a filled Sudoku $S$ (which we cannot see), and we wish to verify its correctness. To do so, we are granted access to an oracle which can tell us if $S$ is consistent on any row, column or block.  Consistent simply means that each number from 1 to 9 appears exactly once.

Question 1.  How many checks are necessary to verify that $S$ is correct?

Certainly, 27 checks is sufficient, but can we do better? Here’s a short proof that we can in fact do better.

Proposition 2.  Every Sudoku can be verified with at most 21 checks.

Proof.  It will be convenient to fix some notation. Let

$$K:= \{r_1, \dots, r_9\} \cup \{c_1, \dots, c_9\} \cup \{b_1, \dots, b_9\}$$

be the set of rows, columns and blocks of a Sudoku. By convention, the blocks are labelled as you (an English reader) would read the words along a page. Now suppose that a Sudoku $S$ is consistent on $b_1, b_2, b_3, r_1$ and $r_2$. We claim that this implies $S$ is also consistent on $r_3$.

Suppose not. Then some number, say 1, appears at least twice in $r_3$. However, since $S$ is consistent on $r_1$ and $r_2$, 1 appears exactly once in each of $r_1$ and $r_2$. Thus, 1 appears at least 4 times in $b_1 \cup b_2 \cup b_3$.  By the Pigeonhole Principle, 1 appears at least twice in $b_1, b_2$ or $b_3$, which is a contradiction. Thus, to verify $S$ it suffices to check $K \setminus (\{r_3, r_6, r_9\} \cup \{c_3, c_6, c_9\})$. $\square$

Can we do better than 21 checks? At first this seems rather tricky.  One can use information theory considerations to prove lower bounds.  For example, clearly at least 9 checks are necessary, since otherwise there will be a cell $x$ for which the row, column and box containing $x$ are all unchecked.  However, it seems difficult to get to a lower bound of 21 in this way. See this MathOverflow question for someone trying to carry this out.

At this point though, our hero Matroid Theory comes to the rescue. That is, somewhat remarkably, there is a matroid structure underlying this problem, which I will now describe. Define a subset $V$ of $K$ to be a verifier if every Sudoku which is consistent on $V$ is consistent everywhere.

Theorem 3.  The set of minimal (under inclusion) verifiers $\mathcal{V}$ is the set of bases of a matroid on $K$.

There is a nice proof of this fact by Emil Jeřábek as an answer to the same MathOverflow question above.  It turns out that this matroid is actually representable (over $\mathbb{Q}$) and that the proof is also valid for the more general $n \times n$ versions of Sudoku. We will not say anything more about the proof due to space limitations.  However, using the fact that $M=(K, \mathcal{V})$ is a matroid, it is now easy to show that 21 is in fact the answer to Question 1.

Proposition 4.  The minimum number of checks needed to verify the correctness of a Sudoku is 21.

Proof.  Translated into matroid theory language, the original question is simply asking what the rank of $M$ is. We have already shown that $B:=K \setminus (\{r_3, r_6, r_9\} \cup \{c_3, c_6, c_9\})$ is a verifier, so it suffices to show that it is a minimal verifier. Certainly, we cannot remove a block from $B$ and remain verifying, since there would then be a completely unchecked cell. On the other hand, for any two rows (or columns) $a$ and $b$ in the same band, it is easy to construct a Sudoku which is consistent everywhere except $a$ and $b$. Thus, we also cannot remove a row or column from $B$ and remain verifying. $\square$

Of course, there is nothing special about the Sudoku graph, and we can attempt to play this game on an arbitrary graph. Let $G$ be a graph and suppose that we wish to verify the correctness of a (not necessarily proper) $\chi(G)$-colouring $C$ of $G$. Again we are given access to an oracle which can tell us whether $C$ is consistent on any maximal clique of $G$. As before, we can define the set family $(K(G), \mathcal{V}(G))$, where $K(G)$ is the set of maximal cliques of $G$ and $\mathcal{V}(G)$ is the family of minimal verifiers.

We now arrive at our main problem.

Problem 5.  Characterize the graphs $G$ for which $(K(G), \mathcal{V}(G))$ is a matroid.

We now show that there are many such graphs.

Proposition 6.  For all bipartite graphs $G$, $(K(G), \mathcal{V}(G))$ is a matroid.

Proof.  We may assume that $G$ is connected. Since $G$ is bipartite, its set of maximal cliques is simply its set of edges. Let $T \subseteq E(G)$ be a minimal verifier. Evidently, $T$ must span all the vertices of $G$. On the other hand, if a 2-colouring of $G$ is correct on a spanning tree, then it is correct everywhere. Thus, $T$ is a spanning tree, and so $(K(G), \mathcal{V}(G))$ is actually the graphic matroid of $G$. $\square$

It would be nice if there were a unified proof that worked for both the bipartite case and the Sudoku graphs. Such a proof would have to utilize some common structure of these two classes, since it is not true that for all graphs $G$, $(K(G), \mathcal{V}(G))$ is a matroid.

Proposition 7.  If $G=K_{2,2,2}$, then $(K(G), \mathcal{V}(G))$ is not a matroid.

Proof.  It will be convenient to regard $K_{2,2,2}$ as the octahedron $O$. Let $1234$ be the middle 4-cycle of $O$ and $t$ and $b$ be the top and bottom vertices.  Now, it is easy to see that $B_1:=\{t12, t34, b23\}$ and $B_2:=\{t23, t14, b34\}$ are minimal verifiers. On the other hand, one easily checks that basis exchange fails for $B_1$ and $B_2$. $\square$

There are also many interesting questions that one can ask about the framework we have set up for graph colouring verification.  For example, we can define the verification number of a graph $G$ to be the minimum size of a verifier (this is the the same thing as the size of a minimal verifier if $(K(G), \mathcal{V}(G))$ is a matroid).

Question 8.  Among all $n$-vertex graphs, which ones have the largest verification number?

I was not able to find much in the literature, so any comments or references would be appreciated. Thanks for reading and see you in $3 + \epsilon$ months!

Acknowledgements.  Parts of this post are based on contributions from François Brunault, Emil Jeřábek and Zack Wolske on MathOverflow and from Ross Kang, Rohan Kapadia, Peter Nelson and Irene Pivotto at Winberie’s in Princeton.