Gale-Shapley optimal matching
Tags | CS161Graphs |
---|
What is greedy?
A greedy algorithm makes a choice and never moves back. However, this is just the basic greedy approach. We can refine our greedy approach as we did in the Ford-Fulkerson algorithm. By reframing the problem, we were able to walk back on our previous assignments in order to improve flow.
In our problem today, we will be doing something similar. We want to revert any choices if there exists a better solution. This isn’t backtracking per se, because it doesn’t require enumerating all possible solutions. It’s just a smarter greedy.
The problem at hand
We used max flow to match people who had unranked preferences, with items that didn’t have preferences at all. But what if we had both people and items with ranked preferences?
The traditional setting for this problem is the hospital residency program, where doctors submit an ordered list of hospitals and hospitals submit an ordered list of doctors.
Stable matchings
Define a blocking pair
if a doctor and a hospital if they are mutually higher on their ranked list than their current assignment. In terms of marriage, we would say that this is a rogue couple
, as they are happier together than they are with their assigned spouses
We define a matching as a stable match
if there are no blocking pairs
Equivalently, we think about it this way: if we are a stable match, then hospital must be lower on the list then ’s current assignment, or/and doctor must be lower on the list than ’s current assignment. In other words, a swap would move the list in opposite directions, or uniformly in the worse. We would never have a swap that moved up the list for both parties.
Some non-working solutions
Hungarian approach
Create a bipartite graph with the weights on the edges being some function of preferences between the doctors and hospitals. For example, you might add together the preference rankings on doctor of hospital , and hospital of doctor . We can run something called the hungarian algorithm
, but does this work?
Why this doesn’t work
- What is the function that maps preferences to edges? This seems arbitrary and hacky
- Doctors are incentivized to misreport their preferences. If a doctor really wants a hospital, they would rank all other hospitals really low. Then, the hungarian algorithm must pick the one that the doctor ranked highest, or else the total cost will be high
- Your algorithm isn’t all-binding. If a doctor and hospital both prefer each other more outside of what the algorithm gives them, then they will pair up outside of the algorithm. So the algorithm is only good if no swaps lead to higher satisfaction. Unfortunately, we can’t guarentee this the hungarian approach
Straight greedy
A simple greedy algorithm will match all the doctor-hospital pairs that have each other as their top choice. Obviously, this step doesn’t rule out any solution, but what about the next step? Here we get stuck
The tiebreaker
Match every doctor to their favorite hospital, and if there are more than one request per hospital, then pick according to hospital preference
Why this method doesn’t work
The problem is if we have two good hospitals, but one that everyone ranks as their first choice. Then, any candidate (even one at the bottom of the hospital rankings) can pick the other good hospital. Because nobody selected this as the first choice, this suboptimal candidate will get the second good hospital, even though other people were ranked higher. And we can’t walk back on our decision, so it’s locked this way. The tl;dr is that if we make the choice upfront, we prevent second and third choices, etc, from happening accurately.
Gale Shapely
This last approach was onto something, but it wasn’t complete. What if we were allowed to walk back on our assignments?
The intuition
Try to match each doctor to their top choice. If you discover a blocking pair, switch the matching. In reality, this means the following:
- Have all doctors “propose” to the hospitals.
- Each hospital picks the candidate highest on its list
- Have all non-assigned doctors propose to the hospital next on their list
- If the hospital gets a candidate that is higher on its list, it will boot the previous candidate out and pick the new one
- keep on doing this until everyone is assigned
This can cause a lot of movement, but at the end, things do settle!
The pseudocode
Complexity
The input space is , because each doctor and hospital have a list of size .
We must look at doctors each “cycle” of proposals, and we iterate through cycles (lest we reach the end of the preferences list!)
Proof of correctness
There are three claims that help us prove it works
- At every iteration, the current match is stable WRT the assigned doctors and hospitals.
To prove this, we use contradiction. Suppose that we had a blocking pair . Currently, is assigned to , and is assigned to . By the algorithm, we know that picked before , as the doctors read from highest to lowest in their proposals. Therefore, we reach a contradiction, because a blocking pair necessarily requires to be higher on ’s list, and to be higher on ’s list.
- Once a hospital is matched, it remains matched
This is trivial from the algorithm’s definition
- At the end of algorithm, every doctor and hospital is matched
Let’s do a proof by contradiction. Suppose that at the end of the algorithm, there is a doctor and hospital still free. However, if we are at the end of the algorithm, has already tried to match . If was empty, then it would have accepted. By claim 2, it couldn’t have been de-assigned during the algorithm, so this is a contradiction.
Detecting multiple stable matchings
Here’s a quick trick
- try doing optimal hospital matching
- try doing optimal doctor matching
- if these differ, you have at least two different matchings
- If these are the same, you only have one. This is because both the doctors and hospitals are at the optimal, meaning that any swap will uniformly decrease both sides, and so it is not desired
More formalism
Look at the final stable matchings. Multiple stable matchings exist if you are able to swap two pairs and increase the rank in one of the two pairs and decrease the rank in the other one. This must be the case for both pairs you swap. In this case, the new matching is equally stable, because there is essentially a zero-sum game. We call this a rotation
, and it’s actually a little more complicated than this, but that’s a general idea.
However, swapping no longer guarantees that the algorithm is doctor optimal (see below). It just guarantees stability.
Cool things about Gale Shapely
The matching returned by this algorithm is doctor-optimal
. This makes sense, because the doctors actively seek out the hospitals, while the hospitals just review proposals.
As such, the order of trying out free doctors in the algorithm doesn’t matter.
Furthermore, doctors can’t gain from misreporting their preferences. This is because doctor-optimal
means that the doctor gets the best possible result given the other doctors. So even if a doctor ranks the other hospitals as very low, if they are unassigned after the first cycle, they must go propose to the other hospitals. Philosophically speaking, if you get what you want, there’s no sense in faking what to you want!
However, this matching is also hospital-worst
, meaning that every hospital is matched to the least-favorite doctor possible in any stable matching. This is not immediately obvious why it’s the case, but it’s because the hospital takes things as they come. Interestingly, hospitals can gain by falsely reporting their preferences! This is again because they aren’t forced to read down a list until they are assigned.