[Subfield] Selfish Routing
Selfish Routing
Non-atomic Routing Model
Routing Game and Flows
You are given a directed network \(G=(V,E)\) and \(k\) source-destination vertex pairs \(\{s_1,t_1\},\ldots,\{s_k,t_k\}\), also called commodities. We denote the set of (simple) \(s_i-t_i\) paths by \(\mathcal{P}_i\), and define \(\mathcal{P} = \cup_i \mathcal{P}_i\).
A flow is a function \(f:\mathcal{P}\rightarrow\mathbb{R}^+\); for a fixed flow \(f\) we define \(f_e = \sum_{P:e\in P} f_P\). We associate a finite and positive rate \(r_i\) with each pair \(\{s_i,t_i\}\), the amount of flow with source \(s_i\) and destination \(t_i\). A flow \(f\) is said to be feasible if for all \(i\), \(\sum_{P\in\mathcal{P}_i} f_P = r_i\).
To capture the congestion, each edge \(e\in E\) is given a load-dependent latency function that we denote by \(\ell_e(\cdot)\). For each \(e\in E\), we assume that the latency function \(\ell_e\) is nonnegative, differentiable, and nondecreasing. The latency of a path \(P\) with respect to a flow \(f\) is defined as the sum of the latencies of the edges in the path, denoted by \(\ell_P(f) = \sum_{e\in P} \ell_e(f_e)\). We define the cost \(C(f)\) of a flow \(f\) in \(G\) as the total latency incurred by \(f\), that is,
In our problem, \(\ell_e(f_e)=\ell_e(f_e)f_e.\) Denote marginal cost function as \(\ell^*_e(f_e)=\ell_e'(f_e)=\ell_e(f_e)+\ell'_e(f_e)f_e.\)
Without central supervision, the flow will form Nash equilibrium:
Next, suppose \(f\), \(\tilde{f}\) are flows in \(G\) at Nash equilibrium (and hence global optima for (NLP2)). By convexity of the objective function of (NLP2), whenever \(f_e \neq \tilde{f}_e\) the function \(h_e\) must be linear between these two values (since any convex combination would yield a solution for (NLP2) with smaller objective function value) and hence \(\ell_e\) must be constant between these two values. This implies that \(\ell_e(f_e) = \ell_e(\tilde{f}_e)\) for all \(e \in E\), thus \(C(f) = C(\tilde{f})\).
Besides, there are some useful properties of Nash equilibrium in single-commodity networks.
Price of anarchy and Pigou Bound
The price of anarchy (PoA) is a measure to quantify the efficiency loss caused by selfish behavior. In the context of network routing games, it is defined as the worst-case ratio over all instances between the cost of a Nash flow and the cost of an optimal flow. Formally,
Pigou bound measures PoA well.
For example, if \(\mathcal{C}\) is the set of polynomials with nonnegative coefficients and degree at most \(p\), then
As \(p\to \infty\), it tends to infinity as \(p/\ln p\).
- the PoA of instances with cost functions in \(\mathcal{C}\) is at most \(\alpha(\mathcal{C})\);
- the PoA of instances with cost functions in \(\mathcal{C}\) can be arbitrarily close to \(\alpha(\mathcal{C})\).
PoA can be super large when cost functions are highly non-linear! How can we reduce it?
Edge Removal
The Brays paradox shows that removing edges from a network may improve its performance. This fact leads to consideration of the problem called NETWORK DESIGN: given a network, which edges should be removed to obtain the best flow in Nash equilibrium? Similarly, given a large network of candidate edges to be built, which subnetwork will show the best performance when used selfishly?
We study how much this mechanism can improve the efficiency of Nash equilibrium.
Single-commodity Networks
Clearly, \(\beta(G,r,\ell)\le \text{PoA}(G,r,\ell)\).
First, we can construct a family of instances with arbitrarily large Braess ratio.
- \(V^k=\{s, v_1,...,v_k,w_1,...,w_k,t\}\)
- \(E^k=\{(s,v_i),(v_i,w_i),(w_i,t): 1\le i\le k\}\cup \{(v_i,w_{i-1}):2\le i\le k\}\cup\{(v_1,t),(s,w_k)\}\)
Call edges of form \((v_i,w_i)\) the type A edges, edges of form \((v_i,w_{i-1}),(v_1,t),(s,w_k)\) the type B edges, and edges of form \((s,v_i)\), \((w_i,t)\) the type C edges. Let their latency functions be defined as follows:
- Type A edges: \(\ell_e^k(x)=0\);
- Type B edges: \(\ell_e^k(x)=1\);
- Type C edges: For each \(i\in\{1,2,...,k\}\), \(e\in \{(w_i,t),(s,v_{k-i+1})\}\), \(\ell^k_e(x)\) is an arbitrary continuous, nondecreasing latency function that \(\ell_e^k(x)=0\) for \(x\le k/(k+1)\) and \(\ell_e^k(x)=i\) for \(x\ge 1\).
For \(i=1,...,k\), let \(P_i\) denote the path \(s\to v_i\to w_i\to t\). For \(i=2,...,k\), let \(Q_i\) denote the path \(s\to v_i\to w_{i-1}\to t\). Define \(Q_1\) to be the path \(s\to v_1\to t\) and \(Q_{k+1}\) the path \(s\to w_k\to t\). Routing one unit flow on each of \(P_1,...,P_k\) yields a Nash equilibrium for \((B^k,k,\ell^k)\) and routing one \(\frac{k}{k+1}\) unit of flow on each of \(Q_1,...,Q_{k+1}\) yields a Nash equilibrium for \((H,k,\ell^k)\) where \(H\) is the subgraph obtained from \(B^k\) by deleting the \(k\) type A edges. One can verify that the total latency of these two flows are \(k(k+1)\) and \(k\), respectively. Thus
Next, let's see an upper bound for Braess ratio. It shows that Braess ratio is maximized by the instances constructed in the previous theorem.
Let's begin with defining some combinatorial structure.
- An edge \(e\) of \(G\) is
- \((f,\tilde{f})\)-light if \(f_{e} \leq \tilde{f}_{e}\) and \(\tilde{f}_{e} > 0\),
- \((f,\tilde{f})\)-heavy if \(f_{e} > \tilde{f}_{e}\),
- \((f,\tilde{f})\)-useless if \(f_{e} = \tilde{f}_{e} = 0\).
- An undirected path is \((f,\tilde{f})\)-alternating if it comprises only:
- forward \((f,\tilde{f})\)-light edges and
- backward \((f,\tilde{f})\)-heavy edges.
Since vertices in \(S\) can be reached from \(s\) via \((f, \tilde{f})\)-alternating paths and vertices outside \(S\) cannot, edges that exit \(S\) cannot be \((f, \tilde{f})\)-light, and edges that enter \(S\) cannot be \((f, \tilde{f})\)-heavy. Since the flow across \(S\) is positive (assuming \(r > 0\)), some non-useless (and thus heavy) edge exits \(S\). Taken together, these facts imply that \(f\) exiting \(S\) is strictly greater than \(\tilde{f}\) exiting \(S\), which leads to contradiction.
Moreover, if \(f\) is directed acyclic, then it sends no flow into \(s\) or out of \(t\). Thus, the first and last edges of every \((f, \tilde{f})\)-alternating \(s\)-\(t\) path must be \((f, \tilde{f})\)-light.
The following theorem provides a general upper bound for Braess ratio with respect to the size of sparse edge sets in the network.
By lemma, there exists an \((f, \tilde{f})\)-alternating \(s\)-\(t\) path, denoted by \(P\). A segment of \(P\) is a maximal subpath of \(P\) that contains only \((f, \tilde{f})\)-light or only \((f, \tilde{f})\)-heavy edges. Edges that are in \(G\) but not \(H\) are called absent, can only reside in \((f, \tilde{f})\)-heavy segments. The key claim is that if \(v\) is a vertex at the end of a segment of \(P\), and \(i\) (heavy) segments of \(P\) between \(s\) and \(v\) contain an absent edge, then
Let's prove it by induction on the segments of \(P\). The inequality trivially holds when \(v = s\), so suppose it holds for a vertex \(v\) that is last on a segment of \(P\), or is the source \(s\). We wish to prove (claim) for \(w\), defined as the last vertex on the next segment. Let \(i\) denote the number of earlier segments of \(P\) that contain at least one absent edge. By the inductive hypothesis, \(d(v) \leq \tilde{d}(v) + i \cdot \tilde{d}(t)\). The inductive step has two cases:
- For the first case, suppose that the segment between \(v\) and \(w\) contains at least one absent edge. As absent edges can only be \((f, \tilde{f})\)-heavy, this segment comprises only \((f, \tilde{f})\)-heavy backward edges. We can see that
- For the second case of the inductive step, suppose that the current segment \(Q \subseteq P\) contains no absent edges. We will prove, by induction on the vertices of \(Q\), that
for all vertices \(x\) of \(Q\). Suppose \(d(x) \leq \tilde{d}(x) + i \cdot \tilde{d}(t)\) for a vertex \(x\) of \(Q\) and let \(y\) denote the next vertex on the segment.
- If the edge \((x, y) \in P\) is \((f, \tilde{f})\)-light, then \(\ell_e(f_e) \leq \ell_e(\tilde{f}_e)\) and \(f_e > 0\). Since \(f\) and \(\tilde{f}\) are Wardrop equilibria, we have
- If the edge \(e = (y, x) \in P\) is \((f, \tilde{f})\)-heavy, then
In either case, the inductive step holds.
Finally, let's see how this claim implies the theorem. Directly apply it to \(t\) to obtain
where \(k\) is the number of segments of \(P\) that include an absent edge. Again by lemma, the \((f,\tilde{f})\)-heavy segments of \(P\) are disjoint from each other and from \(s\) and \(t\). Picking one absent edge from each of these \(k\) segments that include an absent edge, we obtain a sparse set of edges in \(G\) that contains at most \(k\) edges. Thus, the theorem holds.
Multi-commodity Networks
For multi-commodity networks, the Braess ratio is defined as follows:
The lower bound is exponential in the network size, even for two-commodity networks.
On the other hand, it's at most exponential in the network size.
Detecting Braess Paradox Is Hard
So far, we have studied the possible benefits of Edge Removal. But how do we get this benefit? From an algorithmic perspective, we want to efficiently find the edges to remove so that the Nash equilibrium has the minimum cost. Here we only consider the single-commodity case. We will show the hardness of this problem.
LINEAR NETWORK DESIGN is the problem as follows: given a single-commodity instance \((G,r,\ell)\) with linear cost functions, find a subgraph \(H\) of \(G\) such that the cost of Nash equilibrium for \((H,r,\ell)\) is minimized.
In fact, it's almost optimal.
Consider an instance \(\mathcal{I}=(G,s_1,s_2,t_1,t_2)\) of 2DDP. Augment the vertex set by an additional source \(s\) and \(t\), and include the directed edges \((s,s_1),(s,s_2),(t_1,t)\) and \((t_2,t)\), with latency functions \(\ell(x)=1,\ell(x)=x,\ell(x)=1,\ell(x)=x\), respectively. And all edges in \(E\) have a latency function \(\ell(x)=0\). So far, we construct an instance \((G',1,\ell)\) for LINEAR NETWORK DESIGN in polynomial time.
We want to show that
- if \(\mathcal{I}\) is a "YES" instance, then there is a subgraph \(H\) of \(G'\) such that the cost of Nash equilibrium for \((H,1,\ell)\) is at most \(\frac{3}{2}\);
- if \(\mathcal{I}\) is a "NO" instance, then for every subgraph \(H\) of \(G'\), the cost of Nash equilibrium for \((H,1,\ell)\) is at least \(2\).
For the first statement, suppose there are vertex-disjoint \(s_1\)-\(t_1\) and \(s_2\)-\(t_2\) paths \(P_1\) and \(P_2\) in \(G\), respectively. Let \(H\) be the subgraph of \(G'\) that contains all edges of \(P_1\) and \(P_2\). Clearly, routing \(\frac{1}{2}\) flow along each path yields a Nash equilibrium with cost \(\frac{3}{2}\).
For the second statement, suppose there is no vertex-disjoint paths. Suppose \(H\) is a subgraph of \(G'\) that has an \(s\)-\(t\) path. Notice that all \(s\)-\(t\) paths \(P\) in \(H\) contain an \(s_i\)-\(t_j\) path for some \(i,j\in\{1,2\}\), routing all of the flow on \(P\) yields a Nash equilibrium with cost at least \(2\).
We have similar results for GENERAL NETWORK DESIGN, which is the version of LINEAR NETWORK DESIGN with general cost functions.
Taxes
Another approach is to use taxes. Formally, each edge \(e\in E\) is assigned a tax \(\tau_e\ge 0\), and the latency function of edge \(e\) becomes \(\ell_e(f_e)+\tau_e\). The new instance is denoted by \((G,r,\ell+\tau)\), where \(\tau=(\tau_e)_{e\in E}\). The cost of a flow \(f\) in this new instance is defined as $$C(f,\ell+\tau) = \sum_{P\in\mathcal{P}} \ell_P(f) f_P + \sum_{e\in E} \tau_e f_e = C(f,\ell) + \sum_{e\in E} \tau_e f_e.$$ It is a generalization of edge removal, since assigning an infinite tax to an edge is equivalent to removing it.
Marginal Cost Taxes
Intuitively, marginal cost taxes \(\tau_e^m:=f_e\ell_e'(f_e)\) (where \(f\) is an optimal flow) may help since it induce the optimal flow as a Nash equilibrium.
However, it does not always work, for example in the case of linear latency functions.
v.s. Edge Removal
One may ask whether arbitrary tax is really more powerful than edge removal. For the case of linear latency functions, the answer is no.
This minimal counterexample has some properties. We can see that \(f_e^\tau>0\) for all edges \(e\in E\) since otherwise we can remove the edge \(e\) with \(f_e^\tau=0\) and obtain a smaller counterexample. Combined with the proposition, \(G\) is directed acyclic. Furthermore,
However, this fails in general.
- Type A edges: \(\ell_e(x)=0\);
- Type B edges: \(\ell_e(x)=1\) for \(x\le \frac{1}{k+1}\) and \(\ell_e(x)=\frac{n}{2}\) for \(x\ge \frac{1}{k+1}+\epsilon\);
- Type C edges: For each \(e\in \{(w_i,t),(s,v_{k-i+1})\}\), \(\ell_e(x)=0\) for \(x\le 1+\frac{1}{k+1}\), \(\ell_e(1+\frac{1}{k})=i\), and \(\ell_e(x)=\frac{n}{2}\) for \(x\ge 1+\frac{1}{k}+\epsilon\)
where \(\epsilon>0\) is a sufficiently small constant.
If we put a tax \(\tau\) of \(1\) on each type A edge and \(0\) otherwise, then the following flow is a Nash equilibrium for \((B^k,k+1, \ell+\tau)\): \(1\) unit flow on each of \(P_1,...,P_k\) and \(\frac{1}{k+1}\) units of flow on each of \(Q_1,...,Q_{k+1}\). This Nash equilibrium proves that \(c(B^k,k+1,\ell+\tau)=1\).
On the other hand, we must show that \(c(H,k+1,\ell)\ge\frac{n}{2}\) for every subgraph \(H\) of \(B^k\). If \(H\) is all of \(B^k\), then routing \(1+\frac{1}{k}\) units on each of \(P_1,...,P_k\) yields a Nash equilibrium so that \(c(H,k+1,\ell)=\frac{n}{2}\). This is still true if \(H\) only removes some type B edges.
If a type B edge carries at least \(\frac{1}{k+1}+\epsilon\) units of flow or a type C edge carries at least \(1+\frac{1}{k}+\epsilon\) units of flow, we will say that the edge is oversaturated. We will show that if \(H\) removes some type A or type C edges, then the corresponding Nash equilibrium must oversaturate some edges, which immediately implies that \(c(H,k+1,\ell)\ge\frac{n}{2}\).
Suppose \(H\) removes some type C edges. Without loss of generality, assume \((s,v_i)\) is not in \(H\). Without oversaturation, \(s\) can only send out at most \((\frac{1}{k+1}+\epsilon)+(1+\frac{1}{k}+\epsilon)(k-1)\le k(1+\epsilon)\) units of flow, which is less than the required total flow of \(k+1\).
Finally, suppose \(H\) removes some type A edges, say \((v_i,w_i)\). The vertex \(v_i\) then has at most one outgoint edge in \(H\), which must be a type B edge. If it's not oversaturated, then \((s,v_i)\) carries at most \(\frac{1}{k+1}+\epsilon\) units of flow. By a similar argument as before, some edge incident to \(s\) is oversaturated. In either case, \(c(H,k+1,\ell)\ge \frac{n}{2}\) and the proof is complete.
Hardness of Computing Optimal Tax
Regarding the hardness of computing, we have the following result:
- \((\frac{4}{3} - \epsilon)\)-approximation algorithm for the problem of computing the optimal tax in networks with affine latency functions;
- \(o(p/\log p)\)-approximation algorithm for the problem of computing the optimal tax in networks with latency functions that are polynomials with degree \(p\) and nonnegative coefficients;
- \(O(n^{1-\epsilon})\)-approximation algorithm for the problem of computing the optimal tax in \(n\)-node networks with arbitrary latency functions.
Stackelberg Routing
Risk Model
Reference
Correa, J. R., Schulz, A. S., & Stier-Moses, N. E. (2004). Selfish routing in capacitated networks. Mathematics of Operations Research, 29(4), 961-976.
Roughgarden, T. (2002, May). The price of anarchy is independent of the network topology. In Proceedings of the thiry-fourth annual ACM symposium on Theory of computing (pp. 428-437).
Roughgarden, T. (2001, October). Designing networks for selfish users is hard. In Proceedings 42nd IEEE Symposium on Foundations of Computer Science (pp. 472-481). IEEE.
Lin, H., Roughgarden, T., Tardos, E., & Walkover, A. (2005). Braess’s paradox, Fibonacci numbers, and exponential inapproximability. In Automata, Languages and Programming: 32nd International Colloquium, ICALP 2005, Lisbon, Portugal, July 11-15, 2005. Proceedings 32 (pp. 497-512). Springer Berlin Heidelberg.
Cole, R., Dodis, Y., & Roughgarden, T. (2003, June). How much can taxes help selfish routing?. In Proceedings of the 4th ACM Conference on Electronic Commerce (pp. 98-107).
Roughgarden, T. (2005). Selfish routing and the price of anarchy. MIT press.

浙公网安备 33010602011771号