Optimizing the measurement of the trac on large Networks: An Experimental Design Approach

of 7
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Optimizing the measurement of the trac on large Networks: An Experimental Design Approach Guillaume Sagnol, Mustapha Bouhtou, Stéphane Gaubert INRIA Saclay / CMAP, Ecole Polytechnique Route de Saclay,
Optimizing the measurement of the trac on large Networks: An Experimental Design Approach Guillaume Sagnol, Mustapha Bouhtou, Stéphane Gaubert INRIA Saclay / CMAP, Ecole Polytechnique Route de Saclay, Palaiseau Cedex, France Orange Labs R&D, 38 rue du Général-Leclerc Issy-les-Moulineaux Cedex 9, France Abstract We study the classical problem of inferring the trac on each Origin-Destination (OD) pair of a large IP network. The most recent methods take advantage of network-monitoring tools such as Netow (Cisco Systems), which supplement the link measurements by direct information on the OD ows. The aim is to reduce the costs of deployment of Netow, and to optimize its use. We formulate a combinatorial optimization problem, whose objective is to nd the best set of interfaces on which Netow should be activated. Our approach relies on an experimental design model, in which we minimize the variance of an unbiased estimator of a linear combination of the ows. We show that this problem can be solved very eciently by Second Order Cone Programming, and we present a method called Successive Optimal gamma Designs to optimize the deployment of Netow. We give experimental results comparing our method with previously proposed ones. Keywords: Trac measurement, Experimental design, Combinatorial optimization, Second Order Cone Programming. 1 Introduction The problem of estimating Origin-Destination (OD) trac matrices for backbone networks has recently attracted much interest from both Internet providers and the network research community [5, 9, 8, 14], because these trac matrices serve as important inputs of a variety of network trac engineering tasks. This estimation problem is generally stated as follows. We are given the graph of the network, with its set of l edges (or links). Direct measurements are provided by the Simple Network Management Protocol (SNMP), which gives some statistics on the links (for instance, the number of bytes seen on each link in a 5 minutes window). We will denote these SNMP link counts by Y SNMP = (y 1,..., y l ) T. We are also given the set of routes among the network, that is to say a set of m OD pairs, and for each pair, the set of links that a byte need traverse to go from origin O to destination D. The information about the routing is classically gathered in the l m incidence matrix A 0 : this is a 0/1 matrix whose (e, r)-entry takes the value 1 if and only if the OD pair r traverses edge e. More generally, the Internet provider routing policies may lead us to consider matrices in which A 0(e,r) is a real number representing the fraction of the trac from OD pair r that traverses link e. The unknown in our problem is the vector of OD ows X = (x 1,..., x m ), where x r is the number of bytes which have been traveling through OD pair r during the observation period. The following relation is easily seen to hold: Y SNMP = A 0 X. 1 In typical networks, we have l m, and so, the estimation of the ow distribution X is a highly illposed problem. In particular, the previous system cannot have a unique solution X, and we need some additional constraints or information to ensure the identiability of the model. A way to introduce new constraints is to use a network-monitoring tool such as Netow (Cisco systems). This was considered by the authors of [8], who proposed a scheme for selecting dynamically the ows to be measured by Netow, in order to improve the accuracy of the trac estimation. Of course, activating Netow everywhere on the network yields an extensive knowledge of the OD ows. According to [6] however, activating Netow on an interface of a router causes its CPU load to increase by 10 to 50%. Moreover, the deployment of Netow on a network with heterogeneous routers may rise maintenance issues. It is thus of great interest to optimize the use of this tool. A similar problem arises in the eld of road trac ; in that case, the pneumatic cables counting the number of vehicles driving on a road replace the SNMP data, and one can make surveys instead of using Netow. The issue is thus to minimize the economic costs of the surveys needed to estimate the trac between each origin and destination with a prescribed accuracy. Related Work The placement of Netow has attracted much interest from the network research community [1, 2, 3, 4, 11, 13]. Most recent work includes Cantieni, Iannaccone, Barakat, Diot and Thiran [4], who interested themselves in the optimal rates at which Netow should be sampled on each router, and formulated this problem as a convex minimization problem, which they solved using a projected gradient algorithm. Song, Qiu and Zhang [11] used special criteria from the theory experimental design to choose a subset of measurements that Netow should perform, and developed an ecient greedy algorithm to nd a near optimal solution to this combinatorial problem. In a previous work [2], we formulated the placement of Netow as a combinatorial optimization problem, and showed that the Greedy algorithm always nd a solution within 1 1/e 62% of the optimal. The main contribution of this paper is to give an alternative to the greedy approach of [11]: the Successive Optimal gamma-designs approach (SOGD) introduced in Section 3 is not memory-consuming, it reduces to solving a sequence of moderate size second order cone programming problems which can be done rapidly by interior point methods, it may be tuned by the operator in order to give more weight to the most important ows. It yields results of a quality comparable to the greedy approach, whereas it is applicable to much larger instances. The rest of this paper is organized as follows: The problem and the experimental design background are presented in Section 2. Next, we discuss previous methods to solve this problem, in particular the Netquest greedy approach proposed in [11]. The Successive Optimal gamma-designs approach (SOGD), which is the main contribution of this manuscript, is presented in Section 3. Finally, we give some experimental results in Section 4. 2 Experimental design background and Problem statement When Netow is activated on an interface of the network, it will analyze the headers of the packets traversing this interface, and as a result we will have access to some statistics, such as the source and destination IP addresses, and the source and destination AS numbers of these packets. However, we are not directly interested in this information, because we are not trying to estimate the global run of the packets, but only the part of their run which is inside the network of interest, like the backbone of an autonomous system (AS). Practically, and without any loss of generality [2], we will assume throughout this paper that when Netow performs a measure on the k th interface, we get a multidimensional measure Y k which is a linear combination of the OD ows traversing interface k: Y k = A k X. (1) 2 We dene the design variable w as the 0/1 vector of size s, where w k equals 1 if and only if Netow is activated on the interface k. The measurement vector Y is now the concatenation of the SNMP data Y SNMP with all the Netow measurements (Y k ) {k wk =1}. The measurements are never exact in practice, and we have to deal with a noise ɛ, which is a result, among other things, of Netow sampling and lost packets. This can be modeled as follows: where Y = [Y T SNMP, Y T k1,..., Y T k n ] T and A(w) = [A T 0, A T k 1,..., A T k n ] T. Y = A(w) X + ɛ, (2) Under the classical assumptions that we have enough measurements, so that A(w) is of full rank, and that the noise has unit variance (E(ɛɛ T ) = I), the best linear unbiased estimator is given by a Moore-Penrose generalized inverse : ˆX = A(w) Y (Gauss Markov Theorem), and its variance is: Var( ˆX) = (A(w) T A(w)) 1. (3) If we further assume that the noise follows a normal distribution N (0, I), then this variance is nothing but the inverse of the Fisher information matrix of the OD ows. We denote it by M F (w): s M F (w) = A(w) T A(w) = A T 0 A 0 + w k A T k A k. The experimental design approach consists in choosing the design w which maximizes a scalar function of the information matrix M F (w), which is concave and nondecreasing with respect to the natural ordering of the symmetric cone of positive semidenite matrices, namely the Loewner ordering. For a more detailed description of the information functions, the reader is referred to the book of Pukelsheim [10], who proposes to use the matrix spectral functions Φ p, which are essentially the L p -norms of the vector of eigenvalues of the Fisher information matrix, but for p [, 1]. The function Φ p is given by k=1 0 for p 0 and M singular; λ min (M) for p = ; Φ p (M) = (det(m)) 1 m for p = 0 ; ( 1 m trace M p ) 1 p otherwise, where we have used the extended denition of powers of matrices M p for p R: trace M p = m j=1 λp j. We now give a mathematical formulation to the problem of optimizing the use of Netow. Assume that the cost of deployment/activation of Netow on interface k is c k. If an Internet provider has a limited budget B, the Netow Optimal Deployment problem is: ( ) max M F (w) (4) w {0,1} sφ p s.t. w k c k B k When the cost of deployment c k is the same everywhere, the constraint is equivalent to deploy Netow on no more than n interfaces. We call this special case the unit-cost case. For particular values of p, it is interesting to notice that we recover some classical optimal designs problems, known in the litterature as A optimality (p = 1), E optimality (p = ), D optimality (p = 0) and T optimality (p = 0). For small positive values of p, we also recover the maximal rank design [2]. 3 Resolution of the problem In this section, we review previous methods to solve the Netow optimal deployment problem, and we develop a new one. We limit our attention to methods that can be used on a large scale Network, say with more than 5000 OD pairs, and 100 interfaces. 3 Greedy Algorithm The hardness and approximability results given presented in [2] suggest to use a greedy algorithm. On a network with m = 5000 OD pairs however, the computation of the objective function Φ p (w) requires about 2 minutes on a PC at 4GHz, since it includes the diagonalization of a m m matrix. Consequently, selecting only one among one hundred interfaces already requires more than 3 hours. In order to overcome this problem, the authors of [11] proposed to use the special values p = 0 or p = 1, which make the increment of the criterion eciently computable thanks to Sherman-Morrisson like formulae. We adapted their idea to the case of block observations (1): ( ) M + A T 1 k A k = M 1 M 1 A T ( k I + Ak M 1 A T ) 1Ak k M 1, det(m + A T k A k ) = det(m) det ( I + Ak M 1 A T k ). One could naturally object that at the beginning of the algorithm, when no interface has been selected yet, the initial observation matrix M 0 = A T 0 A 0 is not invertible. The authors of [11] remedy to this problem by regularizing the initial observation matrix: they set M 0 = A T 0 A 0 + εi, with ε = Although this trick may look arbitrary, it leads to very good results. However, the inconvenient of these greedy updates is that one needs to store the m m matrix M 1 in memory, and to update it with the Sherman-Morrisson formula each time a new interface is selected by the greedy algorithm. The authors of [11] work on a similar experimental design problem, in which the observation matrix M has the nice property of being very sparse. Then, storing a sparse LU decomposition of the M is much more ecient than storing the full M 1 matrix, and it allows one to compute M 1 A T k. In our case though, M 0 is partially sparse only, and the LU decomposition is full. So this method may have reached its limit for our problem: it is unlikely that we may use it on a network with m = OD pairs without an additional eort to handle the storage of M 1. Convex Programming algorithms A natural idea to solve the problem (4) is to replace the integer constraint w {0, 1} s by 0 w k 1, k {1,..., s}. This relaxation lets problem (4) become a convex program. We can therefore apply dierent techniques to solve it, for instance projected gradient algorithms. Moreover, the solution of this relaxed program might be interpreted to some extent as a sampling of Netow. The projected gradient algorithm for Problem (4) is described in [2], as well as rounding techniques to approximate the integer solution. The bottleneck of this algorithm is the spectral decomposition required to compute the gradient of Φ p. For future work, we hope to implement second order techniques, by using appropriate approximations of the Hessian, whose direct computation is not tractable on large networks. Successive Optimal gamma-designs The hardness of the Optimal Netow Deployment is linked to the large dimension of the parameter that we want to estimate, which leads to large size covariance matrices. Rather than searching an estimator for the full parameter X, a natural idea is to estimate a linear combination z = γ T X of the ows, for which the best linear unbiased estimator and its variance are known [10] : ẑ = γ T A(w) y, var(ẑ) = γ T M F (w) γ, where A denotes the Moore-Penrose inverse of A. The variance of this estimator is a scalar, but it still depends (non-linearly) on a m m matrix. The following result shows that the minimization of var(ẑ) is equivalent to a Second Order Cone Program (SOCP) ; the proof is omitted due to the lack of space, and will be published elsewhere. As in [7] (which handles the case in which the observation matrices A i are vectors), the idea is to give a geometrical characterization of the γ optimal designs, based on the intersection of the vectorial straight line directed by γ and the boundary of a special convex set. 4 Theorem 3.1. Let w be the optimal (real) solution of the γ optimal design problem: min w 0 γ T M F (w) γ w i c i = B, M F (w) = i 1 s w i A T i A i. And let h, (µ, z ) be a pair of primal and dual solution of the SOCP: (P SOCP ) : max h i=0 γ T h (D SOCP ) : min µ 0,z A 0 h 1; A T 0 z 0 + i 1 B/ci A T i z i + γ = 0 i 1, A i h c i B i 0, z i µ i. Setting T = i 1 µ i, the following relations hold : w 0 = µ 0/T, i 1 w i = µ i B/(c i T ), T 2 = γ T M F (w ) γ. This theorem shows how to compute the optimal weights w for the γ combination of the ows by SOCP. This can be done very eciently with interior points codes such as Sedumi [12]. Moreover, this method takes advantage of the sparsity of the matrices A i, since SOCP codes are optimized to work with sparse matrices. Another advantage of this method is the exibility brought by the vector γ : The operator can choose this vector so that the weighted sum correlates the most important ows. In order to estimate all the ows, a possibility is to repeat the SOCP several times with vectors γ either chosen by the operator for their signicance or tossed randomly on the unit sphere of dimension m, and nally to combine (for instance by taking the mean) the resulting optimal designs : we call this method SOGD (Successive Optimal gamma-designs). Once the optimal fractional w i values are obtained, the deployment of Netow can be done using a simple rounding heuristic: one can sort the interfaces according to the ratio wi c i, and activate Netow sequentially on these sorted interfaces until the budget constraint is violated. We may ask ourselves whether it is a good idea to add the constraint µ i B c i k 1 µ k in Problem (D SOCP ), in order to ensure that w i 1. However, our proof does not seem to adapt to this case, and we can not guaranty that the w that one would obtain with this additional constraint is the solution of the γ optimal design problem in [0, 1]. Moreover, the optimal fractional w i values never exceed 1 in practice thanks to the budget constraint, and even if it would, it does not prevent one from applying the rounding heuristic proposed above (although we loose the sampling interpretation of the vector w ). i 0 µ i 4 Experimental Results We now present some comparisons between the methods presented above : experiments were carried out on real data from France Telecom Opentransit backbone (100 nodes, 267 links and 5591 OD-pairs). The inference was made on dynamic ows observed during 50 time steps, using the widespread methodology of entropic projections [9, 8], and the gravity model as initial estimate [9]. As our sample of Netow data was incomplete, we simulated the true value of the unmeasured ows, and we generated both SNMP and Netow measurement using relation (2). Table 1 compares the approaches presented in the previous section in terms of both CPU time and the size of the smallest full-rank design found, i.e. the design with the smallest number of nodes where one should activate Netow in order to infer exactly all the ows. The computation was made in the unit cost case, and we assumed that Netow measured each ow traversing the interfaces where it is activated. The performance of the SOGD method is outstanding in term of CPU time, although the greedy algorithm 5 Figure 1: WAE vs time with Netow on 10 nodes (left) and 17 nodes (right) selected by Greedy (red), SOGD (green), and P.Gradient (blue). Algorithm Smallest fullrank design CPU Greedy (p = 1) 35 3h Projected gradient 39 65h SOGD 36 4min SOGD + Rounding heuristic 35 16min Table 1: Comparison based on the smallest full-rank design found nds a smaller full-rank design. Nevertheless, a simple rounding heuristic [2] allows one to obtain the same full-rank design that the one provided by the Greedy algorithm. Table 1 also shows the limit of direct convex programming methods, which fail to be ecient on large scale networks. Figure 2 compares the deployment of Netow on 10 nodes of the network obtained by the greedy algorithm (in red) and by SOGD (circled in blue). One can notice that these deployments are the same except for 2 nodes. The metric used to compare the designs is the weighted average error(wae) : At time step t, it is dened as the mean of the vector of relative errors X t ˆX t /X t, where ˆX t is the vector found by entropic projection of ˆXt 1 over the space{x 0 A(w)X = Y t (w)}. Figure 1 shows the weighted average error on the trac estimation for the 1000 largest ows, which represent 97% of the trac due to their lognormal distribution [9]. If the estimation is better with the greedy deployment for Netow activated on 10 nodes, the SOGD method yields better results for 17 nodes. Another important comparison of the deployment of Netow is the number of ows correctly estimated by each method. Figure 3 shows the distribution of the errors for the 1000 largest ows (averaged over time), when Netow is activated on 17 nodes. It is very satisfactory to notice that more than 97% of the 1000 largest ows are estimated with an average error below 1% by the SOGD mthod. We hope to improve this result by tuning the vector γ in order to estimate the largest ows in priority. Finally, Figure 4 shows how the weighted temporal error falls when we add Netow measurements, with nodes selected by SOGD. This suggests that 17 interfaces are enough to infer correctly the trac matrix. The additional eort to obtain a design of full-rank is both huge and of limited interest, since no improvement on the estimation is to notice between 17 and 35 nodes. The reason why the ows are not inferred exactly with a full rank matrix is the error ɛ on the measurements (2). 5 Conclusions In this paper, we have proposed a new method to optimize the trac measurement, based on the estimation of a sequence of linear combinations of the ows, rather than on the estimation of the full vector of ows. This method remains tractable for very large instances, and it allows one to identify the trac accurately. Our computational results also suggest that it may be unnecessary to activate measurement tools such as Netow with a design of full-rank, since we achieve results of the same quality by activ
Similar documents
View more...
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks