Sunteți pe pagina 1din 14

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

IEEE/ACM TRANSACTIONS ON NETWORKING

1

A Difference Resolution Approach to Compressing Access Control Lists

James Daly, Alex X. Liu, and Eric Torng

Abstract— Access control lists (ACLs) are the core of many net- working and security devices. As new threats and vulnerabilities emerge, ACLs on routers and firewalls are getting larger. There- fore, compressing ACLs is an important problem. In this paper, we propose a new approach, called Diplomat, to ACL compres- sion. The key idea is to transform higher dimensional target pat- terns into lower dimensional patterns by dividing the original pat- tern into a series of hyperplanes and then resolving differences be- tween two adjacent hyperplanes by adding rules that specify the differences. This approach is fundamentally different from prior ACL compression algorithms and is shown to be very effective. We implemented Diplomat and conducted side-by-side compar- ison with the prior Firewall Compressor, TCAM Razor, and ACL Compressor algorithms on real life classifiers. Our experimental results show that Diplomat outperforms all of them on most of our real-life classifiers, often by a considerable margin, particu- larly as classifier size and complexity increases. In particular, on our largest ACLs, Diplomat has an average improvement ratio of 34.9% over Firewall Compressor on range-ACLs, of 14.1% over TCAM Razor on prefix-ACLs, and 8.9% over ACL Compressor on mixed-ACLs.

Index Terms— Packet classification, router design, ternary con- tent addressable memory (TCAM).

I. INTRODUCTION

A. Background and Motivation

A CCESS control lists (ACLs) are the core of many net- working and security devices, such as routers and fire-

walls, which perform services such as packet filtering, virtual private networks (VPNs), network address translation (NAT), quality of service (QoS), load balancing, traffic accounting and monitoring, and differentiated services (Diffserv). A packet can

be viewed as a

fields with finite discrete domains.

For IP packets,

is typically five and the relevant fields are

source IP address, destination IP address, source port number,

-tuple over

destination IP address, source port number, -tuple over destination port number, and protocol type, where the
destination IP address, source port number, -tuple over destination port number, and protocol type, where the
destination port number, and protocol type, where the domains of these fields are , ,
destination port number, and protocol type, where the domains
of these fields are
, ,
,

Manuscript received January 25, 2014; revised September 22, 2014; accepted November 24, 2014. This work was supported by the National Science Foun- dation under Grants CNS-0916044, CNS-1017588, CNS-1017598, and CCF- 1347953, and the National Natural Science Foundation of China under Grants 61472184, 61321491, and 61272546. A preliminary version of this paper ap- peared in the Proceedings of the 32rd Annual IEEE International Conference on Computer Communications (INFOCOM), Turin, Italy, April 2013. (Corre- sponding author: Alex X. Liu.) . The authors are with the Department of Computer Science and Engi- neering, Michigan State University, East Lansing, MI 48824 USA (e-mail:

dalyjame@cse.msu.edu; alexliu@cse.msu.edu; torng@cse.msu.edu). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNET.2015.2397393

, and , respectively. An ACL is specified as a se- quence (i.e., ordered list)
, and
, respectively. An ACL is specified as a se-
quence (i.e., ordered list) of predefined rules. Each rule is spec-
ified in the form
over packet fields
. The
is typically specified as
. If each
is specified
as a range, we call the ACL a range-ACL. If each
is speci-
fied as a prefix, we call the ACL a prefix-ACL. If some
are

specified as prefixes and others as ranges, then we call the ACL

a mixed-ACL. The

carded , or a combination of these decisions with other options such as a logging option. An ACL is essentially a function from the space of legal packets to a set of possible decisions. When

a packet arrives at a router, the router extracts the relevant field values to form a search key and searches the ACL to find the first rule that the search key matches. The decision of this rule determines the appropriate action to take upon the packet. The rules in an ACL often conflict, which means that a packet may match multiple rules and these rules may have different deci- sions. The way that ACLs resolve conflicts is to follow the first match principle: for any packet, the decision for the packet is the decision of the first packet that the packet matches. In this paper, we focus on the ACL Compression Problem:

given a range-ACL, pre x-ACL or mixed-ACL equivalent range-ACL, pre x-ACL or mixed-ACL

the number of rules in

We consider three different variations of the ACL Compression Problem because of varying application constraints. For appli- cations where all fields must be prefixes, we use the prefix-ACL variant. For applications where all fields can be ranges, we use

the range-ACL variant which allows for more compression. For applications where only some fields are constrained to be prefixes, we use the mixed-ACL variant. This problem has many motivations. First, as new threats and vulnerabilities emerge, ACLs on routers and firewalls are getting larger. For example, because the ACLs used for quality

of services (QoS) on routers are often automatically generated, they tend to be huge. Second, in the Open Flow architecture, which is gaining more popularity and real deployment, net- working devices typically need to handle a large number of ACL-like rules; thus, compressing such ACL-like rules is very helpful for managing and optimizing such devices. A number of network system management work, such as Yu et al.'s DIFANE work [18] and Sung et al. 's work [15], have used our prior ACL compression tool called Firewall Compressor [11], [12] to reduce system complexity. Third, some networking and security devices have strict limits on the number of rules that can be stored. For example, NetScreen-100 only allows ACLs with 733 rules. Fourth, fewer ACL rules often leads to higher

733 rules. Fourth, fewer ACL rules often leads to higher of a rule can be accepted

of a rule can be accepted , dis-

, fi nd an so that
, fi nd an
so that
of a rule can be accepted , dis- , fi nd an so that , denoted

, denoted

can be accepted , dis- , fi nd an so that , denoted , is as

, is as small as possible .

1063-6692 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

2

IEEE/ACM TRANSACTIONS ON NETWORKING

system performance. As ACLs are used to examine every incoming and outgoing packet, such performance is critical for network throughput.

B. Summary and Limitations of Prior art

network throughput. B. Summary and Limitations of Prior art Optimal polynomial time algorithms have been developed

Optimal polynomial time algorithms have been developed

for 1-D range-ACL compression (Top Coder Challenge in 2003, [2], [11]) and prefix-ACL compression [4], [16]. Applegate et

al. proved that the 2-D range-ACL compression problem is

NP-hard [2]. The complexity of 2-D prefix-ACL compression is still unknown, but prefix-ACL compression with an arbitrary number of dimensions is NP-hard[5]. The only approxima- tion algorithm with a non-trivial approximation bound is one given by Applegate et al. which has an approximation ratio of

where word size in bits, and
where
word size in bits, and

for 2-D range-ACL compression and for 2-D prefix-ACL compression,

is the

is the number of rules in the optimal

is the number of rules in the input rule list,

the optimal is the number of rules in the input rule list, Fig. 1. Overview of
Fig. 1. Overview of Diplomat in two dimensions. Fig. 2. Two-dimensional pattern divided into three
Fig. 1.
Overview of Diplomat in two dimensions.
Fig. 2.
Two-dimensional pattern divided into three rows.

rule list [2]. Their bounds are tight. Their method requires dividing the input into several subproblems to deal with the special forbidden rectangle pattern described in Section V.

The number of subproblems required may be

, each of

which may require

Very few prior algorithms work for more than two dimen- sions. Liu et al. 's Firewall Compressor algorithm [11] handles

-dimensional range-ACLs by converting the rule list into a canonical firewall decision diagram and applying an optimal weighted algorithm to each of the 1-D patterns formed between two layers of the diagram. In their algorithm, the “decision” used is actually the rule list that is produced by lower levels, which is replicated for each of the corresponding rules in the higher dimension. However, Firewall Compressor only considers whether or not the two lower rule lists are the same. If two lists are very similar but not the same, both lists are

replicated in their entirety. Their later work, TCAM Razor, pression approach that is totally different from all prior ACL works by the same principle but uses an optimal prefix-ACL compression algorithms. This novel approach is also very ef- algorithm rather than a range-ACL algorithm [14]. Finally, they fective. We implemented Diplomat and conducted side-by-side

combined both works into ACL Compressor, which can use comparison with the prior Firewall Compressor, TCAM Razor

either prefix or range compressors on different fields .

The key contribution of this paper is our new ACL com-

consider the result of each resolution to be one homogeneous re- gion. We illustrate this process for a 2—D range-ACL in Fig. 1. We call this algorithm Diplomat because it compresses a rule list by repeatedly “resolving differences” between adjacent planes. Fig. 2 shows that, for an example input ACL, Firewall Com- pressor produces more rules than Diplomat. Note that Diplomat is designed to be run offline so that net- work managers do not need to interact with the compressed ACL. Rather, the manager can interact with the rule list in a comfortable and understandable form. This can then be input into Diplomat, which will create the compressed rule list actu- ally used by the network device.

D. Key Contributions

actu- ally used by the network device. D. Key Contributions rules. and ACL Compressor algorithms on
actu- ally used by the network device. D. Key Contributions rules. and ACL Compressor algorithms on

rules.

ally used by the network device. D. Key Contributions rules. and ACL Compressor algorithms on real

and ACL Compressor algorithms on real life classifiers. The experimental results show that Diplomat outperforms Firewall Compressor, TCAM Razor and ACL Compressor most of the

We present a new approach, called Diplomat, to ACL com- time, often by a considerable margin. In particular, on our largest

pression. The key idea is to transform higher dimensional target ACLs, Diplomat has an average improvement ratio of 34.9%

C. Proposed Approach

over Firewall Compressor on range-ACLs, of 14.1% over Razor

patterns into lower dimensional patterns by dividing the original

pattern into a series of hyperplanes. It then selects two adjacent on prefix-ACLs, and 8.9% over ACL Compressor on mixed- planes and resolves their differences by adding rules to specify ACLs. Furthermore, on 2-D classifiers we demonstrate approx-

where the two planes differ. After resolution, the two planes are imation ratios of compatible and can be merged into a single plane. For example,

in Fig. 2, a single white rule in the middle row would allow it the prior best result of [2]. We do note that Diplomat achieves to be merged with the top row. Diplomat repeats this process these performance gains at the cost of additional running time until all of the planes have been merged together into a single compared to both Firewall Compressor and TCAM Razor.

plane. Diplomat then repeats the process on the reduced pat-

The remainder of this paper is organized as follows. First, we

tern. After recursive applications, the pattern is reduced to a 1-D review related work in Section II. Second, we present key def- pattern for which optimal algorithms already exist. In terms of initions and some fundamental results in Section III. We then construction, the rules generated by each resolution step and the describe how Diplomat is applied to range and prefix-ACLs in final compression step are joined together into one rule list with Section IV. We cover the theoretical approximation bounds in the rules from earlier steps appearing before the rules from later Section V and the practical experimental results in Section VI. steps. Thus, any section specified by an earlier resolution cannot We conclude the paper by summarizing our results and dis- be undone by a later resolution. This is what allows Diplomat to cussing some open problems in Section VII.

is what allows Diplomat to cussing some open problems in Section VII. for range-ACLs and for

for range-ACLs and

for prefix-ACLs, improving on

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

DALY et al.: DIFFERENCE RESOLUTION APPROACH TO COMPRESSING ACCESS CONTROL LISTS

3

II. RELATED WORK

we cannot leverage any existing RPC algorithms to help with range-ACL compression.

D EFINITIONS AND N OTATIONS

Let

,
,

be the set of decisions. It can be any finite set, such as

accept, deny

, or

0, 1, 2, 3, 4

as the Cartesian product of where . A box
as the Cartesian product of
where
. A box

. For the

is the number of bits used to

of where . A box . For the is the number of bits used to yes,

yes, no

is any. A box . For the is the number of bits used to yes, no one

one of those decisions. We define a canvas

finite

discrete dimensions, prefix version,

represent dimension

by a Cartesian product of intervals. For prefix classifiers, the

for some

is any region represented

. The actual items

are irrelevant so long as they are distinct. A color

intervals are represented by the prefix

with
with

. A pattern or coloring is a function

the prefix with . A pattern or coloring is a function to colors in . An

to colors in

with . A pattern or coloring is a function to colors in . An incomplete pattern
with . A pattern or coloring is a function to colors in . An incomplete pattern

. An incomplete

pattern is a pattern that does not define a color for some points

relating points in

in the canvas. Given two incomplete patterns and that do not both assign colors to
in the canvas. Given two incomplete patterns
and
that do
not both assign colors to the same point in the canvas,
is
the pattern formed by joining the two patterns
and
.
A rule
assigns exactly one color
to each
point in box
of rules and
. A rule list
is an ordered list
is the number of rules in
. A rule list
assigns colors to points as follows: for any point
,
is the value of
for the minimum
such that
is
defined. A rule list is complete if
is defined for all points
in the canvas and is incomplete otherwise. We use
to denote

an incomplete rule list. We use the terms rule list and classifier

and the pattern

, we call the color of

defined by

interchangeably to refer to both a rule list

of defined by interchangeably to refer to both a rule list . For a complete rule
of defined by interchangeably to refer to both a rule list . For a complete rule

. For a complete rule list

to refer to both a rule list . For a complete rule list the background color

the background color of

rule list . For a complete rule list the background color of where , and we
rule list . For a complete rule list the background color of where , and we

where

. For a complete rule list the background color of where , and we can assume

, and we can assume without loss

since

color of where , and we can assume without loss since applies to all of and

applies to all of

and

, .
,
.

to the end of

of generality that

not covered by prior rules. For two rule lists

denotes the rule list formed by appending

We will also write

is a pattern; in this case, the

pattern implied by incomplete rule list where both are defined. overrides for points Two rule
pattern implied by incomplete rule list
where both are defined.
overrides
for points
Two rule lists are equivalent,
, if
. We denote the set of all rule lists equivalent to
as
.
is an optimal rule list for a pattern
if
and
.
Our objective is to find an optimal rule
list for an input pattern
. This was shown in [2] to be NP-hard

for patterns of two or more dimensions in the range case (and may be for the prefix case as well), so we try to find one as close to optimal as possible.

Given a pattern
Given a pattern

, box

is
is

-monochromatic if

. We can break

Given a pattern , box is -monochromatic if . We can break The 2-D range-ACL compression
Given a pattern , box is -monochromatic if . We can break The 2-D range-ACL compression

The 2-D range-ACL compression problem was proven to be NP-hard in [2]. This implies that mixed-ACL compression with at least two range fields is also NP-Hard. Whether the prefix-ACL problem with a fixed number of dimensions is NP-hard is currently open. The 1-D case for both range and prefix-ACL compression is in P. Applegate et al. [2] presented an algorithm for compressing 2-D range-ACLs. For the special case of strip rules where each rectangle spans the entire canvas in one of the two dimensions and only two colors, their algorithm is optimal. They then describe a way to divide the ACL into regions which can each be solved with their strip rule solver. When applying their solution to the general 2-D range-ACL problem, they gave a

-approximation algorithm. They also

adapt their methods to prefix-ACL compression which adds

is the

a factor of

number of bits in the prefix word size. This is the only approxi- mation result we are aware of for either range or prefix-ACL compression for two or more dimensions. We adapt their methods in Section V to demonstrate that some versions of Diplomat can achieve the same bounds. Finally, they observe that the 1-D range-ACL version was given as StripePainter in

the 2003 Google TopCoder challenge and that a

ning time can be achieved, where

decisions [1]. For the 1-D prefix-ACL compression problem, Draves et al. gave an optimal algorithm based on tries [4], and Suri et al. gave an optimal algorithm based on dynamic programming [16]. Suri et al. also gave an optimal dynamic programming solution for the special case where two prefix rectangles in the ACL can intersect only if one is fully contained within the other. In this paper, we use Suri's algorithm to optimally compress the 1-D prefix-ACLs which the prefix version of Diplomat generates as subproblems. Liu et al. presented their own optimal 1-D range-ACL com- pression solution based on dynamic programming methods with

is the number of

rules and

an improved running time of

with is the number of rules and an improved running time of to their approximation bounds
with is the number of rules and an improved running time of to their approximation bounds

to their approximation bounds where

improved running time of to their approximation bounds where run- is the number of possible ,
improved running time of to their approximation bounds where run- is the number of possible ,

run-

is the number of possiblerunning time of to their approximation bounds where run- , where is the maximum number of

, whereapproximation bounds where run- is the number of possible is the maximum number of rules with

bounds where run- is the number of possible , where is the maximum number of rules
bounds where run- is the number of possible , where is the maximum number of rules

is the maximum number of rules with the same deci-

sion [11]. They then applied this algorithm to each of the layers of a firewall decision diagram [7] to create a solution for any number of dimensions. In their later paper [10], they apply a sim- ilar idea using the optimal 1-D prefix-ACL solver from [16] to expand their solution to operate on prefix-ACLs. In , Liu et al. de- fine ACL Compressor which considers ACLs where some fields are ranges and some fields are prefixes. ACL Compressor oper- ates on the same principles, using an optimal 1-D range-ACL solver on range fields and an optimal 1-D prefix-ACL solver on

prefix fields. In this paper, we use Firewall Compressor to opti- any rectilinear pattern

mally compress the 1-D range-ACLs which the range and mixed ment that separates and bounds two differently-colored regions

versions of Diplomat generate as subproblems.

a boundary line . Extend each boundary line in both directions

, call a horizontal or vertical line seg-

using Applegate et al.'s effective grid concept. Quoting [2], “For

into monochromatic boxes

Finally, the Rectilinear Polygon Cover (RPC) Problem, so that it crosses the full canvas. We call the resulting grid the which has been well explored in prior work [3], [6], [13], can “effective grid” for the pattern, and note that all its grid cells be thought of as a special case of the range-ACL compression will be monochromatic.” Without loss of generality, we assume

is converted to the effective grid space, so

color of all prior rules must be black. However, RPC is very

problem where the color of the last rule must be white and the the input rule list

of the last rule must be white and the the input rule list is the effective
of the last rule must be white and the the input rule list is the effective

is

the effective grid and

is one more than the . We also order the

. Applegate et al. also showed

different than range-ACL because the ability to use rectangles number of boundary planes in dimension

with different colors leads to much shorter rule lists. Thus, dimensions so

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

4

is final as presented, with the exception of pagination. 4 Fig. 3. Effective grid denoted by
Fig. 3. Effective grid denoted by the dashed lines. Fig. 4. Merging planes and .
Fig. 3.
Effective grid denoted by the dashed lines.
Fig. 4.
Merging planes
and
. (a) Before. (b) After.

that

in Fig. 3.

[2]. An example of an effective grid can be seenplanes and . (a) Before. (b) After. that in Fig. 3. III. M ETHODOLOGY We now

III. METHODOLOGY

We now formally describe the Diplomat classifier compres-

IEEE/ACM TRANSACTIONS ON NETWORKING

, we have reached our base case and use the optimalclassifier compres- IEEE/ACM TRANSACTIONS ON NETWORKING 1-D solvers in [11] or [16] to generate a solution

1-D solvers in [11] or [16] to generate a solution to

.
.

Algorithm 1 Diplomat Solver

Require: Pattern Ensure: Rule list 1: 2: is the subpattern with Sub-planes 3: Intervals 4:while
Require: Pattern
Ensure: Rule list
1:
2:
is the subpattern with
Sub-planes
3:
Intervals
4:while
do
5:
6:
7: Replace
and
with
8: Replace
and
with
9: Append
to
10:end while 11: 12:Append to 13:return
10:end while
11:
12:Append
to
13:return
sion algorithm. We first describe how we reduce our -dimen- To merge two adjacent patterns
sion algorithm. We first describe how we reduce our
-dimen- To merge two adjacent patterns
and
, we need a
sional pattern into a collection of
-dimensional patterns.
resolver which we formally define as follows.
Given a
-dimensional pattern
with a
-dimensional canvas
De fi nition 1: A resolver of
and
returns a partial
, define the
-dimen-
rule list
defined on
and pattern
defined on
subject
sional canvas
. We divide
to the following constraints:
,
equals either
into
copies of canvas
with
being the
-dimen-
or
and
.
sional plane with
for
. We then create
is the intersection
We observe that if
will match
is the empty list. While a resolver is formally
then
patterns
for
where
both lists and
of plane
with the original pattern
We define pattern
as follows: for any point
.
the function that computes the partial rule list
on the
-dimensional canvas
pattern
and say that
, we also often refer to partial rule list
and the merged
as a resolver
,
. In Fig. 4(a), we see a 3-D pattern
resolves
and
.
reduced into four 2-D patterns
We create the resulting rule list
through
.
for
from our sequence
of
patterns
by “merging” together
adjacent patterns until we are left with a single pattern and then
Next, we review the Firewall Compressor and Suri algorithms
which are used as sub-programs by our resolvers. We then for-
mally present several different resolvers to merge adjacent pat-
terns and schedulers to determine the merging order for patterns.
recursively solving that single pattern. More formally, let denote some merging of patterns to inclusive.
recursively solving that single pattern. More formally, let
denote some merging of patterns
to
inclusive. In Fig. 4(b),
we see
representing the merger of patterns
and
from
Fig. 4(a). For any point
,
for some
. For any
, we define pattern
on
by
setting
if
; otherwise,
is undefined.
Initially,
. Initialize
to be the

empty rule list. While we have at least two patterns remaining in

our sequence of patterns, we select two adjacent patterns

and

single pattern by providing extra information in a partial rule

list

points where

terns to be merged. Let th merger of patterns where

single pattern nation of partial rule lists

to be merged. We merge these two patterns into apatterns where single pattern nation of partial rule lists defined on which is appended to .

rule lists to be merged. We merge these two patterns into a defined on which is
rule lists to be merged. We merge these two patterns into a defined on which is

defined on

which is appended to . Intuitively, and remaining, the rule list for -dimensional pattern for
which is appended to
. Intuitively,
and
remaining, the rule list
for
-dimensional pattern
for our final solution. If
list for -dimensional pattern for our final solution. If sively solve the then appended to A.

sively solve the then appended to

A. Firewall Compressor

We review here the 1-D version of Firewall Compressor from

, compute the effective grid . The first key insight

covers the leftmost cell 0 in some op-from , compute the effective grid . The first key insight timal solution. Liu et al.

timal solution. Liu et al. use dynamic programming to deter-

. If no other

is that the last rule

of

[11]. Given some input classifier

is that the last rule of [11]. Given some input classifier and label the cells from

and label the cells from 0 to

Given some input classifier and label the cells from 0 to mine which other cells are

mine which other cells are also covered by rule

cells are covered, then the optimal solution consists of rule

which only covers cell 0 plus an optimal solution that covers re-

be the nearest cell that is also

The optimal solution is then one that covers re-

where the

covered by

gion

is then one that covers re- where the covered by gion . . Otherwise, let colors
is then one that covers re- where the covered by gion . . Otherwise, let colors
. .
.
.

Otherwise, let

covers re- where the covered by gion . . Otherwise, let colors gion and one that

colors gion

and one that covers region

. . Otherwise, let colors gion and one that covers region differ; this allows the two

differ; this allows the two pat- be the partial rule list used in the

. When we have a

is the concate-

. We then recur- which is

which means

last rule covers cell (and cell 0).

which is which means last rule covers cell (and cell 0). B. Suri's Pre fi x

B. Suri's Pre x ACL Solver

We review the 1-D version of the prefix ACL solver from [16]. Their method works by dividing the range into regions des- ignated by prefixes. Each region has a designated background color. They then proceed to merge regions together, sharing

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

DALY et al.: DIFFERENCE RESOLUTION APPROACH TO COMPRESSING ACCESS CONTROL LISTS

5

background colors between neighboring regions at reduced cost when possible.

into the largestbetween neighboring regions at reduced cost when possible. prefix segments that are monochromatic. For each

prefix segments that are monochromatic. For each prefix-color

pair, compute

applied to the given prefix. For the base case of one of the

is the background color

using , we build the solution . We add the rules from the appropriate solution
using
, we build the solution
. We add the rules from the
appropriate solution
or
to
to form
, and we add the other pattern for columns
to
to
to form
.
b) Optimal In-Plane Range-Resolver: This version as-

sumes that all of the rules for

have to come fromThis version as- sumes that all of the rules for or . . More formally, they

or . .
or
.
.

More formally, they begin by dividing

, whereto come from or . . More formally, they begin by dividing monochromatic prefixes, then if

from or . . More formally, they begin by dividing , where monochromatic prefixes, then if
monochromatic prefixes, then if the segment and 1 otherwise. For other prefixes, . The final
monochromatic prefixes, then
if
the segment and 1 otherwise. For other prefixes,
.
The final cost,
(to fill in the

is the color of Without loss of generality, assume the rules come from

is defined byof Without loss of generality, assume the rules come from . Since we know no rules

of generality, assume the rules come from is defined by . Since we know no rules

. Since we know no rules for

rules come from is defined by . Since we know no rules for Then . We

rules come from is defined by . Since we know no rules for Then . We

Then

. We use a dynamic programming

technique similar to the general resolver to create our rule list

can come from , we only

background color). We utilize Suri's algorithm for constructing need to compute the partial solutions in our prefix resolvers.

.
.
Algorithm 2 Optimal In-Row Range Resolver Require: Rows and and interval Ensure: A rule list,
Algorithm 2 Optimal In-Row Range Resolver
Require: Rows
and
and interval
Ensure: A rule list,
marking how
is different from
.
1:
2:
3:for
do
4: if
then
5:
6:
7: for
do
8:
9: If
then
10:
11:
12: end if 13: end for 14: 15:
12: end if
13: end for
14:
15:
16: else 17:
16: else
17:

18: end if

C. Resolvers

We first define two different types of resolvers that constrain the type of scheduler Diplomat can use. De nition 2: An in-plane resolver is a resolver where all

of the rules in

come from the

canvas

merged. A general resolver has no constraints placed upon the rule source. With an in-plane resolver, the resulting merged pattern will be identical to the pattern that is not the source of the rules in With a general resolver, the merged pattern may be a new pat- tern that is different from either of the two input patterns. The

general resolver has the advantage that

corresponding to only one of the two patterns being

merged. Stated another way, all the rules in

correspond to one of the two patterns beingpatterns being merged. Stated another way, all the rules in . can be smaller since we

all the rules in correspond to one of the two patterns being . can be smaller
all the rules in correspond to one of the two patterns being . can be smaller
.
.
the rules in correspond to one of the two patterns being . can be smaller since

can be smaller since we

have more flexibility in resolving differences between the two patterns. The in-plane resolver has the advantage that because the resulting pattern will be one of the input patterns, the sched-

uler can use dynamic programming. We now describe our ef-

forts to develop fast and effective resolvers. We begin by giving optimal general and in-plane resolvers for rows (1-D planes) and then give heuristic higher dimensional resolvers.

1) Optimal One-Dimensional Resolvers: In this section,

and

1) Optimal One-Dimensional Resolvers: In this section, and 19: end for is a row. We use

19: end for end for

is a row. We use and
is a row. We use
and

to correspond to the two patterns

and , respec-
and
, respec-
20: 21:While do 22: If then 23: 24: 25: else 26:
20:
21:While
do
22: If
then
23:
24:
25: else
26:

27: end if 28:end while

that are being merged. We use

and26: 27: end if 28: end while that are being merged. We use to refer to

to refer to the sub-pattern in row

and

merged. We use and to refer to the sub-pattern in row and tively, between columns .

tively, between columns

. All the resolvers we present

use the optimal 1-D versions of Firewall Compressor (FC) in [11] or Suri's algorithm in [16] as a subroutine. Proofs of opti- mality for the resolvers can be found in Section V. a) Optimal General Range-Resolver: Here, we present an

optimal way to merge and Between Columns 0 and . Let 29:return denote the partial
optimal way to merge
and
Between Columns 0 and
. Let
29:return
denote the partial rule list returned by this optimal solution,
let
, and let
denote the resulting pattern that is
returned. If
, then
, , and
c) Optimal General Prefi x-Resolver: Here, we present
as there are no differences in column
. an optimal way to merge
and
over the range of
Otherwise, there is a difference between
and
in position
some prefix
using only prefix rules; the end result is re-
. This means
or
must include at least one rule that covers either
. We prove in Section V-A that we can restrict our
solver
and pattern
. We begin by observing that
and
are both resolvers. Let
denote
attention to resolvers that cover only
or
from column
the smallest prefix (and thus specifies the largest number
back to some column
where
. This is then combined of points) used in an optimal resolver. The restrictions
with an optimal solution up to column
. We consider all
on prefix rules means that either
, , or
possible choices of this split point
in the following description
. If
and set
,
, and
for the case where
or
, then the optimal resolver is either
. Otherwise, it is some combination
. Then
,
of the optimal resolvers for
and
. That is,
and
is the corresponding value of
. Now,

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

6

IEEE/ACM TRANSACTIONS ON NETWORKING

base case for this recurrence relation occurs when is monochromatic meaning create
base case for this recurrence relation occurs when
is monochromatic meaning
create
relation occurs when is monochromatic meaning create have to come from . Then or Our in

have to come fromrelation occurs when is monochromatic meaning create . Then or Our in [2]. This difference in

when is monochromatic meaning create have to come from . Then or Our in [2]. This
when is monochromatic meaning create have to come from . Then or Our in [2]. This

. Then

or

Our

in [2]. This difference in complexity carries over to merging

or patterns.

is one. We

We propose two different in-plane methods for merging

We We propose two different in-plane methods for merging . We use in-plane in the same

. We use in-plane

in the same way as for the general range-resolver.

in-plane in the same way as for the general range-resolver. -dimensional hyperplanes when methods so that

-dimensional hyperplanes whenin-plane in the same way as for the general range-resolver. methods so that we can use

methods so that we can use a dynamic programming scheduler. We can easily extend both methods to general resolvers if desired. Without loss of generality, we assume that we wish

d) Optimal In-Plane Pre x-Resolver: This version as-

sumes that all of the rules in

and that the ranges are restricted to being prefixes. Without loss

of generality, assume that the rules come from

is defined by

. The algorithm is otherwise identical to the

to mergeis defined by . The algorithm is otherwise identical to the and . leaving pattern .

by . The algorithm is otherwise identical to the to merge and . leaving pattern .

and

.
.

leaving pattern

. Thus, all rules in theis otherwise identical to the to merge and . leaving pattern resolver will be within general

resolver will be within

general prefix-resolver except that we cannot use part of the solution. as The first method
general prefix-resolver except that we cannot use
part of the solution.
as
The first method is to enumerate all of the differences, that is,
we use the FDD comparison algorithm in [9] to find the differ-
ences between the two patterns. We then define
by gener-
Algorithm 3 Optimal In-Row Prefix Resolver
ating rules in
to cover all the places where
differs from
Require: Rows
and
of width
and interval
. We may cover additional locations if this reduces the total
Ensure: A rule list
marking how
is different from
.
number of rules required to cover all the differences. This re-
solver implicitly uses the correct type of rule (range, prefix or
mixed) given the characteristics of the fields.
In the second method, we find
, the minimum bounding
return
Algorithm 4 Prefix Resolve Cost
box surrounding all differences. If any of the fields are prefix
fields, we expand the bounding box in that dimension so that
the bounding box can be defined by prefixes. We then solve
using the appropriate version of Diplomat (range, Require: Rows and and prefix prefix or mixed)
using the appropriate version of Diplomat (range,
Require: Rows
and
and prefix
prefix or mixed) and call the resulting solution
. Alter-
Ensure: Cost
nately, we can solve
and then clip the results against the
if
is monochromatic then
if
then
return 0
else
return 1
end if
else
bounding box, removing any rules that fall entirely outside
the box. This solution is faster because the subsolution can
be computed once and shared between merges; however, this
may produce less compression because we do not optimize the
compression for the bounding box. We use this variant for our
experimental results in Section VI
To achieve greater compression, we run both methods and
use the smaller of the two resolvers. That is, our final resolver
.
if
then
return
D. Schedulers
else
return
end if
end if
To complete the Diplomat algorithm, we need a scheduler to
determine an order for merging patterns. We present two such
schedulers. The first uses a greedy scheme and works with any
resolver while the second uses dynamic programming to get
better overall results but requires in-plane resolvers. We can
Algorithm 5 Prefix Resolve Builder
Require: Row
Require: Interval
Require: Cost function
Require: Prefix
prove theoretical bounds for the greedy scheduler, but the dy-
namic programming scheduler outperforms the greedy sched-
uler in our experiments.
1) Greedy Range Scheduler: For the greedy scheduler, we
compute the cost of the original
possible merges, pick
Ensure: Rule list
the minimum cost merge, and repeat until only one pattern re-
if
then
return
mains. Note that after performing a given merge, we must re-
compute the cost of the one or two merges that involve the
else if
then
new merged pattern. Overall, we perform at most
calls
return
to
. For a
-dimensional classifier, we can consider
end if
all possible permutations of the
fields when performing the
2) Higher Order Resolvers: If
, we need to merge two
greedy scheduling.
Thinking more generally, there is no reason to perform
patterns
and
that are
-dimensional hyperplanes. merges in only one dimension at a time. Instead, we can

This is more complicated than merging two rows. In particular, we have an optimal algorithm for compressing 1-dimensional patterns, but compressing 2-D patterns is NP-hard as shown

greedily choose the best merge in any of the

at any point in time. In particular, for

cheapest pair of rows or columns to merge. In Section V, we

, we find the

dimensions

point in time. In particular, for cheapest pair of rows or columns to merge. In Section

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

DALY et al.: DIFFERENCE RESOLUTION APPROACH TO COMPRESSING ACCESS CONTROL LISTS

7

show how to use this rotational greedy schedule to create an solution to these

subproblems and the correspondingrotational greedy schedule to create an solution to these -approximation for 2-D patterns. optimal schedule of

an solution to these subproblems and the corresponding -approximation for 2-D patterns. optimal schedule of merges.

-approximation for 2-D patterns.

optimal schedule of merges.

Algorithm 6 Scheduler
Algorithm 6 Scheduler
Require: Integers and such that Require: matrix Ensure: Stack of triples , Ensure: The last
Require: Integers
and
such that
Require:
matrix
Ensure: Stack of triples
,
Ensure: The last row,
1:if
then
2:

3:end if

4: 5: 6: 7:for all do 8: for all do 9: 10: if then 11:
4:
5:
6:
7:for all
do
8: for all
do
9:
10: if
then
11:
12:
13:

14: end if

15: end for 16:end for 17:for all do 18: for all do 19: 20: if
15: end for
16:end for
17:for all
do
18: for all
do
19:
20: if
then
21:
22:
23:
Algorithm 7 Schedule
Algorithm 7 Schedule
Require: Integers , and such that Require: matrices Split and Mate Ensure: a triple is
Require: Integers
, and
such that
Require:
matrices Split and Mate
Ensure: a triple
is added to the stack Schedule .
1:
2:
3:
if
then
4: if
then
5: push
6: else
7: push
8: end if
9:end if
onto Schedule
onto Schedule

3) Greedy Pre x Scheduler: For a greedy prefix scheduler, we take the greedy range scheduler described above and make two changes to it. First, when considering the cost of rules in

, the rules are weighted by the number of prefixes it takes to describe the
, the rules are weighted by the number of prefixes it takes to
describe the range
. This acts as a conversion factor between

the effective grid space and the prefix space. Second, we use a weighted prefix resolver instead of a range resolver. 4) Dynamic Programming Pre x Sch eduler: We also present

a dynamic programming method for scheduling that works with prefixes. We begin by precomputing the merge costs as for the range scheduler above. We then define the recursive sub-

, we find the minimum

problem. Given a prefix

be

cost for merging the two halves,

the remaining pattern. We use standard dynamic programming techniques to compute the minimum cost solution to each of

subproblems and the corresponding schedule of

to each of subproblems and the corresponding schedule of and row and while having the merges.

and row

and
and

while having

and the corresponding schedule of and row and while having the merges. 24: end if 25:
the merges.
the
merges.
24: end if 25: end for 26:end for 27: 28: 29: 30: . 31:return
24: end if
25: end for
26:end for
27:
28:
29:
30:
.
31:return

Algorithm 8 Prefix-Scheduler

Require: matrix MergeCost Ensure: Stack of triples , Schedule Ensure: The last row, 1: ,
Require:
matrix MergeCost
Ensure: Stack of triples
, Schedule
Ensure: The last row,
1:
,
2:for all
do
3:

4:end for

5: 6: 7:
5:
6:
7:

2) Dynamic Programming Range Scheduler: We also present a dynamic programming method that re- quires more initial work and, as a result, outperforms the greedy scheduler in our experiments. Using an in-plane range resolver such as the one in Section IV-D2b, let

where the returned pattern is . This requires
where the returned pattern is
. This requires

5) Mixed Schedulers: We can generalize our greedy and dy-

namic programming range and prefix schedulers into a greedy calls to and dynamic programming mixed scheduler as follows. The key

is a range field, then it acts like a

is a prefix field, then it acts like a prefixas follows. The key is a range field, then it acts like a We then define

We then define the following subproblem. Given interval scheduler. To merge the resulting hyperplanes defined by the

, we find the minimum cost remaining fields, it calls an appropriate resolver. If all of the re-

be maining fields are range fields, then it calls a range resolver. If

Resolve . However, because the merged pattern is always one of is the type of field

the original patterns, we do not need to do any more merges.

the original patterns, we do not need to do any more merges. for together while having

for

original patterns, we do not need to do any more merges. for together while having range

together while having

do not need to do any more merges. for together while having range scheduler; if .

range scheduler; if

. Ifmore merges. for together while having range scheduler; if for merging patterns the remaining pattern where

merges. for together while having range scheduler; if . If for merging patterns the remaining pattern

for merging patterns the remaining pattern where

namic programming techniques to compute the minimum cost erwise, it calls a mixed resolver.

. We use standard dy- all of them are prefix fields, then it calls a prefix resolver. Oth-

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

8

Algorithm 9 Prefix-Scheduler-Cost

Require: Prefixes and : Require: matrix MergeCost Ensure: 1:If then 2: 3: 4: else if
Require: Prefixes
and
:
Require:
matrix MergeCost
Ensure:
1:If
then
2:
3:
4: else if
then
5:
6:
7:
8: for all
do
9:
10: end for 11: 12: Costs 13:
10: end for
11:
12: Costs
13:

14:else

15: Right side is analogous 16: end if

A LGORITHMIC A NALYSIS

We now prove some results about our algorithms. We first prove that our 1-D resolvers are optimal. We then prove some approximation results for some 2-D implementations of Diplomat. Finally, we prove that Diplomat with a dynamic programming scheduler should outperform the current state of the art algorithms, Firewall Compressor for range-ACLs and TCAM Razor for prefix-ACLs.

IEEE/ACM TRANSACTIONS ON NETWORKING Proof: We prove this result by contradiction. Let be an optimal
IEEE/ACM TRANSACTIONS ON NETWORKING
Proof: We prove this result by contradiction. Let
be an
optimal resolver with the minimum possible number of rules
that span both rows. If
then
and we are done.
Otherwise, let
be the last rule in
spanning both rows.
We first consider the case where there is no rule
with
and
. In this case, we create a new rule list
from
by replacing
with
. Any point in
covered by
is also covered by
, so
still resolves all of the columns.
For any point
, either some rule
with
defines
for both
and
or no rule in
does. Thus, there can be no
color clash and
resolver with only
the definition of
is still a resolver. This means
is an optimal
rules that span both rows contradicting
. Thus, this case is not possible.
The case where all
with
for
are in row
can be shown to be impossible using the same analysis. The
reverse case where are all intersections occur in row
can be
handled similarly; the only modification is that we replace
with
to form
.

We now consider the last case where some later rules that in-

tersect

range and prefix cases separately, starting with the range case.

so that both its left and right

end points are not covered by any prior rules. Since we merely

covered by other rules, the resulting rule

list

the above cases in which case we are done. If not, let

set of rules

. We handle thecases in which case we are done. If not, let set of rules are in row

case we are done. If not, let set of rules . We handle the are in

are in row

done. If not, let set of rules . We handle the are in row and others

and others are in row

withof rules . We handle the are in row and others are in row We first

rules . We handle the are in row and others are in row with We first

We first trim all rules

removed the parts of

in row with We first trim all rules removed the parts of is equivalent to .

is equivalent toin row with We first trim all rules removed the parts of . This may reduce

. This may reduce the problem to one of

be the .
be the
.
,
,
in with that intersect and
in
with
that intersect
and

and let

be the subset in row

be the subset in row

intersects

Assume without loss of generality that

is the first rule

in

its right end point must be to the right of

the trimming operation we performed earlier. We then apply a

by replacing all rulesthe trimming operation we performed earlier. We then apply a second trimming operation and update 's

second trimming operation and update

's end point given

second trimming operation and update 's end point given to appear scanning from left to right.

to appear scanning from left to right. Since

Algorithm 10 Prefix-Schedule

Require:

matrix Mates Mates

Require: Prefixes and : Ensure: is added to the stack Schedule . 1:if then 2:
Require: Prefixes
and
:
Ensure:
is added to the stack Schedule .
1:if
then
2:
3:
4: if
then
5:
6:

7: else

8: 9:
8:
9:

10: end if 11:end if

E.

Optimal One-Dimensional Resolvers

10: end if 11: end if E. Optimal One-Dimensional Resolvers with the portion of that is

with the portion of

if E. Optimal One-Dimensional Resolvers with the portion of that is strictly to the right of

that is strictly to the right of

and

.
.

This classifier still resolves

column that was formerly covered by

this case to the prior case where all rules after

are from one row, and the proof is complete for ranges.covered by this case to the prior case where all rules after since still covers any

are from one row, and the proof is complete for ranges. since still covers any .

since

still covers anyfrom one row, and the proof is complete for ranges. since . We have thus reduced

. We have thus reduced

that intersectfor ranges. since still covers any . We have thus reduced and , For prefix rule

since still covers any . We have thus reduced that intersect and , For prefix rule

and

,
,

For prefix rule lists, we observe that for any

either

), or else such rule from

and
and

(a contradiction since

either ), or else such rule from and (a contradiction since in the horizontal field. Let

in the horizontal field. Let

is totally obscured byand (a contradiction since in the horizontal field. Let be the widest . Such rules must

be the widestsince in the horizontal field. Let is totally obscured by . Such rules must also be

. Such rules must alsohorizontal field. Let is totally obscured by be the widest be the last such rules since

be the last such rules since any narrower rules must lie entirely

the same from

within their ranges and thus be obscured. Likewise, either

or that
or
that

. If we remove

thus be obscured. Likewise, either or that . If we remove , we find that of

, we find that

Likewise, either or that . If we remove , we find that of the columns within
Likewise, either or that . If we remove , we find that of the columns within
Likewise, either or that . If we remove , we find that of the columns within

of the columns within this range. Since

last rule in this range within

is revealed by the removal of

is still a resolver, a contradiction to the proof is complete for prefixes as well.last rule in this range within is revealed by the removal of in this field. Assume

to the proof is complete for prefixes as well. in this field. Assume without loss of

in this field. Assume without loss of generality

still resolves all

must have been the

, there can be no other rule that and thus no color clash. Thus

being optimal. Thus

rule that and thus no color clash. Thus being optimal. Thus We next prove that there

We next prove that there are optimal 1-D resolvers in which In this section, we prove that our 1-D resolvers are optimal. no column is covered by rules in both rows.

We first prove that there are optimal 1-D resolvers in which no rule spans both rows.

, Then

such that

Theorem 1: Let there exists a rule list

spans both rows. In particular,

there is an optimal resolver with this property.

be a rule list that resolves

resolver with this property. be a rule list that resolves that also resolves and and and

that also resolves

and and
and
and
be a rule list that resolves that also resolves and and and that no rule Theorem

and that no rule

Theorem 2: Let there exists a rule list and that for any undefined.

Let there exists a rule list and that for any undefined. that resolves be a rule
that resolves
that resolves

be a rule list that resolves

and . Then such that or is
and
. Then
such that
or
is

and

, either

Proof: First, by Theorem 1, we can assume that

and , either Proof: First, by Theorem 1, we can assume that has no be an
and , either Proof: First, by Theorem 1, we can assume that has no be an

has no

be an optimal resolver with

rule that occupies both rows. Let

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

DALY et al.: DIFFERENCE RESOLUTION APPROACH TO COMPRESSING ACCESS CONTROL LISTS

9

in one row that overlaps the hor-RESOLUTION APPROACH TO COMPRESSING ACCESS CONTROL LISTS 9 izontal range of any rule in the other

izontal range of any rule in the other row. If

, we are done.

is the

first such rule when scanned left to right. We create a modified

rule list

, removing any that are entirely contained within the

range of

prefix classifiers, any overlapping rule must by entirely within this range (and thus there cannot be any overlapping rules). The

specifies values for

. However, because the modified

valueThe specifies values for . However, because the modified rule contradicting the definition of We now

rule

contradicting the definition of

We now prove the optimality of our general 1-D resolver. Theorem 3: The Optimal Range-Resolver algorithm de- scribed in Section IV-D2a is an optimal 1-D range-resolver.

, the length of the

rows. The base case

. We now con-

is trivial. Now assume this method

rule list still resolves

is not optimal). Foris trivial. Now assume this method rule list still resolves range of that overlap the horizontal

range of

that overlap the horizontal

Otherwise, assume without loss of generality that

the minimum number of rules

without loss of generality that the minimum number of rules by trimming all rules in (which
without loss of generality that the minimum number of rules by trimming all rules in (which
without loss of generality that the minimum number of rules by trimming all rules in (which

without loss of generality that the minimum number of rules by trimming all rules in (which

by trimming all rules in

that the minimum number of rules by trimming all rules in (which cannot happen or else

(which cannot happen or else

and since ,
and
since
,

Fig. 5. Forbidden rectangle.

Theorem 5: The Optimal General Prefix-Resolver algorithm

described in Section IV-D2c is an optimal 1-D general prefix resolver.

is the

is trivial. Now assume

that this method produces an optimal in-plane resolver for all

smaller word sizes. If

is an optimal in-plane resolver, then this method finds an op-

cannot be

included in the optimal resolver for

tions on prefix rules, all

, which is the other solution con-

sidered by our method. By the inductive hypothesis, both of the

subsolutions are optimal and thus so is the overall solution. Theorem 6: The Optimal In-Plane Prefix-Resolver algo- rithm described in Section IV-D2d is an optimal 1-D in-plane prefix-resolver. Proof: This proof is essentially the same as for Theorem 5.

all columns affected in row

the same as for Theorem 5. all columns affected in row does not overlap any rules

does not overlap any rules in

has a smaller

Proof: We induct on

not overlap any rules in has a smaller Proof: We induct on , the word size

, the word size of the rows (

a smaller Proof: We induct on , the word size of the rows ( , and
a smaller Proof: We induct on , the word size of the rows ( , and

, and the result follows.

on , the word size of the rows ( , and the result follows. length of

length of the row). The base case

or
or
and the result follows. length of the row). The base case or . Because of the

. Because of the limita-

are contained in either

or .
or
.
or . Because of the limita- are contained in either or . Proof: We prove this

Proof: We prove this by induction on

in either or . Proof: We prove this by induction on for timal solution. Otherwise, the

for

in either or . Proof: We prove this by induction on for timal solution. Otherwise, the

timal solution. Otherwise, the rule delimited by

on for timal solution. Otherwise, the rule delimited by produces an optimal resolver sider , in

produces an optimal resolver

sider
sider
, in only or is only in that . Thus
,
in only
or
is only in
that
. Thus

to optimal resolvers

rules in both rows. Column

since

by a rule us assume

some

rules in both rows. Column since by a rule us assume some . We first observe
rules in both rows. Column since by a rule us assume some . We first observe

. We first observe that the optimal resolver for

some . We first observe that the optimal resolver for . Next, if Thus, has no

. Next, if

. We first observe that the optimal resolver for . Next, if Thus, has no fewer

Thus,. We first observe that the optimal resolver for . Next, if has no fewer rules

has no fewer rules than the optimal resolver for

If

, then

has no fewer rules than the optimal resolver for If , then and is clearly an

and

is clearly an optimal resolver asno fewer rules than the optimal resolver for If , then and it resolves all differences

it resolves all differences between the two rows in the first

columns. Thus, by our induction hypothesis, our 1-D resolver is an optimal 1-D resolver for this case.

by Theorem 2 we restrict our attention in which no column is covered by

must be covered by some rule

will be covered

. Without loss of generality, let

. By Theorem 2, it follows that for

loss of generality, let . By Theorem 2, it follows that for F. Two-Dimensional Approximation Results

F. Two-Dimensional Approximation Results

In [2], the authors present an iterated strip-rule algorithm

and prove that it is a -approximation al- gorithm on range-ACLs and a -ap- proximation
and prove that it is a
-approximation al-
gorithm on range-ACLs and a
-ap-
proximation algorithm on prefix-ACLs, where
is the number
of rules in the input list,
is the word size in bits, and

is the size of the optimal solution. We simplify their algorithm and generalize it to create a class of approximation algorithms with the same approximation ratio. We then show that some Diplomat variations belong to this class. We first define strip rule patterns [2]. De nition 3: A strip-rule pattern is a 2-D pattern which can be created by a rule list where each rule spans an entire row or column. Applegate et al. prove several properties about strip-rule pat-

3

terns, the most important of which is that any 2

sub-array must have a monochromatic row or column. De nition 4: A forbidden rectangle is a minimal subpattern

containing either a 2

3 subarray with no monochro-

matic rows or columns. An example forbidden rectanble can be seen in Fig. 5. Any pattern that includes a forbidden rectangle is not a strip- rule pattern. We now define the class of strip solver compression

algorithms. De nition 5: A strip solver is a compressor that on an

strip-rule pattern always returns a rule list with at most

rules for some constant

patterns, a strip solver may either return

such a rule list or it may fail. Any compressor that completes on all values and satisfies the bounds on strip rule patterns can

2 or 3values and satisfies the bounds on strip rule patterns can 2 or 3 . On other

and satisfies the bounds on strip rule patterns can 2 or 3 2 or 3 .
and satisfies the bounds on strip rule patterns can 2 or 3 2 or 3 .

2 or 3

the bounds on strip rule patterns can 2 or 3 2 or 3 . On other
the bounds on strip rule patterns can 2 or 3 2 or 3 . On other
.
.

On other

bounds on strip rule patterns can 2 or 3 2 or 3 . On other .

. This means column

must color

can 2 or 3 2 or 3 . On other . This means column must color
does
does

not color

. By our induction

hypothesis, the fact that Firewall Compressor will compute an

optimal coloring of

values of

that finds such a solution.

We also show that our 1-D in-plane resolver is an optimal 1-D in-plane resolver.

Theorem 4: The Optimal In-Plane Range-Resolver algorithm described in Section IV-D2b is an optimal 1-D in-plane range- resolver.

, the length of the rows. The

base case

. We now consider

is clearly an optimal

. Ifrows. The base case . We now consider is clearly an optimal in-plane resolver as it

in-plane resolver as it resolves all differences between the two

will be an optimal resolver

all differences between the two will be an optimal resolver combined with an optimal coloring of

combined with an optimal coloring of

be an optimal resolver combined with an optimal coloring of , and the fact we try

, and the fact we try all possible

an optimal coloring of , and the fact we try all possible , our general resolver

, our general resolver algorithm is an optimal resolver

, our general resolver algorithm is an optimal resolver Proof: We again induct on for ,

Proof: We again induct on

for , then
for
, then

is trivial. Now assume this method produces an

optimal in-plane resolver

rows in the first columns, and . Thus, by our induction hypothesis, our in-plane resolver
rows in the first
columns, and
. Thus, by
our induction hypothesis, our in-plane resolver is an optimal
in-plane resolver for this case.
If
,
then
must color
that
must color
for some
. It follows
. Thus
will be an optimal in-plane resolver
combined with an op-
timal coloring of
. By our induction hypothesis, the fact

that Firewall Compressor will compute an optimal coloring of

, our in-plane

resolver algorithm is an optimal in-plane resolver that finds such

a solution.

algorithm is an optimal in-plane resolver that finds such a solution. , and the fact we

, and the fact we try all possible values of

algorithm is an optimal in-plane resolver that finds such a solution. , and the fact we
algorithm is an optimal in-plane resolver that finds such a solution. , and the fact we

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

10

IEEE/ACM TRANSACTIONS ON NETWORKING

be turned into a strip solver by checking to see if the limit was met
be turned into a strip solver by checking to see if the limit was
met and failing if it does not.
Theorem 7: Range Diplomat with the rotational greedy
We now prove an approximation bound for any iterated strip
solver algorithm.
Theorem 8: For any 2-D pattern
, any iterated strip solver
scheduler is a strip solver compressor with
with constant
is a
-approxima-

Proof: Let this by induction on

columns in

which Diplomat solves optimally with 0 rules. If

then we have a pattern with 1 row and 1 column which means

it is a single cell which Diplomat again solves optimally with 1

rule. If

, then we have a pattern with either 1 row

and 2 columns or 2 rows and 1 column which Diplomat solves optimally using 2 rules.

, then we have the empty pattern

, the sum of the number of rows and

. If
. If

be an arbitrary strip-rule pattern. We prove

,
,

tion algorithm. Proof: For a given value of

blocks where

each vertical section has at least one block. Additional blocks are added each time the strip solver cannot add another row in the current section. For any pair of vertically adjacent blocks, there must be at least one forbidden rectangle or else the two blocks would together be a strip rule pattern. While some of the forbidden rectangles may overlap, if two pairs are disjoint,

then their forbidden rectangles must also be disjoint. Thus, there

are at least

lower bounds on OPT:

, there is a total of

are at least lower bounds on OPT: , there is a total of . The first
are at least lower bounds on OPT: , there is a total of . The first

. The firstare at least lower bounds on OPT: , there is a total of blocks must exist

blocks must exist since

, there is a total of . The first blocks must exist since and assume the

and assume the inductive hypoth-

esis holds for all patterns where the sum of the rows and columns

is strictly smaller than

. Without

there must exist a monochromatic row or column in

loss of generality, assume there exists a monochromatic row

This row can be merged with either of its neighbors at cost 1 by

Now consider

with either of its neighbors at cost 1 by Now consider disjoint forbidden rectangles. This leads
with either of its neighbors at cost 1 by Now consider disjoint forbidden rectangles. This leads

disjoint forbidden rectangles. This leads to threewith either of its neighbors at cost 1 by Now consider , and . , an

,
,

and

.
.

, an upper bound on the cost of the

. Since

to three , and . , an upper bound on the cost of the . Since

is a strip-rule pattern,

bound on the cost of the . Since is a strip-rule pattern, For a given value

For a given value of

iterated strip solver solution is given as follows:of the . Since is a strip-rule pattern, For a given value of coloring with a

coloring with a single rule. Thus, Diplomat with the rota- tional greedy scheduler must incur
coloring
with a single rule. Thus, Diplomat with the rota-
tional greedy scheduler must incur a cost of exactly one when
merging the first two rows or columns. Without loss of gener-
ality, we assume Diplomat merges rows
and
.
We now argue that
, the resulting pattern of merging
and
, is still a strip-rule pattern. Assume otherwise for contradic-
tion. This implies that the row created by the merge is part of a
forbidden rectangle. However, since only a single rule is placed
in one of the rows, the merged row must be identical to the other
row. This implies that the forbidden rectangle already exists in
We choose the value of
that minimizes the above ex-
the original pattern
, a contradiction. Thus,
is a strip rule pat-
pression. If we choose
so that
we get
tern where the sum of the rows and columns is at most
.
leading to
Thus, by the inductive hypothesis, Diplomat with the rotational
an
-approximation. While we cannot know
a
greedy scheduler solves
with at most
rules. Com-
, we see that
priori , by selecting multiple different
we can guarantee that
bining this with the one rule used to merge
and
for some
.
Diplomat with the rotational greedy scheduler colors
with at
if it is better than the classifier gener-
most
rules and the inductive case is complete.
We can choose to use
ated above. This means
which means
We now define the class of iterated strip solver algorithms
using ideas from Applegate et al . Iterated strip solver algorithms
which leads to
.
and the
can be applied to arbitrary 2-D patterns
. The first step is to par-
Further, the two bounds are equal if
result follows.
tition
as follows. Let
be a
pattern for integers
and
Corollary 1: For any 2-D pattern
, range-Diplomat with the
and assume without loss of generality that
; we pad
rotational greedy scheduler is a
-ap-
with empty rows and columns as needed if they are not powers
of 2. For each integer
proximation algorithm.
Proof: The result follows immediately from Theorem 7
into vertical sections of width
, do the following. Partition
. Partition each section into a
and Theorem 8.
sequence of disjoint blocks
, where each block
is

a maximal height subpattern using the full width of the section

in which the strip solver returns a value. Apply a strip solver

, form a solution by taking

the union of solutions for each block defined by

blocks are disjoint, the solutions for each block are completely

independent. The final solution is the best of the lutions for each width Applegate et al. prove the following result.

disjoint forbidden rectangles

in a 2-D pattern

different so-

. Because thedisjoint forbidden rectangles in a 2-D pattern different so- to each block. For each value of

to each block. For each value of

different so- . Because the to each block. For each value of . Lemma 1: Suppose
different so- . Because the to each block. For each value of . Lemma 1: Suppose
.
.

Lemma 1: Suppose there are

each block. For each value of . Lemma 1: Suppose there are . Then . Proof:

. Then

.
.

Proof: Applegate e t al. give a proof for this in [2]. The in- tuition is that the first rule to include any corner of a forbidden rectangle must itself have a corner within that forbidden rec-

tangle.

must itself have a corner within that forbidden rec- tangle. For prefix-ACLs, we present these modifications

For prefix-ACLs, we present these modifications to the it- erated strip solver algorithm. We divide the effective grid into

as we did for the range version. We

then divide these columns into vertical sections so that each sec- tion can be represented by a prefix. We then apply a prefix strip solver to each section. If the strip solver does not return on a sec-

tion, we divide it into blocks, each half the height of the pattern and repeat until it returns on each block. The solution is again

the union of the result on each block.

, any iterated prefix strip

-approximation

solver is a algorithm.

columns created by the initial parti-

prefix-columns to represent.

Proof: The tioning may require

parti- prefix-columns to represent. Proof: The tioning may require columns of width Theorem 9: For any

columns of width

parti- prefix-columns to represent. Proof: The tioning may require columns of width Theorem 9: For any

Theorem 9: For any 2-D pattern

parti- prefix-columns to represent. Proof: The tioning may require columns of width Theorem 9: For any
parti- prefix-columns to represent. Proof: The tioning may require columns of width Theorem 9: For any
parti- prefix-columns to represent. Proof: The tioning may require columns of width Theorem 9: For any
parti- prefix-columns to represent. Proof: The tioning may require columns of width Theorem 9: For any
parti- prefix-columns to represent. Proof: The tioning may require columns of width Theorem 9: For any
parti- prefix-columns to represent. Proof: The tioning may require columns of width Theorem 9: For any
parti- prefix-columns to represent. Proof: The tioning may require columns of width Theorem 9: For any

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

DALY et al.: DIFFERENCE RESOLUTION APPROACH TO COMPRESSING ACCESS CONTROL LISTS

11

We also assume that another

some blocks are not solvable by the strip-solver. This gives

blocks. As before, we note a relation betweensome blocks are not solvable by the strip-solver. This gives the number of added blocks and

the number of added blocks and the number of forbidden rect- angles. This time, a single forbidden rectangle can force up to

additional blocks to be added. This is because we divide theThis time, a single forbidden rectangle can force up to block in half and the forbidden

block in half and the forbidden rectangle can remain entirely within one sub-block, but no block can be divided more than

times. Thus, there are at least as follows:

forbidden rectangles, given

there are at least as follows: forbidden rectangles, given blocks are added because G. Comparison With
there are at least as follows: forbidden rectangles, given blocks are added because G. Comparison With

blocks are added because G. Comparison With Firewall Compressor, TCAM Razor and

G. Comparison With Firewall Compressor, TCAM Razor and ACL Compressor We now show that on 2-D

ACL Compressor

We now show that on 2-D patterns, Diplomat with a dynamic programming scheduler produces a smaller rule list than Fire- wall Compressor, TCAM Razor and ACL Compressor.

be the rule

list produced by range-Diplomat with a dynamic programming

scheduler and

pressor. Then for all

Proof: We prove this by induction on the number of rows

pattern takes

be the rule list produced by Firewall Com-

Theorem 11: Given some 2-D pattern

by Firewall Com- Theorem 11: Given some 2-D pattern , we have . As a base
by Firewall Com- Theorem 11: Given some 2-D pattern , we have . As a base

, we have

. As a base case where

,
,

let

some 2-D pattern , we have . As a base case where , let . in
.
.
in
in
,a
,a

the same number of rules for both Diplomat and Firewall Com- pressor as both use an optimal 1-D solver. We now consider

patterns

row pattern.

be the row removed by Firewall Compressor with cost

row pattern. The key obser-

vation is that Diplomat considers removing all rows including

with each

. When

for doing so

is at most

oring all of

. By our in-with each . When for doing so is at most oring all of . Furthermore, because

. Furthermore, because Diplomat uses an in-plane

because one option that Diplomat considers is col-

Diplomat considers removing row

of its neighboring rows using rules that only color

row of its neighboring rows using rules that only color with . Both Diplomat and Firewall

with

row of its neighboring rows using rules that only color with . Both Diplomat and Firewall

. Both Diplomat and Firewall Com-

rules that only color with . Both Diplomat and Firewall Com- to create an pressor eliminate

to create an

color with . Both Diplomat and Firewall Com- to create an pressor eliminate one row from

pressor eliminate one row from

Let and
Let
and
Com- to create an pressor eliminate one row from Let and be the resulting ; more

be the resulting

an pressor eliminate one row from Let and be the resulting ; more precisely, Diplomat considers

; more precisely, Diplomat considers merging

the resulting ; more precisely, Diplomat considers merging , its cost row pattern is also resolver,
the resulting ; more precisely, Diplomat considers merging , its cost row pattern is also resolver,
the resulting ; more precisely, Diplomat considers merging , its cost row pattern is also resolver,

, its cost

row pattern is also

Diplomat considers merging , its cost row pattern is also resolver, the resulting with smaller total

resolver, the resulting

with smaller total . Thus, one choice, its cost row pattern is also resolver, the resulting ductive hypothesis, Diplomat will color cost

ductive hypothesis, Diplomat will color cost than Firewall Compressor will color

for Diplomat with dynamic programming has cost no more than

Firewall Compressor's cost. Since Diplomat chooses the lowest

cost solution, the result follows. Theorem 12: Given some 2-D pattern

list produced by prefix-Diplomat with a dynamic programming

, let be the rule
,
let
be the rule
scheduler and be the rule list produced by TCAM Razor. Then for all , we
scheduler and
be the rule list produced by TCAM Razor. Then
for all
, we have
.

Proof: This proof uses the same basic ideas as the proof

of Theorem 11. Diplomat considers removing rows in the same order that TCAM Razor does, and a merge never costs more

rules than solving a row. This places an upper bound on

be

rules than solving a row. This places an upper bound on be and the result follows.

and the result follows.

Theorem 13: Given some 2-D pattern

,
,

let

result follows. Theorem 13: Given some 2-D pattern , let to list produced by mixed-Diplomat with
to
to

list produced by mixed-Diplomat with a dynamic programming

sheduler and be the rule list produced by ACL Compressor. Then for all , we
sheduler and
be the rule list produced by ACL Compressor.
Then for all
, we have
.
Again, we let and we find that . The result is a -approx- imation. If
Again, we let
and we find that
. The result is a
-approx-
imation. If
then we have
. Again, these are equal in the event that
. This gives us the final approximation bounds of . We now place a bound
. This gives us the final approximation bounds of
.
We now place a bound on
approximation bounds.
for prefix Diplomat to get our
Theorem 10: Given a
fective grid of size
strip rule pattern with an ef-
with
,
and
grid of size strip rule pattern with an ef- with , and rules. . rules. As

rules.grid of size strip rule pattern with an ef- with , and . rules. As a

.
.

rules.

As a base

or
or

, then the single row or column rules per cell. This costs either

the single row or column rules per cell. This costs either Now assume that this is
the single row or column rules per cell. This costs either Now assume that this is

Now assume that this is true for all smaller

costs either Now assume that this is true for all smaller . Since is rules to

. Since

iseither Now assume that this is true for all smaller . Since rules to describe this

Now assume that this is true for all smaller . Since is rules to describe this

rules to describe this

, prefix Diplomat with a rotational greedy schedule andis true for all smaller . Since is rules to describe this optimal in-row resolver requires

optimal in-row resolver requires at most Proof: We prove this by induction on

case, if either

can be filled with at most

or

be the rule

a strip-rule pattern, there must exist a monochromatic row or column in the effective grid. Assume for generality that it is a

row. Diplomat could use at most

row. If it chooses otherwise, it must pick another row which costs less to remove. Since we are using an in-plane resolver, the resulting pattern must only contain rows in the original pattern.

Thus the subgrids of

there can be no forbidden rectangles and pattern. The total cost then is at most

Theorems 11, 12, and 13 do not necessarily hold when we in- clude redundancy removal and other post-processing steps. For example, while Theorem 11 guarantees that Diplomat with a dy- namic programming scheduler will produce a range-ACL with no more rules than Firewall Compressor before performing re-

This is a little smaller than the original bounds of dundancy removal, no such guarantee can be made after per-

presented in [2]. The same ap- forming redundancy removal. In our experiments, we observe proximation bounds can be achieved by using their maximal that Firewall Compressor, TCAM Razor and ACL Compressor

pick-up sticks algorithm as the strip solver for our revised rarely but occasionally do outperform Diplomat with a dynamic

iterated strip solver algorithm.

costs more than solving a row. Thus, this choice for Diplomat is no worse than ACL Compressor and since Diplomat takes the

Proof: This proof is essentially the same as for Theorems 11 and 12. As before, Diplomat considers the same order to remove rows as ACL Compressor does, and merging rows never

remove rows as ACL Compressor does, and merging rows never are a subset of those in

are a subset of those in

. As such, is also a strip rule
. As such,
is also a strip rule

best possible option the result follows.

also a strip rule best possible option the result follows. . Corollary 2: Prefix-Diplomat with the
.
.

Corollary 2: Prefix-Diplomat with the rotational greedy scheduler and optimal in-plane resolver is a

greedy scheduler and optimal in-plane resolver is a -approximation algorithm. Proof: This follows directly from

-approximation algorithm.

Proof: This follows directly from Theorems 9 and 10.

algorithm. Proof: This follows directly from Theorems 9 and 10. programming scheduler due to this phenomenon.
algorithm. Proof: This follows directly from Theorems 9 and 10. programming scheduler due to this phenomenon.

programming scheduler due to this phenomenon.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

12

IEEE/ACM TRANSACTIONS ON NETWORKING

H. Runtime Effi ciency For the dynamic prefix scheduler, we have subprob- lems of the
H. Runtime Effi ciency
For the dynamic prefix scheduler, we have
subprob-
lems of the form
where we merge prefixes
and
and
We compute the worst-case runtime of the Diplomat algo-
rithm with a dynamic programming scheduler and in-plane re-
row
remains. Unlike the range version, there is only one pos-
sible partition point. Again, one half will contain row
.
There
solvers given a
-dimensional classifier with
. We assume
are then
options for the original row remaining in the other
each field has size
.
We analyze the algorithm in three steps:
computing the child merges, finding an optimal schedule, and
solving the final plane.
half. Altogether, this leads to
timal schedule.
time to compute the op-
For
, the overall cost is
which is ap-
There are
possible planes. We compute a solution for
proximately
when
. By induction, the runtime of
each plane as part of the merge process which only needs to be
done once per plane, not per merge. The total runtime will then
prefix Diplomat with
is
in the worst case. This
be at least
times the runtime for a lower-dimensional so-
is slightly worse than TCAM Razor which requires
time in the worst case.
lution. Range and prefix Diplomat will have different solutions,
covered later.

pairs of planes to try merging together.

, the planes are rows and are a special case for both

When

, this requires clipping, the planes are rows and are a special case for both When one of the

one of the sub-solutions against a bounding box. Finding a minimum bounding box requires finding all of the differences

time [9].bounding box requires finding all of the differences Afterwards, it takes time to clip the lower

Afterwards, it takes

time to clip the lower dimensional solution against theall of the differences time [9]. Afterwards, it takes time, so all merges take The runtime

time, so all mergestime to clip the lower dimensional solution against the take The runtime of the scheduler is

take

The runtime of the scheduler is not dependent on the number

of dimensions. Thus, the total runtime is

times the runtime of a lower-dimensional solution. 1) Range Diplomat: We now find the merge cost for range

. For each row, we must make a call1) Range Diplomat: We now find the merge cost for range Diplomat when to Firewall Compressor,

Diplomat when

to Firewall Compressor, which takes

time total) [11]. This will be used as part of the computation for each merge. Each merge requires determining which other subsolution works best for each right end point. Since there are

merges, the totalworks best for each right end point. Since there are box. Altogether, each merge takes time

box. Altogether, each merge takes

time to calculate the boundary andare merges, the total box. Altogether, each merge takes between two planes. This can be done

between two planes. This can be done in

range and prefix Diplomat. When

There are

can be done in range and prefix Diplomat. When There are IV. E XPERIMENTAL R E

IV. EXPERIMENTAL RESULTS

In this section we evaluate the effectiveness of Diplomat on real-life classifiers. Specifically, we assess how much Diplomat improves over Firewall Compressor [11], TCAM Razor [14], and ACL compressor , the current state of the art al- gorithms for range-, prefix-, and mixed-ACL compression. For Diplomat, we use the dynamic programming scheduler (using the solve-then-clip variant) and in-plane resolver appropriate to the problem. We label the different applications RDip for range-ACL compression, PDip for prefix-ACL compression, and MDip for mixed-ACL compression. We do not report results using a greedy scheduler (or consequently the general resolver) since this was determined experimentally to always be worse than the dynamic programming scheduler.

to always be worse than the dynamic programming scheduler. time. plus time (for A. Methodology All

time.

plusbe worse than the dynamic programming scheduler. time. time (for A. Methodology All Diplomat variants, Firewall

be worse than the dynamic programming scheduler. time. plus time (for A. Methodology All Diplomat variants,

time (forbe worse than the dynamic programming scheduler. time. plus A. Methodology All Diplomat variants, Firewall Compressor

A. Methodologythan the dynamic programming scheduler. time. plus time (for All Diplomat variants, Firewall Compressor (FC), Razor,

All Diplomat variants, Firewall Compressor (FC), Razor, and ACL Compressor (AC) are sensitive to the ordering of the five packet fields (source IP address, destination IP address, source port, destination port, and protocol). We run these algorithms

using each of the

different permutations across all of

the fields and use the best of the 120 results for each classifier. We also use redundancy removal [8] as a post-processing step. We now define the metrics for measuring the effectiveness of

choices, merge cost is .
choices,
merge cost is
.

end points, and

effectiveness of choices, merge cost is . end points, and We now derive the time required

We now derive the time required to compute the optimal

schedule. We have subproblems of the form where we merge rows from to and is
schedule. We have
subproblems of the form
where
we merge rows from
to
and
is the original row that

remains. There must have been a final merge that produced

choices for the partition point thatremains. There must have been a final merge that produced . For the choices for the

. For the

choices for the original row that time to compute the optimal

time to

compute the optimal schedule for our original problem. This

schedule for

remains. Thus, it takes

original problem. This schedule for remains. Thus, it takes . There are defines this last merge.

. There areoriginal problem. This schedule for remains. Thus, it takes defines this last merge. One half will

defines this last merge. One half will include row

are defines this last merge. One half will include row other half, we have . Altogether,

other half, we have

.
.

Altogether, this leads to

include row other half, we have . Altogether, this leads to sets the overall runtime to

sets the overall runtime to be

. Altogether, this leads to sets the overall runtime to be when . is For range

when

. is
.
is

For range Diplomat, the total runtime when

. By induction, the runtime for range Diplomat with

. This is slightly worse than Firewall Com-

pressor which requires

, we

make a single call to Suri's algorithm for each row. Each call

is
is
call to Suri's algorithm for each row. Each call is time in the worst case. 2)

time in the worst case.

2) Pre x Diplomat: For prefix Diplomat with

case. 2) Pre fi x Diplomat: For prefix Diplomat with is the number takes subproblems: for

is the number

takes

subproblems: for

prefix

the solutions for subproblems

problem takes a constant amount of time. Thus each merge takes

time ortakes a constant amount of time. Thus each merge takes ? Deciding each sub- or to

? Deciding each sub-

or to join

of colors [17]. Each merge contains

each sub- or to join of colors [17]. Each merge contains time (or total) where and

time (or

total) where and
total) where
and
colors [17]. Each merge contains time (or total) where and , is it better to use

, is it better to use Suri's partial solution for

and , is it better to use Suri's partial solution for time to compute the cost

time to compute the cost of all possible

merges.

our Diplomat variants. Let

denote a classifier. Let

tained by applying compressor

of rules in

the improvement ratio of

and denote compressors and denote the resulting rule list ob- to , and is the
and
denote compressors and
denote the resulting rule list ob-
to
, and
is the number
on
is
, and
over
on
is
.
In our

and mean im-

number on is , and over on is . In our and mean im- . It

. It is desirable to

over on is . In our and mean im- . It is desirable to . The

. The compression ratio of

results, we focus on the mean compression ratio

provement ratio given a set of classifiers

have a low compression ratio and a high improvement ratio. We assess the performance of our algorithms using a set of 40 real-life classifiers which we first preprocess by removing redundant rules. Most of them have five fields (source and des- tination address, source and destination port, and protocol) al- though some only use a subset of these. We divide this set into three smaller sets based on the number of rules after running re- dundancy removal. The small set contains the 17 smallest clas- sifiers, the middle set contains the next 12 larger classifiers, and the large set contains the 11 largest classifiers. The classifiers in the large set all have at least 600 rules after redundancy re- moval, with the largest having more than 7600 rules. For range and prefix tests, we treat all of the fields as ranges or prefixes,

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

DALY et al.: DIFFERENCE RESOLUTION APPROACH TO COMPRESSING ACCESS CONTROL LISTS

13

TABLE I

SIZE OF LARGEST DIMENSION

LISTS 13 TABLE I S I Z E OF L ARGEST D IMENSION Fig. 6. Percentages
LISTS 13 TABLE I S I Z E OF L ARGEST D IMENSION Fig. 6. Percentages

Fig. 6. Percentages of ACLs that Diplomat outperforms previous methods on.

respectfully. For mixed tests, we treat the port fields as ranges and the other fields as prefixes. In general, larger classifiers also have larger effective grids. Not all of the fields strictly increase as the list size goes up. In particular, the protocol field only uses a few distinct values across all classifiers, usually 6 (TCP) and 17 (UDP). As such, the smallest dimension size is at most nine for all classifiers. The range of the largest dimension size for each set can be seen in Table I.

largest dimension size for each set can be seen in Table I. Fig. 7. Mean improvement
Fig. 7. Mean improvement ratios. Fig. 8. Compression ratios on range classifiers. Fig. 9. Compression
Fig. 7.
Mean improvement ratios.
Fig. 8.
Compression ratios on range classifiers.
Fig. 9.
Compression ratios on prefix classifiers.
Fig. 9. Compression ratios on prefix classifiers. Fig. 10. Compression ratios on mixed classifiers. C. Ef

Fig. 10. Compression ratios on mixed classifiers.

C. Ef ciency

B. Compression Results

Our results show that: 1) RDip, PDip, and MDip outperform Firewall Compressor, Razor, and ACL Compressor, respec- tively, and that 2) the performance gap between Diplomat and its competitors increases as the classifiers grow in size and complexity. Diplomat's superior performance manifests itself in two ways. First, as classifiers grow in size and complexity from small to medium to large, the percentage of classifiers where Diplomat outperforms its competitors increases (Fig. 6). For range classifiers it increases from 5.9% for the small classifiers to 58.3% for the medium classifiers to 100% for the large classifiers. Second, again as classifiers grow in size and complexity, Diplomat's average improvement ratio grows (Fig. 7). For range Diplomat, it increases from 0.6% for the small classifiers to 14.4% for the medium classifiers to 34.9% for the large classifiers over Firewall Compressor. It is especially encouraging that Diplomat performs best on the large classifier set as the large classifiers are the ones that most accurately represent the complex classifiers we are likely to encounter in the future and which most need new compres- sion techniques. More specifically, we hypothesize that for the small classifiers, there is relatively little room for optimization and all the methods are finding optimal or near optimal classi- fiers. However, as the classifiers increase in size and complexity, Diplomat is able to find new compression opportunities missed by the earlier methods. More detailed compression ratio results

for RDip, PDip and MDip are shown in Figs. 8–10, respectively. and VB.Net. Tests were performed on a Xeon E5620 processor

We implemented our algorithms using a combination of C#

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

14

is final as presented, with the exception of pagination. 14 Fig. 11. Runtime of each algorithm

Fig. 11. Runtime of each algorithm on the eight largest classifiers.

at 2.4 GHz running Red Hat Enterprise Linux 6.1 and Mono 2.10.8. We report the time for the best permutation for each method. Diplomat does require significantly more time to compress a classifier than both Firewall Compressor and Razor as can be seen in Fig. 11. This is because Diplomat does explore more possibilities when compressing classifiers. RDip is slower than PDip because there are more possible cases to consider when any range can be used as opposed to fixed prefixes. In future work, we hope to find ways to maintain Diplomat's compression advantage while reducing its running time.

V. C ONCLUSION

In this paper, we presented Diplomat, an algorithm for compressing ACLs. We first presented the general framework and then presented concrete algorithms for implementing this framework. In particular, we presented a greedy scheduler that has guaranteed approximation bounds and a dynamic programming scheduler that performs well in practice. We implemented Diplomat and conducted side-by-side comparison with the prior Firewall Compressor, Razor, and ACL Com- pressor algorithms on real life classifiers. The experimental results show that Diplomat outperforms them most of the time, often by a considerable margin. In particular, on our largest ACLs, Diplomat has an average improvement ratio of 34.9% over Firewall Compressor on range-ACLs, 14.1% over Razor on prefix-ACLs, and 8.9% over ACL Compressor on mixed-ACLs. Furthermore, we generalized the Iterated Strip Rule algorithm in [2] and used it to create a larger class of algorithms that achieve the same approximation bounds. We showed that Diplomat with a greedy scheduler is a member of this class.

ACKNOWLEDGMENT

The authors would like to thank the anonymous referees for their constructive comments.

REFERENCES

[1] Topcoder statistics [Online]. Available: http://community.top- coder.com/tc?module=Static&d1=match editorials&d2=srm150

[2] D. A. Applegate, G. Calinescu, D. S. Johnson, H. Karloff, K. Ligett, and J. Wang, “Compressing rectilinear pictures and minimizing access control lists,” in Proc. ACM-SIAM Symp. Discrete Algorithms , Jan.

2007.

[3] P. Berman and B. DasGupta, “Complexities of efficient solutions of rectilinear polygon cover problems,” Algorithmica , vol. 17, pp. 331–356, 1997. [4] R. Draves, C. King, S. Venkatachary, and B. Zill, “Constructing op- timal IP routing tables,” in Proc. IEEE INFOCOM , 1999, pp. 88–97.

IEEE/ACM TRANSACTIONS ON NETWORKING

[5] K Kogan, S. Nikolenko, W. Culhane, P. Eugster, and E. Ruan, “To- wards efficient implementation of packet classifiers in sdn/openflow,” in Proc. 2nd ACM SIGCOMM Workshop on Hot Topics in Software De ned Networking , New York, NY, USA, 2013, pp. 153–154. [6] V. S. Anil Kumar and H. Ramesh, “Covering rectilinear polygons with axis-parallel rectangles,” in Proc. ACM Symp. Theory of Computing , 1999, pp. 445–454. [7] A. X. Liu and M. G. Gouda, “Diverse firewall design,” in Proc. Conf. Dependable Syst. and Networks, Jun. 2004, pp. 595–604. [8] A. X. Liu and M. G. Gouda, “Complete redundancy detection in fire- walls,” in Proc. 19th Annu. IFIP Conf. Data and Applicant. Security, LNCS 3654, Aug. 2005, pp. 196–209. [9] A. X. Liu and M. G. Gouda, “Diverse firewall design,” IEEE Trans. Parallel Distrib. Syst., vol. 19, pp. 1237–1251, 2008. [10] A. X. Liu, C. R. Meiners, and E. Torng, “TCAM Razor: A system- atic approach towards minimizing packet classifiers in TCAMs,” IEEE/ACM Trans. Netw. , vol. 18, no. 2, pp. 490–500, Feb. 2010. [11] A. X. Liu, E. Torng, and C. Meiners, “Firewall compressor: An algo- rithm for minimizing firewall policies,” in Proc. 27th Annu. IEEE IN- FOCOM , Phoenix, AZ, USA, Apr. 2008. [12] A. X. Liu, E. Torng, and C. R. Meiners, “Compressing network access control lists,” IEEE Trans. Parallel Distrib. Syst. , vol. 22, no. 12, pp. 1969–1977, Dec. 2011. [13] W. J. Masek, Some NP-complete set covering problems 1978. [14] Feb. 7, 2005 [Online]. Available: http://razor.sourceforge.net/ [15] Y.-W. Eric Sung, X. Sun, S. G. Rao, G. G. Xie, and D. A. Maltz, “To- wards systematic design of enterprise networks,” IEEE/ACM Trans. Netw., vol. 19, no. 3, pp. 695–708, Jun. 2011. [16] S. Suri, T. Sandholm, and P. Warkhede, “Compressing two-dimen- sional routing tables,” Algorithmica , vol. 35, pp. 287–300, 2003. [17] S. Suri, T. Sandholm, and P. R. Warkhede, “Compressing two-dimen- sional routing tables,” Algorithmica, vol. 35, no. 4, pp. 287–300, 2003. [18] M. Yu, J. Rexford, M. J. Freedman, and J. Wang, “Scalable flow-based networking with DIFANE,” in Proc. SIGCOMM , 2010.

networking with DIFANE,” in Proc. SIGCOMM , 2010. James Daly received the B.S. degree in engineering

James Daly received the B.S. degree in engineering and computer science from Hope College in 2008. He is currently working toward the Ph.D. degree in com- puter science at Michigan State University, Lansing, MI, USA. His research interests are algorithms, networking, and security.

Alex X. Liu received the Ph.D. degree in computer science from the University of Texas at Austin, Austin, TX, USA, in 2006. His research interests focus on networking and security. Dr. Liu was the recipient of the IEEE and IFIP William C. Carter Award in 2004 and a National Science Foundation CAREER Award in 2009. He received the Withrow Distinguished Scholar Award in 2011 at Michigan State University. He is an as- sociate editor of the IEEE/ACM TRANSACTIONS ON NETWORKING. He received Best Paper Awards from ICNP-2012, SRDS-2012, and LISA-2010. He is the TPC Co-Chair of ICNP 2014.

and LISA-2010. He is the TPC Co-Chair of ICNP 2014. Eric Torng received the Ph.D. degree
and LISA-2010. He is the TPC Co-Chair of ICNP 2014. Eric Torng received the Ph.D. degree

Eric Torng received the Ph.D. degree in computer science from Stanford University, Stanford, CA, USA, in 1994. He is currently an Associate Professor and Grad- uate Director with the Department of Computer Science and Engineering, Michigan State University, Lansing, MI, USA. His research interests include algorithms, scheduling, and networking. Prof. was the recipient of a National Science Foun- dation CAREER Award in 1997.