Changeset 2749

Show
Ignore:
Timestamp:
10/30/09 00:23:38 (4 weeks ago)
Author:
prabal
Message:

Checkpoint

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • docs/Lowthane/ipsn10/related.tex

    r2707 r2749  
    1 \section{Background}\label{sec:related} 
    2  
    3 In this section, we examine existing solutions in the space, highlighting 
    4 elements which serve as building blocks for our 
    5 design and their shortcomings.  Creating a distinct control plane 
    6 for routing is not a novel idea, and so we examine related works in  
    7 internetworking.  Such work served as inspiration for our design, yet we also 
    8 discuss barriers that prevent such works from being naturally portable to L2Ns. 
    9  
    10 \subsection{Existing Low-Power Protocols} 
    11  
    12 Routing has been an integral part of L2Ns since their emergence, as 
    13 data and commands had to be relayed between nodes.  Initially, as many of the 
    14 deployments focused on various types of monitoring, the 
    15 predominant communication paradigm was {\emph{collection-Oriented}}, or 
    16 {\emph{many-to-one}} routing.  By association 
    17 {\emph{dissemination-Oriented}}, or {\emph{one-to-many}}, routing also 
    18 emerged because of the need to send commands to the nodes ({\it e.g.} for 
    19 configuration). 
    20 %   We examine the progression and state-of-the art for these 
    21 % constrained L2N routing protocols. 
    22  
    23 % In addition, as the set of potential application domains expanded, the need 
    24 % for richer and fuller routing protocols became clear, {\it i.e.} those that 
    25 % support unicast and multicast communication paradigms;  we 
    26 % examine these in this section as well. 
    27  
    28 \subsubsection{Collection-Oriented Protocols} 
    29  
    30 Most collection protocols are tree- or DAG-based. 
    31 MintRoute~\cite{mintroute} was one of the initial routing protocols for L2Ns. 
    32 Starting with the gateway, or base station node, beacon messages  
    33  announce the cost to reach the gateway in terms of hops, as 
    34 well as ETX, or expected transmissions. %%  ETX is calculated using the inverse 
    35 %% of packet reception ratio (PRR) over a link, which was roughly approximated 
    36 %% using an exponentially weighted moving average of received signal strength 
    37 %% indicators (RSSI) for each packet.  
    38 %% In subsequent versions for a popularal 
    39 %% platform, link cost estimation was done by using a chip link quality indicator 
    40 %% (LQI), which is provided for every received packet. 
    41 While MintRoute was 
    42 successful, it's main shortcomings stemmed from the limited sophistication of 
    43 its link estimation and loss response techniques, which reduced efficiency in 
    44 difficult RF environments. 
    45  
    46 The successor to MintRoute was CTP~\cite{ctp}.  CTP developed a more accurate 
    47 link estimator, in which control and data traffic were used to inform link 
    48 estimates, although two different link estimators were kept.  %% Furthermore, 
    49 %% CTP allowed multiple destinations to serve as data sinks, in which case 
    50 %% multiple trees were formed with each destination serving as a tree root. 
    51 Multiple potential next-hop parents are maintained, although only 
    52 one was used at any given time.  In one test, CTP displayed 97\% 
    53 overall reliability in a difficult RF environment, as opposed to the 70\% 
    54 reliability obtained by MintRoute.   
    55  
    56 CentRoute~\cite{centroute} provides centralized tree routing, allowing for 
    57 dissemination of tasklets and collection of information in the 
    58 Tenet~\cite{tenet} architecture.  A single node serves as the sink, and other 
    59 nodes in the network send join request messages which are forwarded to the 
    60 sink. 
    61 %%   If the node is a direct 
    62 %% neighbor of the sink, the sink replies with an accept message, and the node 
    63 %% joins the tree by selecting the sink as its next hop.  Otherwise, if a node 
    64 %% that is part of the tree overhears a join message, it forwards this message to 
    65 %% the sink.   
    66 %% When the sink receives a join message (originating from a 
    67 %% non-neighbor node), it waits for the same join message to arrive along 
    68 %% multiple paths and selects the best path.  The sink then source routes the 
    69 %% accept message to the originating node along this path, and the node 
    70 %% appropriately sets its next hop.  
    71 Once created, the tree is frozen unless 
    72 some threshold of failures occur, after which a node disengages and begins the 
    73 join process anew.  This approach is limited to single destination routing, 
    74 and also affords no flexibility for dynamic topology conditions unless links 
    75 completely break. 
    76  
    77 Koala~\cite{koala} is designed to provide mechanisms to enable efficient data 
    78 retrieval from in-network nodes.  It presents centralized mechanisms similar 
    79 to those used by {\lowthane}, in which nodes explore their neighborhood, and 
    80 provide this information to a central controller, which subsequently installs 
    81 an appropriate route in the network.  However, similar to CentRoute, these 
    82 mechanisms are only targeted towards collecting data to a root.  Further, it 
    83 does not position these mechanisms within a broader architecture or 
    84 investigate their performance under different conditions; the focus of the 
    85 work is a MAC-layer technique called Low Power Probing. %% , which enables more 
    86 %% effective duty-cycling. 
    87  
    88 TSMP~\cite{tsmp}, incorporated into the WirelessHART standard\cite{whart}, is a 
    89 well-established industry approach that includes centralize scheduling of 
    90 channel resources to achieve predictable latencies.  It operates at the link-layer, using channel 
    91 hopping, and does not provide any routing capabilities specified in publicly 
    92 available documentation.  Its tight scheduling of media accesses differs 
    93 significantly from the functionality of our {\controller}s, which contain only 
    94 soft state and are relatively asynchronous. 
    95  
    96  
    97 %Koala \cite{koala} presents a design which contains mechanisms 
    98 %similar to those used by {\lowthane} such as collecting topology and installing 
    99 %routes within the network.  However, it does not position these mechanisms 
    100 %within a broader architecture or investigate their performance under different 
    101 %conditions; the focus of the work is a MAC-layer technique called Low Power 
    102 %Probing. 
    103  
    104 %% One of the shifts in ROLL has been toward an IPv6 based architecture, centered 
    105 %% around 6LoWPAN~\cite{6lowpan}, the low-powered adaptation designed for L2Ns. 
    106 Hui presented a complete IPv6 network architecture for L2Ns, which includes a 
    107 low power link along with network and transport layers adapted to the unique 
    108 characteristics of these networks \cite{hui}.  This work adapted collection to 
    109 the IP architecture by conceptualizing it as choosing an IP default route.  It 
    110 also provided unicast capabilities by allowing a controller to communicate 
    111 with individual nodes by source routing the packet, forming the basis for 
    112 triangle routing of intra-network traffic.   
    113 % However, such an approach also incurs unnecessarily high stretch, and 
    114 % naturally forms bottleneck links around the controller. 
    115  
    116 % However, the implementation presented relies on routing intra-network 
    117 % traffic through a collection point to implement unicast traffic, leading to 
    118 % unnecessarily high stretch. 
    119 %%   Of particular interest to our work was 
    120 %% Hui's contribution to reliable multi-parent tree-based routing, and confidence 
    121 %% driven link estimation.   
    122 % This basic design represents the leading edge of collection research, and formed 
    123 % {\lowthane}'s basic template; we extend it with optimizations for 
    124 % point-to-point traffic. 
    125  
    126 From these myriad examples of collection protocols, we drew the elements of a 
    127 reliable control plane.  Dynamic and accurate link estimation is important to 
    128 deal with the lossy nature of the medium, and data-path validation is helpful 
    129 to prevent harmful routing loops.  Finally, nodes should exploit spatial 
    130 redundancy by keeping a set of potential next hops rather than a single choice. 
    131  
    132 %% the  foundation and inspiration 
    133 %% for many of our algorithms and has been invaluable in helping us to frame our 
    134 %% architecture. 
    135  
    136 %% A final approach worth considering are receiver-oriented protocols such as 
    137 %% ExOR.  This family of protocol forwards data by broadcasting packets and 
    138 %% relying on receivers closer to a sink forwarding it.  They must contain a 
    139 %% resolution protcol to suppress duplicates.  The inherent difficult is  
    140  
    141 \subsubsection{Other Routing Protocols} 
    142 \label{ssec:existing-other} 
    143  
    144 BVR~\cite{ref:bvr} is a geographical location-based routing protocol for 
    145 L2Ns.  A set of landmark beacons are selected whose identity is known by all 
    146 nodes in the network.  Every node is given a virtual coordinate, which is the 
    147 vector of distances to all beacons in the network.  When a node wants 
    148 to send a packet to a destination node, it greedily forwards the packet along 
    149 routes that move it closer to the destination.   
    150 %% If it is unable to do this (i.e. gets stuck at some point), it forwards the packet to the beacon closest to the destination.  In the case that this beacon is not able to use greedy routing to forward to the destination, scoped flooding is used to find the destination.  As demonstrated in other papers~\cite{s4-nsdi},  
    151 BVR suffers from poor transmission and routing stretch, particularly as the 
    152 network grows in size \cite{s4-nsdi}.  %% Variations to improve performance trade 
    153 %% off resources by maintaining the entire 2-hop neighbor table in memory. 
    154 Another difficulty is that in such a network, the identity of the beacons must 
    155 be known {\it a priori}, and can not be modified dynamically without a location 
    156 service \cite{bvr-location}. 
    157 %% .  A BVR location 
    158 %% service~\cite{bvr-location} was designed, but struggled because it still 
    159 %% necessitated significant network overhead and state. 
    160  
    161 S4~\cite{s4-nsdi}, is designed to provide small stretch and state in order to 
    162 enable scalable routing.  It utilizes a modified compact routing algorithm 
    163 tailored for L2Ns. %%  A subset of nodes in the network are designated as beacons, 
    164 %% with $\sqrt{N}$ being the optimal number.  A node $n$'s local neighborhood 
    165 %% cluster is defined as the set of nodes whose distance to $n$ is less than or 
    166 %% equal to the distance to their closest beacon.  Each node maintains routes to 
    167 %% all other nodes in their cluster, and all beacons in the network.  A node can 
    168 %% route to nodes outside its cluster by routing to the destination node's 
    169 %% closest beacon.   
    170 %% However, this feature requires a directory service that the 
    171 %% node can contact to obtain this information, and furthermore a beacon 
    172 %% selection mechanism is needed, both for initial beacon selection, as well as 
    173 %% under failure conditions, and the number of beacons can not be easily modified 
    174 %% during runtime.   
    175 S4 has a theoretical worst case routing stretch of $3$, 
    176 although published results indicate an average routing stretch of $1.2$ while maintaining 
    177 $\mathit{O}(\sqrt{N})$ state.  Like BVR, deploying S4 would require also 
    178 designing a distributed directory service.  Neither of these protocols 
    179 optimize for the common workload where most traffic is destined towards an 
    180 egress point; unlike S4 and BVR's homogeneous network, {\lowthane} starts with 
    181 optimizing collection and emphasizing heterogeneity. 
    182  
    183   %% In section~\ref{sec:tossim}, we compare {\lowthane} to S4, 
    184 %% demonstrating that we achieve greater reliability, with smaller stretch, 
    185 %% state, and control traffic. 
    186  
    187 %Finally Hui's work explores extending the 
    188 %Internet architecture to L2Ns, and develops the algorithms necessary for 
    189 %default route formation in \cite{hui}.  This work has been invaluable in 
    190 %helping us to frame our architecture. 
    191  
    192 \subsection{Ad-Hoc Networking Protocols} 
    193  
    194 Finally, there is also an extensive literature on fully distributed routing 
    195 protocols for ad-hoc wireless networks. % and L2Ns in particular. %% The 
    196 %MANET~\cite{MANET} working group focuses 
    197 %% on such protocols and issues. 
    198 AODV~\cite{aodv} uses flooded RREQ messages to discover paths to destinations 
    199 on demand, with intermediate nodes creating hop-by-hop entries for 
    200 bidirectional flows.  DSR~\cite{dsr} is also an on-demand routing protocol, 
    201 but uses source routing to route packets.  OLSR~\cite{olsr} is a link state 
    202 algorithm that uses multipoint relays to reduce the flood of link-state 
    203 advertisements.  802.11s, a draft mesh routing standard uses a variant of AODV 
    204 called AODV-RM for intra-subnet point-to-point communication, but optimizes 
    205 for an access network's workload by proactively building a routing tree towards 
    206 egress points \cite{80211s}. 
    207  
    208 These and other protocols provide point-to-point routing capabilities in 
    209 wireless networks.  Both on-demand and link-state based solutions exist, but 
    210 the focus is on providing reliable any-to-any delivery in networks with mobile 
    211 nodes that are resource rich.  Consequently, most of these protocols have 
    212 large control traffic and/or state requirements.  A protocol 
    213 survey~\cite{ietf-protocol-survey} recently evaluated the specifications of 
    214 the most widely used ad-hoc protocols against five criteria, and 
    215 noted that in their current form no ad-hoc protocol meets more than 
    216 three of the five criteria.%%   L2N-specific implementations of some of these 
    217 %% protocols exists, such as TinyAODV and TYMO, but have received limited 
    218 %% exposure and acceptance due to their poor (or non-existent) evaluations and 
    219 %% limited feature-sets. 
    220  
    221  
    222 \subsection{Internet Solutions} 
     1\section{Related Work} 
     2\label{sec:related} 
     3 
     4In this section, we review the prior work in three distinct areas: 
     5resource-constrained routing (e.g. low-power, memory-limited), ad hoc 
     6routing, and Internet routing.  This work synthesizes contributions 
     7from these distinct areas to create a new design point that no prior 
     8work occupies. 
     9 
     10\subsection{Resource-Constrained Routing} 
     11 
     12A number of collection protocols have been proposed over the years for 
     13data collection in sensor networks.  Most collection protocols create 
     14a collection tree, or DAG, along which data are sent to a root node. 
     15 
     16MintRoute~\cite{mintroute} was one of the first such routing protocols 
     17and it explored several techniques for selecting neighbors from a set 
     18of candidates that were larger than a node could fit into memory.  The 
     19Collection Tree Protocol (CTP)~\cite{ctp} succeeded MintRoute and used 
     20a more accurate, data-driven link estimator and incorporated next-hop 
     21diversity, achieving better data reliability than MintRoute.  In one 
     22experiment, for example, CTP delivered 97\% of data traffic in a 
     23difficult RF environment while MintRoute delivered only 70\%. 
     24 
     25MintRoute and CTP both use {\em distributed} topology formation but 
     26several protocols have explored {\em centralized} topology formation 
     27as well.  CentRoute~\cite{centroute}, a part of Tenet~\cite{tenet}, 
     28uses a root node to control topology centrally (and disseminate 
     29tasklets to individual nodes).  Once created, the routing topology is 
     30frozen unless dramatic topology changes dictate rebuilding routes. 
     31CentRoute is limited to single-destination routing and does not adapt 
     32to limited topology dynamics. 
     33 
     34Koala~\cite{koala}, another centralized routing protocol, enables data 
     35retrieval from in-network nodes using mechanisms similar to those used 
     36by {\lowthane}.  Both protocols employ a technique in which nodes 
     37explore their neighborhood and forward this information to a central 
     38controller.  The controller installs routes directly onto network 
     39nodes.  However, similar to CentRoute, these mechanisms target 
     40collection routing and do not support general point-to-point routing. 
     41 
     42TSMP~\cite{tsmp} is an established industry approach that includes 
     43centralized scheduling of channel resources to achieve predictable 
     44latencies but it operates at the link-layer, using channel hopping, 
     45and does not provide routing capabilities. 
     46 
     47Hui and Culler presented an IPv6 network architecture that adapted IP 
     48routing to the unique characteristics of sensornet.  This work 
     49conceptualized collection within an IP framework as selecting a 
     50default route~\cite{hui}.  It also provided unicast capabilities by 
     51source routing along the collection reverse path, forming the basis 
     52for triangle routing of intra-network traffic. 
     53 
     54%% From these myriad examples of collection protocols, we drew the 
     55%% elements of a reliable control plane.  Dynamic and accurate link 
     56%% estimation is important to deal with the lossy nature of the medium, 
     57%% and data-path validation is helpful to prevent harmful routing loops. 
     58%% Finally, nodes should exploit spatial redundancy by keeping a set of 
     59%% potential next hops rather than a single choice. 
     60 
     61BVR~\cite{ref:bvr} is a geographical, location-based routing protocol. 
     62A set of landmarks are selected whose identities are known to all 
     63nodes in the network.  Every node is given a virtual coordinate, which 
     64is the vector of distances to the landmarks.  Packets are greedily 
     65forwarded along a gradient.  BVR suffers from poor transmission and 
     66routing stretch, particularly with network size size~\cite{s4-nsdi}, 
     67and it requires a location service to map between node names and their 
     68dynamic addresses~\cite{bvr-location}. 
     69% 
     70S4~\cite{s4-nsdi} is designed to provide small stretch and state in 
     71order to enable scalable routing.  It utilizes a modified compact 
     72routing algorithm tailored for resource-constrained networks.  S4 has 
     73a theoretical worst case routing stretch of $3$, although published 
     74results indicate an average routing stretch of $1.2$ while maintaining 
     75$\mathit{O}(\sqrt{N})$ state.  Like BVR, S4 requires a location 
     76service.  More importantly, neither of these protocols optimize for 
     77the common collection workload where most traffic is destined towards 
     78an egress point.  In constrast, {\lowthane} starts with optimizing 
     79collection and build point-to-point on top of collection. 
     80 
     81\subsection{Ad-hoc Routing} 
     82\label{sec:related-manet} 
     83 
     84Mobile ad hoc networking (MANET) has explored many distributed routing 
     85protocols.  AODV~\cite{aodv} floods route-request messages to discover 
     86paths to destinations on demand, with intermediate nodes creating 
     87hop-by-hop entries for bidirectional flows.  DSR~\cite{dsr} offers 
     88on-demand routes, but employs source routing.  OLSR~\cite{olsr} is a 
     89link state algorithm that uses multipoint relays to reduce the flood 
     90of link-state advertisements.  802.11s, a draft mesh routing standard 
     91uses a variant of AODV called AODV-RM for intra-subnet point-to-point 
     92communication, but optimizes for an access network workload by 
     93proactively building a routing tree towards egress 
     94points~\cite{80211s}.  These and other protocols provide 
     95point-to-point routing capabilities in wireless networks.  Both 
     96on-demand and link-state based solutions exist, but the focus is on 
     97providing reliable any-to-any routing between mobile, resource-rich 
     98nodes.  As a result, these protocols exhibit greater control traffic 
     99or require greater state than is appropriate for our constrained 
     100setting~\cite{ietf-protocol-survey}. 
     101 
     102\subsection{Internet Routing} 
    223103\label{sec:related-centralize} 
    224104 
    225 The concept of centralized routing, or separating the control and data planes, 
    226 is not a novel one; we explore several existing protocols later in this 
    227 section that embrace a centralized paradigm.  However, these solutions 
    228 were designed primarily for the Internet, characterized by 
    229 low churn and high bandwidth.  This is in stark contrast to the 
    230 high-churn, low-bandwidth nature of typical L2N environments. %% Consequently, 
    231 %% we discuss three factors that are critical to the success of centralized 
    232 %% solutions, and while trivial to achieve in general networking environments, 
    233 %% present the main obstacles to a centralized solution. 
    234  
    235 % \subsection{Internet Routing Protocols} 
    236  
    237 %% We discuss two separate bodies of work in this section: Centralized Internet 
    238 %% Routing Architectures / Protocols, and IP-Based Ad-Hoc Networking 
    239 %% Protocols(i.e. MANET~\cite{MANET}). 
    240  
    241 % \subsubsection{Centralized Routing Architectures} 
    242  
    243 The most conceptually similar work is Ethane~\cite{ethane}, %% , the successor to SANE~\cite{sane} and the predecessor to 
    244 %% NOX~\cite{nox},  
    245 a centralized architecture for implementing high-level 
    246 security policies.  Designed for large enterprise networks, the network is 
    247 divided into four tiers: controllers, %  (or multiple ones for fault tolerance), 
    248 switches, end hosts or servers, and users.  %% Whenever a switch is connected to 
    249 %% the network, it registers with the controller, and establishes a secure 
    250 %% backchannel to it.  When an end host joins the network, the switch also 
    251 %% reports this registration, reporting the class of device and attachment port. 
    252 %% Finally, users are able to associate with one or more end host machines. 
    253 %% Consequently, the controller maintains a complete and accurate view of the 
    254 %% topology.  
    255 Network administrators specify high-level access policies, such as 
    256 restricting which servers a certain user may communicate with.  Each switch 
    257 maintains a flow table against which it can classify incoming packets using 
    258 a packet header's 10-tuple.  If no applicable entry exists, the packet is 
    259 forwarded to the controller.  The controller consults its policy 
    260 specification, and installs flow entries along the path. 
    261 %% {\lowthane} utilizes this concept of installing flow entries and forwarding 
    262 %% packets belonging to unclassifiable flows to the controller. 
    263 %Ethane~\cite{ethane} is the work that is the most similar to ours.  Designed 
    264 %for large enterprise networks, the network is divided into three tiers: a 
    265 %controller (or multiple ones for fault tolerance), switches, and end hosts or 
    266 %servers.   
    267 % Whenever a switch is connected to the network, it registers with the 
    268 % controller, and establishes a secure channel to it.  When a server or 
    269 % middlebox joins the network, it also registers with the controller, reporting 
    270 % the connecting switch and port.  Consequently, the controller maintains a 
    271 % complete and accurate view of the topology.  Each switch maintains a flow 
    272 % table in which it can match packet 5-tuples.  If no applicable entry exists, 
    273 % the packet is forwarded to the controller.  The controller consults its policy 
    274 % specification, and installs hop-by-hop 
    275 % flow entries along the path. 
    276  
    277 Key elements visible in this design include forming a channel and ``default 
    278 routing'' to a controller, and maintaining global topology.  At a high level, 
    279 {\lowthane} can be considered an adaptation or instantiation of the Ethane 
    280 architecture.  Bringing Ethane together with low-power wireless 
    281 brings out many of the assumptions underlying Ethane's operation, which must 
    282 be addressed for the system to be practical; we explictly note three of the 
    283 most important assumptions in the next section. 
    284  
    285 % {\lowthane} also 
    286 % addresses these issues, but the details differ significantly when no wired 
    287 % infrastructure is available. 
    288  
    289 %%  are present in the {\lowthane} design.  However, 
    290 %% elements which Ethane takes for granted like {\controller} reachability and 
    291 %% topology inference must be carefully addressed in the more dynamic L2N setting. 
    292  
    293 %% Ethane assumes the availability of a secure and reliable channel to the 
    294 %% controller, and also lends itself to trivial construction of a global topology 
    295 %% view.  Network switches all have ample bandwidth and memory, allowing Ethane 
    296 %% to support 70,000 flows / second at line speeds.  Finally, Ethane operates on 
    297 %% top of layer 2 switching, which performs automatic failure rerouting to direct 
    298 %% a packet to the addressed next hop.  While we discuss Ethane's successor, 
    299 %% NOX~\cite{nox}, further in chapter~\ref{ch:player-centralized}, we note that 
    300 %% architecturally it is the same as Ethane and hence suffers from the same 
    301 %% limitations in the context of ROLL. 
    302  
    303 A number of other centralized internet routing designs exist, which we mention 
    304 briefly for completeness.  Feamster's Routing Control Platform, implemented by 
    305 Caesar, made the case for separating the control aspect of routing from 
    306 routers \cite{feamster-separating, caesar-rcp}.  Greenberg {\it et al.} 
    307 present 4D~\cite{4d,tesseract}, a general framework for separating routing from routers. 
    308 The framework consists of four components: a decision plane, a dissemination 
    309 plane, a discovery plane, and a data plane. 
    310  
    311  
    312   %% Designed for both intra-AS (autonomous 
    313 %% system) and inter-AS routing,  
    314 %% The motivation was that by distributing all 
    315 %% aspects of routing, interactions became increasingly complex, heavy bandwidth 
    316 %% usage resulted from topology changes, and forwarding loops and routing 
    317 %% failures that were incredibly difficult to diagnose continually presented 
    318 %% themselves.  To alleviate such difficulties, RCP serves as the primary 
    319 %% interaction point for all routers within an AS.  %% By participating in the 
    320 %% network, RCP was able to obtain a complete view of the topology.  
    321 %% A network 
    322 %% operator could specify the routing policy at a single place (which typically 
    323 %% was to simply mimic that of a fully-meshed iBGP AS), and then the RCP node 
    324 %% would advertise the correct route to each router in the network. 
    325 %% RCP would 
    326 %% then serve the role of an egress router in communicating routing information 
    327 %% with other AS's using eBGP (allowing for a smooth incremental deployment).  
    328 %%  As 
    329 %% an ideal end-state, all AS's would have an RCP, and these RCP's would 
    330 %% communicate with each other to exchange routing information. 
    331 %% Caesar {\it et al.} presented an actual design and implementation of an 
    332 %% RCP~\cite{caesar-rcp}.  %% Leveraging the existing OSPF and BGP protocols in 
    333 %% place in most existing networks, the implementation had three distinct 
    334 %% components: an IGP Viewer, a BGP Engine, and a Route Control Server (RCS). 
    335 %% The IGP Viewer obtains IGP topology information by building adjacencies with 
    336 %% routers in the network.  The BGP Engine learns BGP routes from the routers, 
    337 %% and also communicates routing assignments to each router.  Finally, the RCS 
    338 %% uses the information obtained by the other two components to compute routing 
    339 %% assignments according to some specified policy, which currently defaults to 
    340 %% full-mesh iBGP configurations.  
    341 %% A key decision was distributing RCS nodes to avoid creating a single point of 
    342 %% failure, but they are not precise replicas and there are only loose 
    343 %% consistency requirements. %, which are detailed in the paper. 
    344 %% While the creation of a distinct control hierarchy certainly permeates our 
    345 %% design, the practicality of RCP for L2Ns is limited.   
    346 %% However, the design focuses on only 
    347 %% the intra-AS aspect, leveraging full underlying IP and TCP connectivity, as 
    348 %% well as OSPF and BGP for information dissemination, which neither exist nor 
    349 %% are practical in L2Ns.  A key part of our design involves designing 
    350 %% mechanisms that provide this underlying connectivity and accurately 
    351 %% constructing and maintaining global topology views. %%  Furthermore, full 
    352 %% consistent connectivity is maintained by all nodes in the network at all times 
    353 %% (often through flooding of Link State Advertisements), resulting in 
    354 %% significant memory and network overhead, as opposed to on-demand schemes that 
    355 %% are often a better fit for ROLL traffic patterns.  The lossy and dynamic 
    356 %% nature of low-power wireless would only exacerbate network congestion in such 
    357 %% an architecture.  Finally, their attempts at reducing overhead relies on 
    358 %% prefix-based IP addressing, conflating node address and geographical location, 
    359 %% while ROLL networks primarily embrace flat-label based addressing schemes. 
    360  
    361 %% decision plane, composed of a single logical decision element (which is 
    362 %% replicated for fault tolerance), serves as the {\emph{network brain}}, serving 
    363 %% as the interface for network-wide policy specification, which is subsequently 
    364 %% communicated to individual routers and switches.  The dissemination plane is 
    365 %% tasked with creating the logical channels between the decision element and 
    366 %% each router and switch, providing a medium for control information flow, 
    367 %% independent of the state of the data plane.  The discovery plane refers to the 
    368 %% process in which each router/switch discovers its own capabilities, its 
    369 %% neighboring routers/switches (including those in other ASes), and communicates 
    370 %% this information across the dissemination plane to the decision element, 
    371 %% allowing for the creation of a global view of the network.  Finally, the data 
    372 %% plane is greatly simplified, responsible only for forwarding packets according 
    373 %% to the forwarding table entries installed by the decision element. 
    374 %% As a framework paper, 4D does not provide any specific instantiations of any 
    375 %% of the components.   
    376 %% Although targeted towards the general Internet, 
    377 %% the same modularization applies in the context of ROLL as well; our routing 
    378 %% architecture for L2Ns can roughly be 
    379 %% decomposed into the same four components.  This componentization is 
    380 %% particularly effective in highlighting the characteristics that differentiate 
    381 %% L2Ns from the general Internet.  The dissemination plane must be built over 
    382 %% a series of lossy unreliable links, in which the {\emph{optimal}} paths are 
    383 %% potentially shifting.  %% Low power wireless also complicates the 
    384 %% discovery plane, as there is a degree of dynamicity, even if the nodes 
    385 %% are physically stationary, and also power duty cycles that reach 0.1\% and 
    386 %% below. 
    387 %% Furthermore, energy constraints prohibit excessive control traffic, which 
    388 %% complicates communication of this discovery information to the decision plane. 
    389 %% Finally, the data plane must provide certain reliability and delivery 
    390 %% guarantees in the face of lossy and moving paths.  Yan {\it et al.} present 
    391 %% Tesseract~\cite{tesseract}, a faithful instantiation of the 4D framework and 
    392 %% concept.  %% Switches use periodic beaconing to determine their 
    393 %% neighboring nodes, as well as their own capabilities, and then transmit this 
    394 %% information to a {\emph{decision element}}, which forms the basis of the 
    395 %% decision plane.  Path explorer messages are flooded through the network to 
    396 %% enable the building of the dissemination plane.  Subsequently, the decision 
    397 %% element sends all control messages to switches using source routes, and in 
    398 %% turn, these switches reverse these source routes for sending information to 
    399 %% the decision element.  The decision plane is implemented as an abstract model, 
    400 %% in which specific algorithms are separated from the underlying link 
    401 %% technologies, to provide an element of portability.  To provide robustness at 
    402 %% the decision plane level, multiple decision elements are maintained at any 
    403 %% given time, with all information being fully replicated across all of them. 
    404 %% However, only one of them is a master (i.e. active) at any given time, while 
    405 %% the rest are only used in the case of a hot standby.  The master is chosen 
    406 %% through a leader-election process and arbitrarily assigned priorities. 
    407 %% Security also plays a large role in Tesseract, with numerous mechanisms to 
    408 %% enable encryption, certification, and authentication. 
    409 %% Tesseract also provides the fully connected best path routing, which 
    410 %% necessitates large amounts of control traffic, and state at the nodes that 
    411 %% scales with the size of the network. %% The 
    412 %% dissemination plane relies on source routing in both directions, stripping the 
    413 %% switches of any local flexibility to deal with unreliable links, and dynamic 
    414 %% topology shifts, as is common in L2N networks.  This results in additional 
    415 %% control traffic for every instance of packet delivery failure, as does the 
    416 %% need for full information at decision elements. 
    417  
    418  
    419 %\begin{itemize} 
    420 %\item{Tempest} 
    421 %\item{FIRE} 
    422 %\end{itemize} 
    423  
    424 % \subsubsection{Stable Backchannel To Centralized Controller} 
    425  
    426 \subsection{Towards a Hybrid Solution} 
    427  
    428 From this disparate related Internet work, we distill a three requirements for 
    429 successful centralized control.  
    430 \begin {description} 
    431 \item[Reliable Path to Centralized controller]  
    432  
    433 A logical controller serves as the brain of the network, gathering 
    434 information from all the routers in the network, and disseminating 
    435 control commands.  One of the key implications of this role hierarchy is the 
    436 need for a stable channel between the controller and all routers in the 
    437 network at all times.  %% In many cases, this channel uses logically different 
    438 %% links than the data plane, as its reliability and stability must be ensured. 
    439 In large wired networks, creation of this channel is simplified by underlying 
    440 data link layers: in Ethernets, the 
    441 underlying layer-2 spanning tree algorithm provides reachability between all 
    442 switches and end-hosts in a subnet.%% , and lookups can be accomplished using ARP. 
    443 %% This is feasible because of the local broadcast nature of communications on a 
    444 %% subnet.  Extending this reachability across multiple subnets can be done by 
    445 %% introducing layer-3 routing elements as well.  Furthermore, once this channel 
    446 %% has been established, it remains relatively stable, except in cases of severe 
    447 %% congestion or physical partitioning, which are outside of the scope of our 
    448 %% problem. 
    449 % In contrast, no such natural mechanisms exist in L2N environments, and so the 
    450 % network layer must build a robust path to the controller.   
    451 % Much 
    452 Collection protocols may be used to provide just this functionality. 
    453 %% The 
    454 %% link-layer provides only single-radio-hop communications, with no inherent 
    455 %% mechanisms for building neighbor tables.   
    456 %% Networks can be relatively deep, 
    457 %% with diameters of 20 or more hops possible~\cite{ietf-industrial}. 
    458 %% Furthermore, the quality of links vary widely, both spatially and temporally, 
    459 %% creating a lossy path that is highly dynamic even without mobility. 
    460 %% Finally, this channel must be maintained with minimal control traffic and 
    461 %% network overhead, per the requirements outlined previously. 
    462 \item[Consistent Global View of Topology]  
    463 One benefit of centralized routing is the ability to make routing decisions at 
    464 a central location that has complete information about the state of the entire 
    465 network. %%  In to order to create such a global view, all the elements of the 
    466 %% network must register upon joining, and then provide updates of new neighbors 
    467 %% and changes in the state. 
    468 %%   In many existing solutions, there are periodic 
    469 %% heartbeats that are sent in the form of link state advertisements, which 
    470 %% allows the controller to maintain this global view. 
    471 Three main factors complicate this task in L2Ns: control traffic restrictions, 
    472 memory constraints, and dynamic topology.   
    473 Only a trickle of bandwidth is generally available to propagate network state towards 
    474 a control element. 
    475 %% Control traffic must not exceed 
    476 %% the data rate, and so consistent periodic link state advertisements with 
    477 %% updates sent to the controller are not practical.  %% Furthermore, the 
    478 %% loss-response ROLL criteria dictates that control traffic in response to a 
    479 %% loss must be limited to active paths or the single-hop broadcast domain, 
    480 %% preventing triggered global updates.  
    481 Links exhibit temporal 
    482 and spatial variability, which complicates %: links are not binary, so  
    483 maintaining a consistent 
    484 view of network links. % qualities is difficult. %% would require significant amounts of control 
    485 %% traffic.  
    486 Finally, memory constraints prevent individual nodes from 
    487 maintaining state for all nodes in the network, or even the single-hop 
    488 broadcast neighborhood.  Therefore, forming the complete topology may 
    489 not be possible.%; we investigate the extent to which this is necessary. 
    490  
    491 %% , leading to the unavailability of needed information, 
    492 %% even if control traffic was not problematic. 
    493  
    494 \item [Providing Reliability Over Lossy Links] 
    495 The target environment for the majority of existing routing protocols is often 
    496 highly-reliable, in the form of either wired links, or single-hop wireless 
    497 links that provide relatively high reliability, through both link-layer and 
    498 typically transport layer retransmissions. 
    499 In L2Ns, the low-power nature of the ratio leads to a small 
    500 Signal-to-Interference-and-Noise Ratio (SINR) on many links.  As demonstrated 
    501 by Srinivasan {\it et al.} \cite{underlying}, the SINR of these links is often on 
    502 the cusp of the necessary threshold to receive a packet, which results in 
    503 bimodal links due to variations in signal strength and interference levels. 
    504 Consequently, accurate link estimation and local repair techniques are critical.%% , both in accurately 
    505 %% calculating a links quality/cost, and also in identifying and accounting for 
    506 %% temporal shifts in these link qualities in order to minimize transmission 
    507 %% stretch while maintaining reliability. 
    508 \end{description} 
    509  
    510 Although it is not obvious that any element of centralized control is a 
    511 natural fit for L2N routing since existing centralized solutions rely on 
    512 assumptions which do not automatically hold in this setting, research in the 
    513 sensor network literature can be re-cast to help achieve the necessary 
    514 prerequisites. 
    515  
    516 %% policy 
    517 % Futhermore, none of the related wireless work we have considered has addressed 
    518 % the need for policy routing in wireless networks.  The ability to optimize for 
    519 % metrics other then ETX or hop-count, and to do so ``on the fly'' becomes 
    520 % increasingly important as networks begin to be composed of devices with 
    521 % different power levels, processing rates, and security capabilities, it 
    522 % becomes desirable to have a framework with which to tie together all these 
    523 % metrics and optimize over the joint metrics and policy space. 
    524  
    525  
    526 %% Finally, one way of providing effective routing is to utilize meaningful, or 
    527 %% routable, addressing schemes.  One of the reasons that IP prefix-based routing 
    528 %% is effective is that IP-addresses often tie together a name and geographical 
    529 %% location.  %% In other words, except in very rare cases (such as Mobile IP 
    530 %% %% solutions), all IP addresses with a shared prefix (of varying length depending 
    531 %% %% on the size of subnet) are geographically co-located, allowing remote routers 
    532 %% %% to maintain a single routing entry for that block of addresses. 
    533 %% This reduces 
    534 %% both memory state at routers, and the size of control traffic. 
    535 %% This conflation of address and location is difficult in L2Ns.%%   We discussed 
    536 %% %% examples of this in section~\ref{ssec:existing-other} in the form of 
    537 %% %% geographic routing and compact routing schemes.  
    538 %% The challenge in implementing 
    539 %% such a design is the dynamic nature of the network \cite{shenkergeo}.  Varying link qualities 
    540 %% cause optimal routing paths to change, which would imply that the router's 
    541 %% address would have to adjust accordingly.%%   This would require a lookup service 
    542 %% %% for nodes to report updates to, and request lookups from.  This is in fact one 
    543 %% %% of the main shortcomings of proposed geographic and compact routing schemes in 
    544 %% %% the L2N space.  Despite the shortcomings of existing solutions, this may be a 
    545 %% %% potential avenue that warrants additional research. 
    546  
    547  
    548  
    549  
    550  
    551 %Multiprotocol Label Switching~\cite{mpls}, or MPLS, is a commonly used example of centralized routing in large networks.  Network operators setup explicit paths, often either to provide QoS or VPN services, by installing label entries at Label Switch Routers in the network.  Packets then carry a label, or even potentially a stack, which are attached at Label Edge Routers, allowing for fast lookups at each Label Switch Router.  This setup assumes a relatively static network, and paths are manually set-up, with no clear correlation with data traffic. 
    552  
     105Centralized routing is hardly a novel concept and we explore several 
     106existing protocols in this section that embrace a centralized design. 
     107The critical distinction is that these earlier solutions were designed 
     108for wired networks characterized by low churn, high bandwidth, and 
     109significant memory.  These assumptions do not hold in sensornets which 
     110exhibit high-churn, low-bandwidth, and high resource-constraints. 
     111 
     112Ethane~\cite{ethane} is the work that is architecturally most similar 
     113to ours.  Designed for large enterprise networks, Ethane divides the 
     114network into four tiers: controllers, switches, hosts, and users. 
     115Network administrators specify high-level access policies.  Each 
     116switch maintains a flow table against which it can classify incoming 
     117packets based on header fields.  If no applicable entry exists, the 
     118packet is forwarded to the controller where the controller consults 
     119its policy specification, and installs flow entries along the path. 
     120 
     121We adopt several aspects of Ethane's design including ``default 
     122routes'' to a controller and maintaining global network topology 
     123centrally.  Although architecturally similary, low-power wireless 
     124links expose many assumptions underlying Ethane's operation, which 
     125must be addressed in our context.  Among the (invalid) assumptions are 
     126(i) a reliable path exists to the central controller; (ii) a 
     127consistent global view of the topology is available; and (iii) links 
     128are generally stable.  None of these assumptions hold in a wireless 
     129sensornet. 
     130 
     131Other centralized Internet routing designs exist, which we review 
     132briefly for completeness.  The Routing Control Platform architecture 
     133makes the case for separating the control aspect of routing from 
     134routers~\cite{feamster-separating, caesar-rcp}.  Greenberg {\it et 
     135  al.}  present 4D~\cite{4d,tesseract}, a general framework for 
     136separating routing from routers.  The framework consists of four 
     137components: a decision plane, a dissemination plane, a discovery 
     138plane, and a data plane. 
     139 
     140 
     141 
     142%% {\bf OLD STUFF BELOW} 
     143 
     144 
     145%% highlighting aspects that 
     146%% influenced our thinking and highlight aspects of the prior work that 
     147%% we adopt, and draw a dis directly, and contrast our 
     148 
     149%% examine some of the existing solutions in 
     150%% the space, highlighting elements which serve as building blocks for 
     151%% our design and their shortcomings.  Creating a distinct control plane 
     152%% for routing is not a novel idea, and so we examine related works in 
     153%% internetworking.  Such work served as inspiration for our design, yet 
     154%% we also discuss barriers that prevent such works from being naturally 
     155%% portable to L2Ns. 
     156 
     157 
     158 
     159%% \subsection{Existing Low-Power Protocols} 
     160 
     161%% Routing has been an integral part of L2Ns since their emergence, as 
     162%% data and commands had to be relayed between nodes.  Initially, as many of the 
     163%% deployments focused on various types of monitoring, the 
     164%% predominant communication paradigm was {\emph{collection-Oriented}}, or 
     165%% {\emph{many-to-one}} routing.  By association 
     166%% {\emph{dissemination-Oriented}}, or {\emph{one-to-many}}, routing also 
     167%% emerged because of the need to send commands to the nodes ({\it e.g.} for 
     168%% configuration). 
     169%% %   We examine the progression and state-of-the art for these 
     170%% % constrained L2N routing protocols. 
     171 
     172%% % In addition, as the set of potential application domains expanded, the need 
     173%% % for richer and fuller routing protocols became clear, {\it i.e.} those that 
     174%% % support unicast and multicast communication paradigms;  we 
     175%% % examine these in this section as well. 
     176 
     177%% \subsubsection{Collection-Oriented Protocols} 
     178 
     179%% Most collection protocols are tree- or DAG-based. 
     180%% MintRoute~\cite{mintroute} was one of the initial routing protocols for L2Ns. 
     181%% Starting with the gateway, or base station node, beacon messages  
     182%%  announce the cost to reach the gateway in terms of hops, as 
     183%% well as ETX, or expected transmissions. %%  ETX is calculated using the inverse 
     184%% %% of packet reception ratio (PRR) over a link, which was roughly approximated 
     185%% %% using an exponentially weighted moving average of received signal strength 
     186%% %% indicators (RSSI) for each packet.  
     187%% %% In subsequent versions for a popularal 
     188%% %% platform, link cost estimation was done by using a chip link quality indicator 
     189%% %% (LQI), which is provided for every received packet. 
     190%% While MintRoute was 
     191%% successful, it's main shortcomings stemmed from the limited sophistication of 
     192%% its link estimation and loss response techniques, which reduced efficiency in 
     193%% difficult RF environments. 
     194 
     195%% The successor to MintRoute was CTP~\cite{ctp}.  CTP developed a more accurate 
     196%% link estimator, in which control and data traffic were used to inform link 
     197%% estimates, although two different link estimators were kept.  %% Furthermore, 
     198%% %% CTP allowed multiple destinations to serve as data sinks, in which case 
     199%% %% multiple trees were formed with each destination serving as a tree root. 
     200%% Multiple potential next-hop parents are maintained, although only 
     201%% one was used at any given time.  In one test, CTP displayed 97\% 
     202%% overall reliability in a difficult RF environment, as opposed to the 70\% 
     203%% reliability obtained by MintRoute.   
     204 
     205%% CentRoute~\cite{centroute} provides centralized tree routing, allowing for 
     206%% dissemination of tasklets and collection of information in the 
     207%% Tenet~\cite{tenet} architecture.  A single node serves as the sink, and other 
     208%% nodes in the network send join request messages which are forwarded to the 
     209%% sink. 
     210%% %%   If the node is a direct 
     211%% %% neighbor of the sink, the sink replies with an accept message, and the node 
     212%% %% joins the tree by selecting the sink as its next hop.  Otherwise, if a node 
     213%% %% that is part of the tree overhears a join message, it forwards this message to 
     214%% %% the sink.   
     215%% %% When the sink receives a join message (originating from a 
     216%% %% non-neighbor node), it waits for the same join message to arrive along 
     217%% %% multiple paths and selects the best path.  The sink then source routes the 
     218%% %% accept message to the originating node along this path, and the node 
     219%% %% appropriately sets its next hop.  
     220%% Once created, the tree is frozen unless 
     221%% some threshold of failures occur, after which a node disengages and begins the 
     222%% join process anew.  This approach is limited to single destination routing, 
     223%% and also affords no flexibility for dynamic topology conditions unless links 
     224%% completely break. 
     225 
     226%% Koala~\cite{koala} is designed to provide mechanisms to enable efficient data 
     227%% retrieval from in-network nodes.  It presents centralized mechanisms similar 
     228%% to those used by {\lowthane}, in which nodes explore their neighborhood, and 
     229%% provide this information to a central controller, which subsequently installs 
     230%% an appropriate route in the network.  However, similar to CentRoute, these 
     231%% mechanisms are only targeted towards collecting data to a root.  Further, it 
     232%% does not position these mechanisms within a broader architecture or 
     233%% investigate their performance under different conditions; the focus of the 
     234%% work is a MAC-layer technique called Low Power Probing. %% , which enables more 
     235%% %% effective duty-cycling. 
     236 
     237%% TSMP~\cite{tsmp}, incorporated into the WirelessHART standard\cite{whart}, is a 
     238%% well-established industry approach that includes centralize scheduling of 
     239%% channel resources to achieve predictable latencies.  It operates at the link-layer, using channel 
     240%% hopping, and does not provide any routing capabilities specified in publicly 
     241%% available documentation.  Its tight scheduling of media accesses differs 
     242%% significantly from the functionality of our {\controller}s, which contain only 
     243%% soft state and are relatively asynchronous. 
     244 
     245 
     246%% %Koala \cite{koala} presents a design which contains mechanisms 
     247%% %similar to those used by {\lowthane} such as collecting topology and installing 
     248%% %routes within the network.  However, it does not position these mechanisms 
     249%% %within a broader architecture or investigate their performance under different 
     250%% %conditions; the focus of the work is a MAC-layer technique called Low Power 
     251%% %Probing. 
     252 
     253%% %% One of the shifts in ROLL has been toward an IPv6 based architecture, centered 
     254%% %% around 6LoWPAN~\cite{6lowpan}, the low-powered adaptation designed for L2Ns. 
     255%% Hui presented a complete IPv6 network architecture for L2Ns, which includes a 
     256%% low power link along with network and transport layers adapted to the unique 
     257%% characteristics of these networks \cite{hui}.  This work adapted collection to 
     258%% the IP architecture by conceptualizing it as choosing an IP default route.  It 
     259%% also provided unicast capabilities by allowing a controller to communicate 
     260%% with individual nodes by source routing the packet, forming the basis for 
     261%% triangle routing of intra-network traffic.   
     262%% % However, such an approach also incurs unnecessarily high stretch, and 
     263%% % naturally forms bottleneck links around the controller. 
     264 
     265%% % However, the implementation presented relies on routing intra-network 
     266%% % traffic through a collection point to implement unicast traffic, leading to 
     267%% % unnecessarily high stretch. 
     268%% %%   Of particular interest to our work was 
     269%% %% Hui's contribution to reliable multi-parent tree-based routing, and confidence 
     270%% %% driven link estimation.   
     271%% % This basic design represents the leading edge of collection research, and formed 
     272%% % {\lowthane}'s basic template; we extend it with optimizations for 
     273%% % point-to-point traffic. 
     274 
     275%% From these myriad examples of collection protocols, we drew the elements of a 
     276%% reliable control plane.  Dynamic and accurate link estimation is important to 
     277%% deal with the lossy nature of the medium, and data-path validation is helpful 
     278%% to prevent harmful routing loops.  Finally, nodes should exploit spatial 
     279%% redundancy by keeping a set of potential next hops rather than a single choice. 
     280 
     281%% %% the  foundation and inspiration 
     282%% %% for many of our algorithms and has been invaluable in helping us to frame our 
     283%% %% architecture. 
     284 
     285%% %% A final approach worth considering are receiver-oriented protocols such as 
     286%% %% ExOR.  This family of protocol forwards data by broadcasting packets and 
     287%% %% relying on receivers closer to a sink forwarding it.  They must contain a 
     288%% %% resolution protcol to suppress duplicates.  The inherent difficult is  
     289 
     290%% \subsubsection{Other Routing Protocols} 
     291%% \label{ssec:existing-other} 
     292 
     293%% BVR~\cite{ref:bvr} is a geographical location-based routing protocol for 
     294%% L2Ns.  A set of landmark beacons are selected whose identity is known by all 
     295%% nodes in the network.  Every node is given a virtual coordinate, which is the 
     296%% vector of distances to all beacons in the network.  When a node wants 
     297%% to send a packet to a destination node, it greedily forwards the packet along 
     298%% routes that move it closer to the destination.   
     299%% %% If it is unable to do this (i.e. gets stuck at some point), it forwards the packet to the beacon closest to the destination.  In the case that this beacon is not able to use greedy routing to forward to the destination, scoped flooding is used to find the destination.  As demonstrated in other papers~\cite{s4-nsdi},  
     300%% BVR suffers from poor transmission and routing stretch, particularly as the 
     301%% network grows in size \cite{s4-nsdi}.  %% Variations to improve performance trade 
     302%% %% off resources by maintaining the entire 2-hop neighbor table in memory. 
     303%% Another difficulty is that in such a network, the identity of the beacons must 
     304%% be known {\it a priori}, and can not be modified dynamically without a location 
     305%% service \cite{bvr-location}. 
     306%% %% .  A BVR location 
     307%% %% service~\cite{bvr-location} was designed, but struggled because it still 
     308%% %% necessitated significant network overhead and state. 
     309 
     310%% S4~\cite{s4-nsdi}, is designed to provide small stretch and state in order to 
     311%% enable scalable routing.  It utilizes a modified compact routing algorithm 
     312%% tailored for L2Ns. %%  A subset of nodes in the network are designated as beacons, 
     313%% %% with $\sqrt{N}$ being the optimal number.  A node $n$'s local neighborhood 
     314%% %% cluster is defined as the set of nodes whose distance to $n$ is less than or 
     315%% %% equal to the distance to their closest beacon.  Each node maintains routes to 
     316%% %% all other nodes in their cluster, and all beacons in the network.  A node can 
     317%% %% route to nodes outside its cluster by routing to the destination node's 
     318%% %% closest beacon.   
     319%% %% However, this feature requires a directory service that the 
     320%% %% node can contact to obtain this information, and furthermore a beacon 
     321%% %% selection mechanism is needed, both for initial beacon selection, as well as 
     322%% %% under failure conditions, and the number of beacons can not be easily modified 
     323%% %% during runtime.   
     324%% S4 has a theoretical worst case routing stretch of $3$, 
     325%% although published results indicate an average routing stretch of $1.2$ while maintaining 
     326%% $\mathit{O}(\sqrt{N})$ state.  Like BVR, deploying S4 would require also 
     327%% designing a distributed directory service.  Neither of these protocols 
     328%% optimize for the common workload where most traffic is destined towards an 
     329%% egress point; unlike S4 and BVR's homogeneous network, {\lowthane} starts with 
     330%% optimizing collection and emphasizing heterogeneity. 
     331 
     332%%   %% In section~\ref{sec:tossim}, we compare {\lowthane} to S4, 
     333%% %% demonstrating that we achieve greater reliability, with smaller stretch, 
     334%% %% state, and control traffic. 
     335 
     336%% %Finally Hui's work explores extending the 
     337%% %Internet architecture to L2Ns, and develops the algorithms necessary for 
     338%% %default route formation in \cite{hui}.  This work has been invaluable in 
     339%% %helping us to frame our architecture. 
     340 
     341%% \subsection{Ad-Hoc Networking Protocols} 
     342 
     343%% Finally, there is also an extensive literature on fully distributed routing 
     344%% protocols for ad-hoc wireless networks. % and L2Ns in particular. %% The 
     345%% %MANET~\cite{MANET} working group focuses 
     346%% %% on such protocols and issues. 
     347%% AODV~\cite{aodv} uses flooded RREQ messages to discover paths to destinations 
     348%% on demand, with intermediate nodes creating hop-by-hop entries for 
     349%% bidirectional flows.  DSR~\cite{dsr} is also an on-demand routing protocol, 
     350%% but uses source routing to route packets.  OLSR~\cite{olsr} is a link state 
     351%% algorithm that uses multipoint relays to reduce the flood of link-state 
     352%% advertisements.  802.11s, a draft mesh routing standard uses a variant of AODV 
     353%% called AODV-RM for intra-subnet point-to-point communication, but optimizes 
     354%% for an access network's workload by proactively building a routing tree towards 
     355%% egress points \cite{80211s}. 
     356 
     357%% These and other protocols provide point-to-point routing capabilities in 
     358%% wireless networks.  Both on-demand and link-state based solutions exist, but 
     359%% the focus is on providing reliable any-to-any delivery in networks with mobile 
     360%% nodes that are resource rich.  Consequently, most of these protocols have 
     361%% large control traffic and/or state requirements.  A protocol 
     362%% survey~\cite{ietf-protocol-survey} recently evaluated the specifications of 
     363%% the most widely used ad-hoc protocols against five criteria, and 
     364%% noted that in their current form no ad-hoc protocol meets more than 
     365%% three of the five criteria.%%   L2N-specific implementations of some of these 
     366%% %% protocols exists, such as TinyAODV and TYMO, but have received limited 
     367%% %% exposure and acceptance due to their poor (or non-existent) evaluations and 
     368%% %% limited feature-sets. 
     369 
     370 
     371%% \subsection{Internet Solutions} 
     372%% \label{sec:related-centralize} 
     373 
     374%% The concept of centralized routing, or separating the control and data planes, 
     375%% is not a novel one; we explore several existing protocols later in this 
     376%% section that embrace a centralized paradigm.  However, these solutions 
     377%% were designed primarily for the Internet, characterized by 
     378%% low churn and high bandwidth.  This is in stark contrast to the 
     379%% high-churn, low-bandwidth nature of typical L2N environments. %% Consequently, 
     380%% %% we discuss three factors that are critical to the success of centralized 
     381%% %% solutions, and while trivial to achieve in general networking environments, 
     382%% %% present the main obstacles to a centralized solution. 
     383 
     384%% % \subsection{Internet Routing Protocols} 
     385 
     386%% %% We discuss two separate bodies of work in this section: Centralized Internet 
     387%% %% Routing Architectures / Protocols, and IP-Based Ad-Hoc Networking 
     388%% %% Protocols(i.e. MANET~\cite{MANET}). 
     389 
     390%% % \subsubsection{Centralized Routing Architectures} 
     391 
     392%% The most conceptually similar work is Ethane~\cite{ethane}, %% , the successor to SANE~\cite{sane} and the predecessor to 
     393%% %% NOX~\cite{nox},  
     394%% a centralized architecture for implementing high-level 
     395%% security policies.  Designed for large enterprise networks, the network is 
     396%% divided into four tiers: controllers, %  (or multiple ones for fault tolerance), 
     397%% switches, end hosts or servers, and users.  %% Whenever a switch is connected to 
     398%% %% the network, it registers with the controller, and establishes a secure 
     399%% %% backchannel to it.  When an end host joins the network, the switch also 
     400%% %% reports this registration, reporting the class of device and attachment port. 
     401%% %% Finally, users are able to associate with one or more end host machines. 
     402%% %% Consequently, the controller maintains a complete and accurate view of the 
     403%% %% topology.  
     404%% Network administrators specify high-level access policies, such as 
     405%% restricting which servers a certain user may communicate with.  Each switch 
     406%% maintains a flow table against which it can classify incoming packets using 
     407%% a packet header's 10-tuple.  If no applicable entry exists, the packet is 
     408%% forwarded to the controller.  The controller consults its policy 
     409%% specification, and installs flow entries along the path. 
     410%% %% {\lowthane} utilizes this concept of installing flow entries and forwarding 
     411%% %% packets belonging to unclassifiable flows to the controller. 
     412%% %Ethane~\cite{ethane} is the work that is the most similar to ours.  Designed 
     413%% %for large enterprise networks, the network is divided into three tiers: a 
     414%% %controller (or multiple ones for fault tolerance), switches, and end hosts or 
     415%% %servers.   
     416%% % Whenever a switch is connected to the network, it registers with the 
     417%% % controller, and establishes a secure channel to it.  When a server or 
     418%% % middlebox joins the network, it also registers with the controller, reporting 
     419%% % the connecting switch and port.  Consequently, the controller maintains a 
     420%% % complete and accurate view of the topology.  Each switch maintains a flow 
     421%% % table in which it can match packet 5-tuples.  If no applicable entry exists, 
     422%% % the packet is forwarded to the controller.  The controller consults its policy 
     423%% % specification, and installs hop-by-hop 
     424%% % flow entries along the path. 
     425 
     426%% Key elements visible in this design include forming a channel and ``default 
     427%% routing'' to a controller, and maintaining global topology.  At a high level, 
     428%% {\lowthane} can be considered an adaptation or instantiation of the Ethane 
     429%% architecture.  Bringing Ethane together with low-power wireless 
     430%% brings out many of the assumptions underlying Ethane's operation, which must 
     431%% be addressed for the system to be practical; we explictly note three of the 
     432%% most important assumptions in the next section. 
     433 
     434%% % {\lowthane} also 
     435%% % addresses these issues, but the details differ significantly when no wired 
     436%% % infrastructure is available. 
     437 
     438%% %%  are present in the {\lowthane} design.  However, 
     439%% %% elements which Ethane takes for granted like {\controller} reachability and 
     440%% %% topology inference must be carefully addressed in the more dynamic L2N setting. 
     441 
     442%% %% Ethane assumes the availability of a secure and reliable channel to the 
     443%% %% controller, and also lends itself to trivial construction of a global topology 
     444%% %% view.  Network switches all have ample bandwidth and memory, allowing Ethane 
     445%% %% to support 70,000 flows / second at line speeds.  Finally, Ethane operates on 
     446%% %% top of layer 2 switching, which performs automatic failure rerouting to direct 
     447%% %% a packet to the addressed next hop.  While we discuss Ethane's successor, 
     448%% %% NOX~\cite{nox}, further in chapter~\ref{ch:player-centralized}, we note that 
     449%% %% architecturally it is the same as Ethane and hence suffers from the same 
     450%% %% limitations in the context of ROLL. 
     451 
     452%% A number of other centralized internet routing designs exist, which we mention 
     453%% briefly for completeness.  Feamster's Routing Control Platform, implemented by 
     454%% Caesar, made the case for separating the control aspect of routing from 
     455%% routers \cite{feamster-separating, caesar-rcp}.  Greenberg {\it et al.} 
     456%% present 4D~\cite{4d,tesseract}, a general framework for separating routing from routers. 
     457%% The framework consists of four components: a decision plane, a dissemination 
     458%% plane, a discovery plane, and a data plane. 
     459 
     460 
     461%%   %% Designed for both intra-AS (autonomous 
     462%% %% system) and inter-AS routing,  
     463%% %% The motivation was that by distributing all 
     464%% %% aspects of routing, interactions became increasingly complex, heavy bandwidth 
     465%% %% usage resulted from topology changes, and forwarding loops and routing 
     466%% %% failures that were incredibly difficult to diagnose continually presented 
     467%% %% themselves.  To alleviate such difficulties, RCP serves as the primary 
     468%% %% interaction point for all routers within an AS.  %% By participating in the 
     469%% %% network, RCP was able to obtain a complete view of the topology.  
     470%% %% A network 
     471%% %% operator could specify the routing policy at a single place (which typically 
     472%% %% was to simply mimic that of a fully-meshed iBGP AS), and then the RCP node 
     473%% %% would advertise the correct route to each router in the network. 
     474%% %% RCP would 
     475%% %% then serve the role of an egress router in communicating routing information 
     476%% %% with other AS's using eBGP (allowing for a smooth incremental deployment).  
     477%% %%  As 
     478%% %% an ideal end-state, all AS's would have an RCP, and these RCP's would 
     479%% %% communicate with each other to exchange routing information. 
     480%% %% Caesar {\it et al.} presented an actual design and implementation of an 
     481%% %% RCP~\cite{caesar-rcp}.  %% Leveraging the existing OSPF and BGP protocols in 
     482%% %% place in most existing networks, the implementation had three distinct 
     483%% %% components: an IGP Viewer, a BGP Engine, and a Route Control Server (RCS). 
     484%% %% The IGP Viewer obtains IGP topology information by building adjacencies with 
     485%% %% routers in the network.  The BGP Engine learns BGP routes from the routers, 
     486%% %% and also communicates routing assignments to each router.  Finally, the RCS 
     487%% %% uses the information obtained by the other two components to compute routing 
     488%% %% assignments according to some specified policy, which currently defaults to 
     489%% %% full-mesh iBGP configurations.  
     490%% %% A key decision was distributing RCS nodes to avoid creating a single point of 
     491%% %% failure, but they are not precise replicas and there are only loose 
     492%% %% consistency requirements. %, which are detailed in the paper. 
     493%% %% While the creation of a distinct control hierarchy certainly permeates our 
     494%% %% design, the practicality of RCP for L2Ns is limited.   
     495%% %% However, the design focuses on only 
     496%% %% the intra-AS aspect, leveraging full underlying IP and TCP connectivity, as 
     497%% %% well as OSPF and BGP for information dissemination, which neither exist nor 
     498%% %% are practical in L2Ns.  A key part of our design involves designing 
     499%% %% mechanisms that provide this underlying connectivity and accurately 
     500%% %% constructing and maintaining global topology views. %%  Furthermore, full 
     501%% %% consistent connectivity is maintained by all nodes in the network at all times 
     502%% %% (often through flooding of Link State Advertisements), resulting in 
     503%% %% significant memory and network overhead, as opposed to on-demand schemes that 
     504%% %% are often a better fit for ROLL traffic patterns.  The lossy and dynamic 
     505%% %% nature of low-power wireless would only exacerbate network congestion in such 
     506%% %% an architecture.  Finally, their attempts at reducing overhead relies on 
     507%% %% prefix-based IP addressing, conflating node address and geographical location, 
     508%% %% while ROLL networks primarily embrace flat-label based addressing schemes. 
     509 
     510%% %% decision plane, composed of a single logical decision element (which is 
     511%% %% replicated for fault tolerance), serves as the {\emph{network brain}}, serving 
     512%% %% as the interface for network-wide policy specification, which is subsequently 
     513%% %% communicated to individual routers and switches.  The dissemination plane is 
     514%% %% tasked with creating the logical channels between the decision element and 
     515%% %% each router and switch, providing a medium for control information flow, 
     516%% %% independent of the state of the data plane.  The discovery plane refers to the 
     517%% %% process in which each router/switch discovers its own capabilities, its 
     518%% %% neighboring routers/switches (including those in other ASes), and communicates 
     519%% %% this information across the dissemination plane to the decision element, 
     520%% %% allowing for the creation of a global view of the network.  Finally, the data 
     521%% %% plane is greatly simplified, responsible only for forwarding packets according 
     522%% %% to the forwarding table entries installed by the decision element. 
     523%% %% As a framework paper, 4D does not provide any specific instantiations of any 
     524%% %% of the components.   
     525%% %% Although targeted towards the general Internet, 
     526%% %% the same modularization applies in the context of ROLL as well; our routing 
     527%% %% architecture for L2Ns can roughly be 
     528%% %% decomposed into the same four components.  This componentization is 
     529%% %% particularly effective in highlighting the characteristics that differentiate 
     530%% %% L2Ns from the general Internet.  The dissemination plane must be built over 
     531%% %% a series of lossy unreliable links, in which the {\emph{optimal}} paths are 
     532%% %% potentially shifting.  %% Low power wireless also complicates the 
     533%% %% discovery plane, as there is a degree of dynamicity, even if the nodes 
     534%% %% are physically stationary, and also power duty cycles that reach 0.1\% and 
     535%% %% below. 
     536%% %% Furthermore, energy constraints prohibit excessive control traffic, which 
     537%% %% complicates communication of this discovery information to the decision plane. 
     538%% %% Finally, the data plane must provide certain reliability and delivery 
     539%% %% guarantees in the face of lossy and moving paths.  Yan {\it et al.} present 
     540%% %% Tesseract~\cite{tesseract}, a faithful instantiation of the 4D framework and 
     541%% %% concept.  %% Switches use periodic beaconing to determine their 
     542%% %% neighboring nodes, as well as their own capabilities, and then transmit this 
     543%% %% information to a {\emph{decision element}}, which forms the basis of the 
     544%% %% decision plane.  Path explorer messages are flooded through the network to 
     545%% %% enable the building of the dissemination plane.  Subsequently, the decision 
     546%% %% element sends all control messages to switches using source routes, and in 
     547%% %% turn, these switches reverse these source routes for sending information to 
     548%% %% the decision element.  The decision plane is implemented as an abstract model, 
     549%% %% in which specific algorithms are separated from the underlying link 
     550%% %% technologies, to provide an element of portability.  To provide robustness at 
     551%% %% the decision plane level, multiple decision elements are maintained at any 
     552%% %% given time, with all information being fully replicated across all of them. 
     553%% %% However, only one of them is a master (i.e. active) at any given time, while 
     554%% %% the rest are only used in the case of a hot standby.  The master is chosen 
     555%% %% through a leader-election process and arbitrarily assigned priorities. 
     556%% %% Security also plays a large role in Tesseract, with numerous mechanisms to 
     557%% %% enable encryption, certification, and authentication. 
     558%% %% Tesseract also provides the fully connected best path routing, which 
     559%% %% necessitates large amounts of control traffic, and state at the nodes that 
     560%% %% scales with the size of the network. %% The 
     561%% %% dissemination plane relies on source routing in both directions, stripping the 
     562%% %% switches of any local flexibility to deal with unreliable links, and dynamic 
     563%% %% topology shifts, as is common in L2N networks.  This results in additional 
     564%% %% control traffic for every instance of packet delivery failure, as does the 
     565%% %% need for full information at decision elements. 
     566 
     567 
     568%% %\begin{itemize} 
     569%% %\item{Tempest} 
     570%% %\item{FIRE} 
     571%% %\end{itemize} 
     572 
     573%% % \subsubsection{Stable Backchannel To Centralized Controller} 
     574 
     575%% \subsection{Towards a Hybrid Solution} 
     576 
     577%% From this disparate related Internet work, we distill a three requirements for 
     578%% successful centralized control.  
     579%% \begin {description} 
     580%% \item[Reliable Path to Centralized controller]  
     581 
     582%% A logical controller serves as the brain of the network, gathering 
     583%% information from all the routers in the network, and disseminating 
     584%% control commands.  One of the key implications of this role hierarchy is the 
     585%% need for a stable channel between the controller and all routers in the 
     586%% network at all times.  %% In many cases, this channel uses logically different 
     587%% %% links than the data plane, as its reliability and stability must be ensured. 
     588%% In large wired networks, creation of this channel is simplified by underlying 
     589%% data link layers: in Ethernets, the 
     590%% underlying layer-2 spanning tree algorithm provides reachability between all 
     591%% switches and end-hosts in a subnet.%% , and lookups can be accomplished using ARP. 
     592%% %% This is feasible because of the local broadcast nature of communications on a 
     593%% %% subnet.  Extending this reachability across multiple subnets can be done by 
     594%% %% introducing layer-3 routing elements as well.  Furthermore, once this channel 
     595%% %% has been established, it remains relatively stable, except in cases of severe 
     596%% %% congestion or physical partitioning, which are outside of the scope of our 
     597%% %% problem. 
     598%% % In contrast, no such natural mechanisms exist in L2N environments, and so the 
     599%% % network layer must build a robust path to the controller.   
     600%% % Much 
     601%% Collection protocols may be used to provide just this functionality. 
     602%% %% The 
     603%% %% link-layer provides only single-radio-hop communications, with no inherent 
     604%% %% mechanisms for building neighbor tables.   
     605%% %% Networks can be relatively deep, 
     606%% %% with diameters of 20 or more hops possible~\cite{ietf-industrial}. 
     607%% %% Furthermore, the quality of links vary widely, both spatially and temporally, 
     608%% %% creating a lossy path that is highly dynamic even without mobility. 
     609%% %% Finally, this channel must be maintained with minimal control traffic and 
     610%% %% network overhead, per the requirements outlined previously. 
     611%% \item[Consistent Global View of Topology]  
     612%% One benefit of centralized routing is the ability to make routing decisions at 
     613%% a central location that has complete information about the state of the entire 
     614%% network. %%  In to order to create such a global view, all the elements of the 
     615%% %% network must register upon joining, and then provide updates of new neighbors 
     616%% %% and changes in the state. 
     617%% %%   In many existing solutions, there are periodic 
     618%% %% heartbeats that are sent in the form of link state advertisements, which 
     619%% %% allows the controller to maintain this global view. 
     620%% Three main factors complicate this task in L2Ns: control traffic restrictions, 
     621%% memory constraints, and dynamic topology.   
     622%% Only a trickle of bandwidth is generally available to propagate network state towards 
     623%% a control element. 
     624%% %% Control traffic must not exceed 
     625%% %% the data rate, and so consistent periodic link state advertisements with 
     626%% %% updates sent to the controller are not practical.  %% Furthermore, the 
     627%% %% loss-response ROLL criteria dictates that control traffic in response to a 
     628%% %% loss must be limited to active paths or the single-hop broadcast domain, 
     629%% %% preventing triggered global updates.  
     630%% Links exhibit temporal 
     631%% and spatial variability, which complicates %: links are not binary, so  
     632%% maintaining a consistent 
     633%% view of network links. % qualities is difficult. %% would require significant amounts of control 
     634%% %% traffic.  
     635%% Finally, memory constraints prevent individual nodes from 
     636%% maintaining state for all nodes in the network, or even the single-hop 
     637%% broadcast neighborhood.  Therefore, forming the complete topology may 
     638%% not be possible.%; we investigate the extent to which this is necessary. 
     639 
     640%% %% , leading to the unavailability of needed information, 
     641%% %% even if control traffic was not problematic. 
     642 
     643%% \item [Providing Reliability Over Lossy Links] 
     644%% The target environment for the majority of existing routing protocols is often 
     645%% highly-reliable, in the form of either wired links, or single-hop wireless 
     646%% links that provide relatively high reliability, through both link-layer and 
     647%% typically transport layer retransmissions. 
     648%% In L2Ns, the low-power nature of the ratio leads to a small 
     649%% Signal-to-Interference-and-Noise Ratio (SINR) on many links.  As demonstrated 
     650%% by Srinivasan {\it et al.} \cite{underlying}, the SINR of these links is often on 
     651%% the cusp of the necessary threshold to receive a packet, which results in 
     652%% bimodal links due to variations in signal strength and interference levels. 
     653%% Consequently, accurate link estimation and local repair techniques are critical.%% , both in accurately 
     654%% %% calculating a links quality/cost, and also in identifying and accounting for 
     655%% %% temporal shifts in these link qualities in order to minimize transmission 
     656%% %% stretch while maintaining reliability. 
     657%% \end{description} 
     658 
     659%% Although it is not obvious that any element of centralized control is a 
     660%% natural fit for L2N routing since existing centralized solutions rely on 
     661%% assumptions which do not automatically hold in this setting, research in the 
     662%% sensor network literature can be re-cast to help achieve the necessary 
     663%% prerequisites. 
     664 
     665%% %% policy 
     666%% % Futhermore, none of the related wireless work we have considered has addressed 
     667%% % the need for policy routing in wireless networks.  The ability to optimize for 
     668%% % metrics other then ETX or hop-count, and to do so ``on the fly'' becomes 
     669%% % increasingly important as networks begin to be composed of devices with 
     670%% % different power levels, processing rates, and security capabilities, it 
     671%% % becomes desirable to have a framework with which to tie together all these 
     672%% % metrics and optimize over the joint metrics and policy space. 
     673 
     674 
     675%% %% Finally, one way of providing effective routing is to utilize meaningful, or 
     676%% %% routable, addressing schemes.  One of the reasons that IP prefix-based routing 
     677%% %% is effective is that IP-addresses often tie together a name and geographical 
     678%% %% location.  %% In other words, except in very rare cases (such as Mobile IP 
     679%% %% %% solutions), all IP addresses with a shared prefix (of varying length depending 
     680%% %% %% on the size of subnet) are geographically co-located, allowing remote routers 
     681%% %% %% to maintain a single routing entry for that block of addresses. 
     682%% %% This reduces 
     683%% %% both memory state at routers, and the size of control traffic. 
     684%% %% This conflation of address and location is difficult in L2Ns.%%   We discussed 
     685%% %% %% examples of this in section~\ref{ssec:existing-other} in the form of 
     686%% %% %% geographic routing and compact routing schemes.  
     687%% %% The challenge in implementing 
     688%% %% such a design is the dynamic nature of the network \cite{shenkergeo}.  Varying link qualities 
     689%% %% cause optimal routing paths to change, which would imply that the router's 
     690%% %% address would have to adjust accordingly.%%   This would require a lookup service 
     691%% %% %% for nodes to report updates to, and request lookups from.  This is in fact one 
     692%% %% %% of the main shortcomings of proposed geographic and compact routing schemes in 
     693%% %% %% the L2N space.  Despite the shortcomings of existing solutions, this may be a 
     694%% %% %% potential avenue that warrants additional research. 
     695 
     696 
     697 
     698 
     699 
     700%% %Multiprotocol Label Switching~\cite{mpls}, or MPLS, is a commonly used example of centralized routing in large networks.  Network operators setup explicit paths, often either to provide QoS or VPN services, by installing label entries at Label Switch Routers in the network.  Packets then carry a label, or even potentially a stack, which are attached at Label Edge Routers, allowing for fast lookups at each Label Switch Router.  This setup assumes a relatively static network, and paths are manually set-up, with no clear correlation with data traffic. 
     701