Changeset 2772

Show
Ignore:
Timestamp:
10/30/09 11:23:14 (4 weeks ago)
Author:
stevedh
Message:

checkpoint

Location:
docs/Lowthane/ipsn10
Files:
3 modified

Legend:

Unmodified
Added
Removed
  • docs/Lowthane/ipsn10/design.tex

    r2766 r2772  
    9696%% packets is volatile and potentially inaccurate.  Instead, w 
    9797 
    98 When presented with a new link, {\lowthane} initially use a {\emph{spot}} link quality metric, {\it e.g.} a hardware LQI estimate, to evaluate the quality of the link.  For each link, {\lowthane} maintains a Confidence value, currently the number of transmissions that have used that link.  Once this value crosses a certain threshold, the link is considered {\emph{mature}}, and is then evaluated using the overall link cost metric, ETX or expected transmissions.We use link-layer acknowledgement frames on all unicast traffic, and base our link quality estimate on the ack reception rate: consequently, our link estimator strongly prefers bidirectional links.  To account for the temporal variability of links, {\lowthane} maintains both long-term and short-term link estimates.  Both estimators use the same metric (ETX) but with different time horizons so sa to manage the agility/stability tradeoff. 
     98When presented with a new link, {\lowthane} initially use a {\emph{spot}} link quality metric, {\it e.g.} a hardware LQI estimate, to evaluate the quality of the link.  For each link, {\lowthane} maintains a Confidence value, currently the number of transmissions that have used that link.  Once this value crosses a certain threshold, the link is considered {\emph{mature}}, and is then evaluated using the overall link cost metric, ETX or expected transmissions.We use link-layer acknowledgement frames on all unicast traffic, and base our link quality estimate on the ack reception rate: consequently, our link estimator strongly prefers bidirectional links.  To account for the temporal variability of links, {\lowthane} maintains both long-term and short-term link estimates.  Both estimators use the same metric (ETX) but with different time horizons so as to manage the agility/stability tradeoff. 
    9999%When presented with 
    100100%a new link, {\lowthane} initially uses a {\emph{spot}} link quality metric, 
     
    396396Our implementation of {\lowthane} is built on top of the {\tt blip} 
    397397IPv6 stack using {\tt 6Lowpan}~\cite{6lowpan} and the TinyOS-2.1 operating 
    398 sytem~\cite{tinyos}.  % Unless otherwise stated, our implementation uses the 
     398system~\cite{tinyos}.  % Unless otherwise stated, our implementation uses the 
    399399% default settings highlighted in Section~\ref{sec:design}.   
    400400It is portable to 
    401 most Tinyos 2 platforms, including {\tt telosb}, {\tt micaz}, {\tt iris}, and 
     401most TinyOS 2 platforms, including {\tt telosb}, {\tt micaz}, {\tt iris}, and 
    402402{\tt epic}.  All experiments were performed using the {\tt epic} platform 
    403 \cite{epic}. 
    404  
     403\cite{epic}.  Part of our contribution is making available a 
     404well-tested, open implementation of {\lowthane}, which is currently the 
     405default routing protocol used in the {\tt blip} IPv6 stack. 
     406 
  • docs/Lowthane/ipsn10/evaluation.tex

    r2741 r2772  
    88range of workloads and failure conditions. 
    99 
    10 \subsection{Evalutation} 
     10\subsection{Questions} 
    1111 
    1212Centralized routing over low-power and lossy wireless networks raises 
    1313many concerns that must be addressed in a successful design.  These 
    14 concerns, and {\lowthane}'s response to them, include the following: 
    15  
    16 {\bf Reliability.}  The lossy nature of wireless mesh networks means 
    17 that implementing reliability at end hosts alone is insufficient and 
    18 that support for both hop-by-hop retransmissions and end-to-end route 
    19 adaptations are needed. 
     14concerns include the following: 
     15 
     16\begin{description}\addtolength{\itemsep}{-0.5\baselineskip} 
     17 
     18\item[Reliability]  Does the protocol successfully deliver packets? 
     19 
     20% The lossy nature of wireless mesh networks means 
     21% that implementing reliability at end hosts alone is insufficient and 
     22% that support for both hop-by-hop retransmissions and end-to-end route 
     23% adaptations are needed. 
    2024 
    2125%Figures~\ref{fig:collection-acme}, 
     
    2327%dynamics on end-to-end reliability. 
    2428 
    25 {\bf Convergence.}  Centralized routing protocols must gather topology data 
    26 at one (or more) ``central'' locations before routing can take place, and 
    27 so link dynamics, coupled with constraints on control traffic, can 
    28 make converging on a consistent view of the network a challenge. 
     29\item[Convergence]  How long does the protocol take to find routes?   
     30 
     31% Centralized routing protocols must gather topology data 
     32% at one (or more) ``central'' locations before routing can take place, and 
     33% so link dynamics, coupled with constraints on control traffic, can 
     34% make converging on a consistent view of the network a challenge. 
    2935 
    3036%Figures~\ref{fig:top-report},~\ref{fig:stretch30-cdf}, 
     
    3238%stable topology from a cold start. 
    3339 
    34 {\bf Stretch.}  Near optimal routes, with respect to metrics like 
    35 transmission stretch, are key to conserving energy 
    36 and lowering congestion in L2Ns.  Since stretch can only be defined with 
    37 respect to a topology, and it is essentially impossible to measure the 
    38 ``true'' topology, we compare to a particular instance of the sampled 
    39 topology. 
     40\item[Stretch]  How ``good'' are the routes the protocol uses? 
     41 
     42%  Near optimal routes, with respect to metrics like 
     43% transmission stretch, are key to conserving energy 
     44% and lowering congestion in L2Ns.  Since stretch can only be defined with 
     45% respect to a topology, and it is essentially impossible to measure the 
     46% ``true'' topology, we compare to a particular instance of the sampled 
     47% topology. 
    4048%Figures~\ref{fig:concurrency-smote-trans}, 
    4149%~\ref{fig:sfailure-smote-trans}, and~\ref{fig:mfailure-smote-trans} 
    4250%evaluate the stretch incurred using {\lowthane}. 
    4351 
    44 {\bf Agility/Stability.}  Centralized routing protocols incur delay 
    45 when responding to local transients like link and node churn. 
     52\item[Agility/Stability]  How quickly can {\lowthane} respond to changes in 
     53topology? 
     54 
     55% Centralized routing protocols incur delay 
     56% when responding to local transients like link and node churn. 
    4657 
    4758%Figures~\ref{fig:sfailure-smote} and~\ref{fig:mfailure-smote} evaluate 
     
    4960%attempts to maintain stable paths under these transients. 
    5061 
    51 {\bf Scalability.}  Centralizing the points of control can make 
    52 scaling a network difficult, and routing in large networks with 
    53 limited state and constrained control traffic further exacerbate this 
    54 scaling challenge.   
     62\item[Scalability]  How large a network does {\lowthane} support, and how well 
     63does it support concurrent flows? 
     64 
     65\end{description} 
     66 
     67% Centralizing the points of control can make 
     68% scaling a network difficult, and routing in large networks with 
     69% limited state and constrained control traffic further exacerbate this 
     70% scaling challenge.   
    5571 
    5672%Figures~\ref{fig:concurrency-motelab} 
     
    7591or on the ceiling.  An additional eight nodes are installed in a remote 
    7692residential environment, resulting in a total deployment size of 57 nodes. 
     93Table \ref{tab:deployments} summarizes the locations used to evaluate {\lowthane}. 
    7794 
    7895For the baseline workload in these experiments, all nodes report data 
     
    8198and in our experiments, we use ICMP ping messages, separated by 2 
    8299seconds for multi-packet flows.  In addition, all experiments are 
    83 given time to bootstrap the network topology formation, except as 
    84 noted. 
     100given time to bootstrap the network topology formation.% , except as 
     101% noted. 
    85102 
    86103\begin{table*}[th] 
     
    100117  Testbeds A and B, and Deployment 1; the other deployments were  
    101118  ``production'' and were not suitable for diagnostics.} 
     119\label{tab:deployments} 
    102120\end{center} 
    103121\end{table*} 
     
    115133design. 
    116134 
    117 \SDHnote{too whiny?  nobody likes a whiner.} 
    118 In general, this space suffers from a plethora of routing protocol 
    119 specifications and a lack of robust, open implementations.  DYMO in TinyOS is 
    120 still under heavy development, while tests with S4 found significant problems 
    121 with its forwarding engine which limit its performance when multiple flows are 
    122 present (a result reflect in their reported measurements) \cite{s4}. 
    123 Futhermore, it does not contain an implementation of the location service 
    124 necessary for practial use.  Part of our contribution is making available a 
    125 well-tested, open implementation of {\lowthane}, which is currently the 
    126 default routing protocol used in the {\tt blip} IPv6 stack. 
     135% \SDHnote{too whiny?  nobody likes a whiner.} 
     136% In general, this space suffers from a plethora of routing protocol 
     137% specifications and a lack of robust, open implementations.  DYMO in TinyOS is 
     138% still under heavy development, while tests with S4 found significant problems 
     139% with its forwarding engine which limit its performance when multiple flows are 
     140% present (a result reflect in their reported measurements) \cite{s4}. 
     141% Futhermore, it does not contain an implementation of the location service 
     142% necessary for practial use.   
    127143 
    128144% Policy routing is an important component our architecture; we feel that moving 
     
    146162\centering 
    147163\subfigure[Real-world deployment of ACme nodes]{ 
    148   \includegraphics[width=0.48\columnwidth]{figs/acme-yield.pdf} 
     164  \includegraphics[width=0.45\columnwidth]{figs/acme-yield.pdf} 
    149165  \label{fig:collection-acme} 
    150166} 
    151167\subfigure[Testbed comparison of CTP and {\lowthane} yields, over a workday] { 
    152   \includegraphics[width=0.48\columnwidth]{figs/CTP_BLIP.pdf} 
    153   \label{fig:collection-acme} 
     168  \includegraphics[width=0.45\columnwidth]{figs/CTP_BLIP.pdf} 
     169  \label{fig:collection-ctp-blip} 
    154170} 
    155171\caption{CDF of collection packet delivery ratio observed over three days in a 
    156   real-world deployment.} 
     172  real-world deployment.  {\lowthane}'s collection performance suffers 
     173  slightly when faced with interference on a week day 
     174  (\ref{fig:collection-acme}), but compares favorably with state-of-the-art 
     175  CTP in testbed experiments.} 
    157176\label{fig:collection} 
    158177\end{figure} 
     
    188207%       \label{fig:stretch5-cdf} 
    189208%       \includegraphics[width=.55\columnwidth]{figs/StretchCDF-5min.pdf}} 
    190 \caption{Average degree of a node in the {\controller}'s global topology view as a function of time, topology report rate, and rate of exploration, and the corresponding stretch.} 
     209      \caption{Average degree of a node in the {\controller}'s global topology 
     210        view as a function of time, topology report rate, and rate of 
     211        exploration, and the corresponding stretch.  Basic connectivity is 
     212        acquired quickly, after which local exploration continues to evaluate 
     213        the topology continuously.} 
    191214\label{fig:top-and-stretch} 
    192215\end{figure} 
     
    210233%In {\lowthane}, the global topology view maintained by the {\controller} enables it to route packets to destinations within the subnet, and install efficient routes in the network for active flows.  However, the optimality of the routes is determined by the accuracy and completeness of its global view, which is a function of the topology reports received from the nodes.  Given the constrained nature of L2Ns, quantifying the tradeoff between optimality and control overhead becomes critical, particularly when the network is first being initialized and the global topology view is empty. 
    211234We begin by examining the average degree of nodes in the global topology view 
    212 at the {\controller}.  The intuition is that a larger average degree indicates 
    213 that an increasing portion of the real topology has been captured, and so that 
    214 links which reduce stretch are more likely to have been discovered.  Two 
     235at the {\controller}.  T% he intuition is that a larger average degree indicates 
     236% that an increasing portion of the real topology has been captured, and so that 
     237% links which reduce stretch are more likely to have been discovered.   
     238Two 
    215239factors determine node degree: the Topology Report Interval, which dictates 
    216240how often nodes propagate information to the {\controller}, and the 
     
    222246The key implication is that {\emph{basic connectivity is established very 
    223247    rapidly by the first few topology reports}}, while converging on a stable 
    224 topology is a lengthy (and potentialy endless) process.%   Focusing on the 
     248topology is a continuous process.%   Focusing on the 
    225249% topology report interval, we see that the 30-second interval achieves a higher 
    226250% average node degree than the 5-minute interval, which was to be expected since 
     
    259283%To calculate a rough approximation, we ran a broadcast test on the testbed.  In turn, each node broadcasts 100 packets, separated by 100ms, and each node that receives the packet logs the event.  Using these logs, we can calculate the packet delivery ratio of a particular link, and calculate its inverse, ETX.  To calculate the average transmission stretch, for each source and destination pair, we calculate the route cost of the optimal path using the the topology report data set, and the broadcast test data set, and divide the two.  We do note that this is only an approximation; the broadcast test is only a snapshot at a given instance in time, and can not account for environmental interference sources that can alter the state of the network at the time that topology reports are collected.  However, it does provide a relative point of reference. 
    260284 
    261 Next, we look at the optimiality of 
     285Next, we look at the optimality of 
    262286routes formed based on the reported topology. 
    263287Figure~\ref{fig:stretch30-cdf} compares the 
     
    321345%% L2Ns.   
    322346To see {\lowthane}'s basic mechanisms in action, we first focus on a trace of 
    323 50 packets from a single flow running on testbed A.  The first measure of 
     34750 packets from a single flow running on testbed A.  The measure of 
    324348interest is ``transmissions per success,'' which counts the total number of link 
    325 layer transmissions necessary to deliver a packet from the source to the destination. 
     349layer transmissions necessary to deliver a packet from a source to a destination. 
    326350%  This flow was taken from an experiement on 
    327351% Testbed A, with 4 other concurrent flows, and with the Flow Table being empty 
     
    332356\centering 
    333357\includegraphics[width=0.7\columnwidth]{figs/BlowByBlow.pdf} 
    334 \caption{Number of transmissions for each packet of a single flow}\label{fig:blowbyblow} 
     358\caption{Number of transmissions for each packet of a single flow.  The first 
     359  packet triggers a Route Install message, which reduces the path length from 
     360  6 to 2.} 
     361\label{fig:blowbyblow} 
    335362\end{figure} 
    336363 
     
    367394to support a relatively large number of concurrent low bandwidth flows.  Concurrency stresses 
    368395many aspects of a routing protocol, especially {\it reliability} and {\it 
    369   scalability}.  While elements of this test stress the 
    370 forwarding engine rather than the routing protocol, routing can 
    371 harm these results by causing unnecessary congestion, especially surrounding 
    372 a central element like the \controller. 
    373  
    374 To evalute the impact of the centralized optimization of installing 
     396  scalability}.%   While elements of this test stress the 
     397% forwarding engine rather than the routing protocol, routing can 
     398% harm these results by causing unnecessary congestion, especially surrounding 
     399% a central element like the \controller. 
     400To evaluate the impact of the centralized optimization of installing 
    375401routes, we compare our protocol to itself with route installation disabled, so 
    376402that all packets are forced to use a default route through the \controller. 
     
    417443engine is ``reliable,'' and that it supports a reasonably large amount of 
    418444traffic.  The lower two graphs in Figure~\ref{fig:concurrent-smote-pdr} contain the total number of centralized control 
    419 transmissions {\it per flow}, and the total DAG maintainence traffic {\it per 
     445transmissions {\it per flow}, and the total DAG maintenance traffic {\it per 
    420446  node per minute}.  These generally show a low level of traffic.  The kink in 
    421447the center graph was caused by the protocol reacting to a link failure. 
     
    438464%the conclusion of the experiment.  
    439465Furthermore, there is {\it significantly greater 
    440   variablity without the centralized optimization}.  We also note that {\it 
     466  variability without the centralized optimization}.  We also note that {\it 
    441467  installing routes reduces 
    442468  stretch}, as Figure \ref{fig:concurrency-smote-trans} illustrates.  Testbed A 
     
    502528  load (left); Single Flow with multiple node failures (Center), Multiple 
    503529  Flows with significant node failures.  ``Transmissions per success'' the counts total 
    504    number of link-layer tranmissions necessary to deliver a packet from the 
     530   number of link-layer transmissions necessary to deliver a packet from the 
    505531   source to destination.  ``Transmissions per link'' is a measure of the 
    506532   quality of links selected.} 
     
    530556\caption{Packet delivery ratio and control traffic statistics for a single 
    531557  flow amidst node failures: each vertical line indicates that four nodes 
    532   failed, until finally only four nodes remain and the network is partitioned.} 
     558  failed, until finally only four nodes remain and the network is 
     559  partitioned.  Cached state is used to repair routes until nodes no longer 
     560  have viable parents.  The DAG protocol then works to find new links and 
     561  route around the failed nodes.} 
    533562\label{fig:sfailure-smote-pdr} 
    534563\end{figure} 
     
    537566ratio remains near 100\% until near the end of the experiment.  Added route 
    538567install traffic is clearly visible after a 
    539 route has been broken.  At the same time, the DAG maintainence traffic works in the background 
    540 to maintain reachablity to the controller.  Nodes use cached 
     568route has been broken.  At the same time, the DAG maintenance traffic works in the background 
     569to maintain reachability to the controller.  Nodes use cached 
    541570default routes until approximately 1500 seconds, when some nodes must begin 
    542571seeking out new default routes and therefore triggering additional DAG control traffic. 
     
    572601\includegraphics[width=0.85\columnwidth]{figs/Failure_Control-6.pdf} 
    573602\caption{Packet delivery ratio and control traffic statistics for multiple 
    574   flows amidst node failures.  Approximatly half the nodes are involved in a 
     603  flows amidst node failures.  Approximately half the nodes are involved in a 
    575604  flow, and the rest progressively fail until only the active nodes remain. 
    576605  Four nodes are removed at each vertical line.}\label{fig:mfailure-smote-pdr} 
     
    782811which is only 5K of code, and 524 Bytes of RAM including all routing state for 
    783812six individual flows as well as 210 Bytes of other static data.  Removing the 
    784 route installation functionalty and providing only triangle routing reduces 
     813route installation functionality and providing only triangle routing reduces 
    785814the code size by approximately 2kB. 
    786815 
  • docs/Lowthane/ipsn10/lowthane.tex

    r2766 r2772  
    5454\end{abstract} 
    5555 
    56 \category{I.5.3}{Computing Methodologies}{Pattern Recognition}{Clustering}[similarity measures] 
    57 \terms{Algorithms, Experimentation} 
    58 \keywords{keyword1, keyword2, keyword3} 
     56\category{C.2.2}{Computer Systems Organization}{Computer- Communication 
     57  Networks}[Network Protocols]  
     58% } {Pattern Recognition}{Clustering}[similarity measures] 
     59\terms{Design, Experimentation} %Algorithms, Experimentation} 
     60\keywords{networking, routing, point-to-point} % keyword1, keyword2, keyword3} 
    5961 
    6062