Changeset 2785
- Timestamp:
- 10/30/09 15:33:38 (4 weeks ago)
- Location:
- docs/Lowthane/ipsn10
- Files:
-
- 3 modified
-
design.tex (modified) (8 diffs)
-
discussion.tex (modified) (2 diffs)
-
evaluation.tex (modified) (12 diffs)
Legend:
- Unmodified
- Added
- Removed
-
docs/Lowthane/ipsn10/design.tex
r2784 r2785 96 96 %% packets is volatile and potentially inaccurate. Instead, w 97 97 98 When presented with a new link, {\lowthane} initially use a {\emph{spot}} link quality metric, {\it e.g.} a hardware LQI estimate, to evaluate the quality of the link. For each link, {\lowthane} maintains a Confidence value, currently the number of transmissions that have used that link. Once this value crosses a certain threshold, the link is considered {\emph{mature}}, and is then evaluated using the overall link cost metric, ETX or expected transmissions.We use link-layer acknowledgement frames on all unicast traffic, and base our link quality estimate on the ack reception rate: consequently, our link estimator strongly prefers bidirectional links. To account for the temporal variability of links, {\lowthane} maintains both long-term and short-term link estimates. Both estimators use the same metric (ETX) but with different time horizons so as to manage the agility/stability tradeoff.98 When presented with a new link, {\lowthane} initially use a {\emph{spot}} link quality metric, {\it e.g.} a hardware LQI estimate, to evaluate the quality of the link. For each link, {\lowthane} maintains a Confidence value, currently the number of transmissions that have used that link. Once this value crosses a certain threshold, the link is considered {\emph{mature}}, and is then evaluated using the overall link cost metric, ETX or expected transmissions.We use link-layer acknowledgement frames on all unicast traffic, and base our link quality estimate on the ack reception rate: consequently, our link estimator strongly prefers bidirectional links. To account for the temporal variability of links, {\lowthane} maintains both long-term and short-term link estimates. % Both estimators use the same metric (ETX) but with different time horizons to manage the agility/stability tradeoff. 99 99 %When presented with 100 100 %a new link, {\lowthane} initially uses a {\emph{spot}} link quality metric, … … 127 127 {\emph{Willingness}} value, which indicates the degree to which the 128 128 advertising node is willing to forward traffic for other nodes. Such a metric 129 accounts for a heterogeneous network, in whichbattery-powered and mobile130 nodes would rathershift forwarding responsibilities to mains-powered or stationary nodes.129 accounts for a heterogeneous network, and allow battery-powered and mobile 130 nodes to shift forwarding responsibilities to mains-powered or stationary nodes. 131 131 132 132 \subsubsection{Default Route Formation}\label{sssec:default-route} … … 173 173 174 174 Topology reports are sent to the {\controller} periodically using the default 175 route; they are opportunistically piggybacked on data traffic whenever an176 application generates sufficiently frequent traffic. The unpacked datagram in175 route; they are opportunistically piggybacked on data traffic whenever 176 application traffic is sufficiently frequent. The unpacked datagram in 177 177 Figure \ref{fig:arch} shows how topology information is added to data traffic 178 178 using an optional extension header; using this mechanism, the overhead is only … … 201 201 allows state to be installed in the network to optimize active flows. A 202 202 {\controller} uses Route Install messages to update a node's Flow Table. When a node 203 receives a Route Install, it inserts or updates an entry in the Flow Table. Each Route203 receives a Route Install, it creates or updates an entry in the Flow Table. Each Route 204 204 Install has two parts: 205 205 206 \begin{itemize} 206 \begin{itemize}\addtolength{\itemsep}{-0.5\baselineskip} 207 207 \item{{\bf{Flow Match:}} The criteria used to determine whether a 208 208 given packet matches a flow table entry. By default {\lowthane} uses … … 306 306 data, combined with their policy configuration. Policy rules must currently 307 307 be hard-coded into the route computation engine which runs Dijkstra's 308 shortest-path over the link state. We envision in the future adding a policy309 specification language similar to the packet classification languages used by310 {\tt packetfilter} or {\tt iptables}.308 shortest-path over the link state. % We envision in the future adding a policy 309 % specification language similar to the packet classification languages used by 310 % {\tt packetfilter} or {\tt iptables}. 311 311 312 312 \subsection{Multiple {\Controller}s} … … 363 363 their contents have been discussed in previous sections. First, The Default 364 364 Route Table is used to maintain a stable back-channel to a {\controller}; its 365 consistency is maintained by using a standard tree-formation protocol.% When365 consistency is maintained by using the default route formation algorithm.% When 366 366 % inconsistencies are detected, broadcasts are started to resolve them. 367 367 … … 371 371 by listening to updates for same amount of time as the slowest topology 372 372 update. Due to link and node failures, the LSDB may contain false information. 373 If allowed to persist, this can be extremely damaging since nodes may be 374 instructed to continuously attempt to send messages across a non-existent 375 link, wasting energy and increasing latency. If a link fails on the default 373 % If allowed to persist, this can be extremely damaging since nodes may be 374 % instructed to continuously attempt to send messages across a non-existent 375 % link, wasting energy and increasing latency. 376 If a link fails on the default 376 377 route, local repair mechanisms reroute the packet around the failure, and 377 378 update the Default Route Table. If a link fails otherwise, the packet must … … 379 380 route. In this case, the packet will be re-routed to a {\controller} using the 380 381 default route DAG. The {\controller} notes the broken link in the source 381 header, and after a small number of failures, itdeletes the link from the382 LSDB so it will route around the failure. The link may reappear in383 subsequent topology reports, if the failure was temporary.382 header, and after a small number of failures, deletes the link from the 383 LSDB so it will route around the failure.% The link may reappear in 384 % subsequent topology reports, if the failure was temporary. 384 385 385 386 The final table is the Route Install Table: it contains routes installed 386 387 based on the Link State Database of a {\controller} at some point in time. It 387 is also not proactively maintained: when a link which is part of388 an installed route fails,the packet is forwarded to a \controller. In388 is also not proactively maintained: when an installed route fails, 389 the packet is forwarded to a \controller. In 389 390 addition to removing the link from the LSDB after a small number of failures, 390 391 the {\controller} will generate a ``Route Uninstall'' message so that the node 391 router does not continue to use the same erroneousroute.392 router does not continue to use the broken route. 392 393 393 394 -
docs/Lowthane/ipsn10/discussion.tex
r2716 r2785 36 36 corresponding to the multicast group. It would then unicast the message to a 37 37 small number of nodes within each connected component (for reliability); these 38 nodes would then run a dissemination algorithm like Trickle , where only group39 members participate.This is essentially an implementation of of Levis's38 nodes would then run a dissemination algorithm like Trickle. 39 This is essentially an implementation of of Levis's 40 40 Firecracker protocol \cite{firecracker}. 41 41 … … 66 66 \subsection{Limitations}\label{sec:limitations} 67 67 68 \lowthane~has some important limitations which impact its suitability for some69 applications.68 % \lowthane~has some important limitations which impact its suitability for some 69 % applications. 70 70 71 71 \subsubsection{Stretch} -
docs/Lowthane/ipsn10/evaluation.tex
r2784 r2785 13 13 months in a real deployment. On two experimental testbeds, we 14 14 evaluate {\lowthane}'s scalability, performance, and resilience across a 15 range of workloads and failure conditions.15 range of workloads and failures. 16 16 17 17 \subsection{Questions} … … 366 366 %% L2Ns. 367 367 368 \subsubsection{Basic Operation}369 370 To see {\lowthane}'s basic mechanisms in action, we first focus on a trace of371 50 packets from a single flow running on testbed A. The measure of372 interest is ``transmissions per success,'' which counts the total number of link373 layer transmissions necessary to deliver a packet from a source to a destination.374 % This flow was taken from an experiement on375 % Testbed A, with 4 other concurrent flows, and with the Flow Table being empty376 % at the beginning of the experiment. %% This allows us to verify the basic377 % % functionality of {\lowthane}'s centralized route install.378 379 \begin{figure} [htb]380 \centering381 \includegraphics[width=0.7\columnwidth]{figs/BlowByBlow.pdf}382 \caption{Number of transmissions for each packet of a single flow. The first383 packet triggers a Route Install message, which reduces the path length from384 6 to 2.}385 \label{fig:blowbyblow}386 \end{figure}387 388 Figure~\ref{fig:blowbyblow} shows {\lowthane}'s installation mechanism389 in action. The first packet is forced to traverse the Default Route390 through a \controller, a four-hop path (plus two retransmissions, for391 a total of six transmissions). However, this first packet prompts the392 installation of a shorter and more direct two-hop path which is used393 for all subsequent packets, reducing the stretch by a factor of two.394 We also see the effect of link failures: delivering the first packet395 to the {\controller} required two retransmissions for a total of six396 transmissions, and subsequent link failures occasionally require a397 retransmission.368 % \subsubsection{Basic Operation} 369 370 % To see {\lowthane}'s basic mechanisms in action, we first focus on a trace of 371 % 50 packets from a single flow running on testbed A. The measure of 372 % interest is ``transmissions per success,'' which counts the total number of link 373 % layer transmissions necessary to deliver a packet from a source to a destination. 374 % % This flow was taken from an experiement on 375 % % Testbed A, with 4 other concurrent flows, and with the Flow Table being empty 376 % % at the beginning of the experiment. %% This allows us to verify the basic 377 % %% functionality of {\lowthane}'s centralized route install. 378 379 % \begin{figure} [htb] 380 % \centering 381 % \includegraphics[width=0.65\columnwidth]{figs/BlowByBlow.pdf} 382 % \caption{Number of transmissions for each packet of a single flow. The first 383 % packet triggers a Route Install message, which reduces the path length from 384 % 6 to 2.} 385 % \label{fig:blowbyblow} 386 % \end{figure} 387 388 % Figure~\ref{fig:blowbyblow} shows {\lowthane}'s installation mechanism 389 % in action. The first packet is forced to traverse the Default Route 390 % through a \controller, a four-hop path (plus two retransmissions, for 391 % a total of six transmissions). However, this first packet prompts the 392 % installation of a shorter and more direct two-hop path which is used 393 % for all subsequent packets, reducing the stretch by a factor of two. 394 % We also see the effect of link failures: delivering the first packet 395 % to the {\controller} required two retransmissions for a total of six 396 % transmissions, and subsequent link failures occasionally require a 397 % retransmission. 398 398 399 399 %% shows the number of transmissions needed to … … 418 418 \subsubsection{Concurrent Flows} 419 419 420 Expanding our test from a single flow, we test {\lowthane}'s ability420 First, we test {\lowthane}'s ability 421 421 to support a relatively large number of concurrent low bandwidth flows. Concurrency stresses 422 422 many aspects of a routing protocol, especially {\it reliability} and {\it … … 440 440 \subfigure[\lowthane, with (centralized) and without (triangle) the 441 441 route-installation optimization] { 442 \includegraphics[width=0. 7\columnwidth]{figs/Control-double.pdf}442 \includegraphics[width=0.65\columnwidth]{figs/Control-double.pdf} 443 443 \label{fig:concurrent-smote-pdr} 444 444 } 445 445 \subfigure[AODV performance] { 446 \includegraphics[width=0. 7\columnwidth]{figs/TinyAODV.pdf}446 \includegraphics[width=0.65\columnwidth]{figs/TinyAODV.pdf} 447 447 \label{fig:concurrent-aodv} 448 448 } … … 494 494 variability without the centralized optimization}. We also note that {\it 495 495 installing routes reduces 496 stretch}, as Figure \ref{fig:concurrency-smote-trans} illustrates. Testbed A497 has a diameter of only 3 or 4, and so the consistent reduction in stretch by498 one hop is quite significant in relation to the network diameter.496 stretch}, as Figure \ref{fig:concurrency-smote-trans} illustrates. % Testbed A 497 % has a diameter of only 3 or 4, and so the consistent reduction in stretch by 498 % one hop is quite significant in relation to the network diameter. 499 499 500 500 \subsubsection{AODV Comparison} … … 549 549 \begin{figure*}[htbp] 550 550 \centering 551 \setlength\SUBSIZE{0. 55\columnwidth}551 \setlength\SUBSIZE{0.45\columnwidth} 552 552 % good noise 553 553 \subfigure[Concurrent load]{ … … 584 584 585 585 \subsubsection{Failure Reliability} 586 Centralized systems are often criticized for being too slow to detect and 587 react to remote events. It is important that the optimization we introduced 586 % Centralized systems are often criticized for being too slow to detect and 587 % react to remote events. 588 It is important that the optimization we introduced 588 589 does not break in the case of node and link failures. Therefore, we next 589 590 turn our attention to {\lowthane}'s robustness to … … 602 603 \begin{figure} [h] 603 604 \centering 604 \includegraphics[width=0. 85\columnwidth]{figs/Single_Failure_Control-6.pdf}605 \includegraphics[width=0.65\columnwidth]{figs/Single_Failure_Control-6.pdf} 605 606 \caption{Packet delivery ratio and control traffic statistics for a single 606 607 flow amidst node failures: each vertical line indicates that four nodes … … 649 650 \begin{figure} [h] 650 651 \centering 651 \includegraphics[width=0. 85\columnwidth]{figs/Failure_Control-6.pdf}652 \includegraphics[width=0.65\columnwidth]{figs/Failure_Control-6.pdf} 652 653 \caption{Packet delivery ratio and control traffic statistics for multiple 653 654 flows amidst node failures. Approximately half the nodes are involved in a … … 664 665 \ref{fig:mfailure-smote-pdr} shows the same view of the data: at a high 665 666 level, it appears that {\lowthane} still responds well to the network 666 failures. We see an interesting spike in control traffic around 1000 seconds 667 into the experiment: at this point we have removed approximately $1/3$ of the 668 network, and the DAG forms a brief loop, which persists for several minutes 669 before the DAG re-forms and success rates return to near 670 100\%. We can also note that the transmission stretch necessary to deliver a 667 failures. 668 At around 1000 seconds into the experiment and we have removed $1/3$ of the 669 network, traffic spikes as a loop is formed and then resolved by the DAG 670 maintenance protocol. 671 % We see an interesting spike in control traffic around 1000 seconds 672 % into the experiment: at this point we have removed approximately $1/3$ of the 673 % network, and the DAG forms a brief loop, which persists for several minutes 674 % before the DAG re-forms and success rates return to near 675 % 100\%. 676 The transmission stretch necessary to deliver a 671 677 packet oscillates throughout the experiment (Figure 672 678 \ref{fig:mfailure-smote-trans}) as packets must be re-routed across longer paths. … … 674 680 is in the process of recovering when the final nodes are killed, partitioning 675 681 the network and making recovery impossible. 676 677 682 Furthermore, Figures \ref{fig:sfailure-smote-pdr} and 678 683 \ref{fig:mfailure-smote-pdr} demonstrate that the DAG control traffic remains … … 686 691 in the face of substantial failure. One design feature that we did 687 692 not originally anticipate but which proved critical was {\it 688 explicitly uninstalling Flow Table state}. While many protocols use 689 timeouts to expire soft state, there is no ``natural period'' at which 690 one can say this table state is stale; instead, it makes sense to 691 allow it to persist until it is needed. If we discover it is 692 inaccurate, the packet will still generally be delivered using the 693 explicitly uninstalling Flow Table state}. % While many protocols use 694 % timeouts to expire soft state, there is no ``natural period'' at which 695 % one can say this table state is stale; instead, it makes sense to 696 % allow it to persist until it is needed. 697 If we discover (through a re-routed packet) that an installed path is broken, 698 the packet will still generally be delivered using the 693 699 default route, but failing to uninstall or replace the route leads to high 694 700 transmission counts as the source repeatedly tries to use a broken
