|
Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com |
From: Tony Przygienda (prz_at_XEBEO.COM)
Date: Thu Nov 07 2002 - 10:17:03 CST
Ash, Gerald R (Jerry), ALASO wrote:
>Dave,
>
>>>Such failures are not the fault of the service provider
>>>operation or the vendor/equipment implementation. They are
>>>due to shortcomings in the link-state protocols themselves --
>>>thus the need for the enhancements proposed in the draft.
>>>
>
>>I strongly disagree with this statement. While the design of the
>>protocols can make it challenging, there is ample room in
>>implementation to provide stable and scalable networks.
>>
>>When a network collapses, the fault lies at the feet of the
>>implementers. In every case I've seen (too many), the collapse was
>>inevitable sooner or later, due to naive design choices in software,
>>but at the same time was quite nonlinear in its onset (making any
>>predictive or self-monitoring approach pretty hopeless.)
>>
>>There are some things that would make the job easier, at the cost
>>of additional complexity, but pointing at network collapses
>>and blaming the protocols is disingenuous.
>>
>
>I think you should review the ample evidence presented in http://www.ietf.org/internet-drafts/draft-ash-manral-ospf-congestion-control-00.txt that the protocols need to be enhanced to better respond to congestion collapse:
>
>- Section 2: documented failures and their root-cause analysis, across multiple service provider networks (also review the cited references)
>- Appendix B: vendor analysis of a realistic failure scenario similar to one experienced as discussed in Section 2 (perhaps you would like to provide your own analysis of this scenario based on your OSPF implementation)
>- Appendix C: simulation analysis of protocol performance (other I-D's being discussed provide analysis of proposed protocol extensions)
>
>To say that network collapse in *every* case is due to *naive design choices* ignores the evidence/analysis presented. Based on the evidence/analysis, there is clearly room for the protocols to be improved to the point where networks *never* go down for hours or days at a time (drawing unwanted headlines & business impact).
>
>Jerry
>
Jerry, most of the things you say in your document (which is actually
pretty good) has been
known to people like Dave and other old-time implementors since years
and avoiding exactly
those things by smart implementation techniques was what was
differentiating the have from
the have-nots. I remember myself learning some of those things by hard
experience and some
by looking at old-hands code ;-) [Albeit I remember also picking up a
lot of smart control protocol
ideas from your RTNR work]. I do not think that Dave is putting down
what you say, rather
(and I commit the stupidity to interpret his words by my own beliefs)
that what your document
says are mostly _implementation_ issues, not _standardization_ and
therefore it is not a very wise
idea to add them to the charter of a _standards_ group. Good protocol
specs are _not_
implementation cookbooks, they are documents governing bits on the wires
in such a way that
two people implementing things in vastly different ways can still talk
to each other. Recommendations
of implementation techniques prove long-term inherently dangerous (like
Joel pointed out, at a
certain point in time adding more code to an implementation introduces
more bugs than the
performance gain is worth) or utterly ridiculous (look at ISIS 0-63
metric to make SPF real fast,
it lead to quite bad contortions).
thanks
-- tony
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]