OSEC

Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com
Re: Problem with OSPF and point-to-point connections

From: Claudio Jeker (cjekerdiehard.n-r-g.com)
Date: Thu Aug 16 2007 - 13:59:50 CDT


On Tue, Aug 14, 2007 at 01:55:36PM -0400, Stefan Schmieta wrote:
> Hi tech,
>
> I think there is a problem with how the shortest path algorithm in ospfd
> handles point-to-point connections. I can distill this problem down to
> two routers connected via two GIF interfaces. One (gif171 below)
> represents a ipsec tunnel via the internet, the other (gif21) a
> connection via private T1. We would like traffic to go through the T1
> when possible, so the metrics are set to prefer that interface. The
> excerpt of ospfd.conf for the two interfaces on router1 looks like this:
>
> area 0.0.0.5 {
> interface gif21 {
> auth-md 1 $cl1_password
> metric 20
> }
> interface gif171 {
> auth-md 1 $cl1_password
> metric 45
> }
> ...more interfaces towards other routers here...
> }
>
> What happens is that if both interfaces are up, ospfd uses the right
> metric (20), but the wrong next-hop and thus all traffic is routed via
> gif171 instead of gif21. After some debugging I think the problem is in
> case 1 of calc_next_hop() in rde_spf.c:
>
> void
> calc_next_hop(struct vertex *dst, struct vertex *parent)
> {
> struct lsa_rtr_link *rtr_link = NULL;
> int i;
>
> /* case 1 */
> if (parent == spf_root) {
> switch (dst->type) {
> case LSA_TYPE_ROUTER:
> for (i = 0; i < lsa_num_links(dst); i++) {
> rtr_link = get_rtr_link(dst, i);
> if (rtr_link->type == LINK_TYPE_POINTTOPOINT &&
> ntohl(rtr_link->id) == parent->ls_id) {
> dst->nexthop.s_addr = rtr_link->data;
> break;
> }
> }
>
> It picks the first point-to-point connection that links the two vertices
> in question. In my case, there are two such links with different costs
> and the first one happens to be the more expensive one. I have "fixed"
> my configuration by changing the order so that the cheap one comes first
> in calc_next_hop() but that is obviously only a hack.
>
> Is my ospf.conf incorrect or is this really a bug in ospfd?
>

This is a bug in ospfd. Having two point-to-point links to the same router
is not expected and ospfd chooses the first match it finds. Now IMO all of
case 1 (parent == spf_root) is just plain wrong. We should actually not
look at the LSDB at all but instead use the interface config we already
have. Now changing that is nasty as we do not have the necessary data to
do this at the right place. Will take a while to create a propper fix.

--
:wq Claudio