NETVIRT-1068: Upstreaming fixes 2
Issue-1 : When VM is configured with extra-route, refcount in l3nexthop is
incremented. It gets incremented further due to following reasons:
(a) After initial extra-route configuration using command - neutron
router-update RouterA destination=IP-A,nexthop=prefix-A , if another
update is done using command - neutron router-update RouterA
destination=IP-A,nexthop=prefix-B , neutron router listener calls update
on prefix-A as well as prefix-B. On prefix-A , secondary adj (IP-A) is
removed , where as its added on prefix-B. This back-to-back update creates
race-condition in Vrf Engine , leading inconsistencies in l3nexthop,
VpnExtraRoute, VpnInterfaceOp DS. (b) After initial extra-route
configuration using command - neutron router-update RouterA
destination=IP-A,nexthop=prefix-A, if cluster reboot is performed ,
TEP-ADD event triggered the update of FIB entry for IP-A. Update call in
FIB leads to increase in refcount of l3nexthop for prefix-A.
After refcount has reached high number due to issue-(a) and (b) , if VM is
migrated , group will not be programmed on destination DPN,, thus leading
to VM becoming un-reachable.
Fix: For issue-(a) , a temporary fix of 2sec delay is introduced in
neutron. A better fix/design need to be thought to avoid race condition.
For issue-(b) , after cluster reboot , secondary-adj FIB updated is
avoided due to wrong check in updateVpnInterfaceOnTepAdd method. refcount
is removed from l3nexthop yang because , after cluster reboot , due to
multiple Add/Update replays for a given prefix, refcount goes higher than
it should be. Hence prefixes using a group themselves are stored in
l3nexthop , so that even if spurious Add/Update are triggered post cluster
reboot for same prefix , its not updated.
Change-Id: I4618303345db1241b1826b018424b3df0f8bd9ec
Signed-off-by: HANAMANTAGOUD V Kandagal <hanamantagoud.v.kandagal@ericsson.com>
Signed-off-by: Sam Hague <shague@redhat.com>