isisd: fix edge condition in max_lsp_count computationFix an edge condition in the arithmetic in the max_lsp_count
api.
Signed-off-by: Mark Stapp <mjs@cisco.com>
tests: add topotest for BGP I/O thread CPU spin on full input queueAdd a stress test that replicates the I/O thread spin bug fixed in commit
ed405bf22 ("bgpd: fix I/O thread spinning when peer input queue is
full").
A raw BGP speaker (bgp_sender.py) blasts 10000 UPDATE messages via
non-blocking I/O, each carrying a 15-ASN AS_PATH to increase per-route
processing cost. The total data (~740 KB) exceeds the ibuf_work ring
buffer (~96 KB), creating sustained TCP ...
tests: Slow down test_config.py to allow for processing time to happenThe code has this pattern:
a) Input some cli
b) Look for success
The test is not being graceful in that under heavy load, a) might
not have finished. Give the test system more time to get to an answer.
Please note, I am actually still seeing a honest to goodness bug in mgmtd
that this test is exposing, but the messages about the `cli is locked` and
test failing for not being given enough ti...
bgpd: Fix missing present_tlvs bit for Link ID in link NLRIWhen originating/withdrawing a Link NLRI, link_remote_id is filled in
the bgp_ls_nlri structure but BGP_LS_LINK_DESC_LINK_ID_BIT is not set
in link_desc.present_tlvs.
Fix by setting BGP_LS_LINK_DESC_LINK_ID_BIT in both bgp_ls_originate_link()
and bgp_ls_withdraw_link().
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
bgpd: Require valid TED objects in BGP-LS originate/withdraw APIsBGP-LS node/link/prefix originate and withdraw handlers are expected to
receive valid TED objects.
Add explicit checks at the beginning of each function and return early on
invalid inputs, before any further processing.
This makes the API contract clear, avoids NULL dereferences, and keeps the
originate/withdraw paths consistent.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
bgpd: Include local-node ASN in link withdraw NLRIA Link NLRI contains two nodes: local node and remote node.
Per RFC 9552, each node is identified by ASN, OSPF Area ID, and
IGP Router ID.
For the remote node, `bgp_ls_withdraw_link` sets ASN, OSPF Area ID, and
IGP Router ID when generating the NLRI.
For the local node, `bgp_ls_withdraw_link` sets only OSPF Area ID and
IGP Router ID when generating the NLRI.
Add ASN for the local node as wel...
bgpd: Fix use-after-free in BGP-LS node origination`bgp_ls_originate_node()` could free `ls_attr` after `bgp_ls_populate_node_attr()`
failure, then continue and pass the freed pointer to `bgp_ls_update()`.
Fix by returning immediately after `bgp_ls_attr_free(ls_attr)` on
populate failure.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
lib: fix crash in thread_process_io_inner_loop on stale epoll eventWhen do_event_cancel() processes a pending cancellation at the top of
event_fetch_inner_loop(), it removes the fd from the epoll_event_hash
and calls EPOLL_CTL_DEL. However, epoll_wait() can still deliver events
that were already queued in the kernel's ready list before the
EPOLL_CTL_DEL took effect.
When thread_process_io_inner_loop() processes such a stale event, the
hash lookup returns NULL...
bgpd: Clear `registered_ls_db` before calling `ls_unregister()``bgp_ls_unregister()` only cleared `registered_ls_db` after a
successful `ls_unregister()` call. When `ls_unregister()` failed, the
flag was left as true, making `bgp_ls_is_registered()` report "still
registered". Any subsequent call to `bgp_ls_register()` would then
return early thinking registration was already in place, leaving BGP
permanently unable to receive link-state updates from zebr...
bgpd: Unintern old NLRI reference in `bgp_ls_update()``bgp_afi_node_get()` returns an existing RIB node when the prefix is
already present. In that case `dest->ls_nlri` already holds an
interned pointer. Unconditionally overwriting it with the new value
dropped the old reference without calling `bgp_ls_nlri_unintern()`,
leaking the previous allocation.
Unintern the existing pointer before installing the new one so the
refcount is kept accurate ...
bgpd: Use `bgp_node_lookup()` in `bgp_ls_withdraw()``bgp_ls_withdraw()` was calling `bgp_afi_node_get()` to locate the RIB
destination before marking the route as withdrawn. `bgp_afi_node_get()`
creates a new RIB node when none exists, so a withdraw for an NLRI that
was never installed silently created a phantom RIB entry with no path
info attached, wasting memory and polluting the table.
Replace the call with `bgp_node_lookup()`, which is a p...
bgpd: Fix `edge->destination` null deref in `bgp_ls_withdraw_link()``bgp_ls_withdraw_link()` dereferenced `edge->destination->node` to
read the remote AS number, but the guard only checked
`edge->attributes` — it did not verify that `edge->destination` or
`edge->destination->node` were non-NULL, leading to a potential null
dereference.
Add the missing null checks on `edge->destination` and
`edge->destination->node`, consistent with the guard used for the
same ...
bgpd: Demote unknown opaque message warning to debug in opaque handlerBGP registers only for `LINK_STATE_SYNC` and `LINK_STATE_UPDATE`
opaque messages, but other daemons may send opaque messages on the
same channel for their own purposes. The `default` case in
`bgp_zebra_opaque_msg_handler()` emitted a `zlog_warn()` for every
such message, which would be noisy and gives the operator nothing to
act on.
Demote to a debug log guarded by `BGP_DEBUG(zebra, ZEBRA)`.
...
bgpd: Remove duplicate log calls in `bgp_ls_withdraw_node()``bgp_ls_withdraw_node()` called `zlog_err()` immediately followed by
`flog_err()` with identical messages on two error paths. `flog_err()`
already logs to the standard output, making the `zlog_err()` calls
redundant. Remove the duplicates.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
bgpd: Align `bgp_ls_populate_prefix_attr()` with node/link pattern`bgp_ls_populate_prefix_attr()` used a double-pointer `**attr` and
allocated `ls_attr` internally, unlike `bgp_ls_populate_node_attr()`
and `bgp_ls_populate_link_attr()` which receive a pre-allocated
`*attr` from the caller.
The internal `!encoded` path also had a bug: `attr = NULL` set the
local variable instead of `*attr`, leaving the caller with a dangling
pointer to freed memory.
Refactor...
bgpd: Fix memory leak in `bgp_ls_originate_node()`In `bgp_ls_originate_node()`, `ls_attr` is allocated with
`bgp_ls_attr_alloc()` and populated before being passed to
`bgp_ls_update()`. If `bgp_ls_update()` fails, the function
returns -1 without freeing `ls_attr`, causing a memory leak.
Free `ls_attr` before returning on the `bgp_ls_update()` failure
path.
Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
bgpd: Fix null dereference when `bgp->ls_info` is NULL`bgp_ls_register()` calls `bgp_ls_is_registered()` as an early-return
guard, but that function returns false when `bgp->ls_info` is NULL.
The code then proceeds to dereference `bgp->ls_info->registered_ls_db`
unconditionally, causing a crash.
Add explicit null guards at the top of `bgp_ls_register()`,
`bgp_ls_unregister()`, and `bgp_ls_cleanup()` to return early when
`bgp->ls_info` is not init...
pimd: Reject pim packets with a malformed header lengthEnsure that the header length passed in is correct.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
pimd: Fix out of bounds read in AutoRP codeThe pim_autorp.c has an out-of-bounds read in the announcement/discovery
parsing when unsupported RP entries are skipped. Fix this.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
pimd: igmpv3 never checks packet length and trusts the num-sources fieldModify the code to ensure that the packet length is good enough to allow
us to continue reading the packet instead of just trusting the number
of sources field.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
pimd: Do not allow a register-stop message if not received from the RPCurrently anyone can send a register-stop message to pim and cause it
to stop forwarding state. Ensure that the received register-stop message
actually comes from the RP.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
pimd: Prevent received msg length from being larger than bufferCurrently when pim receives a Graft-Ack echo-back packet it was
not checking that the size received we were capable of sending back.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
pimd: Remove unnecessary assertsAll these places that we have asserts we have already checked
for these values to be usable at this point in time. There
is no need to do these asserts.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
pimd: When receiving a register stop ensure we have enough data to readIn pim_register_stop_recv:
When calling pim_parse_addr_group, if the function fails it returns
a -1 value, then we need to stop processing instead of continuing
and moving pointers around with the -1 value.
Also when calling pim_parse_addr_ucast the same problem exists.
Modify the code to bail out when there is a problem instead
of continuing.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
pimd: Ensure a register packet has enough space to read S,G dataThe S,G values being read from the header, were not being checked
to ensure that the message received has enough data to actually
read those values. Make sure it's correct.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
pimd: Ensure that header has space on packetThe pim_staterefresh_recv function is reading values beyond
what has been ensured to exist. Modify the code to keep
track of the curr_size and ensure we still have room to
read the end of the packet.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
bgpd: Fix destination linkage for delayed reverse-edge updatesAn edge A->B can arrive before reverse edge B->A is present in TED.
When that happens, A->B may not have a destination yet and link
origination for that direction is deferred.
Today, when the reverse edge arrives, the direct edge is not updated
in TED and can remain without a destination.
As a result, when bgp_ls_process_edge tries to generate BGP-LS
NLRI, generation fails because the edge st...
ldpd: improve tlv validation in several placesImprove validation in address-list tlv, PWID subtlv, and
notification tlv processing.
Signed-off-by: Mark Stapp <mjs@cisco.com>
bgpd: close dynamic peer socket in ttl error pathEnsure we close the fd for a new connection if the ttl set
function fails.
Signed-off-by: Mark Stapp <mjs@cisco.com>
bgpd: fix logic handling EVPN_FLAG_DEFAULT_GWRe-order the clearing and setting of the DEFAULT_GW flag so
the end state is correct. Also correct some stale (?) comments.
Signed-off-by: Mark Stapp <mjs@cisco.com>
tests: Ensure upstream IIF is in correct state after interface eventsAdd a bit of code to the test_multicast_pim_uplink_topo1.py script
to ensure that the upstream interfaces are in correct state before
proceeding with the remainder of the test in places where the
interface state has been changed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
tests: add topotest for BGP input-queue-limit behaviorAdd a topotest that verifies BGP correctly handles the
input-queue-limit setting. When r2 has `bgp input-queue-limit 100`
and r1 sends 1000 routes, the test confirms:
1. The eBGP session establishes successfully
2. All 1000 routes converge on r2 despite the low queue limit,
proving reads are properly re-armed after the queue drains
3. The input queue depth settles back to 0 after convergenc...
nhrpd: Correct addrlen check in os_recvmsg()Previously compared addrlen to the stack address of lladdr.sll_addr cast to size_t, virtually always true.
This should remain always true as sll_addr is an unsigned char array of size 8 and addr is an array of size 64 but this fixes the check and ensures enough space in addr for memcpy.
Signed-off-by: Corey Siltala <csiltala@atcorp.com>
bgpd: avoid early return in MPLSVPN NLRI processingAvoid an early error return that could leak allocated memory.
Signed-off-by: Mark Stapp <mjs@cisco.com>
bgpd: remove unneeded asserts in packet readsRemove a couple of unneeded asserts in the bgp packet-reading
code. Remove one, replace another with an error return.
Signed-off-by: Mark Stapp <mjs@cisco.com>