Skip to content

resource manager trait and impl#4409

Draft
elnosh wants to merge 8 commits intolightningdevkit:mainfrom
elnosh:resource-mgr
Draft

resource manager trait and impl#4409
elnosh wants to merge 8 commits intolightningdevkit:mainfrom
elnosh:resource-mgr

Conversation

@elnosh
Copy link
Contributor

@elnosh elnosh commented Feb 10, 2026

Part of #4384

This PR introduces a ResourceManager trait and DefaultResourceManager implementation of that trait which is based on the proposed mitigation in lightning/bolts#1280.

It only covers the standalone implementation of the mitigation. I have done some testing with integrating it into the ChannelManager but that can be done separately. As mentioned in the issue, the resource manager trait defines these 4 methods to be called from the channel manager:

  • add_channel
  • remove_channel
  • add_htlc
  • resolve_htlc

Integrating into the ChannelManager

  • The ResourceManager is intended to be internal to the ChannelManager rather than users instantiating their own and passing it to a ChannelManager constructor.

  • add/remove_channel should be called when channels are opened/closed.

  • add_htlc: When processing HTLCs, the channel manager would call add_htlc which returns a ForwardingOutcome telling it whether to forward or fail the HTLC along with the accountable signal to use in case that it should be forwarded. For the initial "read-only" mode, the channel manager would log the results but not actually fail the HTLC if it was told to do so. A bit more specific on where it would be called: I think it will be when processing the forward_htlcs before we queue the add_htlc to the outgoing channel

    if let Err((reason, msg)) = optimal_channel.queue_add_htlc(

  • resolve_htlc: Used to tell back the ResourceManager the resolution of an HTLC. It will be used to release bucket resources and update reputation/revenue values internally.

This could have more tests but opening early to get thoughts on design if possible

cc @carlaKC

Implements a decaying average over a rolling
window. It will be used in upcoming commits
by the resource manager to track reputation
and revenue of channels.
The RevenueAverage implemented here will
be used in upcoming commits to track the
incoming revenue that channels have
generated through HTLC forwards.
Resources available in the channel will
be divided into general, congestion and
protected resources. Here we implement
the general bucket with basic denial of
service protections.
Resources available in the channel will
be divided into general, congestion and
protected resources. Here we implement
the bucket resources that will be used
for congestion and protected.
The Channel struct introduced here
has the core information that will be
used by the resource manager to
make forwarding decisions on HTLCs:

- Reputation that this channel has
accrued as an outgoing link in HTLC
forwards.

- Revenue (forwarding fees) that the
channel has earned us as an incoming link.

- Pending HTLCs this channel is currently
holding as an outgoing link.

- Bucket resources that are currently in
use in general, congestion and protected.
Trait that will be used by the
`ChannelManager` to mitigate slow
jamming. Core responsibility will
be to track resource usage to
evaluate HTLC forwarding decisions.
Introduces the DefaultResourceManager
struct. The core of methods that will
be used to inform the HTLC forward
decisions are add/resolve_htlc.

- add_htlc: Based on resource
availability and reputation, it
evaluates whehther to forward or
fail the HTLC.

- resolve_htlc: Releases the
bucket resources used from a
HTLC previously added and updates
the channel's reputation based
on HTLC fees and resolution times.
Adds write and read implementations
to persist the DefaultResourceManager.
@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Feb 10, 2026

👋 Thanks for assigning @carlaKC as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@codecov
Copy link

codecov bot commented Feb 11, 2026

Codecov Report

❌ Patch coverage is 95.13837% with 65 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.16%. Comparing base (94d1e5e) to head (53a93ed).
⚠️ Report is 52 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/ln/resource_manager.rs 95.13% 33 Missing and 32 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4409      +/-   ##
==========================================
+ Coverage   86.03%   86.16%   +0.13%     
==========================================
  Files         156      157       +1     
  Lines      103091   104721    +1630     
  Branches   103091   104721    +1630     
==========================================
+ Hits        88690    90235    +1545     
- Misses      11891    11940      +49     
- Partials     2510     2546      +36     
Flag Coverage Δ
tests 86.16% <95.13%> (+0.13%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@carlaKC carlaKC self-requested a review February 11, 2026 07:04
Copy link
Contributor

@carlaKC carlaKC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really great job on this! Done an overly-specific first review round for something that's in draft because I've taken a look at previous versions of this code before when we wrote simulations. Also haven't looked at the tests in detail yet, but coverage is looking ✨ great ✨ .

I think that taking a look at tracking slot usage in GeneralBucket with a single source of truth is worth taking a look at, seems like it could clean up a few places where we need to two hashmap lookups one after the other.

In the interest of one day fuzzing this, I think it could also use some validation that enforces our protocol assumptions (eg, number of slots <= 483).


struct DecayingAverage {
value: i64,
last_updated: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: here and a few places - let's add a suffix that indicates what unit of time this is (unix seconds/ns?) since we can't use strong types here

DecayingAverage {
value: 0,
last_updated: start_timestamp,
decay_rate: 0.5_f64.powf(2.0 / window.as_secs_f64()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a comment about this value either here or on the struct itself. We're using a decaying average to approximate a rolling window, and want a constant decay rate that's related to the window of time we're tracking.

This rate was chosen so that the "half life" of the decayed value is half of the window provided (so if we provide a window of 2 weeks, the value decays to half of its value in a week)

@@ -0,0 +1,101 @@
use std::time::Duration;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: needs licensing header for new file

@@ -0,0 +1,101 @@
use std::time::Duration;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use core::time::Duration


// Check decay after full window
let ts_3 = ts_2 + WINDOW.as_secs();
assert_eq!(avg.value_at_timestamp(ts_3).unwrap(), 250);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a case where we go into a negative value and decay it? Doesn't add coverage but negative numbers are scary, nice to have a test demonstrating that everything works the same.

Comment on lines +1419 to +1420
let outgoing_in_flight_risk: u64 =
outgoing_channel.pending_htlcs.iter().map(|htlc| htlc.1.in_flight_risk).sum();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should only count accountable htlcs towards risk - methinks a helper function!

Comment on lines +641 to +680
impl Writeable for Channel {
fn write<W: Writer>(&self, writer: &mut W) -> Result<(), io::Error> {
write_tlv_fields!(writer, {
(1, self.outgoing_reputation, required),
(3, self.incoming_revenue, required),
(5, self.pending_htlcs, required),
(7, self.general_bucket, required),
(9, self.congestion_bucket, required),
(11, self.last_congestion_misuse, required),
(13, self.protected_bucket, required),
});

Ok(())
}
}

impl Readable for Channel {
fn read<R: Read>(reader: &mut R) -> Result<Channel, DecodeError> {
_init_and_read_len_prefixed_tlv_fields!(reader, {
(1, outgoing_reputation, required),
(3, incoming_revenue, required),
(5, pending_htlcs, required),
(7, general_bucket, required),
(9, congestion_bucket, required),
(11, last_congestion_misuse, required),
(13, protected_bucket, required),
});

Ok(Channel {
outgoing_reputation: outgoing_reputation.0.unwrap(),
incoming_revenue: incoming_revenue.0.unwrap(),
pending_htlcs: pending_htlcs.0.unwrap(),
general_bucket: general_bucket.0.unwrap(),
congestion_bucket: congestion_bucket.0.unwrap(),
last_congestion_misuse: last_congestion_misuse.0.unwrap(),
protected_bucket: protected_bucket.0.unwrap(),
})
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use the top level macro when the fields can be directly read/written:

impl_writeable_tlv_based!(Channel, {
	(1, outgoing_reputation, required),
	(3, incoming_revenue, required),
	(5, pending_htlcs, required),
	(7, general_bucket, required),
	(9, congestion_bucket, required),
	(11, last_congestion_misuse, required),
	(13, protected_bucket, required),
});```

}
}

impl Writeable for Channel {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise can use the top level macro for persistence where we just straight read/write.

});
Ok(BucketResources {
slots_allocated: slots_allocated.0.unwrap(),
slots_used: 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL (from claude) that we can use static_value + impl_writeable_tlv_based for these zero values.

}

// Replay pending HTLCs to restore bucket usage.
for (incoming_channel, htlcs) in pending_htlcs.iter() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👌

@ldk-reviews-bot
Copy link

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants