mne_rt.protocols.RLProtocol#
- class mne_rt.protocols.RLProtocol(direction: str = 'up', initial_threshold: float = 0.0, target_hit_rate: float = 0.7, lr: float = 0.05, epsilon: float = 0.05, smoothing: float = 0.0, history_len: int = 50, warmup_windows: int = 20, rng_seed: int | None = None)#
Bases:
objectAdaptive NF protocol with reinforcement-learning threshold updates.
Adjusts the decision threshold after every evaluation to maintain a target hit rate using the update rule:
threshold += lr * (hit_rate - target_hit_rate) * running_std
Unlike
ThresholdProtocol(which also has an adaptive mode), this protocol tracks a rolling hit rate in a fixed-length window, scales updates by the running standard deviation of recent values, and optionally applies epsilon-greedy exploration: with probabilityepsilona reward is given regardless of the threshold. Exploration trials do not count toward the hit-rate used for threshold updates.During the first
warmup_windowscalls toevaluate()the threshold is frozen andcrossedis alwaysFalse.- Parameters:
- direction{“up”, “down”}
“up” -> reward when value > threshold (e.g., enhance alpha power). “down” -> reward when value < threshold (e.g., suppress beta power). Default is “up”.
- initial_threshold
float Starting decision threshold. Default is 0.0.
- target_hit_rate
float Desired proportion of non-exploration windows that cross the threshold. Must be strictly in
(0, 1). Default is 0.70.- lr
float Learning rate for threshold updates. Must be > 0. Default is 0.05.
- epsilon
float Exploration probability. On each call to
evaluate(),epsilonis the chance of giving a reward regardless of threshold. Must be in[0, 1). Default is 0.05.- smoothing
float EMA smoothing coefficient applied to the raw input before thresholding. Must be in
[0, 1).0.0disables smoothing. Applied as:smoothed = (1 - smoothing) * new + smoothing * prev. Default is 0.0.- history_len
int Rolling-window length for hit-rate and running-std estimation. Must be >= 10. Default is 50.
- warmup_windows
int Number of initial evaluations used solely to seed the rolling statistics before any reward can be issued or any threshold update is applied. Must be >= 1. Default is 20.
- rng_seed
int|None Seed for the NumPy random generator used for epsilon draws. Default is None (non-deterministic).
- Raises:
ValueErrorIf any parameter is outside its valid range.
Notes
The update rule is direction-aware: when
direction="up"a higher threshold raises difficulty; whendirection="down"a lower threshold raises difficulty. The sign of the update is therefore flipped for “down” protocols.Examples
RL-adaptive alpha-up protocol targeting 70 % hit rate:
proto = RLProtocol( direction="up", initial_threshold=0.5, target_hit_rate=0.70, lr=0.05, epsilon=0.05, ) for value in nf_stream: crossed, magnitude = proto.evaluate(value) if crossed: send_reward(magnitude)
Added in version 1.0.0.
- __init__(direction: str = 'up', initial_threshold: float = 0.0, target_hit_rate: float = 0.7, lr: float = 0.05, epsilon: float = 0.05, smoothing: float = 0.0, history_len: int = 50, warmup_windows: int = 20, rng_seed: int | None = None) None[source]#
Methods
__init__([direction, initial_threshold, ...])evaluate(value)Evaluate one NF value and return (crossed, magnitude).
reset()Reset all adaptive state to initial conditions.
Attributes
Rolling hit rate over non-exploration evaluations (0–1).
Total number of evaluations since init or last
reset().Number of exploration trials (epsilon draws) since init or reset.
Current decision threshold.
- evaluate(value: float) tuple[bool, float][source]#
Evaluate one NF value and return (crossed, magnitude).
Applies optional EMA smoothing, checks the current threshold, draws for epsilon-greedy exploration, updates the rolling hit history (exploration draws excluded), and then applies the RL threshold update. Warmup period suppresses all rewards and threshold updates.
- Parameters:
- value
float Current NF feature value.
- value
- Returns:
Notes
Exploration trials (where the reward is given due to epsilon-greedy) are counted in
n_exploredbut are not recorded in the hit history used for the threshold-update rule.
- property hit_rate: float#
Rolling hit rate over non-exploration evaluations (0–1).
Returns 0.0 before any non-exploration evaluations are recorded.