From e2f7d91fc3397d1114aa515fb59faad98b2a4a4a Mon Sep 17 00:00:00 2001
From: Alice Cecile <alice.i.cecile@gmail.com>
Date: Sun, 19 Sep 2021 00:27:59 -0400
Subject: [PATCH 01/11] RFC outline

---
 rfcs/34-system-graph-flavors.md | 144 ++++++++++++++++++++++++++++++++
 1 file changed, 144 insertions(+)
 create mode 100644 rfcs/34-system-graph-flavors.md

diff --git a/rfcs/34-system-graph-flavors.md b/rfcs/34-system-graph-flavors.md
new file mode 100644
index 00000000..00a308c5
--- /dev/null
+++ b/rfcs/34-system-graph-flavors.md
@@ -0,0 +1,144 @@
+# Feature Name: `system-graph-flavors`
+
+## Summary
+
+Several new, higher-level system ordering constraints are introduced to allow users to describe more complex rules that the scheduler must obey.
+These constraints limit the set of possible paths through the schedule and serve as initialization-time assertions on the schedule that protect against accidental misconfiguration.
+
+## Motivation
+
+While the existing strict ordering constraints implemented as `.before` and `.after` are an excellent foundation,
+more complex constraints are needed to elegantly express and enforce rules about how systems relate to each other.
+
+In order to manage the complexity of code bases with hundreds of systems, these constraints need to be both minimal and local.
+Each constraint should have a good, clear reason to exist.
+
+This is particularly critical in the context of [plugin configurability](https://github.com/bevyengine/rfcs/pull/33), where plugin authors need a powerful toolset of constraints in order to ensure that their logic is not broken when reconfigured by the end user.
+
+## User-facing explanation
+
+### Scheduling 101
+
+Bevy schedules are composed of stages, which have two phases: parallel and sequential.
+The parallel phase of a stage always runs first, then is followed by a sequential phase which applies any commands generated in the parallel phase.
+
+During **sequential phases**, each system can access the entire world freely but only one can run at a time.
+Exclusive systems can only be added to the sequential phase while parallel systems can be added to either phase.
+
+During **parallel phases**, systems are allowed to operate in parallel, carefully dividing up access to the `World` according to the data accesses requested by their system parameters to avoid undefined behavior.
+
+Without any user-provided ordering constraints, systems within the same parallel phase can be executed in any order so long as Rust's ownership semantics are obeyed.
+That means that a **waiting** (scheduled to run during this stage) system cannot be started if any **active** (currently running) systems are **incompatible** (cannot be scheduled at the same time) if they have conflicting data access.
+This is helpful for performance reasons, but, as the precise order in which systems start and complete is nondeterministic, can result in logic bugs or inconsistencies due to **system order ambiguities**.
+
+### System ordering constraints
+
+To help you eliminate ambiguities, eliminate delays and enforce **logical invariants** about how your systems work together, Bevy provides a rich toolbox of **system ordering constraints**.
+
+There are several **flavors** of system ordering constraints, each with their own high-level behavior:
+
+- **Strict ordering:** Systems from set `A` cannot be started while any system from set `B` are waiting or active.
+  - Simple and explicit.
+  - Use the `before(label: impl SystemLabel)` or `after(label: impl SystemLabel)` methods to set this behavior.
+- **If-needed ordering:** A given system in set `A` cannot be started while any incompatible system from set `B` is waiting or active.
+  - Usually, if-needed ordering is the correct tool for ordering groups of systems as it avoids unnecessary blocking.
+  - If systems in `A` use interior mutability, if-needed ordering may result in non-deterministic outcomes.
+  - Use `before_if_needed(label: impl SystemLabel)` or `after_if_needed(label: impl SystemLabel)`.
+- **At-least-once separation:** Systems in set `A` cannot be started if a system in set `B` has been started until at least one system with the label `S` has completed. Systems from `A` and `B` are incompatible with each other.
+  - This is most commonly used when commands created by systems in `A` must be processed before systems in `B` can run correctly.
+  - Use the `between(before: impl SystemLabel, after: impl SystemLabel)`, `before_and_seperated_by(before: impl SystemLabel, seperated_by: impl SystemLabel)` method or its `after` analogue.
+  - `hard_before` and `hard_after` are syntactic sugar for the common case where the separating system label is `CoreSystem::ApplyCommands`, the label used for the exclusive system that applies queued commands that is typically added to the beginning of each sequential stage.
+  - The order of arguments in `between` matters; the labels must only be separated by a cleanup system in one direction, rather than both.
+  - This methods do not automatically insert systems to enforce this separation: instead, the schedule will panic upon initialization as no valid system execution strategy exists.
+  - At-least-once separation also applies a special form of run criteria overriding (see below): if a pair of `before` and `after` systems are enabled during a schedule pass, and all intervening `between` systems are disabled, the last intervening `between` system is forcibly enabled to ensure that cleanup occurs
+
+System ordering constraints operate across stages, but can only see systems within the same schedule.
+
+A schedule will panic on initialization if it is **unsatisfiable**, that is, if its system ordering constraints cannot all be met at once.
+This may occur if:
+
+1. A cyclic dependency exists. For example, `A` must be before `B` and `B` must be before `A` (or some more complex transitive issue exists).
+2. Two systems are at-least-once separated and no separating system exists that could be scheduled to satisfy the constraint.
+
+System ordering constraints cannot change which stage systems run in or add systems: you must fix these problems manually in such a way that the schedule becomes satisfiable.
+
+### Run criteria overrides
+
+Bevy provides tools to ensure that critical clean-up work is not missed when systems are enabled and disabled using run criteria.
+
+Suppose we have a system `A` that does some work, but leaves the `World` in a messy state that will cause bugs or undesirable side effects if not handled properly.
+System `B` is responsible for cleaning up this work, and should run during every pass of the schedule loop that `A` does.
+
+We can ensure that this occurs by adding a run criteria to `B` that takes `A`'s run status into account: if `A` should run, `B` must also run, regardless of what its other run criteria say.
+Otherwise, `B`'s original run criteria behave as usual.
+Often, `B` will have a blanket run criteria that states that it should never run, and is only run when the override requires it.
+
+As a user, you can add this rule to your systems and system sets by using the `.overrides(overridden_system: impl SystemLabel)` or `overridden_by(overriding_system: impl SystemLabel)` methods.
+Bevy also provides a standard run criteria for the "this system should never run unless overridden" use case: add `.with_run_criteria(NeverRun)` to these systems.
+
+## Implementation strategy
+
+Simple changes:
+
+1. Commands should be processed in a standard exclusive system with the `CoreSystem::ApplyCommands` label.
+
+### Strict ordering
+
+This algorithm is a restatement of the existing approach, as added in [Bevy #1144](https://github.com/bevyengine/bevy/pull/1144).
+It is provided here to establish a foundation on which we can build other ordering constraints.
+
+TODO: describe algorithm.
+
+### If-needed ordering
+
+TODO: describe the algorithm.
+
+### At-least-once separation
+
+TODO: describe the algorithm.
+
+### Schedules own systems
+
+Currently, each `Stage` owns the systems in it.
+This creates painful divisions that make it challenging or impossible to do things like check dependencies between systems in different stages.
+
+Instead, the `Schedule` should own the systems in it, and constraint-satisfaction logic should be handled at the schedule-level.
+
+### Cross-stage system ordering constraint semantics
+
+### Run criteria overrides
+
+TODO: write this code.
+
+## Drawbacks
+
+1. This adds significant opt-in end-user complexity (especially for plugin authors) over the simple and explicit `.before` and `.after` that already exist or use of stages (or direct linear ordering).
+2. Additional constraints will increase the complexity (and performance cost) of the scheduler.
+
+## Rationale and alternatives
+
+TODO: complete me.
+
+## Prior art
+
+The existing scheduler and basic strict ordering constraints are discussed at length in [Bevy #1144](https://github.com/bevyengine/bevy/pull/1144).
+
+Additional helpful technical foundations and analysis can be found in the [linked blog post](https://ratysz.github.io/article/scheduling-1/) by @Ratysz.
+
+## Unresolved questions
+
+1. How *precisely* should schedules store the systems in them?
+
+## Future possibilities
+
+These additional constraints lay the groundwork for:
+
+1. More elegant, efficient versions of higher-level APIs such as a [system graph specification].
+2. Techniques for automatic schedule construction, where the scheduler is permitted to infer hard sync points and insert cleanup systems as needed to meet the constraints posed.
+3. [Robust schedule-configurable plugins](https://github.com/bevyengine/rfcs/pull/33).
+
+In the future, pre-computed schedules that are loaded from disk could help mitigate some of the startup performance costs of this analysis.
+Such an approach would only be worthwhile for production apps.
+
+The maintainability of system ordering constraints can be enhanced in the future with the addition of tools that serve to verify that they are still useful.
+For example, if-needed ordering constraints that can never produce any effect on system ordering could be flagged at schedule initialization time and reported as a warning.

From cfc4b24440206c9a0bb9a5876d2034bb106e5bc6 Mon Sep 17 00:00:00 2001
From: Alice <alice.i.cecile@gmail.com>
Date: Mon, 20 Sep 2021 14:20:56 -0400
Subject: [PATCH 02/11] Introduction to causal ties

---
 rfcs/34-system-graph-flavors.md | 87 +++++++++++++++++++++++++++------
 1 file changed, 71 insertions(+), 16 deletions(-)

diff --git a/rfcs/34-system-graph-flavors.md b/rfcs/34-system-graph-flavors.md
index 00a308c5..a192fbf1 100644
--- a/rfcs/34-system-graph-flavors.md
+++ b/rfcs/34-system-graph-flavors.md
@@ -50,7 +50,7 @@ There are several **flavors** of system ordering constraints, each with their ow
   - `hard_before` and `hard_after` are syntactic sugar for the common case where the separating system label is `CoreSystem::ApplyCommands`, the label used for the exclusive system that applies queued commands that is typically added to the beginning of each sequential stage.
   - The order of arguments in `between` matters; the labels must only be separated by a cleanup system in one direction, rather than both.
   - This methods do not automatically insert systems to enforce this separation: instead, the schedule will panic upon initialization as no valid system execution strategy exists.
-  - At-least-once separation also applies a special form of run criteria overriding (see below): if a pair of `before` and `after` systems are enabled during a schedule pass, and all intervening `between` systems are disabled, the last intervening `between` system is forcibly enabled to ensure that cleanup occurs
+  - At-least-once separation also applies a special causal tie (see below): if a pair of `before` and `after` systems are enabled during a schedule pass, and all intervening `between` systems are disabled, the last intervening `between` system is forcibly enabled to ensure that cleanup occurs
 
 System ordering constraints operate across stages, but can only see systems within the same schedule.
 
@@ -62,19 +62,65 @@ This may occur if:
 
 System ordering constraints cannot change which stage systems run in or add systems: you must fix these problems manually in such a way that the schedule becomes satisfiable.
 
-### Run criteria overrides
-
-Bevy provides tools to ensure that critical clean-up work is not missed when systems are enabled and disabled using run criteria.
-
-Suppose we have a system `A` that does some work, but leaves the `World` in a messy state that will cause bugs or undesirable side effects if not handled properly.
-System `B` is responsible for cleaning up this work, and should run during every pass of the schedule loop that `A` does.
-
-We can ensure that this occurs by adding a run criteria to `B` that takes `A`'s run status into account: if `A` should run, `B` must also run, regardless of what its other run criteria say.
-Otherwise, `B`'s original run criteria behave as usual.
-Often, `B` will have a blanket run criteria that states that it should never run, and is only run when the override requires it.
-
-As a user, you can add this rule to your systems and system sets by using the `.overrides(overridden_system: impl SystemLabel)` or `overridden_by(overriding_system: impl SystemLabel)` methods.
-Bevy also provides a standard run criteria for the "this system should never run unless overridden" use case: add `.with_run_criteria(NeverRun)` to these systems.
+### Causal ties: systems that must run together
+
+Sometimes, the working of a system is tied in an essential way to the operation of another.
+Suppose you've created a simple set of physics systems: updating velocity based on acceleration without then applying the velocity to update the position will result in subtle and frustrating bugs.
+Rather than relying on fastidious documentation and careful end-users, we can enforce these constraints through the schedule.
+
+These **causal ties** have fairly simple rules:
+
+- causal ties are directional: the **upstream** system(s) determine whether the **downstream** systems must run, but the reverse is not true.
+- if an upstream system runs during a pass through the schedule, the downstream system must also run during the same pass through the schedule
+  - this overrides any existing run criteria, although any ordering constraints are obeyed as usual
+  - causal ties propagate downstream: you could end up enabling a complex chain of systems because a single upstream system needed to run during this schedule pass
+- if no suitable downstream system exists in the schedule, the schedule will panic upon initialization
+- causal ties are independent of system execution order: upstream systems may be before, after or ambiguous with their downstream systems
+
+Let's take a look at how we would declare these rules for our physics example using the `if_runs_then` system configuration method on our upstream system:
+
+```rust
+fn main(){
+  App::new()
+    .add_system(update_velocity
+      .label("update_velocity")
+      // If this system is enabled,
+      // all systems with the "apply_velocity" label must run this loop.
+      // If no systems with that label exist,
+      // the schedule will panic.
+      .if_runs_then("apply_velocity")
+    )
+    .add_system(
+      apply_velocity
+      .label("apply_velocity")
+    )
+    .run();
+}
+```
+
+If we only have control over our downstream system, we can use the `run_if` system configuration methods:
+
+```rust
+fn main(){
+  App::new()
+    .add_system(update_velocity
+      .label("update_velocity")
+    )
+    .add_system(
+      apply_velocity
+      .label("apply_velocity")
+      // If any system with the given label is enabled,
+      // this system must run this loop.
+      .run_if("update_velocity")
+    )
+    .run();
+}
+```
+
+`only_run_if` has the same semantics, except that it also adds a run criteria that always returns `ShouldRun::No`, allowing the system to be skipped except where overridden by a causal tie (or other run criteria).
+
+If we want a strict bidirectional dependency, we can use `always_run_together(label: impl SystemLabel)` as a short-hand for a pair of `.if_run_then` and `run_if` constraints.
+Bidirectional links of this sort result in an OR-style logic: if either system in the pair is scheduled to run in this schedule pass, both systems must run.
 
 ## Implementation strategy
 
@@ -106,7 +152,7 @@ Instead, the `Schedule` should own the systems in it, and constraint-satisfactio
 
 ### Cross-stage system ordering constraint semantics
 
-### Run criteria overrides
+### Causal ties
 
 TODO: write this code.
 
@@ -117,7 +163,14 @@ TODO: write this code.
 
 ## Rationale and alternatives
 
-TODO: complete me.
+### Why do we need to include causal ties as part of this RFC?
+
+Without causal-tie functionality, at-least-once separation can silently break when the separating system is disabled.
+
+### Why don't we support more complex causal tie operations?
+
+Compelling use cases (other than the one created by at-least-once separation) have not been demonstrated.
+If and when those exist, we can add the functionality with APIs that map well to user intent.
 
 ## Prior art
 
@@ -142,3 +195,5 @@ Such an approach would only be worthwhile for production apps.
 
 The maintainability of system ordering constraints can be enhanced in the future with the addition of tools that serve to verify that they are still useful.
 For example, if-needed ordering constraints that can never produce any effect on system ordering could be flagged at schedule initialization time and reported as a warning.
+
+Additional graph-style APIs for specifying causal ties may be helpful, as would the addition of a `.always_run_together` method that can be added to labels to imply mutual causal ties between all systems that share that label.

From 5f38471614ef4562461e2f3d6b593fef217247e0 Mon Sep 17 00:00:00 2001
From: Alice <alice.i.cecile@gmail.com>
Date: Mon, 20 Sep 2021 16:01:25 -0400
Subject: [PATCH 03/11] Causal ties implementation algorithm

---
 rfcs/34-system-graph-flavors.md | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/rfcs/34-system-graph-flavors.md b/rfcs/34-system-graph-flavors.md
index a192fbf1..da1e4085 100644
--- a/rfcs/34-system-graph-flavors.md
+++ b/rfcs/34-system-graph-flavors.md
@@ -154,7 +154,22 @@ Instead, the `Schedule` should own the systems in it, and constraint-satisfactio
 
 ### Causal ties
 
-TODO: write this code.
+After all `SystemDescriptors` are added to the schedule, any causal ties described within are converted to a single `CausalTies` graph data structure.
+
+This stores directed edges: from one upstream system to the corresponding downstream system.
+If a label is applied to more than one system, it will create an edge between each appropriate system.
+If the label for the upstream system is empty, it should be silently skipped, but if the label for the downstream system is empty the schedule should panic.
+This is because our high-level functionality will continue to behave correctly if a system that creates work to be handled is absent, but will break if a system that cleans up work is missing.
+
+Whenever run criteria are checked, the set of all systems which have `ShouldRun::Yes` during this schedule pass are compared to the set of upstream systems (those which have at least one outgoing edge) in the `CausalTies` struct.
+All systems that are immediately downstream of those systems have their `ShouldRun` state set to `Yes` as well.
+If the downstream system is not waiting to be scheduled in the *current or future* stages, the executor should panic (likely due to the system being added to an earlier stage).
+This process is repeated until no new changes occur.
+
+If this is implemented under the current `RunCriteria` design that includes looping information, that looping information should be propagated to downstream dependencies as well.
+In particular, if an upstream system is `YesAndLoop`, the downstream system becomes `YesAndLoop`.
+If the upstream system is `NoAndLoop`, the downstream system becomes either `NoAndLoop` or `YesAndLoop`, based on whether or not it was already set to run.
+This branching is much more complex to reason about and likely to be substantially slower, and is another reason to simplify the `RunCriteria` enum.
 
 ## Drawbacks
 
@@ -172,6 +187,13 @@ Without causal-tie functionality, at-least-once separation can silently break wh
 Compelling use cases (other than the one created by at-least-once separation) have not been demonstrated.
 If and when those exist, we can add the functionality with APIs that map well to user intent.
 
+### Why not cache the new set of systems whose run criteria changed?
+
+Vector allocations are slow, but bit set comparisons and branchless mass assignments are fast.
+It's faster to repeat work here than verify if it needs to be done.
+
+At extremely high causal-tie graph depths and numbers of systems, the asymptotic performance gains may be worth it, but that seems unlikely in realistic scenarios.
+
 ## Prior art
 
 The existing scheduler and basic strict ordering constraints are discussed at length in [Bevy #1144](https://github.com/bevyengine/bevy/pull/1144).

From fb8c193905bb2dbeee5242c521efd813d3ecabef Mon Sep 17 00:00:00 2001
From: Alice <alice.i.cecile@gmail.com>
Date: Mon, 20 Sep 2021 16:34:43 -0400
Subject: [PATCH 04/11] Cross-stage system ordering

---
 rfcs/34-system-graph-flavors.md | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/rfcs/34-system-graph-flavors.md b/rfcs/34-system-graph-flavors.md
index da1e4085..fe88c7b2 100644
--- a/rfcs/34-system-graph-flavors.md
+++ b/rfcs/34-system-graph-flavors.md
@@ -19,13 +19,15 @@ This is particularly critical in the context of [plugin configurability](https:/
 
 ### Scheduling 101
 
-Bevy schedules are composed of stages, which have two phases: parallel and sequential.
-The parallel phase of a stage always runs first, then is followed by a sequential phase which applies any commands generated in the parallel phase.
+Bevy schedules are composed of stages, of which there are two varieties: parallel and sequential.
+Typically, these will alternate: parallel stages will perform easily isolated work, and generate commands.
+Then a sequential stage will run, beginning with the standard command-processing exclusive system.
+Any adjacent parallel stages are collapsed into one parallel stage to maximize parallelism, and any adjacent sequential stages are concatenated.
 
-During **sequential phases**, each system can access the entire world freely but only one can run at a time.
+During **sequential stages**, each system can access the entire world freely but only one can run at a time.
 Exclusive systems can only be added to the sequential phase while parallel systems can be added to either phase.
 
-During **parallel phases**, systems are allowed to operate in parallel, carefully dividing up access to the `World` according to the data accesses requested by their system parameters to avoid undefined behavior.
+During **parallel stages**, systems are allowed to operate in parallel, carefully dividing up access to the `World` according to the data accesses requested by their system parameters to avoid undefined behavior.
 
 Without any user-provided ordering constraints, systems within the same parallel phase can be executed in any order so long as Rust's ownership semantics are obeyed.
 That means that a **waiting** (scheduled to run during this stage) system cannot be started if any **active** (currently running) systems are **incompatible** (cannot be scheduled at the same time) if they have conflicting data access.
@@ -150,7 +152,17 @@ This creates painful divisions that make it challenging or impossible to do thin
 
 Instead, the `Schedule` should own the systems in it, and constraint-satisfaction logic should be handled at the schedule-level.
 
-### Cross-stage system ordering constraint semantics
+### Cross-stage system ordering constraints
+
+When two systems are in separate stages, the exact meaning of ordering constraints gets a bit trickier.
+
+When system `A` is before system `B`:
+
+- if system `A` is in an earlier stage, nothing happens: this is trivially satisfied
+- if system `A` is in the same stage, system `B` is held in queue until system `A` completes
+- if system `A` is in a later stage, the schedule panics: this is trivially impossible
+
+This check (and panic) should be skipped for if-needed ordering constraints, as they should be minimally impairing since they are only intended to resolve ambiguities, rather than introduce logically-necessary constraints.
 
 ### Causal ties
 

From 8163c78b48e22db6b6dbd9fdb278dbdeef771dd2 Mon Sep 17 00:00:00 2001
From: Alice Cecile <alice.i.cecile@gmail.com>
Date: Tue, 21 Sep 2021 21:59:26 -0400
Subject: [PATCH 05/11] Implementation strategy for strict and if-needed
 ordering

---
 rfcs/34-system-graph-flavors.md | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/rfcs/34-system-graph-flavors.md b/rfcs/34-system-graph-flavors.md
index fe88c7b2..4290d36a 100644
--- a/rfcs/34-system-graph-flavors.md
+++ b/rfcs/34-system-graph-flavors.md
@@ -132,10 +132,18 @@ Simple changes:
 
 ### Strict ordering
 
-This algorithm is a restatement of the existing approach, as added in [Bevy #1144](https://github.com/bevyengine/bevy/pull/1144).
+This algorithm is a restatement of the existing approach, as added in [Bevy #1144](https://github.com/bevyengine/bevy/pull/1144) and found in the [bevy_ecs/schedule](https://github.com/bevyengine/bevy/tree/v0.5.0/crates/bevy_ecs/src/schedule) module.
 It is provided here to establish a foundation on which we can build other ordering constraints.
 
-TODO: describe algorithm.
+Based on the `.before` and `.after` constraints provided by the `SystemDescriptor` passed in by the end user, each system records the list of systems that must come after it in its `SystemSchedulingMetadata`: these are its **dependants**.
+Additionally, each system records the total number of systems that must come before it: these are its **dependencies**.
+
+Systems are allowed to run if:
+
+1. They are not incompatible with any currently running system due to the data that they must access.
+2. The current number of dependencies is 0.
+
+The parallel executor released in Bevy 0.5 runs systems in a straightforward and greedy fashion: if both of those conditions are true, the currently-checked system is immediately scheduled.
 
 ### If-needed ordering
 

From 76b210b1fbfd15bec41cd66ab6e060e546fb1db7 Mon Sep 17 00:00:00 2001
From: Alice Cecile <alice.i.cecile@gmail.com>
Date: Wed, 22 Sep 2021 20:19:50 -0400
Subject: [PATCH 06/11] At-least once separation implementation design

---
 rfcs/34-system-graph-flavors.md | 34 +++++++++++++++++++++++++++++++--
 1 file changed, 32 insertions(+), 2 deletions(-)

diff --git a/rfcs/34-system-graph-flavors.md b/rfcs/34-system-graph-flavors.md
index 4290d36a..ec7d80d5 100644
--- a/rfcs/34-system-graph-flavors.md
+++ b/rfcs/34-system-graph-flavors.md
@@ -147,11 +147,28 @@ The parallel executor released in Bevy 0.5 runs systems in a straightforward and
 
 ### If-needed ordering
 
-TODO: describe the algorithm.
+The dependency graph for strict-ordering constraints can be built once, upon schedule initialization.
+However, if-needed dependencies are more complex: they care about the actual data accesses involved, and so must be recomputed at the beginning of each parallel stage.
+Because the set of archetypes in the world cannot change during a parallel stage, this is safe to compute this ahead of time, rather than dynamically.
+
+The strategy for doing this is fairly straightforward:
+
+1. Convert all `SystemDescriptor` data into `SystemSchedulingMetadata`, which stores the constraints on a one-to-one basis.
+2. Compute and cache the dependency graph of the strict ordering constraints.
+3. Beginning from the strict-ordering dependency graph, check each if-needed dependency.
+4. If and only if the systems connected by an if-needed dependency are incompatible, add a temporary strict ordering edge between them.
+5. At the beginning of each parallel stage, reset the working dependency graph to the cached strict-ordering dependency graph.
 
 ### At-least-once separation
 
-TODO: describe the algorithm.
+At-least-once separation constraints affect both system scheduling and whether or not a system is enabled.
+These constraints capture an interaction between three kinds of systems: before-separation, after-separation and separating systems.
+
+If any before-separation and after-separation systems exist during the same parallel stage, a strict ordering dependency is added between all before-separation systems within that stage to all separating systems within that stage, and then to all after-separation systems within that stage.
+This is technically stricter than is needed, see **Rationale and Alternatives** for more discussion.
+
+If they exist within the same sequential stage, the constraint is verified through an in-order scan of the systems.
+If they exist across stage boundaries, the same in-order scan is applied, treating all parallel stages as if they were a single opaque unit.
 
 ### Schedules own systems
 
@@ -214,6 +231,19 @@ It's faster to repeat work here than verify if it needs to be done.
 
 At extremely high causal-tie graph depths and numbers of systems, the asymptotic performance gains may be worth it, but that seems unlikely in realistic scenarios.
 
+### How could we reduce the strictness of at-least-once-separation scheduling?
+
+The simple algorithm proposed above is, technically speaking, too strict.
+Lets mark our before-separation systems with `A`, our after-separation systems with `B`, and our separating systems with `S` so that `A` -> `S` -> `B`.
+
+Within a stage, `A` -> `S` -> `B` -> `A` -> `S` -> `B` is a valid order, but is ruled out by our simple ordering constraints.
+
+
+1. Separating systems are not scheduled in the ordinary fashion. Instead, they are held in reserve, and each separator type is stored in its own pool.
+2. After a before-separation system completes, one separating system of the appropriate flavor is added to the set of waiting systems. A strict-ordering dependency between the separating system and all after-separation systems is added to the working dependency graph.
+3. This ensures that the separating system runs 
+
+
 ## Prior art
 
 The existing scheduler and basic strict ordering constraints are discussed at length in [Bevy #1144](https://github.com/bevyengine/bevy/pull/1144).

From a9ac6ae9ff222fb14e6fcb9e2aef821499c5f721 Mon Sep 17 00:00:00 2001
From: Alice <alice.i.cecile@gmail.com>
Date: Wed, 20 Oct 2021 12:14:41 -0400
Subject: [PATCH 07/11] Swap at-least-once separation to use side effects
 design

---
 rfcs/34-system-graph-flavors.md | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/rfcs/34-system-graph-flavors.md b/rfcs/34-system-graph-flavors.md
index fe88c7b2..67597759 100644
--- a/rfcs/34-system-graph-flavors.md
+++ b/rfcs/34-system-graph-flavors.md
@@ -20,7 +20,8 @@ This is particularly critical in the context of [plugin configurability](https:/
 ### Scheduling 101
 
 Bevy schedules are composed of stages, of which there are two varieties: parallel and sequential.
-Typically, these will alternate: parallel stages will perform easily isolated work, and generate commands.
+Typically, these will alternate, as adjacent stages of the same type will be collapsed.
+Parallel stages will perform easily isolated work, and generate commands.
 Then a sequential stage will run, beginning with the standard command-processing exclusive system.
 Any adjacent parallel stages are collapsed into one parallel stage to maximize parallelism, and any adjacent sequential stages are concatenated.
 
@@ -39,20 +40,18 @@ To help you eliminate ambiguities, eliminate delays and enforce **logical invari
 
 There are several **flavors** of system ordering constraints, each with their own high-level behavior:
 
-- **Strict ordering:** Systems from set `A` cannot be started while any system from set `B` are waiting or active.
+- **Strict ordering:** While any systems from set `A` are waiting or active, systems from set `B` cannot be started.
   - Simple and explicit.
   - Use the `before(label: impl SystemLabel)` or `after(label: impl SystemLabel)` methods to set this behavior.
-- **If-needed ordering:** A given system in set `A` cannot be started while any incompatible system from set `B` is waiting or active.
+  - If `A` is empty (or if no systems from `A` are enabled), this rule has no effect.
+- **If-needed ordering:** While any systems from set `A` are waiting or active, systems from set `B` cannot be started if they are incompatible with any remaining systems in `A`.
   - Usually, if-needed ordering is the correct tool for ordering groups of systems as it avoids unnecessary blocking.
   - If systems in `A` use interior mutability, if-needed ordering may result in non-deterministic outcomes.
   - Use `before_if_needed(label: impl SystemLabel)` or `after_if_needed(label: impl SystemLabel)`.
-- **At-least-once separation:** Systems in set `A` cannot be started if a system in set `B` has been started until at least one system with the label `S` has completed. Systems from `A` and `B` are incompatible with each other.
-  - This is most commonly used when commands created by systems in `A` must be processed before systems in `B` can run correctly.
-  - Use the `between(before: impl SystemLabel, after: impl SystemLabel)`, `before_and_seperated_by(before: impl SystemLabel, seperated_by: impl SystemLabel)` method or its `after` analogue.
-  - `hard_before` and `hard_after` are syntactic sugar for the common case where the separating system label is `CoreSystem::ApplyCommands`, the label used for the exclusive system that applies queued commands that is typically added to the beginning of each sequential stage.
-  - The order of arguments in `between` matters; the labels must only be separated by a cleanup system in one direction, rather than both.
-  - This methods do not automatically insert systems to enforce this separation: instead, the schedule will panic upon initialization as no valid system execution strategy exists.
-  - At-least-once separation also applies a special causal tie (see below): if a pair of `before` and `after` systems are enabled during a schedule pass, and all intervening `between` systems are disabled, the last intervening `between` system is forcibly enabled to ensure that cleanup occurs
+- **At-least-once separation:** If a type of side effect has been "produced" by any system, at least one "consuming" system must run before any system that "uses" that side effect can run.
+  - `Commands` are the most important side effect here, but others (such as indexing) may be added to the engine itself later.
+  - This design expands on the API in [RFC 36: Encoding Side Effect Lifecycles as Subgraphs](https://github.com/bevyengine/rfcs/pull/36).
+  - At-least-once separation also applies a special "causal tie" (see below): if a pair of `before` and `after` systems are enabled during a schedule pass, and all intervening `between` systems are disabled, the last intervening `between` system is forcibly enabled to ensure that cleanup occurs
 
 System ordering constraints operate across stages, but can only see systems within the same schedule.
 

From 4217724a9c49d0a54699114ba37df9bb6139e0d4 Mon Sep 17 00:00:00 2001
From: Alice <alice.i.cecile@gmail.com>
Date: Wed, 20 Oct 2021 13:14:41 -0400
Subject: [PATCH 08/11] Basic constraint scheduling

---
 rfcs/34-system-graph-flavors.md | 48 ++++++++++++++++++++++++++++-----
 1 file changed, 41 insertions(+), 7 deletions(-)

diff --git a/rfcs/34-system-graph-flavors.md b/rfcs/34-system-graph-flavors.md
index 67597759..751bfa4c 100644
--- a/rfcs/34-system-graph-flavors.md
+++ b/rfcs/34-system-graph-flavors.md
@@ -129,18 +129,52 @@ Simple changes:
 
 1. Commands should be processed in a standard exclusive system with the `CoreSystem::ApplyCommands` label.
 
-### Strict ordering
+At-least-once separation needs to be tackled in concert with [RFC 36: Encoding Side Effect Lifecycles as Subgraphs](https://github.com/bevyengine/rfcs/pull/36), as it relies on the side effect API to provide an ergonomic interface to these concepts.
 
-This algorithm is a restatement of the existing approach, as added in [Bevy #1144](https://github.com/bevyengine/bevy/pull/1144).
-It is provided here to establish a foundation on which we can build other ordering constraints.
+### Causal ties
 
-TODO: describe algorithm.
+### Solving system ordering
 
-### If-needed ordering
+The basics of the system scheduler are universal.
+This core structure is a restatement of the existing approach, as added in [Bevy #1144](https://github.com/bevyengine/bevy/pull/1144).
+It is provided here for clarity when explaining other approaches.
 
-TODO: describe the algorithm.
+1. Collect a list of systems that need to be run during the current stage.
+2. Describe the data access of each system, in terms of which resources and archetype-components it needs access to.
+3. Select one of the waiting systems.
+   1. Currently, this is simply done greedily: checking the next available system.
+4. Check if the system is allowed to run.
+   1. Systems cannot run if an incompatible (in terms of data access) system is currently running.
+      1. This can be checked efficiently by comparing a the bitset of the system's data accesses to the bitset of the currently running data accesses.
+   2. Other system ordering constraints can also block a system from running, as described below.
+5. If it is allowed to run, run the system.
+   1. Rebuild the data accesses by taking a bitwise union of the data accesses of all currently running systems.
+   2. Once the system is complete, unlock its data by rebuilding the bitset again.
+6. Loop from 3 until all systems in the current stage are completed.
+7. Advance to the next stage and loop from 1.
+
+#### Strict ordering
+
+Systems cannot run if any of the systems that are "strictly before" them are still waiting or running.
+
+Naively, you could just check if any of the waiting or running systems have an appropriate label.
+However, this results in repeated, inefficient checks.
+Instead, we can precompute a representation of the dependency tree once, when the schedule for a given stage is made.
+
+Each system stores the number of **dependencies** it has: the dependency count must be equal to 0 in order for the system to run.
+Each system also stores a list of **dependent systems**: once this system completes, the number of dependencies in each of its dependant systems is decreased by one.
+
+We can count these dependencies by reviewing each strict ordering constraint.
+For each set of systems in the "before" set, add a dependency to each system in the "after" set.
+
+#### If-needed ordering
+
+Systems cannot run if any of the systems that they are "before if needed" are still waiting or running, if and only if they are incompatible with those systems.
+
+This can be implemented using the same dependency graph algorithm above: the only difference is which edges are added.
+For each if-needed dependency constraint added, an edge between the two systems is added if and only if they have incompatible data accesses.
 
-### At-least-once separation
+#### At-least-once separation
 
 TODO: describe the algorithm.
 

From 09a88f00550cfcc38d653920ca28728083eeb5f1 Mon Sep 17 00:00:00 2001
From: Alice <alice.i.cecile@gmail.com>
Date: Wed, 20 Oct 2021 13:25:37 -0400
Subject: [PATCH 09/11] Causal tie implementation

---
 rfcs/34-system-graph-flavors.md | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/rfcs/34-system-graph-flavors.md b/rfcs/34-system-graph-flavors.md
index 751bfa4c..1227c3f2 100644
--- a/rfcs/34-system-graph-flavors.md
+++ b/rfcs/34-system-graph-flavors.md
@@ -133,6 +133,17 @@ At-least-once separation needs to be tackled in concert with [RFC 36: Encoding S
 
 ### Causal ties
 
+Causal ties add two small complications to the scheduler.
+
+First, when the schedule is built, we must check that all of the required downstream systems required to perform cleanup exist.
+For each causal tie, check if any systems with the upstream label are in the schedule.
+If they do, check if any systems with the downstream label are in the schedule.
+If none exist, panic.
+
+Secondly, after run criteria are evaluated, each causal tie should be checked.
+If any upstream systems for that causal tie are freshly enabled, also enable any of their downstream systems.
+Check all other causal ties and repeat until no new systems are enabled.
+
 ### Solving system ordering
 
 The basics of the system scheduler are universal.

From 85b21f918cd627d03c2210d16ad2af100014211d Mon Sep 17 00:00:00 2001
From: Alice <alice.i.cecile@gmail.com>
Date: Wed, 20 Oct 2021 14:48:56 -0400
Subject: [PATCH 10/11] At-least-once separation implementation details

---
 rfcs/34-system-graph-flavors.md | 43 +++++++++++++++++++++++++++------
 1 file changed, 36 insertions(+), 7 deletions(-)

diff --git a/rfcs/34-system-graph-flavors.md b/rfcs/34-system-graph-flavors.md
index 1227c3f2..ce7e1401 100644
--- a/rfcs/34-system-graph-flavors.md
+++ b/rfcs/34-system-graph-flavors.md
@@ -51,7 +51,7 @@ There are several **flavors** of system ordering constraints, each with their ow
 - **At-least-once separation:** If a type of side effect has been "produced" by any system, at least one "consuming" system must run before any system that "uses" that side effect can run.
   - `Commands` are the most important side effect here, but others (such as indexing) may be added to the engine itself later.
   - This design expands on the API in [RFC 36: Encoding Side Effect Lifecycles as Subgraphs](https://github.com/bevyengine/rfcs/pull/36).
-  - At-least-once separation also applies a special "causal tie" (see below): if a pair of `before` and `after` systems are enabled during a schedule pass, and all intervening `between` systems are disabled, the last intervening `between` system is forcibly enabled to ensure that cleanup occurs
+  - At-least-once separation also applies a special "causal tie" (see below): if a pair of `before` and `after` systems are enabled during a schedule pass, and all intervening `between` systems are disabled, the last intervening `between` system is forcibly enabled to ensure that cleanup occurs.
 
 System ordering constraints operate across stages, but can only see systems within the same schedule.
 
@@ -176,7 +176,7 @@ Each system stores the number of **dependencies** it has: the dependency count m
 Each system also stores a list of **dependent systems**: once this system completes, the number of dependencies in each of its dependant systems is decreased by one.
 
 We can count these dependencies by reviewing each strict ordering constraint.
-For each set of systems in the "before" set, add a dependency to each system in the "after" set.
+For each set of systems in the "before" set, add a dependency to each system in the "after" set that is in the same stage.
 
 #### If-needed ordering
 
@@ -187,7 +187,31 @@ For each if-needed dependency constraint added, an edge between the two systems
 
 #### At-least-once separation
 
-TODO: describe the algorithm.
+Under an at-least-once-separated constraint, systems from the "after" set (`C`) cannot run if at least one system from the "before" set (`A`) has run, until at least one system from the "between" set (`B`) has run.
+
+Even ignoring the causal ties induced, this is by far the most complex constraint.
+Unfortunately, this pattern arises naturally whenever side effect cleanup is involved, and so must be handled properly by the scheduler itself.
+
+We cannot simply restate this as `A.before(B)` and `B.before(C)`: schedules produced by this will meet the requirements, but be overly strict, breaking our ability to express logic in important ways.
+This will force *all* of the separating systems to run before any of the "after" systems can start, making non-trivial chains impossible.
+
+Instead, we must track side effects.
+Each side effect is either in the `SideEffect::Clean` or `SideEffect::Dirty` state.
+When a system that produces a side effect is started, that side effect moves to the `SideEffect::Dirty` state.
+When a system that consumes a side effect completes, that side effect moves to the `SideEffect::Clean` state.
+Systems which use a side effect cannot be started if that side-effect is in the `SideEffect::Dirty` state.
+
+This solves the immediate scheduling problem, but does not tell us whether or not our schedule is unsatisfiable, and does not tell us when we must run our cleanup systems.
+The scheduler simply silently and nondeterministically hanging is not a great outcome!
+
+There are two options here:
+
+- users manually add cleanup systems at the appropriate points, and we write and then run a complex special-cased verifier
+  - critically, we need to make sure that the logic is never broken under *any* possible valid strategy
+- the scheduler automatically inserts cleanup systems where needed
+  - this avoids the need to add causal ties automatically
+  - this significantly improves the ordinary user experience by reducing errors and complexity
+  - this *might* worsen performance over a hand-optimized schedule, as more cleanup systems may be added than is strictly needed (or they may be added at suboptimal times)
 
 ### Schedules own systems
 
@@ -258,15 +282,20 @@ Additional helpful technical foundations and analysis can be found in the [linke
 
 ## Unresolved questions
 
-1. How *precisely* should schedules store the systems in them?
+1. How should dependencies between systems that are in different stages be handled?
+   1. For linear schedules, this is simple: just panic if the relative order is wrong.
+   2. Nonlinear schedules (as discussed at length in [RFC #35: (Hierarchical) Stage Labels, Conditions, and Transitions](https://github.com/bevyengine/rfcs/pull/35) are much more complex.
+   3. We could just eliminate the distinction between parallel and exclusive stages completely, merging this all into one big parallel stage. Then, any constraints outside of the stage panic, as is the case on 0.5  (but you care less, because linear structures will have only one stage).
+2. Should we use manual or automatic insertion of cleanup systems?
+   1. If we use manual scheduling, what algorithm do we want to use to verify satisfiability.
+   2. If we use automatic scheduling, what strategy should be followed?
 
 ## Future possibilities
 
 These additional constraints lay the groundwork for:
 
-1. More elegant, efficient versions of higher-level APIs such as a [system graph specification].
-2. Techniques for automatic schedule construction, where the scheduler is permitted to infer hard sync points and insert cleanup systems as needed to meet the constraints posed.
-3. [Robust schedule-configurable plugins](https://github.com/bevyengine/rfcs/pull/33).
+1. More elegant, efficient versions of higher-level APIs such as a [system graph specification](https://github.com/bevyengine/bevy/pull/2381) or a [side effects](https://github.com/bevyengine/rfcs/pull/36) abstraction.
+2. [Robust schedule-configurable plugins](https://github.com/bevyengine/rfcs/pull/33).
 
 In the future, pre-computed schedules that are loaded from disk could help mitigate some of the startup performance costs of this analysis.
 Such an approach would only be worthwhile for production apps.

From 1439f18b6bd78aa01d6b15ca71bc41da678c5ad9 Mon Sep 17 00:00:00 2001
From: Alice <alice.i.cecile@gmail.com>
Date: Thu, 21 Oct 2021 01:31:33 -0400
Subject: [PATCH 11/11] Fix RFC number

---
 rfcs/{34-system-graph-flavors.md => 38-system-graph-flavors.md} | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 rename rfcs/{34-system-graph-flavors.md => 38-system-graph-flavors.md} (100%)

diff --git a/rfcs/34-system-graph-flavors.md b/rfcs/38-system-graph-flavors.md
similarity index 100%
rename from rfcs/34-system-graph-flavors.md
rename to rfcs/38-system-graph-flavors.md