ecs-1(1)

glfmn.io ecs-1(1)
Name

ecs-1

Understanding bevy's ECS implementation.

Parts

  1. Part 0

    The why and how of making an entity-component-system for personal use.

  2. Part 1*

Description

A tour of the bevy_ecs v0.12.1 entity and component storage.

Bevy

Bevy is a modular game engine based around an ECS, or Entity, Component, System architecture; to quote the README (emphasis mine):

ECS is a software pattern that involves breaking your program up into Entities, Components, and Systems. Entities are unique things that are assigned groups of Components, which are then processed using Systems.

Additionally, bevy_ecs sports some fancy features like robust change detection, dense and sparse entity storage, archetypes, and more cool stuff that we will surely be getting into.

This design is principally useful because of the flexibility it provides in designing and implementing a game, and secondarily because it can also have great performance characteristics when implemented correctly.

I've implemented a couple of simple projects using Bevy, and while it was still very immature at the time, it was a very great experience and I highly recommend trying it out if you are interested.

Entity

I started this because I was specifically curious about how bevy accesses an Entity's data, so it seems a natural place to start is with the Entity type itself:

src/entity/mod.rs
128#[repr(C, align(8))]
129pub struct Entity {
130 // Do not reorder the fields here. The ordering is explicitly used by repr(C)
131 // to make this struct equivalent to a u64.
132 #[cfg(target_endian = "little")]
133 index: u32,
134 generation: NonZeroU32,
135 #[cfg(target_endian = "big")]
136 index: u32,
137}
  1. index

    A number which indicates the identity of the Entity; if no entities are ever freed, this number just increments.

  2. generation

    Increments every time an entity index is re-used to prevent accesses of the wrong data using stale Entity. Additionally, generation is NonZeroU32 as a niche optimization, in order to allow for Option<Entity> and Entity to have the same size.

Note that the memory layout is tightly controlled and optimized using the #[cfg] and #[repr] attributes. This allows codegen to use single integer instructions for many operations.

If you have a simple ECS, then you have 1 array for every type of component, where the entity is an index directly into those arrays; but if not every entity has every component, then the ECS needs to store extra data to indicate the presence or absence of each type of component for each entity. The simplest approach would be to use Option<Component>, but this adds both runtime and memory cost.

To combat this fragmentation, bevy uses Archetypes; an archetype is essentially a pool of entity data for entities which share the same components. This means that by finding the right archetype, you know the set of components the entity has, negating the need to explicitly store any data indicating the presence or absence of a component. The information is now stored out of band.

But then that introduces a problem: how do you efficiently look up a particular component from its entity ID?

Here Bevy introduces a new type Entities which handles entity lookup and allocation:

src/entity/mod.rs
425pub struct Entities {
426 meta: Vec<EntityMeta>,
427 pending: Vec<u32>,
428 free_cursor: AtomicIdCursor,
429 /// Stores the number of free entities for [`len`](Entities::len)
430 len: u32,
431}
  • meta

    Contains the information necessary to lookup the entity's component data.

  • pending

    Combines the free list and the list of allocated entities that haven't been properly inserted into meta

  • free_cursor

    Separates the entity indices in pending which have been allocated from the ones that are still free and unreserved. It is atomic to allow handing out entities across threads. The free_cursor counts down from the end of the pending list to make flushing more efficient.

If we examine the EntityMeta in more detail, we see that it contains an EntityLocation and a generation; the generation must match the one indicated on the Entity for the lookup to succeed, preventing use-after-free. The EntityLocation is more interesting:

src/entity/mod.rs
902pub struct EntityLocation {
903 pub archetype_id: ArchetypeId,
904 pub archetype_row: ArchetypeRow,
905 pub table_id: TableId,
906 pub table_row: TableRow,
907}

Here we see the indices necessary to look up the Archetype where the component data lives, and then find the entity's data within the Archetype. But I was also surprised to see a TableID and TableRow; we will revisit those later and explore the Archetype first.

Archetype

To quote the bevy_ecs documentation:

An archetype uniquely describes a group of entities that share the same components: a world only has one archetype for each unique combination of components, and all entities that have those components and only those components belong to that archetype.

A lot of details in the Archetype implementation exist specifically to support sparse component storage, so lets dive into the what and why of sparse storage.

Sparse Entity Storage

An Archetypal ECS optimizes for fast iteration over entity data by placing Entities into buckets (Archetypes) where the component data can be densely packed into arrays, also known as Columns.

One consequence of this, however, is that when a System adds or removes components from an Entity, it moves to a new Archetype, requiring that Archetype must remove all the Entity's components from each of its Columns. It can be in any row of the Column, which means ordering cannot be preserved without fragmenting the data within the Column. Here we can see how the cost quickly snowballs:

  • Touching every Column for every component in the new and old Archetypes
  • Updating the EntityMeta within the global Entities struct to point to the new ArchetypeId and ArchetypeRow
  • Updating the EntityMeta for any entity in the old Archetype that moved to a different ArchetypeRow to preserve density

bevy_ecs provides sparse storage to offer an alternate, more efficient code path for frequently added and removed components. It provides an extra level of indirection: the Table. As we will see shortly, the Archetype keeps an extra lookup array which allows one to go from the virtual ArchetypeRow to the TableRow where the component data is actually stored. The result is that when only sparse components change, most of the component data can stay in place without moving.

Let's dive into how bevy_ecs defines an Archetype:

src/archetype.rs
307pub struct Archetype {
308 id: ArchetypeId,
309 table_id: TableId,
310 edges: Edges,
311 entities: Vec<ArchetypeEntity>,
312 components: ImmutableSparseSet<ComponentId, ArchetypeComponentInfo>,
313}
  1. id

    A type-safe index that represents this Archetype, used for runtime lookup.

  2. table_id

    Look-up index of the table which holds the data for the Archetype's components.

  3. edges

    When the user adds or removes a component to an entity, it must move to a different Archetype. To avoid doing type reflection to find the new Archetype on every component change, Edges caches the results of these moves.

  4. entities

    A lookup table which stores the entity and its row within the table; essentially a virtual mapping.

  5. components

    Metadata about each component stored within the Archetype, used to support multithreading, unique to each Archetype.

Let's take a deeper look at the entities lookup array:

src/archetype.rs
267pub struct ArchetypeEntity {
268 entity: Entity,
269 table_row: TableRow,
270}

As we can see here, the Archetype has a mapping to the backing TableRow which allows access to the Component data, so to go from the Entity to its data requires a double-indirection when the access goes through the Archetype.

When iterating through the Archetype, we are also able to access the ID and Generation of every Entity contained within, which is necessary and useful for some of bevy_ecs's features; we will see many places which cache a copy of the Entity for various reasons, from safety to performance.

Table

The table is a struct-of-arrays style data structure. It contains a set of columns, where each column is one of the components used by the archetype. Each column is backed by a custom BlobVec type which stores the actual component data.

src/storage/table.rs
560pub struct Table {
561 columns: ImmutableSparseSet<ComponentId, Column>,
562 entities: Vec<Entity>,
563}

The ImmutableSparseSet maps from the ComponentId to the Column, where the ComponentId comes from the underlying component type's TypeId.

src/storage/table.rs
151pub struct Column {
152 data: BlobVec,
153 added_ticks: Vec<UnsafeCell<Tick>>,
154 changed_ticks: Vec<UnsafeCell<Tick>>,
155}

In addition to storage, the Column is responsible for change detection using Ticks, which store when component data was added or changed. By comparing the current Tick to the added_ticks or changed_ticks, a Query can filter components based on these attributes.

BlobVec

An Archetypal ECS deals with many types that are only known at runtime; Many data structures which require a compile-time known size, alignment, Drop function, etc, are no longer usable as they can't easily deal with type-erased data.

This is where the BlobVec comes in: it uses runtime type information to store the data as type-erased blobs of u8. Otherwise, it is essentially just a dynamically allocated array.

src/storage/blob_vec.rs
16pub(super) struct BlobVec {
17 item_layout: Layout,
18 capacity: usize,
19 len: usize,
20 data: NonNull<u8>,
21 drop: Option<unsafe fn(OwningPtr<'_>)>,
22}
  1. item_layout

    Stores the alignment and size of the data in the BlobVec in order to properly store and access the data. Note that the Rust std defines the Layout type. Special care is taken to ensure that Layout's size and alignment properly reflect the offset needed between elements in an array.

  2. len

    Number of elements, not bytes, used for bounds checking accesses.

  3. capacity

    Number of elements that will fit into the currently allocated memory.

  4. data

    Pointer to the array data.

  5. drop

    Pointer to the drop function used for data that has a Drop impl, if the type needs to use its drop function.

Runtime Type Metadata

bevy_ecs does a lot of computations with types only known at runtime; data usually known at compile type must be represented explicitly. We've seen a few places where the ECS uses runtime ComponentIds to lookup component metadata.

An important observation to make is that, when using the bevy_ecs API to query for components, the types are statically known when the query type is defined, or when components are inserted for a particular entity. The types are dynamic mostly on the archetype side. This provides good entry-points for registering types with the ECS.

So, let's dig a bit deeper into bevy's runtime type handling starting with what bevy_ecs needs to know about component types:

src/component.rs
323pub struct ComponentDescriptor {
324 name: Cow<'static, str>,
325 storage_type: StorageType,
326 is_send_and_sync: bool,
327 type_id: Option<TypeId>,
328 layout: Layout,
329 drop: Option<for<'a> unsafe fn(OwningPtr<'a>)>,
330}
  1. name

    Name of the component type.

  2. storage_type

    Users of bevy_ecs decide whether a particular component is sparse or dense at compile time; when the type is erased, that information is stored here in the runtime metadata for that component type.

  3. is_send_and_sync

    Necessary for scheduling multithreaded access.

  4. type_id

    If known, the actual rust-compiler generated TypeId for the component type.

  5. layout

    Layout information needed for storing the component type that we see used in the BlobVec.

  6. drop

    The drop function for this type, if it has one, also used in the BlobVec.

We can see that the ComponentDescriptor's new function computes all of this data directly from the type:

src/component.rs
359pub fn new<T: Component>() -> Self {
360 Self {
361 name: Cow::Borrowed(std::any::type_name::<T>()),
362 storage_type: T::Storage::STORAGE_TYPE,
363 is_send_and_sync: true,
364 type_id: Some(TypeId::of::<T>()),
365 layout: Layout::new::<T>(),
366 drop: needs_drop::<T>().then_some(Self::drop_ptr::<T> as _),
367 }
368}

In order to have access to this information at runtime, bevy_ecs keeps a global Components struct which functions a little bit like the Entities struct we saw earlier:

src/components.rs
440pub struct Components {
441 components: Vec<ComponentInfo>,
442 indices: TypeIdMap<ComponentId>,
443 resource_indices: TypeIdMap<ComponentId>,
444}
  1. components

    Stores the ComponentDescriptor for every known component type in the ECS world.

  2. indices

    Mapping from the type to an index in the components array, wrapped in the ComponentId type for type safety.

  3. resource_indices

    bevy_ecs allows systems to request access to a single global instance of a piece of data (called Resources) in the same thread-safe way as components. This is an alternative to global variables or singleton objects. This map allows the components array to also store resource runtime type metadata.

We can confirm this by looking into the init_component and init_component_inner functions which are the entry-points for component metadata into the system.

First, the function checks for the TypeId in the indices map to get the component, otherwise it inserts a new component descriptor:

src/components.rs
456pub fn init_component<T: Component>(&mut self, storages: &mut Storages) -> ComponentId {
457 let type_id = TypeId::of::<T>();
458
459 let Components {
460 indices,
461 components,
462 ..
463 } = self;
464 *indices.entry(type_id).or_insert_with(|| {
465 Components::init_component_inner(
466 components, storages, ComponentDescriptor::new::<T>()
467 )
468 })
469}

The function inserts the descriptor at the end of the components array, with the new ComponentId containing the index of the new descriptor for easy retrieval later.

src/components.rs
488#[inline]
489fn init_component_inner(
490 components: &mut Vec<ComponentInfo>,
491 storages: &mut Storages,
492 descriptor: ComponentDescriptor,
493) -> ComponentId {
494 let component_id = ComponentId(components.len());
495 let info = ComponentInfo::new(component_id, descriptor);
496 if info.descriptor.storage_type == StorageType::SparseSet {
497 storages.sparse_sets.get_or_insert(&info);
498 }
499 components.push(info);
500 component_id
501}

Fragmentation Redux

Recall the Archetype struct:

src/archetype.rs
307pub struct Archetype {
308 id: ArchetypeId,
309 table_id: TableId,
310 edges: Edges,
311 entities: Vec<ArchetypeEntity>,
312 components: ImmutableSparseSet<ComponentId, ArchetypeComponentInfo>,
313}

If we wanted to iterate through all the Entities in this Archetype—which is the access pattern bevy_ecs is generally designed to support—we might iterate through the entities array and use the TableRow to retrieve the component data for the components of interest. However, the TableRows that correspond to each ArchetypeEntity aren't necessarily linear in memory. Thus, it is possible to have some internal "fragmentation" within a Table as a result of having sparse components.

Thus, having sparse components can possibly increase the iteration time of non-sparse component storage too. I have a suspicion that bevy has ways of working around this, and I have a feeling this is why the EntityMeta stores the TableID and TableRow:

  • This allows for direct lookup of the component data when the request is for a single entity, bypassing the archetypes completely.
  • If only dense components are requested, it should be possible to grab the matching Tables directly and skip some of the associated Archetypes.

To get more insight there, we need to start digging into the ways bevy_ecs allows a System to access its components.

Queries

In bevy_ecs, systems are normal rust functions which, using generics, use their function parameters to request access to various bits of data such as Components and Resources.

The way to request components is through a Query, which is a generic type that encodes which types it needs to access through its type parameters. From the official bevy examples:

bevy/examples/ecs/ecs_guide.rs
88// This system updates the score for each entity with the "Player" and "Score" component.
89fn score_system(mut query: Query<(&Player, &mut Score)>) {
90 for (player, mut score) in &mut query {
91 let scored_a_point = random::<bool>();
92 if scored_a_point {
93 score.value += 1;
94 println!(
95 "{} scored a point! Their score is: {}",
96 player.name, score.value
97 );
98 } else {
99 println!(
100 "{} did not score a point! Their score is: {}",
101 player.name, score.value
102 );
103 }
104 }
105}

The Query<(&Player, &mut Score)> is a struct that uses the generic parameters to extract the compile time data necessary to execute the query at runtime. The Query provides several methods for iterating through components, but also implements IntoIter to work seamlessly with rust's for loops as in the example above.

If we dig into the Query, we see that it mostly functions as a wrapper to the QueryState, bundling the extra data necessary for things like change detection and accessing data from the ECS world:

src/system/query.rs
328pub struct Query<'world, 'state, D: QueryData, F: QueryFilter = ()> {
329 world: UnsafeWorldCell<'world>,
330 state: &'state QueryState<D, F>,
331 last_run: Tick,
332 this_run: Tick,
333 force_read_only_component_access: bool,
334}
  1. world

    A pointer to the world, used to access the data being queried. Bevy checks accesses to ensure that Rust's XOR mutability rules are respected when systems are run concurrently.

  2. state

    The query forwards its generic parameters to the QueryState which caches state necessary for iterating through Archetypes and Tables to access components.

  3. last_run, this_run

    bevy_ecs supports change detection by comparing Ticks between when the query is executed and when component data is updated. (See also the Ticks on the Column.)

  4. force_read_only_component_access

    From the comments, this appears to be a safety hack.

Looking at the code example above, the entry point for iterating through the components requested by the Query is iter_mut:

src/system/query.rs
469#[inline]
470pub fn iter_mut(&mut self) -> QueryIter<'_, 's, D, F> {
471 // SAFETY: `self.world` has permission to access the required components.
472 unsafe {
473 self.state
474 .iter_unchecked_manual(self.world, self.last_run, self.this_run)
475 }
476}

We can see that the implementation lives a couple of levels deep within the returned QueryIter's QueryIterationCursor::next function, which is created from the matched tables for the QueryState. The function is quite long, so we will break things down.

src/query/iter.rs
753unsafe fn next(
754 &mut self,
755 tables: &'w Tables,
756 archetypes: &'w Archetypes,
757 query_state: &'s QueryState<D, F>,
758) -> Option<D::Item<'w>>

Because component storage types are statically known, the QueryData knows ahead of time if it uses dense storage; the QueryData extracts this information for its own use:

src/query/iter.rs
668const IS_DENSE: bool = D::IS_DENSE && F::IS_DENSE;

This enables it to have a top-level if Self::IS_DENSE check to indeed iterate through each matched Table instead of each matched Archetype.

The Table-based/dense branch (with some comments stripped):

src/query/iter.rs
760loop {
761 // we are on the beginning of the query, or finished processing a table, so
762 // skip to the next
763 if self.current_row == self.current_len {
764 let table_id = self.table_id_iter.next()?;
765 let table = tables.get(*table_id).debug_checked_unwrap();
766 D::set_table(&mut self.fetch, &query_state.fetch_state, table);
767 F::set_table(&mut self.filter, &query_state.filter_state, table);
768 self.table_entities = table.entities();
769 self.current_len = table.entity_count();
770 self.current_row = 0;
771 continue;
772 }
773
774 let entity = self.table_entities.get_unchecked(self.current_row);
775 let row = TableRow::from_usize(self.current_row);
776 if !F::filter_fetch(&mut self.filter, *entity, row) {
777 self.current_row += 1;
778 continue;
779 }
780
781 let item = D::fetch(&mut self.fetch, *entity, row);
782
783 self.current_row += 1;
784 return Some(item);
785}

We can see that the iterator follows the following structure:

  1. Iterate through the matched tables until there aren't any
    (notice the self.table_id_iter.next()?)
  2. Use the filter to skip some entities
    (used for things like change detection)
  3. Fetch the component data for the entity.

Comparing to the archetype-based/sparse implementation: (again with some comments stripped):

src/query/iter.rs
795loop {
796 if self.current_row == self.current_len {
797 let archetype_id = self.archetype_id_iter.next()?;
798 let archetype = archetypes.get(*archetype_id).debug_checked_unwrap();
799 let table = tables.get(archetype.table_id()).debug_checked_unwrap();
800 D::set_archetype(&mut self.fetch, &query_state.fetch_state, archetype, table);
801 F::set_archetype(
802 &mut self.filter,
803 &query_state.filter_state,
804 archetype,
805 table,
806 );
807 self.archetype_entities = archetype.entities();
808 self.current_len = archetype.len();
809 self.current_row = 0;
810 continue;
811 }
812
813 let archetype_entity = self.archetype_entities.get_unchecked(self.current_row);
814 if !F::filter_fetch(
815 &mut self.filter,
816 archetype_entity.id(),
817 archetype_entity.table_row(),
818 ) {
819 self.current_row += 1;
820 continue;
821 }
822
823 let item = D::fetch(
824 &mut self.fetch,
825 archetype_entity.id(),
826 archetype_entity.table_row(),
827 );
828 self.current_row += 1;
829 return Some(item);
830}

The structure is very similar, except with the extra level of indirection grabbing the archetype_entity from the archetype, and using that to fetch from the Table.

From here we can start to look at the QueryData trait. QueryData is super trait of WorldQuery, but in practice most types implement both WorldQuery and either QueryData or QueryFilter. Most of the required methods are actually implemented on WorldQuery.

The most interesting implementation right now is for &T where T: Component to see how bevy_ecs allows for a system to access components.

Let us first peer into the associated types on WorldQuery:

src/query/fetch.rs
733type Item<'w> = &'w T;
734type Fetch<'w> = ReadFetch<'w, T>;
735type State = ComponentId;

Of particular interest is the ReadFetch, which stores the data necessary to fetch the component:

src/query/fetch/rs
712#[doc(hidden)]
713pub struct ReadFetch<'w, T> {
714 table_components: Option<ThinSlicePtr<'w, UnsafeCell<T>>>,
715 sparse_set: Option<&'w ComponentSparseSet>,
716}
  1. table_components

    A custom slice pointer that does no bounds checking in release mode and which points directly to the slice of all component data for this component type in the current table.

  2. sparse_set

    Reference to a sparse set which contains the data for the set of sparse components of this type in the current archetype.

As you can see, it contains data for accessing both sparse and dense components. Before calls to WorldQuery::fetch, the QueryIterationCursor initializes the data inside the read fetch via the WorldQuery::set_archetype or WorldQuery::set_table generic methods, depending on whether this particular component is sparse or dense.

src/query/fetch.rs
669unsafe fn fetch<'w>(
670 fetch: &mut ReadFetch<'w>,
671 entity: Entity,
672 table_row: TableRow,
673) -> Self::Item<'w> {
674 match T::Storage::STORAGE_TYPE {
675 StorageType::Table => fetch
676 .table_components
677 .debug_checked_unwrap()
678 .get(table_row.as_usize())
679 .deref(),
680 StorageType::SparseSet => fetch
681 .sparse_set
682 .debug_checked_unwrap()
683 .get(entity)
684 .debug_checked_unwrap()
685 .deref(),
686 }
687}

Again, because the STORAGE_TYPE is statically known, the branch doesn't exist at runtime when the fetch function gets inlined. This allows dense components to always go directly through the table for accesses, but allows any sparse components to go through the sparse component set.

The trick for allowing queries with multiple components is to implement WorldQuery for tuples, and having each item in the tuple store any necessary state.

The WorldQuery trait is quite large and feels a bit arbitrary, but it makes sense as it appears to be designed specifically to hook into different parts of the QueryIterationCursor::next function to support accessing every kind of data one can access using the Query API.

An illustrative example of this is to compare the &T where T: Component implementation of WorldQuery to the Entity implementation.

The QueryFilter trait is similar enough to the QueryData that we don't have to cover it separately, but just know it allows for systems to choose to ignore entities from certain archetypes based on components that the system does not need to access. For example, the With filter:

// Rotate all entities with the `Spin` component
fn rotate(time: Res<Time>, mut transforms: Query<&mut Transform, With<Spin>>) {
    for transform in transforms {
        transform.rotate_z(PI * time.delta_seconds());
    }
}

Matching Archetypes

One last area of interest I'd like to focus on is how the Query iterates through and matches Archetypes. A lot of this deals with the cached data in the QueryState contained within a Query. But first, let's look at how Archetypes and Tables are stored:

Similar to the top level Entities struct which stores all the entity data for the ECS, bevy_ecs has a top-level Archetypes struct:

src/archetype.rs
615pub struct Archetypes {
616 pub(crate) archetypes: Vec<Archetype>,
617 pub(crate) archetype_component_count: usize,
618 by_components: bevy_utils::HashMap<ArchetypeComponents, ArchetypeId>,
619}

It follows the same pattern of using the index of the archetype in a global array as its identifier, and providing ways to look-up Archetypes.

And nearly identical for Tables:

src/table.rs
797pub struct Tables {
798 tables: Vec<Table>,
799 table_ids: HashMap<Vec<ComponentId>, TableId>,
800}

Tables and Archetype define TableId and ArchetypeId as type-safe indices for direct access and look up. The QueryState uses this to avoid needing to match components on every tick.

src/query/state.rs
29pub struct QueryState<D: QueryData, F: QueryFilter = ()> {
30 world_id: WorldId,
31 pub(crate) archetype_generation: ArchetypeGeneration,
32 pub(crate) matched_tables: FixedBitSet,
33 pub(crate) matched_archetypes: FixedBitSet,
34 pub(crate) archetype_component_access: Access<ArchetypeComponentId>,
35 pub(crate) component_access: FilteredAccess<ComponentId>,
36 // NOTE: we maintain both a TableId bitset and a vec because iterating the vec is faster
37 pub(crate) matched_table_ids: Vec<TableId>,
38 // NOTE: we maintain both a ArchetypeId bitset and a vec because iterating the vec is faster
39 pub(crate) matched_archetype_ids: Vec<ArchetypeId>,
40 pub(crate) fetch_state: D::State,
41 pub(crate) filter_state: F::State,
42 #[cfg(feature = "trace")]
43 par_iter_span: Span,
44}
  1. component_access

    A bitset indicating which components require read access, write access. Additionally, it keeps a bitset of components for filtering archetypes (For example, With<C> passed via a QueryFilter type argument). Components in the filter bitset do not get accessed, but the system scheduler can use this information for multi-threading.

  2. archetype_component_access

    Archetypes have globally unique IDs for each component because each Archetype has a unique access path to the component data for its entities when performing sparse iteration. This allows for system scheduling to treat multiple Queries accessing the same underlying components in different archetypes to run in parallel.

  3. archetype_generation

    Archetypes are only added, never removed. The Archetypes::generation function indicates that: the "generation" is a handle to the current highest archetype ID. This can be used to iterate over newly introduced Archetypes since the last time this function was called. Since the QueryState relies on caching the matched Archetypes and Tables to know which ones to iterate, it can use the last generation it checked to catch any new Archetypes.

  4. matched_tables, matched_archetypes, matched_table_ids, matched_archetype_ids

    Aforementioned cached data indicating which tables and archetypes the QueryState must access. Updated primarily by the new_archetype function. Since all Archetypes contain a Table, the new_archetype function also checks the Archetype's Table for component matches.

By keeping an up-to-date cache, the query can skip iterating through any irrelevant tables or archetypes. However, to ensure the cache is up-to-date, any methods directly using the query state must also ensure that they perform the necessary bookkeeping for any new archetypes encountered. For example, the QueryState::get method:

src/query/state.rs
384/// Gets the query result for the given `World` and `Entity`.
385///
386/// This can only be called for read-only queries, see `Self::get_mut` for
387/// write-queries.
388#[inline]
389pub fn get<'w>(
390 &mut self,
391 world: &'w World,
392 entity: Entity,
393) -> Result<ROQueryItem<'w, D>, QueryEntityError> {
394 self.update_archetypes(world);
395 // SAFETY: query is read only
396 unsafe {
397 self.as_readonly().get_unchecked_manual(
398 world.as_unsafe_world_cell_readonly(),
399 entity,
400 world.last_change_tick(),
401 world.read_change_tick(),
402 )
403 }
404}

The call to self.update_archetypes(world) iterates all Archetypes of a new generation to cache the matched Archetypes for query purposes.

Conclusion

We've covered a lot here, and we haven't even explored the topics of how bevy_ecs represents, schedules, and multi-threads its Systems! Admittedly that falls quite outside my comfort zone at the moment, but is still on the table for a future post.

In conclusion, bevy_ecs is a very interesting crate, and both more and less straightforward to understand than I thought it would be.