Name

ecs-1

Understanding bevy's ECS implementation.

Parts

Part 0: The why and how of making an entity-component-system for personal use.
Part 1: Current page

Description

A tour of the bevy_ecs v0.12.1 entity and component storage.

Bevy

Bevy is a modular game engine based around an ECS, or Entity, Component, System architecture; to quote the README (emphasis mine):

ECS is a software pattern that involves breaking your program up into Entities, Components, and Systems. Entities are unique things that are assigned groups of Components, which are then processed using Systems.

Additionally, bevy_ecs sports some fancy features like robust change detection, dense and sparse entity storage, archetypes, and more cool stuff that we will surely be getting into.

This design is principally useful because of the flexibility it provides in designing and implementing a game, and secondarily because it can also have great performance characteristics when implemented correctly.

I've implemented a couple of simple projects using Bevy, and while it was still very immature at the time, it was a very great experience and I highly recommend trying it out if you are interested.

Entity

I started this because I was specifically curious about how bevy accesses an Entity's data, so it seems a natural place to start is with the Entity type itself:

src/entity/mod.rs

128 #[repr(C, align(8))]
129 pub struct Entity {
130     // Do not reorder the fields here. The ordering is explicitly used by repr(C)
131     // to make this struct equivalent to a u64.
132     #[cfg(target_endian = "little")]
133     index: u32,
134     generation: NonZeroU32,
135     #[cfg(target_endian = "big")]
136     index: u32,
137 }

index: A number which indicates the identity of the Entity; if no entities are ever freed, this number just increments.
generation: Increments every time an entity index is re-used to prevent accesses of the wrong data using stale Entity. Additionally, generation is NonZeroU32 as a niche optimization, in order to allow for Option<Entity> and Entity to have the same size.

Note that the memory layout is tightly controlled and optimized using the #[cfg] and #[repr] attributes. This allows codegen to use single integer instructions for many operations.

If you have a simple ECS, then you have 1 array for every type of component, where the entity is an index directly into those arrays; but if not every entity has every component, then the ECS needs to store extra data to indicate the presence or absence of each type of component for each entity. The simplest approach would be to use Option<Component>, but this adds both runtime and memory cost.

To combat this fragmentation, bevy uses Archetypes; an archetype is essentially a pool of entity data for entities which share the same components. This means that by finding the right archetype, you know the set of components the entity has, negating the need to explicitly store any data indicating the presence or absence of a component. The information is now stored out of band.

But then that introduces a problem: how do you efficiently look up a particular component from its entity ID?

Here Bevy introduces a new type Entities which handles entity lookup and allocation:

src/entity/mod.rs

425 pub struct Entities {
426     meta: Vec<EntityMeta>,
427     pending: Vec<u32>,
428     free_cursor: AtomicIdCursor,
429     /// Stores the number of free entities for [`len`](Entities::len)
430     len: u32,
431 }

meta: Contains the information necessary to lookup the entity's component data.
pending: Combines the free list and the list of allocated entities that haven't been properly inserted into meta
free_cursor: Separates the entity indices in pending which have been allocated from the ones that are still free and unreserved. It is atomic to allow handing out entities across threads. The free_cursor counts down from the end of the pending list to make flushing more efficient.

If we examine the EntityMeta in more detail, we see that it contains an EntityLocation and a generation; the generation must match the one indicated on the Entity for the lookup to succeed, preventing use-after-free. The EntityLocation is more interesting:

src/entity/mod.rs

902 pub struct EntityLocation {
903     pub archetype_id: ArchetypeId,
904     pub archetype_row: ArchetypeRow,
905     pub table_id: TableId,
906     pub table_row: TableRow,
907 }

Here we see the indices necessary to look up the Archetype where the component data lives, and then find the entity's data within the Archetype. But I was also surprised to see a TableID and TableRow; we will revisit those later and explore the Archetype first.

Archetype

To quote the bevy_ecs documentation:

An archetype uniquely describes a group of entities that share the same components: a world only has one archetype for each unique combination of components, and all entities that have those components and only those components belong to that archetype.

A lot of details in the Archetype implementation exist specifically to support sparse component storage, so lets dive into the what and why of sparse storage.

Sparse Entity Storage

An Archetypal ECS optimizes for fast iteration over entity data by placing Entities into buckets (Archetypes) where the component data can be densely packed into arrays, also known as Columns.

One consequence of this, however, is that when a System adds or removes components from an Entity, it moves to a new Archetype, requiring that Archetype must remove all the Entity's components from each of its Columns. It can be in any row of the Column, which means ordering cannot be preserved without fragmenting the data within the Column. Here we can see how the cost quickly snowballs:

Touching every Column for every component in the new and old Archetypes
Updating the EntityMeta within the global Entities struct to point to the new ArchetypeId and ArchetypeRow
Updating the EntityMeta for any entity in the old Archetype that moved to a different ArchetypeRow to preserve density

bevy_ecs provides sparse storage to offer an alternate, more efficient code path for frequently added and removed components. It provides an extra level of indirection: the Table. As we will see shortly, the Archetype keeps an extra lookup array which allows one to go from the virtual ArchetypeRow to the TableRow where the component data is actually stored. The result is that when only sparse components change, most of the component data can stay in place without moving.

Let's dive into how bevy_ecs defines an Archetype:

src/archetype.rs

307 pub struct Archetype {
308     id: ArchetypeId,
309     table_id: TableId,
310     edges: Edges,
311     entities: Vec<ArchetypeEntity>,
312     components: ImmutableSparseSet<ComponentId, ArchetypeComponentInfo>,
313 }

id: A type-safe index that represents this Archetype, used for runtime lookup.
table_id: Look-up index of the table which holds the data for the Archetype's components.
edges: When the user adds or removes a component to an entity, it must move to a different Archetype. To avoid doing type reflection to find the new Archetype on every component change, Edges caches the results of these moves.
entities: A lookup table which stores the entity and its row within the table; essentially a virtual mapping.
components: Metadata about each component stored within the Archetype, used to support multithreading, unique to each Archetype.

Let's take a deeper look at the entities lookup array:

src/archetype.rs

267 pub struct ArchetypeEntity {
268     entity: Entity,
269     table_row: TableRow,
270 }

As we can see here, the Archetype has a mapping to the backing TableRow which allows access to the Component data, so to go from the Entity to its data requires a double-indirection when the access goes through the Archetype.

When iterating through the Archetype, we are also able to access the ID and Generation of every Entity contained within, which is necessary and useful for some of bevy_ecs's features; we will see many places which cache a copy of the Entity for various reasons, from safety to performance.

Table

The table is a struct-of-arrays style data structure. It contains a set of columns, where each column is one of the components used by the archetype. Each column is backed by a custom BlobVec type which stores the actual component data.

src/storage/table.rs

560 pub struct Table {
561     columns: ImmutableSparseSet<ComponentId, Column>,
562     entities: Vec<Entity>,
563 }

The ImmutableSparseSet maps from the ComponentId to the Column, where the ComponentId comes from the underlying component type's TypeId.

src/storage/table.rs

151 pub struct Column {
152     data: BlobVec,
153     added_ticks: Vec<UnsafeCell<Tick>>,
154     changed_ticks: Vec<UnsafeCell<Tick>>,
155 }

In addition to storage, the Column is responsible for change detection using Ticks, which store when component data was added or changed. By comparing the current Tick to the added_ticks or changed_ticks, a Query can filter components based on these attributes.

BlobVec

An Archetypal ECS deals with many types that are only known at runtime; Many data structures which require a compile-time known size, alignment, Drop function, etc, are no longer usable as they can't easily deal with type-erased data.

This is where the BlobVec comes in: it uses runtime type information to store the data as type-erased blobs of u8. Otherwise, it is essentially just a dynamically allocated array.

src/storage/blob_vec.rs

16 pub(super) struct BlobVec {
17     item_layout: Layout,
18     capacity: usize,
19     len: usize,
20     data: NonNull<u8>,
21     drop: Option<unsafe fn(OwningPtr<'_>)>,
22 }

item_layout: Stores the alignment and size of the data in the BlobVec in order to properly store and access the data. Note that the Rust std defines the Layout type. Special care is taken to ensure that Layout's size and alignment properly reflect the offset needed between elements in an array.
len: Number of elements, not bytes, used for bounds checking accesses.
capacity: Number of elements that will fit into the currently allocated memory.
data: Pointer to the array data.
drop: Pointer to the drop function used for data that has a Drop impl, if the type needs to use its drop function.

Runtime Type Metadata

bevy_ecs does a lot of computations with types only known at runtime; data usually known at compile type must be represented explicitly. We've seen a few places where the ECS uses runtime ComponentIds to lookup component metadata.

An important observation to make is that, when using the bevy_ecs API to query for components, the types are statically known when the query type is defined, or when components are inserted for a particular entity. The types are dynamic mostly on the archetype side. This provides good entry-points for registering types with the ECS.

So, let's dig a bit deeper into bevy's runtime type handling starting with what bevy_ecs needs to know about component types:

src/component.rs

323 pub struct ComponentDescriptor {
324     name: Cow<'static, str>,
325     storage_type: StorageType,
326     is_send_and_sync: bool,
327     type_id: Option<TypeId>,
328     layout: Layout,
329     drop: Option<for<'a> unsafe fn(OwningPtr<'a>)>,
330 }

name: Name of the component type.
storage_type: Users of bevy_ecs decide whether a particular component is sparse or dense at compile time; when the type is erased, that information is stored here in the runtime metadata for that component type.
is_send_and_sync: Necessary for scheduling multithreaded access.
type_id: If known, the actual rust-compiler generated TypeId for the component type.
layout: Layout information needed for storing the component type that we see used in the BlobVec.
drop: The drop function for this type, if it has one, also used in the BlobVec.

We can see that the ComponentDescriptor's new function computes all of this data directly from the type:

src/component.rs

359 pub fn new<T: Component>() -> Self {
360     Self {
361         name: Cow::Borrowed(std::any::type_name::<T>()),
362         storage_type: T::Storage::STORAGE_TYPE,
363         is_send_and_sync: true,
364         type_id: Some(TypeId::of::<T>()),
365         layout: Layout::new::<T>(),
366         drop: needs_drop::<T>().then_some(Self::drop_ptr::<T> as _),
367     }
368 }

In order to have access to this information at runtime, bevy_ecs keeps a global Components struct which functions a little bit like the Entities struct we saw earlier:

src/components.rs

440 pub struct Components {
441     components: Vec<ComponentInfo>,
442     indices: TypeIdMap<ComponentId>,
443     resource_indices: TypeIdMap<ComponentId>,
444 }

components: Stores the ComponentDescriptor for every known component type in the ECS world.
indices: Mapping from the type to an index in the components array, wrapped in the ComponentId type for type safety.
resource_indices: bevy_ecs allows systems to request access to a single global instance of a piece of data (called Resources) in the same thread-safe way as components. This is an alternative to global variables or singleton objects. This map allows the components array to also store resource runtime type metadata.

We can confirm this by looking into the init_component and init_component_inner functions which are the entry-points for component metadata into the system.

First, the function checks for the TypeId in the indices map to get the component, otherwise it inserts a new component descriptor:

src/components.rs

456 pub fn init_component<T: Component>(&mut self, storages: &mut Storages) -> ComponentId {
457     let type_id = TypeId::of::<T>();
458 
459     let Components {
460         indices,
461         components,
462         ..
463     } = self;
464     *indices.entry(type_id).or_insert_with(|| {
465         Components::init_component_inner(
466             components, storages, ComponentDescriptor::new::<T>()
467         )
468     })
469 }

The function inserts the descriptor at the end of the components array, with the new ComponentId containing the index of the new descriptor for easy retrieval later.

src/components.rs

488 #[inline]
489 fn init_component_inner(
490     components: &mut Vec<ComponentInfo>,
491     storages: &mut Storages,
492     descriptor: ComponentDescriptor,
493 ) -> ComponentId {
494     let component_id = ComponentId(components.len());
495     let info = ComponentInfo::new(component_id, descriptor);
496     if info.descriptor.storage_type == StorageType::SparseSet {
497         storages.sparse_sets.get_or_insert(&info);
498     }
499     components.push(info);
500     component_id
501 }

Fragmentation Redux

Recall the Archetype struct:

src/archetype.rs

307 pub struct Archetype {
308     id: ArchetypeId,
309     table_id: TableId,
310     edges: Edges,
311     entities: Vec<ArchetypeEntity>,
312     components: ImmutableSparseSet<ComponentId, ArchetypeComponentInfo>,
313 }

If we wanted to iterate through all the Entities in this Archetype—which is the access pattern bevy_ecs is generally designed to support—we might iterate through the entities array and use the TableRow to retrieve the component data for the components of interest. However, the TableRows that correspond to each ArchetypeEntity aren't necessarily linear in memory. Thus, it is possible to have some internal "fragmentation" within a Table as a result of having sparse components.

Thus, having sparse components can possibly increase the iteration time of non-sparse component storage too. I have a suspicion that bevy has ways of working around this, and I have a feeling this is why the EntityMeta stores the TableID and TableRow:

This allows for direct lookup of the component data when the request is for a single entity, bypassing the archetypes completely.
If only dense components are requested, it should be possible to grab the matching Tables directly and skip some of the associated Archetypes.

To get more insight there, we need to start digging into the ways bevy_ecs allows a System to access its components.

Queries

In bevy_ecs, systems are normal rust functions which, using generics, use their function parameters to request access to various bits of data such as Components and Resources.

The way to request components is through a Query, which is a generic type that encodes which types it needs to access through its type parameters. From the official bevy examples:

bevy/examples/ecs/ecs_guide.rs

88 // This system updates the score for each entity with the "Player" and "Score" component.
89 fn score_system(mut query: Query<(&Player, &mut Score)>) {
90     for (player, mut score) in &mut query {
91         let scored_a_point = random::<bool>();
92         if scored_a_point {
93             score.value += 1;
94             println!(
95                 "{} scored a point! Their score is: {}",
96                 player.name, score.value
97             );
98         } else {
99             println!(
100                 "{} did not score a point! Their score is: {}",
101                 player.name, score.value
102             );
103         }
104     }
105 }

The Query<(&Player, &mut Score)> is a struct that uses the generic parameters to extract the compile time data necessary to execute the query at runtime. The Query provides several methods for iterating through components, but also implements IntoIter to work seamlessly with rust's for loops as in the example above.

If we dig into the Query, we see that it mostly functions as a wrapper to the QueryState, bundling the extra data necessary for things like change detection and accessing data from the ECS world:

src/system/query.rs

328 pub struct Query<'world, 'state, D: QueryData, F: QueryFilter = ()> {
329     world: UnsafeWorldCell<'world>,
330     state: &'state QueryState<D, F>,
331     last_run: Tick,
332     this_run: Tick,
333     force_read_only_component_access: bool,
334 }

world: A pointer to the world, used to access the data being queried. Bevy checks accesses to ensure that Rust's XOR mutability rules are respected when systems are run concurrently.
state: The query forwards its generic parameters to the QueryState which caches state necessary for iterating through Archetypes and Tables to access components.
last_run, this_run: bevy_ecs supports change detection by comparing Ticks between when the query is executed and when component data is updated. (See also the Ticks on the Column.)
force_read_only_component_access: From the comments, this appears to be a safety hack.

Looking at the code example above, the entry point for iterating through the components requested by the Query is iter_mut:

src/system/query.rs

469 #[inline]
470 pub fn iter_mut(&mut self) -> QueryIter<'_, 's, D, F> {
471     // SAFETY: `self.world` has permission to access the required components.
472     unsafe {
473         self.state
474             .iter_unchecked_manual(self.world, self.last_run, self.this_run)
475     }
476 }

We can see that the implementation lives a couple of levels deep within the returned QueryIter's QueryIterationCursor::next function, which is created from the matched tables for the QueryState. The function is quite long, so we will break things down.

src/query/iter.rs

753 unsafe fn next(
754     &mut self,
755     tables: &'w Tables,
756     archetypes: &'w Archetypes,
757     query_state: &'s QueryState<D, F>,
758 ) -> Option<D::Item<'w>>

Because component storage types are statically known, the QueryData knows ahead of time if it uses dense storage; the QueryData extracts this information for its own use:

src/query/iter.rs

668 const IS_DENSE: bool = D::IS_DENSE && F::IS_DENSE;

This enables it to have a top-level if Self::IS_DENSE check to indeed iterate through each matched Table instead of each matched Archetype.

The Table-based/dense branch (with some comments stripped):

src/query/iter.rs

760 loop {
761     // we are on the beginning of the query, or finished processing a table, so
762     // skip to the next
763     if self.current_row == self.current_len {
764         let table_id = self.table_id_iter.next()?;
765         let table = tables.get(*table_id).debug_checked_unwrap();
766         D::set_table(&mut self.fetch, &query_state.fetch_state, table);
767         F::set_table(&mut self.filter, &query_state.filter_state, table);
768         self.table_entities = table.entities();
769         self.current_len = table.entity_count();
770         self.current_row = 0;
771         continue;
772     }
773 
774     let entity = self.table_entities.get_unchecked(self.current_row);
775     let row = TableRow::from_usize(self.current_row);
776     if !F::filter_fetch(&mut self.filter, *entity, row) {
777         self.current_row += 1;
778         continue;
779     }
780 
781     let item = D::fetch(&mut self.fetch, *entity, row);
782 
783     self.current_row += 1;
784     return Some(item);
785 }

We can see that the iterator follows the following structure:

Iterate through the matched tables until there aren't any
(notice the self.table_id_iter.next()?)
Use the filter to skip some entities
(used for things like change detection)
Fetch the component data for the entity.

Comparing to the archetype-based/sparse implementation: (again with some comments stripped):

src/query/iter.rs

795 loop {
796     if self.current_row == self.current_len {
797         let archetype_id = self.archetype_id_iter.next()?;
798         let archetype = archetypes.get(*archetype_id).debug_checked_unwrap();
799         let table = tables.get(archetype.table_id()).debug_checked_unwrap();
800         D::set_archetype(&mut self.fetch, &query_state.fetch_state, archetype, table);
801         F::set_archetype(
802             &mut self.filter,
803             &query_state.filter_state,
804             archetype,
805             table,
806         );
807         self.archetype_entities = archetype.entities();
808         self.current_len = archetype.len();
809         self.current_row = 0;
810         continue;
811     }
812 
813     let archetype_entity = self.archetype_entities.get_unchecked(self.current_row);
814     if !F::filter_fetch(
815         &mut self.filter,
816         archetype_entity.id(),
817         archetype_entity.table_row(),
818     ) {
819         self.current_row += 1;
820         continue;
821     }
822 
823     let item = D::fetch(
824         &mut self.fetch,
825         archetype_entity.id(),
826         archetype_entity.table_row(),
827     );
828     self.current_row += 1;
829     return Some(item);
830 }

The structure is very similar, except with the extra level of indirection grabbing the archetype_entity from the archetype, and using that to fetch from the Table.

From here we can start to look at the QueryData trait. QueryData is super trait of WorldQuery, but in practice most types implement both WorldQuery and either QueryData or QueryFilter. Most of the required methods are actually implemented on WorldQuery.

The most interesting implementation right now is for &T where T: Component to see how bevy_ecs allows for a system to access components.

Let us first peer into the associated types on WorldQuery:

src/query/fetch.rs

733 type Item<'w> = &'w T;
734 type Fetch<'w> = ReadFetch<'w, T>;
735 type State = ComponentId;

Of particular interest is the ReadFetch, which stores the data necessary to fetch the component:

src/query/fetch/rs

712 #[doc(hidden)]
713 pub struct ReadFetch<'w, T> {
714     table_components: Option<ThinSlicePtr<'w, UnsafeCell<T>>>,
715     sparse_set: Option<&'w ComponentSparseSet>,
716 }

table_components: A custom slice pointer that does no bounds checking in release mode and which points directly to the slice of all component data for this component type in the current table.
sparse_set: Reference to a sparse set which contains the data for the set of sparse components of this type in the current archetype.

As you can see, it contains data for accessing both sparse and dense components. Before calls to WorldQuery::fetch, the QueryIterationCursor initializes the data inside the read fetch via the WorldQuery::set_archetype or WorldQuery::set_table generic methods, depending on whether this particular component is sparse or dense.

src/query/fetch.rs

669 unsafe fn fetch<'w>(
670     fetch: &mut ReadFetch<'w>,
671     entity: Entity,
672     table_row: TableRow,
673 ) -> Self::Item<'w> {
674     match T::Storage::STORAGE_TYPE {
675         StorageType::Table => fetch
676             .table_components
677             .debug_checked_unwrap()
678             .get(table_row.as_usize())
679             .deref(),
680         StorageType::SparseSet => fetch
681             .sparse_set
682             .debug_checked_unwrap()
683             .get(entity)
684             .debug_checked_unwrap()
685             .deref(),
686     }
687 }

Again, because the STORAGE_TYPE is statically known, the branch doesn't exist at runtime when the fetch function gets inlined. This allows dense components to always go directly through the table for accesses, but allows any sparse components to go through the sparse component set.

The trick for allowing queries with multiple components is to implement WorldQuery for tuples, and having each item in the tuple store any necessary state.

The WorldQuery trait is quite large and feels a bit arbitrary, but it makes sense as it appears to be designed specifically to hook into different parts of the QueryIterationCursor::next function to support accessing every kind of data one can access using the Query API.

An illustrative example of this is to compare the &T where T: Component implementation of WorldQuery to the Entity implementation.

The QueryFilter trait is similar enough to the QueryData that we don't have to cover it separately, but just know it allows for systems to choose to ignore entities from certain archetypes based on components that the system does not need to access. For example, the With filter:

// Rotate all entities with the `Spin` component
fn rotate(time: Res<Time>, mut transforms: Query<&mut Transform, With<Spin>>) {
    for transform in transforms {
        transform.rotate_z(PI * time.delta_seconds());
    }
}

Matching Archetypes

One last area of interest I'd like to focus on is how the Query iterates through and matches Archetypes. A lot of this deals with the cached data in the QueryState contained within a Query. But first, let's look at how Archetypes and Tables are stored:

Similar to the top level Entities struct which stores all the entity data for the ECS, bevy_ecs has a top-level Archetypes struct:

src/archetype.rs

615 pub struct Archetypes {
616     pub(crate) archetypes: Vec<Archetype>,
617     pub(crate) archetype_component_count: usize,
618     by_components: bevy_utils::HashMap<ArchetypeComponents, ArchetypeId>,
619 }

It follows the same pattern of using the index of the archetype in a global array as its identifier, and providing ways to look-up Archetypes.

And nearly identical for Tables:

src/table.rs

797 pub struct Tables {
798     tables: Vec<Table>,
799     table_ids: HashMap<Vec<ComponentId>, TableId>,
800 }

Tables and Archetype define TableId and ArchetypeId as type-safe indices for direct access and look up. The QueryState uses this to avoid needing to match components on every tick.

src/query/state.rs

29 pub struct QueryState<D: QueryData, F: QueryFilter = ()> {
30     world_id: WorldId,
31     pub(crate) archetype_generation: ArchetypeGeneration,
32     pub(crate) matched_tables: FixedBitSet,
33     pub(crate) matched_archetypes: FixedBitSet,
34     pub(crate) archetype_component_access: Access<ArchetypeComponentId>,
35     pub(crate) component_access: FilteredAccess<ComponentId>,
36     // NOTE: we maintain both a TableId bitset and a vec because iterating the vec is faster
37     pub(crate) matched_table_ids: Vec<TableId>,
38     // NOTE: we maintain both a ArchetypeId bitset and a vec because iterating the vec is faster
39     pub(crate) matched_archetype_ids: Vec<ArchetypeId>,
40     pub(crate) fetch_state: D::State,
41     pub(crate) filter_state: F::State,
42     #[cfg(feature = "trace")]
43     par_iter_span: Span,
44 }

component_access: A bitset indicating which components require read access, write access. Additionally, it keeps a bitset of components for filtering archetypes (For example, With<C> passed via a QueryFilter type argument). Components in the filter bitset do not get accessed, but the system scheduler can use this information for multi-threading.
archetype_component_access: Archetypes have globally unique IDs for each component because each Archetype has a unique access path to the component data for its entities when performing sparse iteration. This allows for system scheduling to treat multiple Queries accessing the same underlying components in different archetypes to run in parallel.
archetype_generation: Archetypes are only added, never removed. The Archetypes::generation function indicates that: the "generation" is a handle to the current highest archetype ID. This can be used to iterate over newly introduced Archetypes since the last time this function was called. Since the QueryState relies on caching the matched Archetypes and Tables to know which ones to iterate, it can use the last generation it checked to catch any new Archetypes.
matched_tables, matched_archetypes, matched_table_ids, matched_archetype_ids: Aforementioned cached data indicating which tables and archetypes the QueryState must access. Updated primarily by the new_archetype function. Since all Archetypes contain a Table, the new_archetype function also checks the Archetype's Table for component matches.

By keeping an up-to-date cache, the query can skip iterating through any irrelevant tables or archetypes. However, to ensure the cache is up-to-date, any methods directly using the query state must also ensure that they perform the necessary bookkeeping for any new archetypes encountered. For example, the QueryState::get method:

src/query/state.rs

384 /// Gets the query result for the given `World` and `Entity`.
385 ///
386 /// This can only be called for read-only queries, see `Self::get_mut` for
387 /// write-queries.
388 #[inline]
389 pub fn get<'w>(
390     &mut self,
391     world: &'w World,
392     entity: Entity,
393 ) -> Result<ROQueryItem<'w, D>, QueryEntityError> {
394     self.update_archetypes(world);
395     // SAFETY: query is read only
396     unsafe {
397         self.as_readonly().get_unchecked_manual(
398             world.as_unsafe_world_cell_readonly(),
399             entity,
400             world.last_change_tick(),
401             world.read_change_tick(),
402         )
403     }
404 }

The call to self.update_archetypes(world) iterates all Archetypes of a new generation to cache the matched Archetypes for query purposes.

Conclusion

We've covered a lot here, and we haven't even explored the topics of how bevy_ecs represents, schedules, and multi-threads its Systems! Admittedly that falls quite outside my comfort zone at the moment, but is still on the table for a future post.

In conclusion, bevy_ecs is a very interesting crate, and both more and less straightforward to understand than I thought it would be.

128	#[repr(C, align(8))]
129	pub struct Entity {
130	// Do not reorder the fields here. The ordering is explicitly used by repr(C)
131	// to make this struct equivalent to a u64.
132	#[cfg(target_endian = "little")]
133	index: u32,
134	generation: NonZeroU32,
135	#[cfg(target_endian = "big")]
136	index: u32,
137	}

425	pub struct Entities {
426	meta: Vec<EntityMeta>,
427	pending: Vec<u32>,
428	free_cursor: AtomicIdCursor,
429	/// Stores the number of free entities for [`len`](Entities::len)
430	len: u32,
431	}

902	pub struct EntityLocation {
903	pub archetype_id: ArchetypeId,
904	pub archetype_row: ArchetypeRow,
905	pub table_id: TableId,
906	pub table_row: TableRow,
907	}

307	pub struct Archetype {
308	id: ArchetypeId,
309	table_id: TableId,
310	edges: Edges,
311	entities: Vec<ArchetypeEntity>,
312	components: ImmutableSparseSet<ComponentId, ArchetypeComponentInfo>,
313	}

267	pub struct ArchetypeEntity {
268	entity: Entity,
269	table_row: TableRow,
270	}

560	pub struct Table {
561	columns: ImmutableSparseSet<ComponentId, Column>,
562	entities: Vec<Entity>,
563	}

151	pub struct Column {
152	data: BlobVec,
153	added_ticks: Vec<UnsafeCell<Tick>>,
154	changed_ticks: Vec<UnsafeCell<Tick>>,
155	}

16	pub(super) struct BlobVec {
17	item_layout: Layout,
18	capacity: usize,
19	len: usize,
20	data: NonNull<u8>,
21	drop: Option<unsafe fn(OwningPtr<'_>)>,
22	}

323	pub struct ComponentDescriptor {
324	name: Cow<'static, str>,
325	storage_type: StorageType,
326	is_send_and_sync: bool,
327	type_id: Option<TypeId>,
328	layout: Layout,
329	drop: Option<for<'a> unsafe fn(OwningPtr<'a>)>,
330	}

359	pub fn new<T: Component>() -> Self {
360	Self {
361	name: Cow::Borrowed(std::any::type_name::<T>()),
362	storage_type: T::Storage::STORAGE_TYPE,
363	is_send_and_sync: true,
364	type_id: Some(TypeId::of::<T>()),
365	layout: Layout::new::<T>(),
366	drop: needs_drop::<T>().then_some(Self::drop_ptr::<T> as _),
367	}
368	}

440	pub struct Components {
441	components: Vec<ComponentInfo>,
442	indices: TypeIdMap<ComponentId>,
443	resource_indices: TypeIdMap<ComponentId>,
444	}

456	pub fn init_component<T: Component>(&mut self, storages: &mut Storages) -> ComponentId {
457	let type_id = TypeId::of::<T>();
458
459	let Components {
460	indices,
461	components,
462	..
463	} = self;
464	*indices.entry(type_id).or_insert_with(\|\| {
465	Components::init_component_inner(
466	components, storages, ComponentDescriptor::new::<T>()
467	)
468	})
469	}

488	#[inline]
489	fn init_component_inner(
490	components: &mut Vec<ComponentInfo>,
491	storages: &mut Storages,
492	descriptor: ComponentDescriptor,
493	) -> ComponentId {
494	let component_id = ComponentId(components.len());
495	let info = ComponentInfo::new(component_id, descriptor);
496	if info.descriptor.storage_type == StorageType::SparseSet {
497	storages.sparse_sets.get_or_insert(&info);
498	}
499	components.push(info);
500	component_id
501	}

88	// This system updates the score for each entity with the "Player" and "Score" component.
89	fn score_system(mut query: Query<(&Player, &mut Score)>) {
90	for (player, mut score) in &mut query {
91	let scored_a_point = random::<bool>();
92	if scored_a_point {
93	score.value += 1;
94	println!(
95	"{} scored a point! Their score is: {}",
96	player.name, score.value
97	);
98	} else {
99	println!(
100	"{} did not score a point! Their score is: {}",
101	player.name, score.value
102	);
103	}
104	}
105	}

328	pub struct Query<'world, 'state, D: QueryData, F: QueryFilter = ()> {
329	world: UnsafeWorldCell<'world>,
330	state: &'state QueryState<D, F>,
331	last_run: Tick,
332	this_run: Tick,
333	force_read_only_component_access: bool,
334	}

469	#[inline]
470	pub fn iter_mut(&mut self) -> QueryIter<'_, 's, D, F> {
471	// SAFETY: `self.world` has permission to access the required components.
472	unsafe {
473	self.state
474	.iter_unchecked_manual(self.world, self.last_run, self.this_run)
475	}
476	}

753	unsafe fn next(
754	&mut self,
755	tables: &'w Tables,
756	archetypes: &'w Archetypes,
757	query_state: &'s QueryState<D, F>,
758	) -> Option<D::Item<'w>>

760	loop {
761	// we are on the beginning of the query, or finished processing a table, so
762	// skip to the next
763	if self.current_row == self.current_len {
764	let table_id = self.table_id_iter.next()?;
765	let table = tables.get(*table_id).debug_checked_unwrap();
766	D::set_table(&mut self.fetch, &query_state.fetch_state, table);
767	F::set_table(&mut self.filter, &query_state.filter_state, table);
768	self.table_entities = table.entities();
769	self.current_len = table.entity_count();
770	self.current_row = 0;
771	continue;
772	}
773
774	let entity = self.table_entities.get_unchecked(self.current_row);
775	let row = TableRow::from_usize(self.current_row);
776	if !F::filter_fetch(&mut self.filter, *entity, row) {
777	self.current_row += 1;
778	continue;
779	}
780
781	let item = D::fetch(&mut self.fetch, *entity, row);
782
783	self.current_row += 1;
784	return Some(item);
785	}

795	loop {
796	if self.current_row == self.current_len {
797	let archetype_id = self.archetype_id_iter.next()?;
798	let archetype = archetypes.get(*archetype_id).debug_checked_unwrap();
799	let table = tables.get(archetype.table_id()).debug_checked_unwrap();
800	D::set_archetype(&mut self.fetch, &query_state.fetch_state, archetype, table);
801	F::set_archetype(
802	&mut self.filter,
803	&query_state.filter_state,
804	archetype,
805	table,
806	);
807	self.archetype_entities = archetype.entities();
808	self.current_len = archetype.len();
809	self.current_row = 0;
810	continue;
811	}
812
813	let archetype_entity = self.archetype_entities.get_unchecked(self.current_row);
814	if !F::filter_fetch(
815	&mut self.filter,
816	archetype_entity.id(),
817	archetype_entity.table_row(),
818	) {
819	self.current_row += 1;
820	continue;
821	}
822
823	let item = D::fetch(
824	&mut self.fetch,
825	archetype_entity.id(),
826	archetype_entity.table_row(),
827	);
828	self.current_row += 1;
829	return Some(item);
830	}

733	type Item<'w> = &'w T;
734	type Fetch<'w> = ReadFetch<'w, T>;
735	type State = ComponentId;

712	#[doc(hidden)]
713	pub struct ReadFetch<'w, T> {
714	table_components: Option<ThinSlicePtr<'w, UnsafeCell<T>>>,
715	sparse_set: Option<&'w ComponentSparseSet>,
716	}

669	unsafe fn fetch<'w>(
670	fetch: &mut ReadFetch<'w>,
671	entity: Entity,
672	table_row: TableRow,
673	) -> Self::Item<'w> {
674	match T::Storage::STORAGE_TYPE {
675	StorageType::Table => fetch
676	.table_components
677	.debug_checked_unwrap()
678	.get(table_row.as_usize())
679	.deref(),
680	StorageType::SparseSet => fetch
681	.sparse_set
682	.debug_checked_unwrap()
683	.get(entity)
684	.debug_checked_unwrap()
685	.deref(),
686	}
687	}

615	pub struct Archetypes {
616	pub(crate) archetypes: Vec<Archetype>,
617	pub(crate) archetype_component_count: usize,
618	by_components: bevy_utils::HashMap<ArchetypeComponents, ArchetypeId>,
619	}

797	pub struct Tables {
798	tables: Vec<Table>,
799	table_ids: HashMap<Vec<ComponentId>, TableId>,
800	}

29	pub struct QueryState<D: QueryData, F: QueryFilter = ()> {
30	world_id: WorldId,
31	pub(crate) archetype_generation: ArchetypeGeneration,
32	pub(crate) matched_tables: FixedBitSet,
33	pub(crate) matched_archetypes: FixedBitSet,
34	pub(crate) archetype_component_access: Access<ArchetypeComponentId>,
35	pub(crate) component_access: FilteredAccess<ComponentId>,
36	// NOTE: we maintain both a TableId bitset and a vec because iterating the vec is faster
37	pub(crate) matched_table_ids: Vec<TableId>,
38	// NOTE: we maintain both a ArchetypeId bitset and a vec because iterating the vec is faster
39	pub(crate) matched_archetype_ids: Vec<ArchetypeId>,
40	pub(crate) fetch_state: D::State,
41	pub(crate) filter_state: F::State,
42	#[cfg(feature = "trace")]
43	par_iter_span: Span,
44	}

384	/// Gets the query result for the given `World` and `Entity`.
385	///
386	/// This can only be called for read-only queries, see `Self::get_mut` for
387	/// write-queries.
388	#[inline]
389	pub fn get<'w>(
390	&mut self,
391	world: &'w World,
392	entity: Entity,
393	) -> Result<ROQueryItem<'w, D>, QueryEntityError> {
394	self.update_archetypes(world);
395	// SAFETY: query is read only
396	unsafe {
397	self.as_readonly().get_unchecked_manual(
398	world.as_unsafe_world_cell_readonly(),
399	entity,
400	world.last_change_tick(),
401	world.read_change_tick(),
402	)
403	}
404	}