Skip to content

Introduce SpaceInspector and RegionInspector #1322

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

qinsoon
Copy link
Member

@qinsoon qinsoon commented May 15, 2025

No description provided.

@@ -42,6 +43,9 @@ use downcast_rs::Downcast;
pub trait Space<VM: VMBinding>: 'static + SFT + Sync + Downcast {
fn as_space(&self) -> &dyn Space<VM>;
fn as_sft(&self) -> &(dyn SFT + Sync + 'static);
fn as_inspector(&self) -> Option<&dyn SpaceInspector> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should just return &dyn SpaceInspector. It is temporary to use an Option so I don't have to implement for all the spaces.

@qinsoon
Copy link
Member Author

qinsoon commented May 15, 2025

@wks Can you take a look at this draft PR quickly? If this seems to be the right way for #1302, I will get it done.

Copy link
Collaborator

@wks wks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think from the high level you are definitely on the right track. There are some details that can be simplified to make the code clearer and more usable.

Comment on lines 1193 to 1205
fn list_regions(&self, parent_region: Option<&dyn RegionInspector>) -> Vec<Box<dyn RegionInspector>> {
if let Some(parent_region) = parent_region {
list_child_regions::<Chunk, Block>(parent_region).or_else(|| {
if !crate::policy::immix::BLOCK_ONLY {
list_child_regions::<Block, Line>(parent_region)
} else {
None
}
}).unwrap_or_else(|| vec![])
} else {
self.chunk_map.all_chunks().map(|r: Chunk| Box::new(r) as Box<dyn RegionInspector>).collect()
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function looks complicated.

I think what is happening here is that although we do have the concept of Chunk, here we are letting ImmixSpace do the work of splitting it into Immix blocks (instead of for example MiMalloc blocks) because crate::util::heap::chunk_map::Chunk alone is space-agnostic. Similarly, we are letting ImmixSpace to split a block into lines using Immix-specific knowledge. And this is why we do things like

let chunk_inspector = immix_space_inspector.list_regions(None);
let block_inspector = immix_space_inspector.list_regions(Some(&*chunk_inspector[0]));
let line_inspector = immix_space_inspector.list_regions(Some(&*block_inspector[0]));

And we are multiplexing the same function list_regions to support three different kinds of regions, and we use dyn RegionInspector as the generic parameter.

But I think this approach is too complicated. I think we could achieve something like:

let chunk_inspector = immix_space_inspector.list_chunks();
let block_inspector = chunk_inspector[0].list_blocks();
let line_inspector = block_inspector[0].list_lines();

This means we will need to define Immix-specific ChunkInspector, BlockInspector and LineInspector types. This may conflict with the idea of letting all <R: Region> implement RegionInspector, but I think it is clearer. And the concrete inspector types may just be some simple wrappers over the concrete Region type. For example,

struct ImmixChunkInspector(Chunk);

// struct ImmixBlockInspector(Block); // We may just let Block and Line inspect themselves
// struct ImmixLineInspector(Line); // because they are Immix-specific types.

impl RegionInspector for ImmixChunkInspector {
    type RegionType = Chunk;
    fn get_region(&self) -> Self::RegionType {
        self.0
    }
}

impl ImmixChunkInspector {
    /// The purpose of this method is to ensure the results are
    /// `crate::policy::immix::block::Block` instead of MarkSweep's block.
    fn blocks() -> Iterator<Item = Block> { ... }
}

impl RegionInspector for Block {
    type RegionType = Self;
    fn get_region(&self) -> Self::RegionType {
        self
    }
}

And crate::policy::immix::block::Block already provides the lines() method.
It knows how to iterate lines in itself.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function can be refactored to be more readable.

fn list_regions(&self, parent_region: Option<&dyn RegionInspector>) -> Vec<Box<dyn RegionInspector>> {
  if parent_region.is_none() {
    // List top-level regions - chunks
    return self.chunk_map.all_chunks().map(|r: Chunk| Box::new(r) as Box<dyn RegionInspector>).collect();
  }
  let parent_region = parent_region.unwrap();
  // List blocks within a chunk
  if let Some(ret) = list_child_regions::<Chunk, Block>(parent_region) {
    return ret;
  }
  // List lines within a block
  if !crate::policy::immix::BLOCK_ONLY {
    if let Some(ret) = list_child_regions::<Block, Line>(parent_region) {
      return ret;
    }
  }
  // Unknown cases
  return vec![];
}

It conveys a good point that only the policy knows how about its heap structure (unless we create policy-specific types like you suggested).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One benefit of this current API is that we can use the same code to iterate through all kinds of spaces, through all levels of regions, because we are not relying on the policy specific types.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be more readable if we split list_regions into list_top_regions and list_sub_regions.

But for ImmixSpace alone, I think it is clearer to provide list_chunks, list_blocks_in_chunk and list_lines_in_block, respectively. If some users are inspecting ImmixSpace, they know what they are doing.

One benefit of this current API is that we can use the same code to iterate through all kinds of spaces, through all levels of regions, because we are not relying on the policy specific types.

Yes. A generic list_sub_regions may still be useful if the user just goes through all spaces and hierarchically dump all regions and sub-regions in a space-agnostic way.


pub trait SpaceInspector {
fn name(&self) -> &str;
fn list_regions(&self, parent_region: Option<&dyn RegionInspector>) -> Vec<Box<dyn RegionInspector>>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just let this generic list_regions method return the first level of regions (chunks for most spaces), and let the returned iterator go into deeper nested regions by themselves. In this way, we don't complicate the space-level list_regions.

And it is better to let the return value be Iterator<Item = Xxx> instead of Vec<Xxx>. We will use iterator underneath anyway, and the user can do space.list_regions().collect::<Vec<_>>() if they need to.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just let this generic list_regions method return the first level of regions (chunks for most spaces), and let the returned iterator go into deeper nested regions by themselves. In this way, we don't complicate the space-level list_regions.

We could separate the function into two if you think it is better. One to list the top-level regions, and the other to list sub-regions of one region.

And it is better to let the return value be Iterator<Item = Xxx> instead of Vec. We will use iterator underneath anyway, and the user can do space.list_regions().collect::<Vec<_>>() if they need to.

This sounds reasonable.

Comment on lines 24 to 25
pub trait RegionInspector {
fn region_type(&self) -> &str;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If something other than <R: Region> can implement RegionInspector (such as the ImmixChunkInspector I wrote in another comment), we can add two members:

pub trait RegionInspector {
    type RegionType: Region;
    fn get_region(): Self::RegionType;
}

And we can probably remove start and size because we can get the region itself.

src/mmtk.rs Outdated
@@ -596,4 +597,14 @@ impl<VM: VMBinding> MMTK<VM> {
.vm_space
.initialize_object_metadata(object, false)
}

pub fn inspect_spaces<'a>(&'a self) -> Vec<&'a dyn SpaceInspector> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Having a Vec does simplify the API a bit because it can be iterated, and the performance doesn't matter because (1) we are inspecting, and (2) there are probably not may spaces in a plan.

}

pub(crate) fn list_child_regions<PARENT: Region, CHILD: Region + 'static>(region: &dyn RegionInspector) -> Option<Vec<Box<dyn RegionInspector>>> {
if region.region_type() == std::any::type_name::<PARENT>() {
Copy link
Collaborator

@wks wks May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be no need to compare types by strings because both PARENT and CHILD are statically known.

The problem is that generic trait specialization is an unstable feature. There is another way to avoid comparing by string.

We can let each space define a trait:

trait ImmixRegionInspector: RegionIterator {
    type ChildType: Region;
    const HAS_CHILD_REGION: bool;

    fn get_child_inspectors(&self) -> Option<dyn ImmixRegionInspector> {
        if Self::HAS_CHILF_REGION { Some(...) } else { None }
    }
}

impl ImmixRegionInspector for Chunk {
    type ChildType = crate::policy::immix::block::Block;
    const HAS_CHILD_REGION = true;
}

impl ImmixRegionInspector for crate::policy::immix::block::Block {
    type ChildType = crate::policy::immix::line::Line;
    const HAS_CHILD_REGION = true;
}

impl ImmixRegionInspector for crate::policy::immix::line::Line {
    type ChildType = Self; // We can't say None
    const HAS_CHILD_REGION = false;
}

Then we will let ImmixSpace::list_regions call the get_child_inspectors method. ImmixSpace will use downcast to ensure the inspector passed in is an ImmixRegionInspector.

The main difference is that each space decides which trait to downcast into. So for Chunk, ImmixSpace will downcast it to ImmixRegionInspector, while MarkSweepSpace will downcast it to MarkSweepRegionInspector. This will allow the same type to follow different hierarchies.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do hope to avoid those policy-specific types.

  • I find having a lot of 'small types' is hard to read when you browse through the code.
  • They make it impossible to iterate through different spaces/regions in a unified way.

We should try get a simple solution. If a string comparison works, we don't need to introduce a new trait and several impls for the trait for every policy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants