Skip to content

Explain Storage Spaces file placement optimization details #3181

Open
@i3v

Description

@i3v

It is important to understand the optimization strategy and limitations of the Storage Spaces to use it efficiently.
Meanwhile, there's effectively no official info on how it works.
Some aspects to cover are:

  1. NTFS metadata

    1. Is SS smart enough to automatically keep all NTFS metadata (for both Capacity and Performance tiers) files on the Performance Tier?
    2. Or is it treating that metadata as regular files and moving it around during the Optimization, based on usage?
    3. Or is it always keeping NTFS metadata next to the file data? (I don't know, maybe for disaster recovery...)
    4. Can user "pin" all metadata to a Performance Tier like it is possible to pin a regular file?
  2. Heat map

    1. What is the "heat map" resolution? Is it "per file", "per 256MB slab", "per NTFS allocation unit"?
    2. What is "heat map" actually counting? Is it "file open" or "block read/write" operations?
    3. Let's assume I have 2 files, 100GB each. I open 1st file 1000 times a day and read a random 16KB each time. I also open the second file 2 times a day, and read the whole file each time. My Performance Tier is, say, 100GB large. What would be moved to the Performance Tier?
    4. Few other examples, explaining how the heatmap should work would be nice as well..
    5. Is it possible to explicitly view/export the heatmap? Where is it stored? Is it possible to set some "notifications", e.g. for a "some file became very hot" (e.g. to trigger the optimization or maybe just pin those particular files)?
  3. In which cases data is moved across tiers?

    1. Is it true that SS never moves any NTFS files to another tier without Optimize-Volume or defrag.exe /C /H /K /G (that are either started automatically or manually)? E.g. is it correct that even if some file in the Capacity tier suddenly became very hot at some point, it won't be moved to the Performance tier before the Optimization would be started?
    2. Is ReFS any different?
  4. File placement and "slab consolidation".

    1. Let's assume user writes many (~100,000) files to a single folder, 200KB each. Files are generated in parallel, 128 CPU threads, and CPU is definitely not the bottleneck. The storage space is Simple, with 2 columns, 128KB interleave, 10 physical HDDs and NTFS filesystem with 64KB allocation unit. Let's assume that there's either no tiering (just HDDs) or with Performance tier already filled up with something. Is SS smart enough to evenly distribute the outstanding write operations across the 10 physical drives? Is it smart enough to not allow the NTFS folder metadata modification become the bottleneck (e.g. aggressively move that metadata to the Performance Tier, maybe?) Or would it be attempting to optimize "slab consolidation" and thus sequentially write the incoming small files to just two physical HDDs all the way until the first 256MB slab would be filled with data?
    2. Assuming all those files are already written, would Optimize-Volume (with/without -SlabConsolidate) attempt to collect them together, to minimize the total number of 256MB-slabs? Does it take into account that all those files are in a single folder anyhow (or would it just randomly put those small files to some other pre-existing 256MB slabs, which are not 100% filled with data?)
    3. Won't this scenario actually work better if there would be just one column? Provided that each file is relatively small, it looks like the overall performance would be limited by the number of disk seek operations, and with more columns there would be more seek operations (two HDDs must seek for 2 parts of each file), isn't it?

Document Details

Do not edit this section. It is required for learn.microsoft.com ➟ GitHub issue linking.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions