Skip to content

Conversation

mcimerman
Copy link
Contributor

@mcimerman mcimerman commented May 8, 2025

HelenRAID

Part of bachelor thesis ``Sofware RAID for HelenOS".

For implementation details and architecture, please refer to Chapter 3 of the thesis text (page 44).

This PR includes:

  • lib/device/hr client IPC methods
  • uspace/app/hrctl - cli tool for RAID volume management
  • uspace/srv/bd/hr - the server
  • uspace/app/bdwrite - tool for writing cyclic blocks (filled with 'A' - 'Z')
    • was used for testing volume LBA mappings to underlying devices
    • will be deleted from this PR

This PR does not yet include:

  • Unit tests
    • already written
    • 11 scenarios, 38 total unit tests
    • will be added after an automation problem is solved or discussed

Todo list

This is just a list of things I would like to do before HelenRAID gets merged.

  • Include unit tests
  • Rename hr (server) and hrctl (utility) to raid (server) and raidctl (utility)
  • Delete uspace/app/bdwrite
  • Discuss if Linux's MD RAID support shall be kept (GPL)

Feel free to suggest additional changes or modifications.

Try it

Easiest way to try HelenRAID is to checkout this branch used to "ship" HelenRAID for the thesis.

I have written extensive instructions on how can one try and test HelenRAID. For these, please refer to Appendix D of the thesis text, specifically:

  • Section D.2 for launching unit tests
  • Section D.3 for manual test proposals
  • Section D.4 for testing foreign (to HelenRAID) RAID metadata volumes

The for the control utility (hrctl) documentation, please refer to Appendix C of the thesis text.

Read block on first iteration directly to
xorbuf. This way one memset and one xor
is saved.
Also prepare the metadata for metada versioning
and sync counter.
Removes whole raid4.c, as RAID4 is implemented
via RLQs in RAID5.
This fibril pool allows execution of grouped work units,
providing pre-allocated storage for the workers.

Based on an idea from Vojtech Horky.
And test if all workers finished against submitted rather
than started workers count, thus remove hr_fgroup_t.wus_started.
mcimerman added 30 commits June 29, 2025 18:24
Even if not allowing rebuild operation to happen,
we still must take rebuilt extents into account,
because writes below the resync_offset must be
done like in ONLINE state.
First - take first usable extent.
Closest - take extent with last seek position.
Round-robin - always switch extents.
Split - split I/O to multiple extents.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant