Add embedded on-disk transactional graph disk storage #1

gitbuda · 2025-06-13T18:44:39Z

TODOs 🛠️

Integrate with memgraph (an example / previous attempt is under Experiment alternative storage gitbuda/memgraph#12), make it fully transparent (fully behind memgraph's query engine)
Parquet seems to be a dominant strategy, can we somehow better utilize Arrow so that the disk overhead is paying off?

Testing and Benchmarking 📈👀

2025-06-08 non-transactional, no-indexing, storage on local disk, Mac M1

NOTE: Used Arrow as input format for Parquet
NOTE: Similar performance while Parquet is ~4x smaller in file sizes (for bigger batch sizes)

2025-06-15 transactional (RU and RC are the same), no-indexing, storage on local disk, Mac M1

Ideas 🤔

Use Arrow for serializing data (the same Node serialization could be used under WAL and primary storage files), Parquet for storing data on disk. NOTE: WAL is not required for on-disk systems.
It's probably possible to detect serialization errors by storing additional metadata about updates at the WAL creation time (those could be later deleted by GC).

…xing data structure

gitbuda added 15 commits June 13, 2025 20:43

Add embedded on-disk transactional graph disk storage

73d2a2a

Update diagram.dot

cf80922

Refactor TransactionalGraph to hold one set of Transaction objects

ed257a9

Add G1a hermitage test (impl of RU is still wrong)

cb7d357

Add params to the Hermitage gtest

024d43c

Add multiple isolation levels to the benchmarking

9ddc02a

Add G1b test case

a085e71

Fix initial setup to follow the original script

c87525d

Add G1c test case

7763994

Add OTV test case

11e3e07

Improve the hermitage test outcome formatting

a3e732c

Merge diverged plot scripts

160bd92

Add basic/slow index implementation under graph + correctness testing

6fb84d6

Move indexing code to a new file

f010068

Replace std::unordered_map with folly concurrent skiplist as the inde…

7abbfdf

…xing data structure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add embedded on-disk transactional graph disk storage #1

Add embedded on-disk transactional graph disk storage #1

Uh oh!

gitbuda commented Jun 13, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add embedded on-disk transactional graph disk storage #1

Are you sure you want to change the base?

Add embedded on-disk transactional graph disk storage #1

Uh oh!

Conversation

gitbuda commented Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TODOs 🛠️

Testing and Benchmarking 📈👀

2025-06-08 non-transactional, no-indexing, storage on local disk, Mac M1

2025-06-15 transactional (RU and RC are the same), no-indexing, storage on local disk, Mac M1

Ideas 🤔

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gitbuda commented Jun 13, 2025 •

edited

Loading