Skip to content

[POC] feat: storage for optimizer core objects #9

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ jobs:
uses: dtolnay/install@cargo-docs-rs
- name: cargo docs-rs
# TODO: Once we figure out the crates, rename this.
run: cargo docs-rs -p optd-tmp
run: cargo docs-rs -p optd-storage
hack:
# cargo-hack checks combinations of feature flags to ensure that features are all additive
# which is required for feature unification
Expand Down
9 changes: 8 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,11 @@ Cargo.lock
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
#.idea/

### Project Specific ###

# The memo table database for testing purposes.
test_memo.db
# Storing environment variables.
.env
13 changes: 12 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
[workspace]
members = ["optd-tmp"]
members = ["optd-storage"]
resolver = "2"

[workspace.dependencies]
anyhow = "1"
chrono = "0.4.39"
diesel = { version = "2.2", features = [
"sqlite",
"returning_clauses_for_sqlite_3_35",
"chrono",
] }
# Using a bundled version of sqlite3-sys to avoid build issues.
libsqlite3-sys = { version = "0.30", features = ["bundled"] }
15 changes: 15 additions & 0 deletions diesel.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# For documentation on how to configure this file,
# see https://diesel.rs/guides/configuring-diesel-cli

[print_schema]
# The file diesel will write the generated schema to.
file = "optd-storage/src/storage/schema.rs"


# A column of type `INTEGER PRIMARY KEY` becomes an alias for the 64-bit signed integer `ROWID`.
# See https://sqlite.org/autoinc.html for more details.
sqlite_integer_primary_key_is_bigint = true

[migrations_directory]
# The directory where the migration files are located.
dir = "optd-storage/migrations"
1 change: 1 addition & 0 deletions docs/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
# Contributor Guide

- [Installaton]()
- [Working with diesel-rs](./contributor_guide/diesel.md)

# RFCs

Expand Down
55 changes: 55 additions & 0 deletions docs/src/contributor_guide/diesel.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Working with diesel-rs

[Diesel](https://diesel.rs/) is an ORM framework we use to persist the core objects in the optd query optimizer. We chose to work with Diesel instead of other alternatives mainly for its compile-time safety guarantees, which is a good companion for our table-per-operator-kind model.

This guide assumes that you already have the `sqlite3` binary installed.

## Setup

When working with Diesel for the first time, you could use the convenient setup scripts located at `scripts/setup.sh`. The script will install the Diesel CLI tool, generate a testing memo table database at project root, and run the Diesel setup script.

For more details, follow the [Getting Started with Diesel](https://diesel.rs/guides/getting-started.html) guide.

## Making changes

To generate a new migration, use the following command:

```shell
diesel migration generate <migration_name>
```

Diesel CLI will create two empty files in the `optd-storgage/migrations` folder. You will see output that looks something like this:

```shell
Creating optd-storage/migrations/2025-01-20-153830_<migration_name>/up.sql
Creating optd-storage/migrations/2025-01-20-153830_<migration_name>/down.sql
```

The `up.sql` file should contain the changes you want to apply and `down.sql` should contain the command to revert the changes.

Before optd becomes stable, it is ok to directly modify the migrations themselves.

To apply the new migration, run:

```shell
diesel migration run
```

You can also check that if `down.sql` properly revert the change:

```shell
diesel migration redo [-n <REDO_NUMBER>]
```

You can also use the following command to revert changes:

```shell
diesel migration revert [-n <REVERT_NUMBER>]

## Adding a new operator

(TODO)

## Adding a new property

(TODO)
10 changes: 10 additions & 0 deletions optd-storage/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
[package]
name = "optd-storage"
version = "0.1.0"
edition = "2021"

[dependencies]
diesel.workspace = true
chrono.workspace = true
anyhow.workspace = true
libsqlite3-sys.workspace = true
Empty file added optd-storage/migrations/.keep
Empty file.
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DROP TABLE rel_groups;
14 changes: 14 additions & 0 deletions optd-storage/migrations/2025-01-19-032646_create_rel_groups/up.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
CREATE TABLE rel_groups (
-- The group identifier
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
-- The optimization status of the group.
-- It could be:
-- Unexplored, Exploring, Explored, Optimizing, Optimized.
-- `0` indicates `Unexplored`.
status INTEGER NOT NULL,
-- Time at which the group is created.
created_at TIMESTAMP DEFAULT (CURRENT_TIMESTAMP) NOT NULL,
-- The group identifier of the representative.
rep_id BIGINT,
FOREIGN KEY (rep_id) REFERENCES rel_groups(id) ON DELETE CASCADE ON UPDATE CASCADE
);
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DROP TABLE logical_props;
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
CREATE TABLE logical_props (
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
-- The relational group that shares this logical property entry.
group_id BIGINT NOT NULL,
-- The number of rows produced by this relation.
card_est BIGINT NOT NULL,

FOREIGN KEY(group_id) REFERENCES rel_groups(id) ON DELETE CASCADE ON UPDATE CASCADE
);
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DROP TABLE logical_typ_descs;
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
CREATE TABLE logical_typ_descs (
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL
);
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DROP TABLE logical_exprs;
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
-- The relational logical expressions table specifies which group a logical expression belongs to.
CREATE TABLE logical_exprs (
-- The logical expression id.
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
-- The type descriptor of the logical expression.
typ_desc BIGINT NOT NULL,
-- The group identifier of the logical expression.
group_id BIGINT NOT NULL, -- groups.id
-- Time at which the logical expression is created.
created_at TIMESTAMP DEFAULT (CURRENT_TIMESTAMP) NOT NULL,
FOREIGN KEY (typ_desc) REFERENCES logical_typ_descs(id) ON DELETE CASCADE ON UPDATE CASCADE,
FOREIGN KEY (group_id) REFERENCES rel_groups(id) ON DELETE CASCADE ON UPDATE CASCADE
);
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DROP TABLE physical_typ_descs;
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
CREATE TABLE physical_typ_descs (
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
-- Name of the physical operator.
name TEXT NOT NULL
);
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DROP TABLE physical_props;
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
CREATE TABLE physical_props (
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
-- Payload type.
payload BLOB NOT NULL
);
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DROP TABLE physical_exprs;
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
-- The relational physical expressions table specifies which group a physical expression belongs to.
-- It also specifies the derived physical property and the cost associated with the physical expression.
CREATE TABLE physical_exprs (
-- The physical expression id.
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
-- The type descriptor of the physical expression.
typ_desc BIGINT NOT NULL,
-- The group this physical expression belongs to.
group_id BIGINT NOT NULL,
-- The physical property dervied based on the properties of the children nodes.
derived_phys_prop_id BIGINT NOT NULL,
-- The cost associated with the physical expression.
cost DOUBLE NOT NULL,
-- Time at which the physical expression is created.
created_at TIMESTAMP DEFAULT (CURRENT_TIMESTAMP) NOT NULL,
FOREIGN KEY (typ_desc) REFERENCES physical_typ_descs(id) ON DELETE CASCADE ON UPDATE CASCADE,
FOREIGN KEY (group_id) REFERENCES rel_groups(id) ON DELETE CASCADE ON UPDATE CASCADE,
FOREIGN KEY (derived_phys_prop_id) REFERENCES physical_props(id) ON DELETE CASCADE ON UPDATE CASCADE
);
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DROP TABLE rel_subgroup_winners;
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
-- The winners table records the winner of a group with some required physical property.
CREATE TABLE rel_subgroup_winners (
-- The subgroup id of the winner, i.e. the winner of the group with `group_id` and some required physical property.
subgroup_id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
-- The physical expression id of the winner.
physical_expr_id BIGINT NOT NULL,
FOREIGN KEY (subgroup_id) REFERENCES rel_subgroup(id) ON DELETE CASCADE ON UPDATE CASCADE,
FOREIGN KEY (physical_expr_id) REFERENCES physical_exprs(id) ON DELETE CASCADE ON UPDATE CASCADE
);
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DROP TABLE scalar_groups;
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
CREATE TABLE scalar_groups (
-- The group identifier
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
-- The optimization status of the group.
-- It could be:
-- Unexplored, Exploring, Explored, Optimizing, Optimized.
-- `0` indicates `Unexplored`.
status INTEGER NOT NULL,
-- Time at which the group is created.
created_at TIMESTAMP DEFAULT (CURRENT_TIMESTAMP) NOT NULL,
-- The group identifier of the representative.
rep_id BIGINT,
FOREIGN KEY (rep_id) REFERENCES rel_groups(id) ON DELETE CASCADE ON UPDATE CASCADE
);
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DROP TABLE scalar_typ_descs;
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
CREATE TABLE scalar_typ_descs (
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
-- Name of the scalar operator.
name TEXT NOT NULL
);
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DROP TABLE scalar_props;
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
-- The scalar properties table contains the scalar property associated with
-- some scalar expression.
-- TODO(yuchen): add scalar properties.
CREATE TABLE scalar_props (
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
payload BLOB NOT NULL
);
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DROP TABLE scalar_exprs;
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
-- The scalar expressions table specifies which group a scalar expression belongs to.
-- It also specifies the derived scalar property and the cost associated with the
CREATE TABLE scalar_exprs (
-- The scalar expression id.
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
-- The type descriptor of the scalar expression.
typ_desc BIGINT NOT NULL,
-- The group identifier of the scalar expression.
group_id BIGINT NOT NULL,
-- Time at which the logical expression is created.
created_at TIMESTAMP DEFAULT (CURRENT_TIMESTAMP) NOT NULL,
-- The cost associated computing the scalar expression.
cost DOUBLE, -- TODO: This can be NULL, do we want a seperate table?
FOREIGN KEY (typ_desc) REFERENCES scalar_typ_descs(id) ON DELETE CASCADE ON UPDATE CASCADE,
FOREIGN KEY (group_id) REFERENCES scalar_groups(id) ON DELETE CASCADE ON UPDATE CASCADE
);
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DROP TABLE scalar_group_winners;
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
-- The scalar group winners table records the winner of a scalar group.
CREATE TABLE scalar_group_winners (
-- The scalar group we are interested in.
group_id BIGINT NOT NULL,
-- The winner of the group with `group_id`.
scalar_expr_id BIGINT NOT NULL,
PRIMARY KEY (group_id),
FOREIGN KEY (group_id) REFERENCES scalar_groups(id) ON DELETE CASCADE ON UPDATE CASCADE,
FOREIGN KEY (scalar_expr_id) REFERENCES scalar_exprs(id) ON DELETE CASCADE ON UPDATE CASCADE
);

-- Could also do a query to compute the winner:
-- SELECT MIN(cost), [all other fields]
-- FROM scalar_exprs
-- WHERE group_id = <input_group>;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DROP TABLE rel_subgroups;
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
-- The relational subgroups table specifies the subgroups of a group with some required physical property.
CREATE TABLE rel_subgroups (
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
-- The group the subgroup belongs to.
group_id BIGINT NOT NULL,
-- The required physical property of the subgroup.
required_phys_prop_id BIGINT NOT NULL,
FOREIGN KEY (group_id) REFERENCES rel_groups(id) ON DELETE CASCADE ON UPDATE CASCADE,
FOREIGN KEY (required_phys_prop_id) REFERENCES physical_props(id) ON DELETE CASCADE ON UPDATE CASCADE
);
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DROP TABLE rel_subgroup_physical_exprs;
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
-- The relational subgroup expressions table specifies the physical expressions of a subgroup.
-- It is a m:n junction table since a subgroup can have multiple physical expressions,
-- and a physical expression can belong to multiple subgroups.
CREATE TABLE rel_subgroup_physical_exprs (
-- The subgroup the physical expression belongs to.
subgroup_id BIGINT NOT NULL,
-- The physical expression id.
physical_expr_id BIGINT NOT NULL,
PRIMARY KEY (subgroup_id, physical_expr_id),
FOREIGN KEY (subgroup_id) REFERENCES rel_subgroups(id) ON DELETE CASCADE ON UPDATE CASCADE,
FOREIGN KEY (physical_expr_id) REFERENCES physical_exprs(id) ON DELETE CASCADE ON UPDATE CASCADE
);
2 changes: 2 additions & 0 deletions optd-tmp/src/lib.rs → optd-storage/src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
pub mod storage;

pub fn add(left: u64, right: u64) -> u64 {
left + right
}
Expand Down
2 changes: 2 additions & 0 deletions optd-storage/src/storage.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
pub mod models;
pub mod schema;
Loading
Loading