Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: DSL parser + AST implementation #23

Merged
merged 66 commits into from
Feb 11, 2025
Merged
Show file tree
Hide file tree
Changes from 65 commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
e15907e
add logical side of schema
yliang412 Jan 30, 2025
a15bff4
add and impl RelationChildren and ScalarChildren traits
yliang412 Jan 30, 2025
e30b2c9
create sequence
yliang412 Jan 31, 2025
c5ec6cb
Merge branch 'main' into yuchen/initial-storage
yliang412 Feb 1, 2025
5aaeefc
Add doc of rule engine
AlSchlo Feb 2, 2025
e49b7a6
Be more specific about rule application WITH
AlSchlo Feb 2, 2025
92a3546
Fix indentation error
AlSchlo Feb 2, 2025
639c6f3
Fix again
AlSchlo Feb 2, 2025
134dd04
Add new IR and update doc
AlSchlo Feb 2, 2025
893ae2d
Add more details to doc and refine IR
AlSchlo Feb 2, 2025
66b43d7
Refine IR
AlSchlo Feb 2, 2025
69ca1a0
Update docs
AlSchlo Feb 2, 2025
5d22cde
Remove mention of tree
AlSchlo Feb 2, 2025
df9e9db
Make doc consistent again
AlSchlo Feb 2, 2025
54b47fb
Reorg dir
AlSchlo Feb 2, 2025
b3156eb
Add missing fules
AlSchlo Feb 2, 2025
2f00009
add basic apis
yliang412 Feb 3, 2025
3eff602
Reorg more code
AlSchlo Feb 3, 2025
4d6e33d
Add missing files
AlSchlo Feb 3, 2025
5c7a5cd
Add transformers
AlSchlo Feb 3, 2025
0caffcc
Add actions doc
AlSchlo Feb 3, 2025
00be567
use trigger to merge groups
yliang412 Feb 3, 2025
e999f62
Add analyzer doc
AlSchlo Feb 3, 2025
d49cb8b
Add more docs
AlSchlo Feb 3, 2025
9a88281
Rename cascades into alexis_stuff to allow for cascades subdir
AlSchlo Feb 3, 2025
eaf06c9
Reorg repo structure
AlSchlo Feb 3, 2025
abc0397
Add missing files
AlSchlo Feb 3, 2025
b9578a1
Refactor types
AlSchlo Feb 3, 2025
fc9b050
add scalar stuff
yliang412 Feb 3, 2025
8bad99c
fix projects operator
yliang412 Feb 3, 2025
d8e228e
Partial merge
AlSchlo Feb 3, 2025
b069d2e
The mother of merges
AlSchlo Feb 4, 2025
0ca88dd
Fix horrible Json bug
AlSchlo Feb 4, 2025
f160dce
Update migration
AlSchlo Feb 4, 2025
7e2663d
Start implementation of engine interpreter
AlSchlo Feb 4, 2025
d9a4cd1
add test utilities
yliang412 Feb 4, 2025
178fb97
fix scalar_adds foreign key
yliang412 Feb 4, 2025
64c852b
Latest demo test
AlSchlo Feb 4, 2025
fc36f2f
Refactor and comment operators dir
AlSchlo Feb 5, 2025
92a3c65
Refactor and comment plans dir
AlSchlo Feb 5, 2025
e75fc9b
Refactor and comment values dir
AlSchlo Feb 5, 2025
270d61a
Fix engine/actions & engine/patterns dirs
AlSchlo Feb 5, 2025
f24c089
Add missing files
AlSchlo Feb 5, 2025
59f7a6b
Cleanup ingestion function
AlSchlo Feb 5, 2025
c10166e
First grammar versions
AlSchlo Feb 8, 2025
dca2537
Fix grammar a bit
AlSchlo Feb 8, 2025
f8e777a
Fix more bugs
AlSchlo Feb 8, 2025
0d370ad
First version of grammar
AlSchlo Feb 8, 2025
e76e573
Finish grammar
AlSchlo Feb 8, 2025
027de75
Remove old code
AlSchlo Feb 8, 2025
26227d7
Make braces optional in def
AlSchlo Feb 8, 2025
c75e6bb
Substantially improve the grammar for operators
AlSchlo Feb 8, 2025
db078c6
Fix missing newline
AlSchlo Feb 8, 2025
06a7075
Fix matching and constructor grammar
AlSchlo Feb 8, 2025
1caa03e
Add missing files
AlSchlo Feb 8, 2025
b899699
Finish parser
AlSchlo Feb 9, 2025
3173aed
Heavy refactor & add tests
AlSchlo Feb 10, 2025
3d87b42
Merge with main
AlSchlo Feb 10, 2025
22e7197
Remove dangling print statement
AlSchlo Feb 10, 2025
591f4c3
Fix clippy
AlSchlo Feb 10, 2025
c5bcfa5
Remove duplicate literal
AlSchlo Feb 10, 2025
52d4157
Add comment
AlSchlo Feb 10, 2025
14348e4
Fix flaky cascades test to run in memory
AlSchlo Feb 10, 2025
9b66b85
Fix some grammar bugs and start semantic analysis
AlSchlo Feb 11, 2025
b45654d
Fix clippy
AlSchlo Feb 11, 2025
cfb02bd
Remove analyzer
AlSchlo Feb 11, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 53 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions optd-core/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,5 @@ serde = { version = "1.0", features = ["derive"] }
serde_json = { version = "1", features = ["raw_value"] }
dotenvy = "0.15"
async-recursion = "1.1.1"
pest = "2.7.15"
pest_derive = "2.7.15"
2 changes: 1 addition & 1 deletion optd-core/src/cascades/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@ mod tests {

#[tokio::test]
async fn test_ingest_partial_logical_plan() -> anyhow::Result<()> {
let memo = SqliteMemo::new("sqlite://memo.db").await?;
let memo = SqliteMemo::new_in_memory().await?;
// select * from t1, t2 where t1.id = t2.id and t2.name = 'Memo' and t2.v1 = 1 + 1
let partial_logical_plan = filter(
join(
Expand Down
1 change: 1 addition & 0 deletions optd-core/src/dsl/analyzer/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
pub mod semantic;
258 changes: 258 additions & 0 deletions optd-core/src/dsl/analyzer/semantic.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,258 @@
/*use std::collections::HashSet;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm unused files?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah damn i pushed name analyzer in here.


use crate::dsl::parser::ast::{Expr, File, Function, Operator, Pattern, Properties, Type};

#[derive(Debug)]
pub struct SemanticAnalyzer {
logical_properties: HashSet<String>,
operators: HashSet<String>,
identifiers: Vec<HashSet<String>>,
}

impl SemanticAnalyzer {
pub fn new() -> Self {
SemanticAnalyzer {
logical_properties: HashSet::new(),
operators: HashSet::new(),
identifiers: Vec::new(),
}
}

fn enter_scope(&mut self) {
self.identifiers.push(HashSet::new());
}

fn exit_scope(&mut self) {
self.identifiers.pop();
}

fn add_identifier(&mut self, name: String) -> Result<(), String> {
if let Some(scope) = self.identifiers.last_mut() {
if scope.contains(&name) {
return Err(format!("Duplicate identifier name: {}", name));
}
scope.insert(name);
}
Ok(())
}

fn lookup_identifier(&self, name: &str) -> bool {
self.identifiers
.iter()
.rev()
.any(|scope| scope.contains(name))
}

fn is_valid_scalar_type(&self, ty: &Type) -> bool {
match ty {
Type::Array(inner) => self.is_valid_scalar_type(inner),
Type::Int64 | Type::String | Type::Bool | Type::Float64 => true,
_ => false,
}
}

fn is_valid_logical_type(&self, ty: &Type) -> bool {
match ty {
Type::Array(inner) => self.is_valid_logical_type(inner),
Type::Int64 | Type::String | Type::Bool | Type::Float64 => true,
_ => false,
}
}

fn is_valid_property_type(&self, ty: &Type) -> bool {
match ty {
Type::Array(inner) => self.is_valid_property_type(inner),
Type::Tuple(fields) => fields.iter().all(|f| self.is_valid_property_type(f)),
Type::Map(a, b) => self.is_valid_property_type(a) && self.is_valid_property_type(b),
Type::Int64 | Type::String | Type::Bool | Type::Float64 => true,
Type::Function(_, _) => false,
Type::Operator(_) => false,
}
}

fn validate_properties(&mut self, properties: &Properties) -> Result<(), String> {
for field in &properties.fields {
if !self.is_valid_property_type(&field.ty) {
return Err(format!("Invalid type in properties: {:?}", field.ty));
}
}

self.logical_properties = properties
.fields
.iter()
.map(|field| field.name.clone())
.collect();

Ok(())
}

fn validate_operator(&mut self, operator: &Operator) -> Result<(), String> {
match operator {
Operator::Scalar(scalar_op) => {
if self.operators.contains(&scalar_op.name) {
return Err(format!("Duplicate operator name: {}", scalar_op.name));
}
self.operators.insert(scalar_op.name.clone());

for field in &scalar_op.fields {
if !self.is_valid_scalar_type(&field.ty) {
return Err(format!("Invalid type in scalar operator: {:?}", field.ty));
}
}
}
Operator::Logical(logical_op) => {
if self.operators.contains(&logical_op.name) {
return Err(format!("Duplicate operator name: {}", logical_op.name));
}
self.operators.insert(logical_op.name.clone());

for field in &logical_op.fields {
if !self.is_valid_logical_type(&field.ty) {
return Err(format!("Invalid type in logical operator: {:?}", field.ty));
}
}

// Check that derived properties match the logical properties fields
for (prop_name, _) in &logical_op.derived_props {
if !self.operators.iter().any(|f| f == prop_name) {
return Err(format!(
"Derived property not found in logical properties: {}",
prop_name
));
}
}

// Check that all logical properties fields have corresponding derived properties
for field in &self.logical_properties {
if !logical_op.derived_props.contains_key(field) {
return Err(format!(
"Logical property field '{}' is missing a derived property",
field
));
}
}
}
}
Ok(())
}

// Validate a function definition
fn validate_function(&mut self, function: &Function) -> Result<(), String> {
if self.function_names.contains(&function.name) {
return Err(format!("Duplicate function name: {}", function.name));
}
self.function_names.insert(function.name.clone());

self.enter_scope();
for (param_name, _) in &function.params {
self.add_identifier(param_name.clone())?;
}
self.validate_expr(&function.body)?;
self.exit_scope();

Ok(())
}

// Validate an expression
fn validate_expr(&mut self, expr: &Expr) -> Result<(), String> {
match expr {
Expr::Var(name) => {
if self.lookup_identifier(name) {
return Err(format!("Undefined identifier: {}", name));
}
}
Expr::Val(name, expr1, expr2) => {
self.validate_expr(expr1)?;
self.add_identifier(name.clone())?;
self.validate_expr(expr2)?;
}
Expr::Match(expr, arms) => {
self.validate_expr(expr)?;
for arm in arms {
self.validate_pattern(&arm.pattern)?;
self.validate_expr(&arm.expr)?;
}
}
Expr::If(cond, then_expr, else_expr) => {
self.validate_expr(cond)?;
self.validate_expr(then_expr)?;
self.validate_expr(else_expr)?;
}
Expr::Binary(left, _, right) => {
self.validate_expr(left)?;
self.validate_expr(right)?;
}
Expr::Unary(_, expr) => {
self.validate_expr(expr)?;
}
Expr::Call(func, args) => {
self.validate_expr(func)?;
for arg in args {
self.validate_expr(arg)?;
}
}
Expr::Member(expr, _) => {
self.validate_expr(expr)?;
}
Expr::MemberCall(expr, _, args) => {
self.validate_expr(expr)?;
for arg in args {
self.validate_expr(arg)?;
}
}
Expr::ArrayIndex(array, index) => {
self.validate_expr(array)?;
self.validate_expr(index)?;
}
Expr::Literal(_) => {}
Expr::Fail(_) => {}
Expr::Closure(params, body) => {
self.enter_scope();
for param in params {
self.add_identifier(param.clone())?;
}
self.validate_expr(body)?;
self.exit_scope();
}
_ => {}
}
Ok(())
}

// Validate a pattern
fn validate_pattern(&mut self, pattern: &Pattern) -> Result<(), String> {
match pattern {
Pattern::Bind(name, pat) => {
self.add_identifier(name.clone())?;
self.validate_pattern(pat)?;
}
Pattern::Constructor(_, pats) => {
for pat in pats {
self.validate_pattern(pat)?;
}
}
Pattern::Literal(_) => {}
Pattern::Wildcard => {}
Pattern::Var(name) => {
self.add_identifier(name.clone())?;
}
}
Ok(())
}

// Validate a complete file
pub fn validate_file(&mut self, file: &File) -> Result<(), String> {
self.validate_properties(&file.properties)?;

for operator in &file.operators {
self.validate_operator(operator)?;
}

for function in &file.functions {
self.validate_function(function)?;
}

Ok(())
}
}
*/
2 changes: 2 additions & 0 deletions optd-core/src/dsl/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
pub mod analyzer;
pub mod parser;
Loading