Skip to content

Commit

Permalink
Support for AddColumns and RemoveColumns.
Browse files Browse the repository at this point in the history
  • Loading branch information
mkromberg committed Jan 18, 2016
1 parent 4c331cc commit e7b477d
Show file tree
Hide file tree
Showing 4 changed files with 208 additions and 94 deletions.
3 changes: 1 addition & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# README #
# README #

`vecdb`
Current version: 0.2.0
Expand Down Expand Up @@ -48,7 +48,6 @@ Query results are returned as a vector with one element per database column, eac

There are ideas to add support for timeseries and versioning. This would include:

1. Add a single-byte indexed Char type (perhaps denoted lowercase "c"), indexing up to 127 unique strings
1. Support for deleting records
1. Performing all updates without overwriting data, and tagging old data with the timestamps defining its lifetime, allowing efficient queries on the database as it appeared at any given time in the past.
1. Built-in support for the computation of aggregate values as part of the parallel query mechanism, based on timeseries or other key values.
Expand Down
7 changes: 5 additions & 2 deletions TODO.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
# TODO #

1. Add columns after creation
1. Generalization of Symbol Tables + Add One, Four & Eight Byte Symbol Tables
1. Enhance queries to support conditional functions... Eg. ('price' '>' 100)('Name' 'like' 'A%')
1. Prototype parallel queries
1. Beef up error checking on file creation
1. Database status reporting function (# shards, records in each, statistics, etc)
1. Parallel queries using isolates
1. No Symbol Table Char Type
1. User Guide
1. "c" data type (single byte indices)
1. RESTful / ODATA? API
1. Timestamped non-overwriting updates
1. Delete records (AFTER non-overwriting updates)
1. Database cleanup (throw away history)
Expand Down
43 changes: 30 additions & 13 deletions TestVecdb.dyalog
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
:Namespace TestVecdb

Updated to version 0.2.2 with sharding, summary queries and add/remove of columns
Updated to version 0.2.3 with sharding, summary queries and add/remove of columns
Call TestVecdb.RunAll to run a full system test
assumes vecdb is loaded in #.vecdb
returns memory usage statistics (result of "memstats 0")
Expand All @@ -16,12 +16,12 @@
path{(-/()'\/')}source
'Testing vecdb version ',#.vecdb.Version
Basic

Sharding

zSharding;columns;data;options;params;folder;types;name;db;ix;rotate
zSharding;columns;data;options;params;folder;types;name;db;ix;rotate;newcols;colsnow;m
Test database with 2 shards
Also acts as test for add/remove columns

folderpath,'/',(name'shardtest'),'/'

Expand Down Expand Up @@ -49,10 +49,26 @@
db.Append time columns(3¨data)

assert 5=db.Count
assert(1 2,¨4 1)ixdb.Query('Name'((columns'Name')data)) Should find everything
ixdb.Query('Name'((columns'Name')data)) Should find everything
assert(1 2,¨4 1)ix
TEST'Read it all back'
assert datadb.Read time ix columns

newcolscolumns,¨'2'
TEST'Add columns'
db.AddColumns time newcols types
db.Update ix newcols data Populate new columns
assert(db.Read ix columns)(db.Read ix newcols)

TEST'Remove columns'
m(columns)db.ShardCols not the shard col
db.RemoveColumns time(m/columns),(~m)/newcols
colsnow((~m)/columns),m/newcols
types((~m)/types),m/types
data((~m)/data),m/data
assert(db.(Columns Types))(colsnow types) should now only have the new columns
assert datadb.Read time ix colsnow Check database is "undamaged"

TEST'Erase database'
assert 0={db.Erase}time

Expand All @@ -64,8 +80,11 @@
zBasic;columns;types;folder;name;db;tnms;data;numrecs;recs;select;where;expect;indices;options;params;range;rcols;rcoli;newvals;i;t;vals;ix
Create and delete some tables

numrecs5000000 5 million records
memstats 1 Clear memory statistics
numrecs50000000 50 million records
memstats 1 Clear memory statistics
:If (8×numrecs)>200016
'*** Warning: workspace size should be at least: ',((8×numrecs)÷1000000)',Mb ***'
:EndIf

folderpath,'/',(name'testdb1'),'/'
'Clearing: ',folder
Expand Down Expand Up @@ -118,25 +137,23 @@

TEST'Single key, single data group by'
expect(1data){,+/}2data
assert expectdb.Query 'sum col_I2' 'col_I1' select sum(col_I2) group by col_I1'
assert expectdb.Query time 'sum col_I2' 'col_I1' select sum(col_I2) group by col_I1'

TEST'Single CHAR key, single data group by'
expect(6data){,+/}2data
assert expectdb.Query 'sum col_I2' 'col_C' select sum(col_I2) group by col_C'
assert expectdb.Query time 'sum col_I2' 'col_C' select sum(col_I2) group by col_C'

TEST'Single key, multiple data group by'
expect(1data){,(+/[;1]),/[;2]}[0.5]data[2 3]
assert expectdb.Query ('sum col_I2' 'max col_I4')'col_I1' select sum(col_I2),max(col_I4) group by col_I1'
assert expectdb.Query time ('sum col_I2' 'max col_I4')'col_I1' select sum(col_I2),max(col_I4) group by col_I1'

TEST'Two key, single data group by'
expect([0.5]data[1 5]){,+/}2data
assert expectdb.Query 'sum col_I2'('col_I1' 'col_B') select sum(col_I2) group by col_I1'
assert expectdb.Query time 'sum col_I2'('col_I1' 'col_B') select sum(col_I2) group by col_I1'

TEST'Two key, multiple data group by'
expect([0.5]data[1 5]){,(+/[;1]),/[;2]}[0.5]data[2 3]
assert expectdb.Query ('sum col_I2' 'max col_I4')('col_I1' 'col_B') select sum(col_I2),max(col_I4) group by col_I1,col_B'


assert expectdb.Query time ('sum col_I2' 'max col_I4')('col_I1' 'col_B') select sum(col_I2),max(col_I4) group by col_I1,col_B'

Test vecdb.Replace
indicesdb.Query where
Expand Down
Loading

0 comments on commit e7b477d

Please sign in to comment.