|
| 1 | +# CEVFS: Compression & Encryption VFS for SQLite 3 |
| 2 | +CEVFS is a SQLite 3 Virtual File System for compressing and encrypting data at the pager level. Once set up, you use SQLite as you normally would and the compression and encryption is transparently handled during database read/write operations via the SQLite pager. |
| 3 | + |
| 4 | +## Introduction |
| 5 | +CEVFS is an open source SQLite extension that combines some of the functionality of [CEROD](http://www.sqlite.org/cerod/doc/trunk/www/index.wiki) and [ZIPVFS](http://www.sqlite.org/zipvfs/doc/trunk/www/index.wiki) into one package. Even though no testing has been done yet beyond macOS and iOS, the goal is to make it as portable as SQLite itself -- with the help of the community. |
| 6 | + |
| 7 | +CEVFS gives you convenient hooks into SQLite that allow you to easily implement your own compression and encryption functions. As different operating systems come preloaded with different compression and encryption libraries, no default compression or encryption functions are supplied with CEVFS. The **cevfs_example** project uses Zlib and CCCryptor (3cc), both of which are included in macOS and iOS. |
| 8 | + |
| 9 | +## How It Works |
| 10 | +CEVFS is a [SQLite Virtual File System](http://www.sqlite.org/vfs.html) which uses its own pager and is inserted between the pager used by the b-tree (herein referred to as the upper pager) and the OS interface. This allows it to intercept the read/write operations to the database and seamlessly compress/decompress and encrypt/decrypt the data. See "[How ZIPVFS Works](http://www.sqlite.org/zipvfs/doc/trunk/www/howitworks.wiki)" for more details. Unlike ZIPVFS, the page size of the pager used by CEVFS (herein referred to as the lower pager) is determined when the CEVFS database is created and is a persistent property of the database. WAL mode is not yet supported. |
| 11 | + |
| 12 | +## How To Build |
| 13 | +The `cevfs.workspace` Xcode workspace shows how you can use the SQLite Split Amalgamation files to add the SQLite and CEVFS source directly to your project. My creating a separate project for SQlite+CEVFS vs. your main app, you can adjust the build setting independently. |
| 14 | + |
| 15 | +For illustration, here's how you could do it command-line style: |
| 16 | + |
| 17 | +#### Get the SQLite Source Code |
| 18 | +Currently, it is recommended that you use SQLite version 3.10.2. It is a known issue that CEVFS currently does _not_ work properly with SQLite version 3.13.0. Other versions haven't been tested yet. This will be resolved in a future release. |
| 19 | + |
| 20 | +It's a little tricky obtaining previous releases of SQLite as you now have to get them directly from the source code repository which generates the Zip or Tarball on the fly. |
| 21 | + |
| 22 | +1. Go to the [Downloads page](http://www.sqlite.org/download.html). |
| 23 | +1. Select one of the geographically located sites ([Dallas TX](http://www.sqlite.org/cgi/src), [Newark NJ](http://www2.sqlite.org/cgi/src), [Fremont CA](http://www3.sqlite.org/cgi/src)) from the Source Code Repositories section at the bottom of the page. |
| 24 | +1. Select the **Tags** tab at the top of the page. |
| 25 | +1. Select the SQLite version you're interested in. |
| 26 | +1. Select the Fossil SHA1 (10-char hex string) commit link. |
| 27 | +1. Finally, download the tarball or ZIP archive. |
| 28 | + |
| 29 | +_For easy reference, it will be assumed that the SQLite source code is in the `sqlite/` directory._ |
| 30 | + |
| 31 | +#### Create the Amalgamation file. |
| 32 | +1. Decompress the archive and `cd` to the root of the unarchived directory. |
| 33 | +1. `./configure` |
| 34 | +1. `make sqlite3.c` |
| 35 | + |
| 36 | +#### Build a Static Library |
| 37 | +1. Create a temporary `build` directory and `cd` to it. |
| 38 | +1. Copy `sqlite3.c` and `cevfs.c` to the `build` directory. |
| 39 | +1. Combine the files: `cat sqlite3.c cevfs.c > cevfs-all.c` |
| 40 | +1. Compile: `clang -c cevfs-all.c -o sqlite3.o -oS` |
| 41 | +1. Create static lib: `libtool -static sqlite3.o -o sqlite3.a` |
| 42 | + |
| 43 | +### Creating a Command-Line Build Tool |
| 44 | +If you are using macOS, you can use the `cevfs_build` example which implements compression using Zlib and encryption using 3cc. Otherwise, modify the _xFunctions_ first to accommodate your operating system. |
| 45 | + |
| 46 | +Copy the following files to your temporary build directory: |
| 47 | +- sqlite/sqlite3.c |
| 48 | +- cevfs/cevfs.c |
| 49 | +- cevfs_build/cevfs_build.c |
| 50 | + |
| 51 | +Build: |
| 52 | +``` |
| 53 | +build> $ cat sqlite3.c cevfs.c > cevfs-all.c |
| 54 | +build> $ clang cevfs-all.c cevfs_build.c -O2 -o cevfs_build -lz |
| 55 | +``` |
| 56 | + |
| 57 | +Then to create a CEVFS database: |
| 58 | +``` |
| 59 | +./cevfs_build UNCOMPRESSED COMPRESSED KEY VFS_NAME |
| 60 | +``` |
| 61 | + |
| 62 | +parameters: |
| 63 | +- **UNCOMPRESSED**: path to uncompressed database |
| 64 | +- **COMPRESSED**: path to new compressed database |
| 65 | +- **KEY**: encryption key |
| 66 | +- **VFS_NAME**: name to embed in header (10 chars. max.) |
| 67 | + |
| 68 | +### Creating a Custom Version of SQLite |
| 69 | +It is helpful to have a custom command-line version of `sqlite3` on your development workstation for opening/testing your newly created databases. |
| 70 | + |
| 71 | +Again, modify your _xFunctions_ to accommodate your operating system. You may also need to install the Readline lib. |
| 72 | + |
| 73 | +Copy the following files to your temporary `build` directory. |
| 74 | +- sqlite3.c (from SQLite source) |
| 75 | +- shell.c (from SQLite source) |
| 76 | +- cevfs/cevfs.c |
| 77 | +- cevfs_build/cevfs_mod.c |
| 78 | + |
| 79 | +Build: |
| 80 | +``` |
| 81 | +build> $ cat sqlite3.c cevfs.c cevfs_mod.c > cevfs-all.c |
| 82 | +build> $ clang cevfs-all.c shell.c -DSQLITE_ENABLE_CEROD=1 -DHAVE_READLINE=1 -O2 -o sqlite3 -lz -lreadline |
| 83 | +``` |
| 84 | + |
| 85 | +Then, to open a CEVFS database: |
| 86 | + |
| 87 | +``` |
| 88 | +$ ./sqlite3 |
| 89 | +sqlite> PRAGMA activate_extensions("cerod-x'<your hex key goes here>'"); |
| 90 | +sqlite> .open path/to/your/cerod/db |
| 91 | +``` |
| 92 | + |
| 93 | +By specifying `SQLITE_ENABLE_CEROD` we can make use of an API hook that's built into SQLite for the CEROD extension that will allow you to conveniently activate CEVFS. It has the following signature: |
| 94 | + |
| 95 | +``` |
| 96 | +SQLITE_API void SQLITE_STDCALL sqlite3_activate_cerod(const char *zPassPhrase); |
| 97 | +``` |
| 98 | + |
| 99 | +If you are using CEVFS, chances are that you are _not_ currently making use of this API hook. You can use the `const char *` param to pass something other than the intended activation key, such as the encryption key. This `sqlite3_activate_cerod` function has been implemented in `cevfs_build/cevfs_mod.c` as an example. Alternatively, you can roll out your own [Run-Time Loadable Extension](http://www.sqlite.org/loadext.html). |
| 100 | + |
| 101 | +## Limitations |
| 102 | +1. Currently tested with SQLite version 3.10.2 only. |
| 103 | +- WAL mode is not (yet) supported. |
| 104 | +- Free nodes are not managed which could result in wasted space. Not an issue if the database you create with `cevfs_build()` is intended to be used as a read-only database. |
| 105 | +- VACUUM not yet implemented to recover lost space. |
| 106 | + |
| 107 | +## Knows Issues |
| 108 | +1. Does not work yet with SQLite version 3.13.0. |
| 109 | + |
| 110 | +## Contributing to the project |
| 111 | +If you would like to contribute back to the project, please fork the repo and submit pull requests. |
| 112 | + |
| 113 | +Here are some things that need to be implemented: |
| 114 | +- Proper mutex & multi-threading support |
| 115 | +- TCL unit tests |
| 116 | +- WAL support |
| 117 | +- Full text indexing support |
| 118 | +- Keep track of free space lost when data is moved to a new page and reuse free space during subsequent write operations |
| 119 | +- Implement database compacting (using 'vacuum' keyword if possible) |
0 commit comments