|
1 |
| -## Work in progress |
2 |
| -You may ask questions in the chat window below or |
3 |
| -refer to [legacy documentation](https://docs.datajoint.org/) |
| 1 | +# Declaration Syntax |
| 2 | + |
| 3 | +## Creating Tables |
| 4 | + |
| 5 | +### Classes represent tables |
| 6 | + |
| 7 | +To make it easy to work with tables in MATLAB and Python, DataJoint programs create a |
| 8 | +separate class for each table. |
| 9 | +Computer programmers refer to this concept as |
| 10 | +[object-relational mapping](https://en.wikipedia.org/wiki/Object-relational_mapping). |
| 11 | +For example, the class `experiment.Subject` in the DataJoint client language may |
| 12 | +correspond to the table called `subject` on the database server. |
| 13 | +Users never need to see the database directly; they only interact with data in the |
| 14 | +database by creating and interacting with DataJoint classes. |
| 15 | + |
| 16 | +#### Data tiers |
| 17 | + |
| 18 | +The table class must inherit from one of the following superclasses to indicate its |
| 19 | +data tier: `dj.Lookup`, `dj.Manual`, `dj.Imported`, `dj.Computed`, or `dj.Part`. |
| 20 | +See :ref:`tiers` and :ref:`master-part`. |
| 21 | + |
| 22 | +### Defining a table |
| 23 | + |
| 24 | +To define a DataJoint table in Python: |
| 25 | + |
| 26 | +1. Define a class inheriting from the appropriate DataJoint class: `dj.Lookup`, |
| 27 | +`dj.Manual`, `dj.Imported` or `dj.Computed`. |
| 28 | + |
| 29 | +2. Decorate the class with the schema object (see :ref:`schema`) |
| 30 | + |
| 31 | +3. Define the class property `definition` to define the table heading. |
| 32 | + |
| 33 | +For example, the following code defines the table ``Person``: |
| 34 | + |
| 35 | +```python |
| 36 | +import datajoint as dj |
| 37 | +schema = dj.Schema('alice_experiment') |
| 38 | + |
| 39 | +@schema |
| 40 | +class Person(dj.Manual): |
| 41 | + definition = ''' |
| 42 | + username : varchar(20) # unique user name |
| 43 | + --- |
| 44 | + first_name : varchar(30) |
| 45 | + last_name : varchar(30) |
| 46 | + ''' |
| 47 | +``` |
| 48 | + |
| 49 | +The `@schema` decorator uses the class name and the data tier to check whether an |
| 50 | +appropriate table exists on the database. |
| 51 | +If a table does not already exist, the decorator creates one on the database using the |
| 52 | +definition property. |
| 53 | +The decorator attaches the information about the table to the class, and then returns |
| 54 | +the class. |
| 55 | + |
| 56 | +The class will become usable after you define the `definition` property as described in :ref:`definitions`. |
| 57 | + |
| 58 | +#### DataJoint classes in Python |
| 59 | + |
| 60 | +DataJoint for Python is implemented through the use of classes providing access to the |
| 61 | +actual tables stored on the database. |
| 62 | +Since only a single table exists on the database for any class, interactions with all |
| 63 | +instances of the class are equivalent. |
| 64 | +As such, most methods can be called on the classes themselves rather than on an object, |
| 65 | +for convenience. |
| 66 | +Whether calling a DataJoint method on a class or on an instance, the result will only |
| 67 | +depend on or apply to the corresponding table. |
| 68 | +All of the basic functionality of DataJoint is built to operate on the classes |
| 69 | +themselves, even when called on an instance. |
| 70 | +For example, calling `Person.insert(...)` (on the class) and `Person.insert(...)` (on |
| 71 | +an instance) both have the identical effect of inserting data into the table on the |
| 72 | +database server. |
| 73 | +DataJoint does not prevent a user from working with instances, but the workflow is |
| 74 | +complete without the need for instantiation. |
| 75 | +It is up to the user whether to implement additional functionality as class methods or |
| 76 | +methods called on instances. |
| 77 | + |
| 78 | +### Valid class names |
| 79 | + |
| 80 | +Note that in both MATLAB and Python, the class names must follow the CamelCase compound |
| 81 | +word notation: |
| 82 | + |
| 83 | +- start with a capital letter and |
| 84 | +- contain only alphanumerical characters (no underscores). |
| 85 | + |
| 86 | +Examples of valid class names: |
| 87 | + |
| 88 | +`TwoPhotonScan`, `Scan2P`, `Ephys`, `MembraneVoltage` |
| 89 | + |
| 90 | +Invalid class names: |
| 91 | + |
| 92 | +`Two_photon_Scan`, `twoPhotonScan`, `2PhotonScan`, `membranePotential`, `membrane_potential` |
| 93 | + |
| 94 | +## Table Definition |
| 95 | + |
| 96 | +DataJoint models data as sets of **entities** with shared **attributes**, often |
| 97 | +visualized as tables with rows and columns. |
| 98 | +Each row represents a single entity and the values of all of its attributes. |
| 99 | +Each column represents a single attribute with a name and a datatype, applicable to |
| 100 | +entity in the table. |
| 101 | +Unlike rows in a spreadsheet, entities in DataJoint don't have names or numbers: they |
| 102 | +can only be identified by the values of their attributes. |
| 103 | +Defining a table means defining the names and datatypes of the attributes as well as |
| 104 | +the constraints to be applied to those attributes. |
| 105 | +Both MATLAB and Python use the same syntax define tables. |
| 106 | + |
| 107 | +For example, the following code in defines the table `User`, that contains users of the |
| 108 | +database: |
| 109 | + |
| 110 | +The table definition is contained in the `definition` property of the class. |
| 111 | + |
| 112 | +```python |
| 113 | +@schema |
| 114 | +class User(dj.Manual): |
| 115 | + definition = """ |
| 116 | + # database users |
| 117 | + username : varchar(20) # unique user name |
| 118 | + --- |
| 119 | + first_name : varchar(30) |
| 120 | + last_name : varchar(30) |
| 121 | + role : enum('admin', 'contributor', 'viewer') |
| 122 | + """ |
| 123 | +``` |
| 124 | + |
| 125 | +This defines the class `User` that creates the table in the database and provides all |
| 126 | +its data manipulation functionality. |
| 127 | + |
| 128 | +### Table creation on the database server |
| 129 | + |
| 130 | +Users do not need to do anything special to have a table created in the database. |
| 131 | +Tables are created at the time of class definition. |
| 132 | +In fact, table creation on the database is one of the jobs performed by the decorator |
| 133 | +`@schema` of the class. |
| 134 | + |
| 135 | +### Changing the definition of an existing table |
| 136 | + |
| 137 | +Once the table is created in the database, the definition string has no further effect. |
| 138 | +In other words, changing the definition string in the class of an existing table will |
| 139 | +not actually update the table definition. |
| 140 | +To change the table definition, one must first [drop](../drop.md) the existing table. |
| 141 | +This means that all the data will be lost, and the new definition will be applied to |
| 142 | +create the new empty table. |
| 143 | + |
| 144 | +Therefore, in the initial phases of designing a DataJoint pipeline, it is common to |
| 145 | +experiment with variations of the design before populating it with substantial amounts |
| 146 | +of data. |
| 147 | + |
| 148 | +It is possible to modify a table without dropping it. |
| 149 | +This topic is covered separately. |
| 150 | + |
| 151 | +### Reverse-engineering the table definition |
| 152 | + |
| 153 | +DataJoint objects provide the `describe` method, which displays the table definition |
| 154 | +used to define the table when it was created in the database. |
| 155 | +This definition may differ from the definition string of the class if the definition |
| 156 | +string has been edited after creation of the table. |
| 157 | + |
| 158 | +Examples |
| 159 | + |
| 160 | +```python |
| 161 | +s = lab.User.describe() |
| 162 | +``` |
| 163 | + |
| 164 | +## Definition Syntax |
| 165 | + |
| 166 | +The table definition consists of one or more lines. |
| 167 | +Each line can be one of the following: |
| 168 | + |
| 169 | +- The optional first line starting with a `#` provides a description of the table's purpose. |
| 170 | + It may also be thought of as the table's long title. |
| 171 | +- A new attribute definition in any of the following forms (see :ref:`datatypes` for valid datatypes): |
| 172 | + ``name : datatype`` |
| 173 | + ``name : datatype # comment`` |
| 174 | + ``name = default : datatype`` |
| 175 | + ``name = default : datatype # comment`` |
| 176 | +- The divider `---` (at least three hyphens) separating primary key attributes above |
| 177 | +from secondary attributes below. |
| 178 | +- A foreign key in the format `-> ReferencedTable`. |
| 179 | + (See [Dependencies](dependencies.md).) |
| 180 | + |
| 181 | +For example, the table for Persons may have the following definition: |
| 182 | + |
| 183 | +```python |
| 184 | +# Persons in the lab |
| 185 | +username : varchar(16) # username in the database |
| 186 | +--- |
| 187 | +full_name : varchar(255) |
| 188 | +start_date : date # date when joined the lab |
| 189 | +``` |
| 190 | + |
| 191 | +This will define the table with attributes `username`, `full_name`, and `start_date`, |
| 192 | +in which `username` is the [primary key](primary.md). |
| 193 | + |
| 194 | +### Attribute names |
| 195 | + |
| 196 | +Attribute names must be in lowercase and must start with a letter. |
| 197 | +They can only contain alphanumerical characters and underscores. |
| 198 | +The attribute name cannot exceed 64 characters. |
| 199 | + |
| 200 | +Valid attribute names |
| 201 | + `first_name`, `two_photon_scan`, `scan_2p`, `two_photon_scan` |
| 202 | + |
| 203 | +Invalid attribute names |
| 204 | + `firstName`, `first name`, `2photon_scan`, `two-photon_scan`, `TwoPhotonScan` |
| 205 | + |
| 206 | +Ideally, attribute names should be unique across all tables that are likely to be used |
| 207 | +in queries together. |
| 208 | +For example, tables often have attributes representing the start times of sessions, |
| 209 | +recordings, etc. |
| 210 | +Such attributes must be uniquely named in each table, such as `session_start_time` or |
| 211 | +`recording_start_time`. |
| 212 | + |
| 213 | +### Default values |
| 214 | + |
| 215 | +Secondary attributes can be given default values. |
| 216 | +A default value will be used for an attribute if no other value is given at the time |
| 217 | +the entity is [inserted](../../manipulation/insert.md) into the table. |
| 218 | +Generally, default values are numerical values or character strings. |
| 219 | +Default values for dates must be given as strings as well, contained within quotes |
| 220 | +(with the exception of `CURRENT_TIMESTAMP`). |
| 221 | +Note that default values can only be used when inserting as a mapping. |
| 222 | +Primary key attributes cannot have default values (with the exceptions of |
| 223 | +`auto_increment` and `CURRENT_TIMESTAMP` attributes; see [primary-key](primary.md)). |
| 224 | + |
| 225 | +An attribute with a default value of `NULL` is called a **nullable attribute**. |
| 226 | +A nullable attribute can be thought of as applying to all entities in a table but |
| 227 | +having an optional *value* that may be absent in some entities. |
| 228 | +Nullable attributes should *not* be used to indicate that an attribute is inapplicable |
| 229 | +to some entities in a table (see [normalization](../normalization.md)). |
| 230 | +Nullable attributes should be used sparingly to indicate optional rather than |
| 231 | +inapplicable attributes that still apply to all entities in the table. |
| 232 | +`NULL` is a special literal value and does not need to be enclosed in quotes. |
| 233 | + |
| 234 | +Here are some examples of attributes with default values: |
| 235 | + |
| 236 | +```python |
| 237 | +failures = 0 : int |
| 238 | +due_date = "2020-05-31" : date |
| 239 | +additional_comments = NULL : varchar(256) |
| 240 | +``` |
0 commit comments