Skip to content

Commit 466a84d

Browse files
committed
Add declare page
1 parent ccb601f commit 466a84d

File tree

1 file changed

+240
-3
lines changed

1 file changed

+240
-3
lines changed

docs/src/design/tables/declare.md

Lines changed: 240 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,240 @@
1-
## Work in progress
2-
You may ask questions in the chat window below or
3-
refer to [legacy documentation](https://docs.datajoint.org/)
1+
# Declaration Syntax
2+
3+
## Creating Tables
4+
5+
### Classes represent tables
6+
7+
To make it easy to work with tables in MATLAB and Python, DataJoint programs create a
8+
separate class for each table.
9+
Computer programmers refer to this concept as
10+
[object-relational mapping](https://en.wikipedia.org/wiki/Object-relational_mapping).
11+
For example, the class `experiment.Subject` in the DataJoint client language may
12+
correspond to the table called `subject` on the database server.
13+
Users never need to see the database directly; they only interact with data in the
14+
database by creating and interacting with DataJoint classes.
15+
16+
#### Data tiers
17+
18+
The table class must inherit from one of the following superclasses to indicate its
19+
data tier: `dj.Lookup`, `dj.Manual`, `dj.Imported`, `dj.Computed`, or `dj.Part`.
20+
See :ref:`tiers` and :ref:`master-part`.
21+
22+
### Defining a table
23+
24+
To define a DataJoint table in Python:
25+
26+
1. Define a class inheriting from the appropriate DataJoint class: `dj.Lookup`,
27+
`dj.Manual`, `dj.Imported` or `dj.Computed`.
28+
29+
2. Decorate the class with the schema object (see :ref:`schema`)
30+
31+
3. Define the class property `definition` to define the table heading.
32+
33+
For example, the following code defines the table ``Person``:
34+
35+
```python
36+
import datajoint as dj
37+
schema = dj.Schema('alice_experiment')
38+
39+
@schema
40+
class Person(dj.Manual):
41+
definition = '''
42+
username : varchar(20) # unique user name
43+
---
44+
first_name : varchar(30)
45+
last_name : varchar(30)
46+
'''
47+
```
48+
49+
The `@schema` decorator uses the class name and the data tier to check whether an
50+
appropriate table exists on the database.
51+
If a table does not already exist, the decorator creates one on the database using the
52+
definition property.
53+
The decorator attaches the information about the table to the class, and then returns
54+
the class.
55+
56+
The class will become usable after you define the `definition` property as described in :ref:`definitions`.
57+
58+
#### DataJoint classes in Python
59+
60+
DataJoint for Python is implemented through the use of classes providing access to the
61+
actual tables stored on the database.
62+
Since only a single table exists on the database for any class, interactions with all
63+
instances of the class are equivalent.
64+
As such, most methods can be called on the classes themselves rather than on an object,
65+
for convenience.
66+
Whether calling a DataJoint method on a class or on an instance, the result will only
67+
depend on or apply to the corresponding table.
68+
All of the basic functionality of DataJoint is built to operate on the classes
69+
themselves, even when called on an instance.
70+
For example, calling `Person.insert(...)` (on the class) and `Person.insert(...)` (on
71+
an instance) both have the identical effect of inserting data into the table on the
72+
database server.
73+
DataJoint does not prevent a user from working with instances, but the workflow is
74+
complete without the need for instantiation.
75+
It is up to the user whether to implement additional functionality as class methods or
76+
methods called on instances.
77+
78+
### Valid class names
79+
80+
Note that in both MATLAB and Python, the class names must follow the CamelCase compound
81+
word notation:
82+
83+
- start with a capital letter and
84+
- contain only alphanumerical characters (no underscores).
85+
86+
Examples of valid class names:
87+
88+
`TwoPhotonScan`, `Scan2P`, `Ephys`, `MembraneVoltage`
89+
90+
Invalid class names:
91+
92+
`Two_photon_Scan`, `twoPhotonScan`, `2PhotonScan`, `membranePotential`, `membrane_potential`
93+
94+
## Table Definition
95+
96+
DataJoint models data as sets of **entities** with shared **attributes**, often
97+
visualized as tables with rows and columns.
98+
Each row represents a single entity and the values of all of its attributes.
99+
Each column represents a single attribute with a name and a datatype, applicable to
100+
entity in the table.
101+
Unlike rows in a spreadsheet, entities in DataJoint don't have names or numbers: they
102+
can only be identified by the values of their attributes.
103+
Defining a table means defining the names and datatypes of the attributes as well as
104+
the constraints to be applied to those attributes.
105+
Both MATLAB and Python use the same syntax define tables.
106+
107+
For example, the following code in defines the table `User`, that contains users of the
108+
database:
109+
110+
The table definition is contained in the `definition` property of the class.
111+
112+
```python
113+
@schema
114+
class User(dj.Manual):
115+
definition = """
116+
# database users
117+
username : varchar(20) # unique user name
118+
---
119+
first_name : varchar(30)
120+
last_name : varchar(30)
121+
role : enum('admin', 'contributor', 'viewer')
122+
"""
123+
```
124+
125+
This defines the class `User` that creates the table in the database and provides all
126+
its data manipulation functionality.
127+
128+
### Table creation on the database server
129+
130+
Users do not need to do anything special to have a table created in the database.
131+
Tables are created at the time of class definition.
132+
In fact, table creation on the database is one of the jobs performed by the decorator
133+
`@schema` of the class.
134+
135+
### Changing the definition of an existing table
136+
137+
Once the table is created in the database, the definition string has no further effect.
138+
In other words, changing the definition string in the class of an existing table will
139+
not actually update the table definition.
140+
To change the table definition, one must first [drop](../drop.md) the existing table.
141+
This means that all the data will be lost, and the new definition will be applied to
142+
create the new empty table.
143+
144+
Therefore, in the initial phases of designing a DataJoint pipeline, it is common to
145+
experiment with variations of the design before populating it with substantial amounts
146+
of data.
147+
148+
It is possible to modify a table without dropping it.
149+
This topic is covered separately.
150+
151+
### Reverse-engineering the table definition
152+
153+
DataJoint objects provide the `describe` method, which displays the table definition
154+
used to define the table when it was created in the database.
155+
This definition may differ from the definition string of the class if the definition
156+
string has been edited after creation of the table.
157+
158+
Examples
159+
160+
```python
161+
s = lab.User.describe()
162+
```
163+
164+
## Definition Syntax
165+
166+
The table definition consists of one or more lines.
167+
Each line can be one of the following:
168+
169+
- The optional first line starting with a `#` provides a description of the table's purpose.
170+
It may also be thought of as the table's long title.
171+
- A new attribute definition in any of the following forms (see :ref:`datatypes` for valid datatypes):
172+
``name : datatype``
173+
``name : datatype # comment``
174+
``name = default : datatype``
175+
``name = default : datatype # comment``
176+
- The divider `---` (at least three hyphens) separating primary key attributes above
177+
from secondary attributes below.
178+
- A foreign key in the format `-> ReferencedTable`.
179+
(See [Dependencies](dependencies.md).)
180+
181+
For example, the table for Persons may have the following definition:
182+
183+
```python
184+
# Persons in the lab
185+
username : varchar(16) # username in the database
186+
---
187+
full_name : varchar(255)
188+
start_date : date # date when joined the lab
189+
```
190+
191+
This will define the table with attributes `username`, `full_name`, and `start_date`,
192+
in which `username` is the [primary key](primary.md).
193+
194+
### Attribute names
195+
196+
Attribute names must be in lowercase and must start with a letter.
197+
They can only contain alphanumerical characters and underscores.
198+
The attribute name cannot exceed 64 characters.
199+
200+
Valid attribute names
201+
`first_name`, `two_photon_scan`, `scan_2p`, `two_photon_scan`
202+
203+
Invalid attribute names
204+
`firstName`, `first name`, `2photon_scan`, `two-photon_scan`, `TwoPhotonScan`
205+
206+
Ideally, attribute names should be unique across all tables that are likely to be used
207+
in queries together.
208+
For example, tables often have attributes representing the start times of sessions,
209+
recordings, etc.
210+
Such attributes must be uniquely named in each table, such as `session_start_time` or
211+
`recording_start_time`.
212+
213+
### Default values
214+
215+
Secondary attributes can be given default values.
216+
A default value will be used for an attribute if no other value is given at the time
217+
the entity is [inserted](../../manipulation/insert.md) into the table.
218+
Generally, default values are numerical values or character strings.
219+
Default values for dates must be given as strings as well, contained within quotes
220+
(with the exception of `CURRENT_TIMESTAMP`).
221+
Note that default values can only be used when inserting as a mapping.
222+
Primary key attributes cannot have default values (with the exceptions of
223+
`auto_increment` and `CURRENT_TIMESTAMP` attributes; see [primary-key](primary.md)).
224+
225+
An attribute with a default value of `NULL` is called a **nullable attribute**.
226+
A nullable attribute can be thought of as applying to all entities in a table but
227+
having an optional *value* that may be absent in some entities.
228+
Nullable attributes should *not* be used to indicate that an attribute is inapplicable
229+
to some entities in a table (see [normalization](../normalization.md)).
230+
Nullable attributes should be used sparingly to indicate optional rather than
231+
inapplicable attributes that still apply to all entities in the table.
232+
`NULL` is a special literal value and does not need to be enclosed in quotes.
233+
234+
Here are some examples of attributes with default values:
235+
236+
```python
237+
failures = 0 : int
238+
due_date = "2020-05-31" : date
239+
additional_comments = NULL : varchar(256)
240+
```

0 commit comments

Comments
 (0)