Summary
The current documentation for CHAR(n) / CHARACTER(n) is misleading about how trailing spaces are handled, and there is no mention of the significant impact this has on CDC / Changefeeds.
Current (misleading) docs statement
For CHAR(n)/CHARACTER(n) types, if the value is under the column's length limit, CockroachDB adds space padding from the end of the value to the length limit.
This implies that a value like ' ' (one space) stored in a CHAR(3) column would be read back as ' ' (three spaces). That is not what happens in SQL expressions.
Actual behavior (matches PostgreSQL)
Trailing spaces are treated as semantically insignificant in CHAR/BPCHAR types, per the SQL standard and PostgreSQL behavior:
- Trailing spaces are stripped when a value is stored into a
CHAR column (CRDB: datum.go#L6283-L6285, cast.go#L540-L543).
- Space padding is re-added when values are returned to the client over pgwire (
types.go#L105-L114), so SELECT v FROM fab shows ' ' in a psql client — but SQL expressions like length(v) operate on the stripped value.
Demonstration (CockroachDB v25.4.5 and v24.1.25, matches PostgreSQL 18.3):
CREATE TABLE fab (id INT NOT NULL PRIMARY KEY, v CHAR(3));
INSERT INTO fab VALUES (1, NULL), (2, ''), (3, ' '), (4, ' '),
(5, 'abc'), (6, 'a '), (7, ' b '), (8, ' c');
SELECT *, length(v) FROM fab;
-- id | v | length
-- ----+------+--------
-- 1 | NULL | NULL
-- 2 | | 0
-- 3 | | 0 -- trailing space stripped
-- 4 | | 0 -- all spaces stripped
-- 5 | abc | 3
-- 6 | a | 1 -- trailing spaces stripped
-- 7 | b | 2 -- trailing space stripped
-- 8 | c | 3
Compare with STRING(3) / VARCHAR(3), where trailing spaces are preserved:
-- 3 | | 1 -- space preserved
-- 4 | | 3 -- spaces preserved
-- 6 | a | 3
-- 7 | b | 3
This matches the PostgreSQL docs:
Trailing spaces are treated as semantically insignificant and disregarded when comparing two values of type character. Trailing spaces are removed when converting a character value to one of the other string types.
CDC / Changefeed impact (undocumented)
This behavior causes data loss in CDC exports because the changefeed reads values via SQL, not pgwire padding. A CHAR(3) column with value ' ' (one space) is exported as an empty string:
$ cat changefeed_output.csv | sort
1,NULL
2,
3, # ← was ' ', exported as empty
4, # ← was ' ', exported as empty
5,abc
6,a # ← was 'a ', trailing spaces lost
7," b"
8," c"
This is particularly impactful for customers with large datasets (e.g., hundreds of millions of rows) using CHAR columns expecting space-preservation in exports.
Recommended workaround for CDC
Use rpad() in a changefeed SELECT projection to explicitly restore padding:
CREATE CHANGEFEED INTO '<sink>'
WITH initial_scan = 'only', format = csv
AS SELECT id, rpad(v, 3) FROM fab;
Output:
1,NULL
2," "
3," "
4," "
5,abc
6,a
7," b "
8," c"
Suggested doc improvements
- Clarify the
CHAR(n) description: Explain that trailing spaces are stripped at storage time and re-added only when values are returned via the wire protocol — they are NOT present in SQL expression evaluation (e.g., length(), comparisons, casts).
- Add a note on
CHAR vs STRING/VARCHAR: If trailing space preservation is required, STRING(n) / VARCHAR(n) should be used instead.
- Add a CDC/Changefeed note: Document that
CHAR(n) columns will have trailing spaces stripped in changefeed CSV/Avro exports, and recommend using rpad(v, n) in a changefeed SELECT projection as a workaround.
References
Jira issue: DOC-17107
Summary
The current documentation for
CHAR(n)/CHARACTER(n)is misleading about how trailing spaces are handled, and there is no mention of the significant impact this has on CDC / Changefeeds.Current (misleading) docs statement
This implies that a value like
' '(one space) stored in aCHAR(3)column would be read back as' '(three spaces). That is not what happens in SQL expressions.Actual behavior (matches PostgreSQL)
Trailing spaces are treated as semantically insignificant in
CHAR/BPCHARtypes, per the SQL standard and PostgreSQL behavior:CHARcolumn (CRDB:datum.go#L6283-L6285,cast.go#L540-L543).types.go#L105-L114), soSELECT v FROM fabshows' 'in a psql client — but SQL expressions likelength(v)operate on the stripped value.Demonstration (CockroachDB v25.4.5 and v24.1.25, matches PostgreSQL 18.3):
Compare with
STRING(3)/VARCHAR(3), where trailing spaces are preserved:This matches the PostgreSQL docs:
CDC / Changefeed impact (undocumented)
This behavior causes data loss in CDC exports because the changefeed reads values via SQL, not pgwire padding. A
CHAR(3)column with value' '(one space) is exported as an empty string:This is particularly impactful for customers with large datasets (e.g., hundreds of millions of rows) using
CHARcolumns expecting space-preservation in exports.Recommended workaround for CDC
Use
rpad()in a changefeedSELECTprojection to explicitly restore padding:Output:
Suggested doc improvements
CHAR(n)description: Explain that trailing spaces are stripped at storage time and re-added only when values are returned via the wire protocol — they are NOT present in SQL expression evaluation (e.g.,length(), comparisons, casts).CHARvsSTRING/VARCHAR: If trailing space preservation is required,STRING(n)/VARCHAR(n)should be used instead.CHAR(n)columns will have trailing spaces stripped in changefeed CSV/Avro exports, and recommend usingrpad(v, n)in a changefeedSELECTprojection as a workaround.References
#help-database, 2026-05-15Jira issue: DOC-17107