Skip to content

Docs: CHAR(n) trailing-space behavior is misleading and CDC impact is undocumented #23365

@fabiog1901

Description

@fabiog1901

Summary

The current documentation for CHAR(n) / CHARACTER(n) is misleading about how trailing spaces are handled, and there is no mention of the significant impact this has on CDC / Changefeeds.


Current (misleading) docs statement

For CHAR(n)/CHARACTER(n) types, if the value is under the column's length limit, CockroachDB adds space padding from the end of the value to the length limit.

This implies that a value like ' ' (one space) stored in a CHAR(3) column would be read back as ' ' (three spaces). That is not what happens in SQL expressions.


Actual behavior (matches PostgreSQL)

Trailing spaces are treated as semantically insignificant in CHAR/BPCHAR types, per the SQL standard and PostgreSQL behavior:

  • Trailing spaces are stripped when a value is stored into a CHAR column (CRDB: datum.go#L6283-L6285, cast.go#L540-L543).
  • Space padding is re-added when values are returned to the client over pgwire (types.go#L105-L114), so SELECT v FROM fab shows ' ' in a psql client — but SQL expressions like length(v) operate on the stripped value.

Demonstration (CockroachDB v25.4.5 and v24.1.25, matches PostgreSQL 18.3):

CREATE TABLE fab (id INT NOT NULL PRIMARY KEY, v CHAR(3));

INSERT INTO fab VALUES (1, NULL), (2, ''), (3, ' '), (4, '   '),
                       (5, 'abc'), (6, 'a  '), (7, ' b '), (8, '  c');

SELECT *, length(v) FROM fab;
--  id |  v   | length
-- ----+------+--------
--   1 | NULL |   NULL
--   2 |      |      0
--   3 |      |      0   -- trailing space stripped
--   4 |      |      0   -- all spaces stripped
--   5 | abc  |      3
--   6 | a    |      1   -- trailing spaces stripped
--   7 |  b   |      2   -- trailing space stripped
--   8 |   c  |      3

Compare with STRING(3) / VARCHAR(3), where trailing spaces are preserved:

--   3 |      |      1   -- space preserved
--   4 |      |      3   -- spaces preserved
--   6 | a    |      3
--   7 |  b   |      3

This matches the PostgreSQL docs:

Trailing spaces are treated as semantically insignificant and disregarded when comparing two values of type character. Trailing spaces are removed when converting a character value to one of the other string types.


CDC / Changefeed impact (undocumented)

This behavior causes data loss in CDC exports because the changefeed reads values via SQL, not pgwire padding. A CHAR(3) column with value ' ' (one space) is exported as an empty string:

$ cat changefeed_output.csv | sort
1,NULL
2,
3,         # ← was ' ', exported as empty
4,         # ← was '   ', exported as empty
5,abc
6,a        # ← was 'a  ', trailing spaces lost
7," b"
8,"  c"

This is particularly impactful for customers with large datasets (e.g., hundreds of millions of rows) using CHAR columns expecting space-preservation in exports.


Recommended workaround for CDC

Use rpad() in a changefeed SELECT projection to explicitly restore padding:

CREATE CHANGEFEED INTO '<sink>'
WITH initial_scan = 'only', format = csv
AS SELECT id, rpad(v, 3) FROM fab;

Output:

1,NULL
2,"   "
3,"   "
4,"   "
5,abc
6,a  
7," b "
8,"  c"

Suggested doc improvements

  1. Clarify the CHAR(n) description: Explain that trailing spaces are stripped at storage time and re-added only when values are returned via the wire protocol — they are NOT present in SQL expression evaluation (e.g., length(), comparisons, casts).
  2. Add a note on CHAR vs STRING/VARCHAR: If trailing space preservation is required, STRING(n) / VARCHAR(n) should be used instead.
  3. Add a CDC/Changefeed note: Document that CHAR(n) columns will have trailing spaces stripped in changefeed CSV/Avro exports, and recommend using rpad(v, n) in a changefeed SELECT projection as a workaround.

References

Jira issue: DOC-17107

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions