← Back to TIL

octet_length()

Jan 19, 2026

postgresqlsqlunicode

the problem

select length('𝓦𝓮𝓲𝓻𝓭 𝓒𝓱𝓪𝓻𝓪𝓬𝓽𝓮𝓻');

returns:

7

this is not incorrect. length() counts characters, and those unicode glyphs are single code points.

the problem shows up later.

you validate with length(), but the database enforces limits in bytes.

what actually breaks

example:

varchar(10)

your app logic:

  • length(text) <= 10 → passes

postgres storage:

  • utf-8 bytes > 10 → insert FAILS

result: validation passes but error on insert.

the fix

validate using bytes when storage limits matter:

select octet_length(text) <= 10

mental model

  • length() → semantic length (characters)
  • octet_length() → physical length (bytes on disk / wire)

variation

  • bit_length() → bit length (bits)
  • byte_length() → byte length (bigquery)
  • char_length() → character length (characters)

read more: