07:50
An Interlude: Why you do not want to use C/POSIX locale.
30 October 2024
(A quick interlude in our series on locales and character encodings in PostgreSQL.)
C/POSIX locale is only useful if you know that you will never have characters in your strings other than 7-bit ASCII, or you are 100% sure that you are never going to have mixed client encodings. If you do have non-7-bit-ASCII characters with mixed client encodings, zaniness can result:
c=# \l List of databases Name | Owner | Encoding | Locale Provider | Collate | Ctype | Locale | ICU Rules | Access privileges -----------+------------------+----------+-----------------+-------------+-------------+-----------+-----------+------------------- c | xof | UTF8 | libc | POSIX | POSIX | | | (17 rows [othes deleted]) c=# set client_encoding='UTF8'; SET c=# CREATE TABLE m (v varchar); CREATE TABLE c=# INSERT INTO m VALUES('Í'); INSERT 0 1 c=# set client_encoding='LATIN1'; SET c=# INSERT INTO m VALUES('é'); INSERT 0 1 c=# SELECT * FROM m; v ---- ? é (2 rows) c=# set client_encoding='UTF8'; SET c=# SELECT * FROM m; v ---- Í Ã© (2 rows) c=# INSERT INTO m VALUES('‰'); INSERT 0 1 c=# SELECT * FROM m; v ---- Í Ã© ‰ (3 rows) c=# set client_encoding='LATIN1'; SET c=# SELECT * FROM m; ERROR: character with byte sequence 0xe2 0x80 0xb0 in encoding "UTF8" has no equivalent in encoding "LATIN1"
Even worse (in this regard) than C locale is SQL_ASCII
locale. Please never use this locale.
There are no comments yet.
Comment