Troubleshooting
Problem
When using LENGTH command to determine the size of a column BigSQL is returning a different size than the same column in Hive.
Symptom
db2 "CREATE HADOOP TABLE BIGSQL.TEST_TABLE (col1 VARCHAR(512)) STORED AS PARQUETFILE "
The SQL command completed successfully.
db2 "insert into bigsql.test_table values('•')"
The SQL command completed successfully.
db2 "select length(col1) from bigsql.test_table"
1
-----------
13
1 record(s) selected
hive> select length(col1) from bigsql.test_table;
OK
6
The SQL command completed successfully.
db2 "insert into bigsql.test_table values('•')"
The SQL command completed successfully.
db2 "select length(col1) from bigsql.test_table"
1
-----------
13
1 record(s) selected
hive> select length(col1) from bigsql.test_table;
OK
6
Cause
This is expected behavior. The LENGTH function in Big SQL counts bytes, whereas LENGTH function in Hive counts characters.
For Big SQL, the CHARACTER_LENGTH function counts characters, and for Hive, the OCTET_LENGTH function counts bytes.
Resolving The Problem
To compare accurately:
use LENGTH in Hive to compare to CHARACTER_LENGTH in BigSQL
OR
use OCTET_LENGTH in Hive to compare to LENGTH in BigSQL
use LENGTH in Hive to compare to CHARACTER_LENGTH in BigSQL
OR
use OCTET_LENGTH in Hive to compare to LENGTH in BigSQL
Related Information
Document Location
Worldwide
[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSCRJT","label":"IBM Db2 Big SQL"},"Component":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"4.2, 4.2.5, 5.0.0, 5.0.1, 5.0.2, 5.0.3, 5.0.4","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]
Was this topic helpful?
Document Information
Modified date:
08 April 2021
UID
ibm10870828