Skip to main content
Skip to main content

ORC

InputOutputAlias

Description

Apache ORC is a columnar storage format widely used in the Hadoop ecosystem.

Data Types Matching

The table below compares supported ORC data types and their corresponding ClickHouse data types in INSERT and SELECT queries.

ORC data type (INSERT)ClickHouse data typeORC data type (SELECT)
BooleanUInt8Boolean
TinyintInt8/UInt8/Enum8Tinyint
SmallintInt16/UInt16/Enum16Smallint
IntInt32/UInt32Int
BigintInt64/UInt32Bigint
FloatFloat32Float
DoubleFloat64Double
DecimalDecimalDecimal
DateDate32Date
TimestampDateTime64Timestamp
String, Char, Varchar, BinaryStringBinary
ListArrayList
StructTupleStruct
MapMapMap
IntIPv4Int
BinaryIPv6Binary
BinaryInt128/UInt128/Int256/UInt256Binary
BinaryDecimal256Binary
  • Other types are not supported.
  • Arrays can be nested and can have a value of the Nullable type as an argument. Tuple and Map types also can be nested.
  • The data types of ClickHouse table columns do not have to match the corresponding ORC data fields. When inserting data, ClickHouse interprets data types according to the table above and then casts the data to the data type set for the ClickHouse table column.

Example Usage

Inserting Data

You can insert ORC data from a file into ClickHouse table using the following command:

$ cat filename.orc | clickhouse-client --query="INSERT INTO some_table FORMAT ORC"

Selecting Data

You can select data from a ClickHouse table and save them into some file in the ORC format using the following command:

$ clickhouse-client --query="SELECT * FROM {some_table} FORMAT ORC" > {filename.orc}

Format Settings

SettingDescriptionDefault
output_format_arrow_string_as_stringUse Arrow String type instead of Binary for String columns.false
output_format_orc_compression_methodCompression method used in output ORC format. Default valuenone
input_format_arrow_case_insensitive_column_matchingIgnore case when matching Arrow columns with ClickHouse columns.false
input_format_arrow_allow_missing_columnsAllow missing columns while reading Arrow data.false
input_format_arrow_skip_columns_with_unsupported_types_in_schema_inferenceAllow skipping columns with unsupported types while schema inference for Arrow format.false

To exchange data with Hadoop, you can use HDFS table engine.