WebAssembly User-Defined Functions
ClickHouse supports creating user-defined functions (UDFs) written in WebAssembly. This allows you to execute custom logic written in languages like Rust, C, C++, or others by compiling them to WebAssembly modules.
Overview
A WebAssembly module is a compiled binary file that contains one or more functions that can be called from ClickHouse. Think of a module as a library or shared object that you load once and reuse many times.
WebAssembly module containing UDFs can be written in any language that can compile to WebAssembly, such as Rust, C, or C++.
Code compiled to WebAssembly ("guest" code) and executed by ClickHouse ("host") run in a sandboxed environment having access only to a dedicated memory space.
Guest code exports functions that ClickHouse can invoke - these include the functions that implement your custom logic (used to define UDFs) as well as support functions required for memory management and data exchange between ClickHouse and the WebAssembly code.
Your code should be compiled to "freestanding" WebAssembly (aka wasm32-unknown-unknown) without any dependencies on an operating system or standard library. Also only default 32-bit WebAssembly target is supported (no wasm64 extension).
The module must follow one of the supported communication protocols (ABIs) for interacting with ClickHouse.
Once compiled, the module's binary code is loaded into ClickHouse by inserting it into the system.webassembly_modules table.
After that, you can create UDFs that reference functions exported by the module using the CREATE FUNCTION ... LANGUAGE WASM statement.
Prerequisites
Enable WebAssembly support in your ClickHouse configuration:
Available Engine Implementations:
Quick Start
This example demonstrates the complete workflow of creating a WebAssembly UDF by implementing the Collatz conjecture calculator.
We'll write the code in WebAssembly Text format (WAT), which is a human-readable representation of WebAssembly, so no any programming language is required at this stage.
ClickHouse requires the module to be in binary format, so we'll use the transpiler to convert WAT to WASM.
To perform this conversion you may use wat2wasm from the WebAssembly Binary Toolkit (WABT) or parse command from the wasm-tools.
In snippet above we pipe binary WASM code directly into ClickHouse client using FORMAT RawBlob to insert it into system.webassembly_modules table.
Then we define the UDF that references the steps function exported by the module:
Note that we specify function name from the module after ::, because it differs from the UDF name.
Now we can use the collatz_steps function in our queries:
The number column is explicitly cast to UInt32, because WebAssembly functions expect exact type matches specified signature in CREATE FUNCTION statement.
In the result we got sequence of Collatz steps for numbers from 1 to 100, corresponding to sequence A006577 from the OEIS.
Manage WASM modules via system table
WebAssembly modules are stored in the system.webassembly_modules table having the following structure:
- Columns
nameString — Module name. Non-empty, word characters only.codeString — Raw binary WASM code. Write-only, reads return empty string.hashUInt256 — SHA256 of the module binary (zero if present on disk but not yet loaded).
Module management happens through standard SQL operations on this table:
Insert a module
Optionally, provide integrity hash:
If the provided hash does not match the computed SHA256 of the module code, the insertion fails. It may be useful when loading modules from external sources such as S3 or HTTP.
Distribute a module across a cluster
system.webassembly_modules is a per-instance table — an INSERT lands only on the replica handling the connection. There is no ON CLUSTER form of the INSERT statement, so a subsequent CREATE FUNCTION ... ON CLUSTER will fail on replicas that do not have the module:
To fan an insert out to every node, write to the cluster table function instead of the local system.webassembly_modules table:
This pattern relies on the underlying distributed-write path visiting every replica within each shard, which only happens when the cluster is configured with internal_replication=false. With internal_replication=true (the default for clusters that use ReplicatedMergeTree to drive replication themselves), the insert is delivered to a single healthy replica per shard, and system.webassembly_modules is not replicated by that path — so some replicas will still be missing the module. In that configuration you need to insert against each replica individually, for example by iterating over system.clusters and writing via remote(...) per host, or by copying the binary into user_scripts/wasm/ on every host.
You can inspect internal_replication for a cluster with SELECT cluster, shard_num, internal_replication FROM system.clusters.
After the fanned-out insert, the module is present on every replica and CREATE FUNCTION ... ON CLUSTER succeeds:
You can verify the module is loaded everywhere with clusterAllReplicas:
Inserts into system.webassembly_modules are idempotent for the same (name, hash) pair, so re-running the fanned-out insert is safe and is a reasonable way to repair state after a replica has been replaced. Note that newly added servers do not retroactively receive existing modules — you must re-run the insert against the updated cluster, or place the binary into the user_scripts/wasm/ directory on the new host.
List modules
Delete a module
Deletion performed by DELETE FROM system.webassembly_modules WHERE name = '...' statement.
The predicate must be either name = 'literal' for exact match or name LIKE 'pattern' to delete every module whose name matches the pattern; no other shapes are accepted.
If any existing UDFs reference one of the matched modules, the deletion fails, so you must drop those UDFs first.
Create a WebAssembly UDF
Syntax:
Parameters:
function_name: Name of the function in ClickHouse. May be different from the exported function name in the module.FROM 'module_name' :: 'source_function_name': Name of the loaded WASM module and function name in WASM module to use (defaults to function_name)ARGUMENTS: List of argument names and types (names optional and used for serialization formats that support named fields)ABI: Application Binary Interface versionROW_DIRECT: Direct type mapping, row-by-row processingBUFFERED_V1: Block-based processing with serializationASSEMBLYSCRIPT: Row-by-row processing for modules produced by the AssemblyScript compiler. Numeric types map to AssemblyScript primitives; ClickHouseStringmaps to AssemblyScriptstring.
DETERMINISTIC: Declares the function as deterministic — always returns the same output for the same input. When specified, ClickHouse may constant-fold calls where all arguments are constants: the function is evaluated once at query analysis time and the result is reused for every row.SHA256_HASH: Expected module hash for verification (auto-filled if omitted), can be used to ensure the correct WASM module loaded across different replicas.SETTINGS: Per-function settingsserialization_formatString — Serialization format for ABI requires it. Default:MsgPack.
ABIs Versions
To interact with ClickHouse, WebAssembly modules must adhere to one of the supported ABIs (Application Binary Interfaces).
ROW_DIRECT: Direct type mapping (primitive typesInt32,UInt32,Int64,UInt64,Float32,Float64only)BUFFERED_V1: Complex types with serializationASSEMBLYSCRIPT: Row-by-row interop with AssemblyScript modules; supports numeric types andString.
ABI ROW_DIRECT
Calls an exported WASM function directly per row.
- Arguments and return types as numeric types
Int32/UInt32/Int64/UInt64/Float32/Float64/Int128/UInt128. - Strings are not supported in this ABI.
- Signatures must match the WASM export (
i32/i64/f32/f64/v128). - No support functions required to be exported by the module.
For example function with signature:
Can be created as:
WebAssembly does not distinguish between signed and unsigned arguments, but rather uses different instructions to interpret the values. Thus, size of the argument should match exactly, while signedness is determined by the operations inside the function.
ABI BUFFERED_V1
This ABI is experimental and subject to change in future releases.
Processes entire blocks at once using a (de)serialization through WASM memory. Supports any argument and return types.
Serialized data is copied to wasm memory passed as pointer to buffer (which consists of pointer to data and size of the data) to the UDF function along with the number of rows in the input. Thus, user-defined function on wasm time always accepts two i32 arguments and returns single i32 value.
Guest code processes the data and returns a pointer to the result buffer with serialized result data.
The guest code must provide two functions to create and destroy these buffers.
Example C definitions:
ABI ASSEMBLYSCRIPT
Targets modules produced by the AssemblyScript compiler. Each row triggers one call into the exported function, mapping ClickHouse values to AssemblyScript primitives and string objects.
Supported types:
-
Numeric:
Int8/UInt8,Int16/UInt16(widened toi32at the boundary),Int32/UInt32,Int64/UInt64,Float32,Float64 -
String— maps to AssemblyScriptstring(UTF-16 in WASM memory). ClickHouse handles the UTF-8 ↔ UTF-16 conversion automatically. -
Custom AssemblyScript classes are not supported as argument or return types — their runtime class ids are not stable across compilations (see AssemblyScript#2982).
Module requirements:
The module must be compiled with the AssemblyScript managed runtime so that __new, __pin, and __unpin are exported. The standard incoming/outgoing string handling expects these. The recommended invocation:
AssemblyScript also imports env.abort for runtime traps (out-of-memory, bounds checks, etc.). ClickHouse provides this import automatically: when an abort is triggered, the active query fails with a WASM_ERROR exception that includes the decoded AssemblyScript message and source location.
Example:
After compiling with asc and loading the resulting .wasm into system.webassembly_modules, declare the UDFs as:
Note for developing UDFs in Rust
For Rust programs we provide a helper crate clickhouse-wasm-udf to simplify development of WebAssembly UDFs for ClickHouse. The crate provides function for memory management, so you don't need to implement clickhouse_create_buffer and clickhouse_destroy_buffer functions manually, but rather add the crate as a dependency. Also there are macros #[clickhouse_wasm_udf] to wrap your regular Rust functions into the required ABI format.
With the crate you can write UDFs like this:
Macros will generate wrapper function accepting and returning buffer structures and handle serialization/deserialization automatically using serde.
Host API available to modules
The following host functions may be imported and used by modules:
clickhouse_server_version() -> i64— returns ClickHouse server version as integer (e.g. 25011001 for v25.11.1.1).clickhouse_throw(ptr: i32, size: i32)— throws an error with the provided message. Accepts pointer to the memory location containing the error message string and size of the string.clickhouse_log(ptr: i32, size: i32)— logs a message to ClickHouse server text log.clickhouse_random(ptr: i32, size: i32)— fills memory with random bytes.env.abort(message: i32, fileName: i32, line: i32, column: i32)— supplied for AssemblyScript-compatible modules. Calling it (or triggering an AssemblyScript runtime trap that calls it) terminates the UDF with aWASM_ERRORexception containing the decoded message and source location. Modules that do not importenv.abortare unaffected.
Settings
The following query-level settings control WebAssembly UDF execution:
-
webassembly_udf_max_fuel— Fuel limit per WebAssembly UDF instance execution. Each WebAssembly instruction consumes some amount of fuel. Set to 0 for no limit. -
webassembly_udf_max_memory— Memory limit in bytes per WebAssembly UDF instance. -
webassembly_udf_max_input_block_size— Maximum number of rows passed to a WebAssembly UDF in a single block. Set to 0 to process all rows at once. -
webassembly_udf_max_instances— Maximum number of WebAssembly UDF instances that can run in parallel per function.
Example usage: