> ## Documentation Index
> Fetch the complete documentation index at: https://docs.odigos.io/llms.txt
> Use this file to discover all available pages before exploring further.

# ClickHouse

> Configuring the ClickHouse backend (Self-Hosted)

### Getting Started

<img src="https://d15jtxgb40qetw.cloudfront.net/clickhouse.svg" alt="clickhouse" className="not-prose h-20" />

ClickHouse is very popular database for storing telemetry data and running efficient queries on it.

**Use Cases**

* ClickHouse can handle all 3 signals (metrics, logs, traces) in one single database. less overhead to manage multiple databases.
* ClickHouse is a columnar database, which is optimized for time-series immutable data like OpenTelemetry telemetry data.
* ClickHouse is a distributed database, which can scale horizontally and handle large amounts of data with efficient storage and query performance.
* ClickHouse is open-source and has a large community.

**Prerequisites**

* To use ClickHouse as a destination in Odigos, you need to have a ClickHouse deployment running somewhere and accessible from cluster where Odigos is running.
* You should know the service endpoint where ClickHouse listens for incoming client connections.
* If you haven't already, create a database in ClickHouse where you want to store the telemetry data. the default database name is `otel` (configurable). To create it, run the following SQL command: `CREATE DATABASE otel;`
* Understanding of how to maintain, scale, and optimize your self-hosted ClickHouse deployment, as well as how to fine tune setting based on your queries and use case.

**Schema**

When using the ClickHouse destination with Odigos, logs metrics and traces are going to be "INSERT INTO" the ClickHouse database.

It is important to understand what schema is used, e.g. what table names are used, column names, data types, etc.
Then you can run queries on this data, or modify and optimize it for your specific use case.

There are 2 modes of operation for ClickHouse destination in Odigos:

**Create Schema**

In this option, the schema will be automatically created by odigos with reasonable defaults.

The benefit of this option is that you can see the value fast, without needing to apply and manage any schema yourself.
The downside is that the schema may not be optimized for your specific use case, and may make changes more complicated.

To use it:

* Odigos UI - When adding a new ClickHouse destination, select the `Create Scheme` checkbox field.
* Destination K8s Manifest - Set the `CLICKHOUSE_CREATE_SCHEME` setting to value `true`.

The schema which will be used by default can be found [here](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/a80fdb94cecb1cd3c1fbd7fbe3000396cd588970/exporter/clickhouseexporter/example/default_ddl).

* Indexes - The default schema includes indexes for the data, including trace\_id, resource and scope attributes, span/metric/log attributes etc.
* TTL is set to 180 days. This means data will be kept in the database for this period of time and then deleted.
* Partitioning - The default schema includes partitioning by day, which means data is stored in separate partitions for each day.
* Order By - Optimized for trace queries on service\_name + span\_name + time, for logs on service\_name + time, and for metrics on service\_name + metric\_name + attributes + time.

This option is not recommended for production workloads:

* You may want to adjust the settings to better fit your use case, scale performance requirements, and costs.
* Each new exporter will attempt to create the schema, which is less robust and harder to manage than a pre-created schema.

**Self Managed Schema**

With this option, you are responsible for creating and managing the schema yourself.

To use it:

* Odigos UI - Unselect the `Create Scheme` checkbox field.
* Destination K8s Manifest - Set the `CLICKHOUSE_CREATE_SCHEME` setting to value `false`.

The benefit of this option is that you have full control over the schema, and can optimize it for your specific use case.

How it works?

* Browse to the [default DDL example](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/a80fdb94cecb1cd3c1fbd7fbe3000396cd588970/exporter/clickhouseexporter/example/default_ddl).
* Copy the `sql` files to your local machine.
* Make any changes you need to the schema.
* Run the SQL files in your ClickHouse database to create the schema.

This option is recommended for production workloads:

* You can optimize the schema for your specific use case, scale performance requirements, and costs.
* You can manage the schema in a version control system, and apply changes in a controlled way.
* Applying changes to the schema is more robust and easier to manage than attempting to create it on the fly with each new connection.

Important Settings:

* Indexes - You may want to add or remove indexes based on your queries, to optimize performance and costs.
* TTL - You may want to adjust the TTL based on your retention policy. If you need traces for auditing purposes, you may extend it to 365 days. For high-throughput, low-latency systems, you might want to reduce it to 90 days or even less.
* Partitioning - partitioning by day is a good default, but in high throughput systems you may want to partition by hour or even minute. You may also consider partitioning by service\_name, or other attributes.
* Order By - You may want to adjust the order by clause based on your queries, so that common columns are used first in the query, to optimize performance.

### Configuring Destination Fields

<Accordion title="Supported Signals:">
  ✅ Traces
  ✅ Metrics
  ✅ Logs
  ❌ Profiles
</Accordion>

* **CLICKHOUSE\_ENDPOINT** `string` : Endpoint. The ClickHouse endpoint is the URL where the ClickHouse server is listening for incoming connections.
  * This field is required
  * Example: `http://host:port`
* **CLICKHOUSE\_USERNAME** `string` : Username. If Clickhouse Authentication is used, provide the username
  * This field is optional
* **CLICKHOUSE\_PASSWORD** `string` : Password. If Clickhouse Authentication is used, provide the password
  * This field is optional
* **CLICKHOUSE\_CREATE\_SCHEME** `boolean` : Create Schema. Should the destination create the schema for you? Set to `false` if you manage your own schema, or `true` to have Odigos create the schema for you
  * This field is required and defaults to `True`
* **CLICKHOUSE\_DATABASE\_NAME** `string` : Database Name. The name of the Clickhouse Database where the telemetry data will be stored. The Database will not be created when not exists, so make sure you have created it before
  * This field is required and defaults to `otel`
* **CLICKHOUSE\_TLS\_ENABLED** `boolean` : Enable TLS. Secure connection
  * This field is optional and defaults to `False`
* **CLICKHOUSE\_INSECURE\_SKIP\_VERIFY** `boolean` : Insecure Skip Verify. Skip TLS certificate verification
  * This field is optional and defaults to `False`
* **CLICKHOUSE\_USE\_CUSTOM\_CA** `boolean` : Use Custom CA Certificate. Use a custom CA certificate instead of the system root CA
  * This field is optional and defaults to `False`
* **CLICKHOUSE\_CA\_PEM** `string` : Certificate Authority (CA PEM). CA certificate in PEM format to verify the server
  * This field is optional
  * Example: `-----BEGIN CERTIFICATE-----`
* **CLICKHOUSE\_TRACES\_TABLE** `string` : Traces Table. Name of the ClickHouse Table to use for storing trace spans. This name should be used in span queries
  * This field is required and defaults to `otel_traces`
* **CLICKHOUSE\_LOGS\_TABLE** `string` : Logs Table. Name of the ClickHouse Table to use for storing logs
  * This field is required and defaults to `otel_logs`
* **CLICKHOUSE\_METRICS\_TABLE\_EXP\_HISTOGRAM** `string` : Metrics Table - Exp. Histogram. Name of the table for storing exponential histogram metrics
  * This field is required and defaults to `otel_metrics_exponential_histogram`
* **CLICKHOUSE\_METRICS\_TABLE\_GAUGE** `string` : Metrics Table - Gauge. Name of the table for storing gauge metrics
  * This field is required and defaults to `otel_metrics_gauge`
* **CLICKHOUSE\_METRICS\_TABLE\_HISTOGRAM** `string` : Metrics Table - Histogram. Name of the table for storing histogram metrics
  * This field is required and defaults to `otel_metrics_histogram`
* **CLICKHOUSE\_METRICS\_TABLE\_SUM** `string` : Metrics Table - Sum. Name of the table for storing sum metrics
  * This field is required and defaults to `otel_metrics_sum`
* **CLICKHOUSE\_METRICS\_TABLE\_SUMMARY** `string` : Metrics Table - Summary. Name of the table for storing summary metrics
  * This field is required and defaults to `otel_metrics_summary`
* **HYPERDX\_LOG\_NORMALIZER** `boolean` : Enable Log Normalizer. Enable HyperDX log normalization processor. This parses JSON from log bodies, infers severity levels, and normalizes log attributes for better querying.
  * This field is optional and defaults to `False`

### Adding Destination to Odigos

There are two primary methods for configuring destinations in Odigos:

##### **Using the UI**

<Steps>
  <Step>
    Use the [Odigos CLI](https://docs.odigos.io/cli/odigos_ui) to access the UI

    ```bash theme={null}
    odigos ui
    ```
  </Step>

  <Step>
    Click on `Add Destination`, select `Clickhouse` and follow the on-screen instructions
  </Step>
</Steps>

##### **Using Kubernetes manifests**

<Steps>
  <Step>
    Save the YAML below to a file (e.g. `clickhouse.yaml`)

    ```yaml theme={null}
    apiVersion: odigos.io/v1alpha1
    kind: Destination
    metadata:
      name: clickhouse-example
      namespace: odigos-system
    spec:
      data:
        CLICKHOUSE_CREATE_SCHEME: '<Create Schema (default: True)>'
        CLICKHOUSE_DATABASE_NAME: '<Database Name (default: otel)>'
        CLICKHOUSE_ENDPOINT: <Endpoint>
        CLICKHOUSE_LOGS_TABLE: '<Logs Table (default: otel_logs)>'
        CLICKHOUSE_METRICS_TABLE_EXP_HISTOGRAM: '<Metrics Table - Exp. Histogram (default:
          otel_metrics_exponential_histogram)>'
        CLICKHOUSE_METRICS_TABLE_GAUGE: '<Metrics Table - Gauge (default: otel_metrics_gauge)>'
        CLICKHOUSE_METRICS_TABLE_HISTOGRAM: '<Metrics Table - Histogram (default: otel_metrics_histogram)>'
        CLICKHOUSE_METRICS_TABLE_SUM: '<Metrics Table - Sum (default: otel_metrics_sum)>'
        CLICKHOUSE_METRICS_TABLE_SUMMARY: '<Metrics Table - Summary (default: otel_metrics_summary)>'
        CLICKHOUSE_TRACES_TABLE: '<Traces Table (default: otel_traces)>'
        # Note: The commented fields below are optional.
        # CLICKHOUSE_USERNAME: <Username>
        # CLICKHOUSE_TLS_ENABLED: <Enable TLS>
        # CLICKHOUSE_INSECURE_SKIP_VERIFY: <Insecure Skip Verify>
        # CLICKHOUSE_USE_CUSTOM_CA: <Use Custom CA Certificate>
        # HYPERDX_LOG_NORMALIZER: <Enable Log Normalizer>
      destinationName: clickhouse
      # Uncomment the 'secretRef' below if you are using the optional Secret.
      # secretRef:
      #   name: clickhouse-secret
      signals:
      - TRACES
      - METRICS
      - LOGS
      type: clickhouse

    ---

    # The following Secret is optional. Uncomment the entire block if you need to use it.
    # apiVersion: v1
    # data:
    #   CLICKHOUSE_CA_PEM: <Base64 Certificate Authority (CA PEM)>
    #   CLICKHOUSE_PASSWORD: <Base64 Password>
    # kind: Secret
    # metadata:
    #   name: clickhouse-secret
    #   namespace: odigos-system
    # type: Opaque
    ```
  </Step>

  <Step>
    Apply the YAML using `kubectl`

    ```bash theme={null}
    kubectl apply -f clickhouse.yaml
    ```
  </Step>
</Steps>
