ClickHouse Integration with Proxus
Learn how to integrate Proxus with an existing ClickHouse instance, send device data, and leverage high-performance analytics for real-time monitoring and insights.
Introduction
This guide explains how to integrate Proxus with an existing ClickHouse instance, a high-performance columnar database optimized for real-time analytics. By connecting Proxus to ClickHouse, you can store and analyze device data at scale, enabling actionable insights and robust monitoring.
What is ClickHouse?
ClickHouse is an open-source columnar database management system designed for online analytical processing (OLAP). It excels at handling large volumes of data with low-latency queries, making it ideal for time-series and real-time analytics.
Why Integrate Proxus with ClickHouse?
Integrating Proxus with ClickHouse offers several benefits:
-
High-Performance Analytics: Process massive datasets with sub-second query times.
-
Real-Time Data Ingestion: Stream device data for immediate analysis.
-
Scalability: Handle growing data volumes effortlessly.
-
Cost Efficiency: Leverage an open-source solution for enterprise-grade analytics.
Preparing ClickHouse
This guide assumes you already have a running ClickHouse instance. Ensure it’s configured to accept connections from Proxus.
Verify Connectivity
Test connectivity to your ClickHouse server using the HTTP interface (e.g., http://<clickhouse-host>:8123/ping
). It should return OK
.
Check Credentials
Confirm you have the correct username and password for ClickHouse access (e.g., admin
/clickhouse123
).
Configuring Proxus
Defining a ClickHouse Outbound Channel
Log in to Proxus
Log in to the Proxus web interface.
Navigate to Outbound Channels
Go to Integrations > Outbound Channels.
Create New Channel
Click + New to add a new outbound channel.
Fill in Channel Details
Provide the following details:
- Target Type:
ClickHouse
- Profile Name:
ClickHouseAnalytics
(e.g.) - Description:
ClickHouse Integration for Device Data
- Transport Strategy:
Pass Through Strategy
Adding ClickHouse Parameters
Go to Parameters Tab
Navigate to the Parameters tab in the channel configuration.
Add Key-Value Pairs
Add these parameters to connect to your ClickHouse instance:
Parameter Details
Parameter Details
Host
ClickHouse server hostname or IP
Port
ClickHouse HTTP port (default: 8123)
Username
ClickHouse username
Password
ClickHouse password
Database
Target database name
Table
Target table name
WriteIntervalSeconds
Buffer flush interval in seconds
TTLExpression
Optional: Time-to-live expression for data retention
- Key:
Host
, Value:<clickhouse-host>
(e.g.,localhost
) - Key:
Port
, Value:8123
(or your ClickHouse HTTP port) - Key:
Username
, Value:admin
(or your username) - Key:
Password
, Value:clickhouse123
(or your password) - Key:
Database
, Value:proxus
(default or your database) - Key:
Table
, Value:DeviceRawData
(default or your table) - Key:
WriteIntervalSeconds
, Value:5
(buffer every 5 seconds) - Key:
TTLExpression
, Value: See examples below (optional)
Save Configuration
Click Save to apply the settings.
TTL Expression Examples
The TTLExpression
parameter defines how long data remains in ClickHouse before automatic deletion. Use ClickHouse’s TTL syntax based on the Time
column. Here are some examples:
30 Days Retention
30 Days Retention
Description: Keeps data for 30 days from the timestamp.
1 Week Retention
1 Week Retention
Description: Retains data for 7 days.
6 Months Retention
6 Months Retention
Description: Stores data for 6 months.
Custom Retention with Condition
Custom Retention with Condition
Description: Deletes data after 90 days, except for devices named ‘Critical’.
Sending Data to ClickHouse
Linking ClickHouse Profile to a Device
Navigate to Devices
Go to Data Management > Devices in Proxus.
Select Device
Choose the device to send data from.
Go to Target Profiles
In the device details, navigate to Target Profiles.
Add ClickHouse Profile
Add the profile:
- Profile Name:
ClickHouseAnalytics
- Target Type:
ClickHouse
- Transport Strategy:
Pass Through Strategy
Verifying Data Transmission
Activate Device
Ensure the device’s Status is “Active.”
Check Logs
Verify Proxus logs for messages like [ClickHouse] Added X records to buffer
and [ClickHouse] Flushed Y records
.
Querying Data in ClickHouse
Accessing ClickHouse
Connect to your ClickHouse instance using a client (e.g., clickhouse-client
) or the HTTP interface (e.g., http://<clickhouse-host>:8123
).
Basic Queries
Verify Data
Run this query to see recent data:
Aggregate Data
Calculate average values by device:
Data Structure
The DeviceRawData
table in ClickHouse has the following schema:
Column | Type | Description | Example |
---|---|---|---|
Time | DateTime64(3) | Timestamp of data collection | 2025-02-25 09:36:00.000 |
DeviceId | UInt32 | Unique device identifier | 1 |
DeviceName | String | Name of the device | Sensor1 |
Key | String | Data key (e.g., temperature) | temperature |
DataType | Enum8 | Type of data (e.g., Double, String) | 8 (Double) |
NumericValue | Float64 | Numeric value of the data | 25.5 |
StringValue | String | String value (if applicable) | "" or "High" |
Optimizing Performance
Asynchronous Inserts
The integration uses ClickHouse’s asynchronous insert feature to buffer data, reducing disk writes. Proxus buffers data client-side every 5 seconds (WriteIntervalSeconds=5
), which aligns with server-side settings if configured (e.g., async_insert_busy_timeout_ms=5000
).
Monitoring Parts
Check active parts to ensure efficient data ingestion:
Troubleshooting
Common Issues
Connection Failed
Connection Failed
-
Cause: Incorrect Host or Port.
-
Solution: Verify
Host
andPort
match your ClickHouse instance.
Too Many Parts
Too Many Parts
-
Cause: Frequent inserts bypassing buffer.
-
Solution: Increase
WriteIntervalSeconds
(e.g., to 10) or ensure server-sideasync_insert
is enabled.
Data Not Visible
Data Not Visible
-
Cause: Buffer not flushed.
-
Solution: Wait 5 seconds or check logs for
[ClickHouse] Flushed
messages.
Best Practices
-
Buffer Tuning: Set
WriteIntervalSeconds
based on data volume (e.g., 5-10 seconds for high throughput). -
Retention Management: Use
TTLExpression
to control data lifecycle. Examples:-
Short-term:
toDateTime(Time) + INTERVAL 1 WEEK
-
Long-term:
toDateTime(Time) + INTERVAL 1 YEAR
-
-
Monitoring: Regularly query
system.parts
to optimize part count and storage.
Conclusion
This guide has shown you how to integrate Proxus with an existing ClickHouse instance using the ClickHouseIntegrationActor
. You can now efficiently store device data, perform real-time analytics, and scale your monitoring solution with ease.