Back
Jul 08, 2021

ClickHouse tips #9: Filling gaps in time-series on ClickHouse

This simple trick will teach you how to fill date and datetime gaps in time-series on ClickHouse
Xoel López
Founder at TheirStack

Filling gaps and resampling time series

In the previous tip, we showed you how to fill null values on ClickHouse. Many times, you will group data by a Date or a DateTime column, and there will be rows missing because for those periods there was no data.

To chart data, you’ll ideally have time series where the intervals between one data point and the next one are constant, so you’d want to fill in the empty periods with a value of 0. This is how you’d do it on ClickHouse.

If you have a Data Source like this, with a DataTime column and an Int column counting visits:

Start building with Tinybird
If you've read this far, you might want to use Tinybird as your analytics backend. Start for free with no time limit.

WITH FILL

ClickHouse introduced the WITH FILL modifier to ORDER BY clause a couple of years ago. You can read its full documentation here, but with this example you’ll understand what it does at a glance. It solves the problem we described above in one line, and in a very elegant way:

This works with Date, DateTime and any numerical column. The step is 1 by default, and on DateTime columns it’s one second and on Date columns it’s one day.

Subscribe to our newsletter
Links to our blog and other great reads sent every other Saturday.

If you can’t use WITH FILL

If you’re running an older version of ClickHouse, there’s still a way to do this. The way to fill those empty rows would be to generate a time series that contained every second in that time interval, that could be done like this:

And then joining that with the original Data Source would give us this, with all the rows present, including those where there are no values present in the original one:

Last: if you were using Tinybird, you could split the query in two nodes, and write the subquery in another node, that you could name full_dt for example (with the same content as the timeseries_without_gaps node from above). Doing that, the previous query could be rewritten like this, in a much more legible way (plus you get to see the results of intermediate subqueries in our notebook-like interface - quite handy).

If you don’t have a Tinybird account yet, sign up for one here

Do you like this post?

Related posts

ClickHouse tips #7: Forward and backfilling null values
ClickHouse tips #5: Adding and subtracting intervals
ClickHouse tips #3: the transform function
Clickhouse Tips #1: Calculating Aggregations After a Given Date
Changelog #18: High-frequency ingestion, handling NDJSON files and more product enhancements

Tinybird

Team

Feb 01, 2022
Tinybird at South Summit Madrid 2019
A big performance boost, adding columns and more

Tinybird

Team

May 31, 2021
Tinybird is out of beta and open to everyone
ClickHouse tips #6: Filtering data in subqueries to avoid joins
You can now explore and analyze time series data in Tinybird

Build fast data products, faster.

Try Tinybird and bring your data sources together and enable engineers to build with data in minutes. No credit card required, free to get started.
Tinybird is not affiliated with, associated with, or sponsored by ClickHouse, Inc. ClickHouse® is a registered trademark of ClickHouse, Inc.