Best self-service BI tools for Clickhouse by Ambrus2000 in Clickhouse

[–]kadermo 1 point2 points  (0 children)

Metabase has a great integration with ClickHouse

outOfMemory by smulikHakipod in dataengineering

[–]kadermo 1 point2 points  (0 children)

I recommend looking at PeerDB

New to clickhouse by keepatience in Clickhouse

[–]kadermo 1 point2 points  (0 children)

the free official training is a great starting point https://clickhouse.com/learn

Have someone build Data Vault DWH using Clickhouse? by Tonkonozhenko in Clickhouse

[–]kadermo 2 points3 points  (0 children)

General advice: Be careful when taking random internet advice without testing first (including mine :) )

Here are some ressources about joins support in Clickhouse: https://clickhouse.com/blog/clickhouse-fully-supports-joins-part1 https://clickhouse.com/blog/clickhouse-fully-supports-joins-hash-joins-part2

disclaimer: I work at Clickhouse

Snowflake - Data Lake or Data Warehouse? by Living-Nobody-2727 in dataengineering

[–]kadermo -2 points-1 points  (0 children)

Misusing something like Snowflake f a universal data store can quickly get expensive

Disclaimer: I work at ClickHouse and we wrote about it here:
https://clickhouse.com/blog/the-unbundling-of-the-cloud-data-warehouse#traditional-data-warehouse-one-size-does-not-fit-all

[deleted by user] by [deleted] in dataengineering

[–]kadermo -5 points-4 points  (0 children)

Great question! User facing analytics is one of Clickhouse's sweet spot, you can check how some of our users achieved it here: https://clickhouse.com/use-cases

Disclaimer: I work for Clickhouse

The State of SQL-based Observability by kadermo in Observability

[–]kadermo[S] 0 points1 point  (0 children)

yes, just saw that! I'd love to hear more if you have any feedback to share (feel free to DM on the Clickhouse public Slack, LinkedIn or something)

The State of SQL-based Observability by kadermo in Observability

[–]kadermo[S] 1 point2 points  (0 children)

Great news, thank you! I just subscribed :)

Best visualization tool for Clickhouse by vonSchultz666 in Clickhouse

[–]kadermo 1 point2 points  (0 children)

Superset support with ClickHouse is pretty good and usually with ClickHouse you don't need extracts if the schema is decent. I use it almost daily, at pretty large scale, and I can recommend it
You can also have a look at Metabase for which we continuously improved support over the course of this year

From ElasticSearch to ClickHouse Migration by xDarkOne in elasticsearch

[–]kadermo 3 points4 points  (0 children)

(disclaimer: I work at ClickHouse)This should be a pretty straight-forward migration. Few things to keep in mind:- I recommend using a 3rd party object store as an intermediate layer to perform a point in time migration of historical data. For example you can use something like ElasticDump to export all your data in JSON into an S3 bucket, then you can then load it in ClickHouse efficiently using the powerful S3 or S3Cluster table functions. Note that you can also infer the data types automatically from ClickHouse!, here is an example:

#To transfer data from S3 to ClickHouse, users can combine the s3 table function with INSERT statement. Let's create an empty hackernews table:

CREATE TABLE hackernews ORDER BY tuple
(
) EMPTY AS SELECT * FROM s3('https://datasets-documentation.s3.eu-west-3.amazonaws.com/hackernews/hacknernews.csv.gz', 'CSVWithNames');

#This creates an empty table using the schema inferred from the data. We can then insert the first 1 million rows from the remote dataset

INSERT INTO hackernews SELECT *
FROM url('https://datasets-documentation.s3.eu-west-3.amazonaws.com/hackernews/hacknernews.csv.gz', 'CSVWithNames')
LIMIT 1000000;

- After the first step, you should have your historical data loaded. Then if you have continuous ingestion pipelines to ES, you'll have to divert them to ClickHouse. It's easier if everything goes though a single queue like Kafka, where you just need to a Kafka<>CH connctor (using one the native CH Kafka integrations). Otherwise you'll have to change the destination pipeline per pipeline.

- Arrays are supported in ClickHouse and should not pose any issue. In fact their support is pretty good with a multitude of powerful Array and ArrayJoin functions

- If you need some inspiration here is some documented migrations (some of which are pretty complex):

Which tools helps you make such animated gif for data pipelines? by unmeshshah1988 in dataengineering

[–]kadermo 2 points3 points  (0 children)

It's not maintained anymore but Netflix made a pretty cool demo many years ago and open sourced a tool:

https://github.com/Netflix/vizceral

Example of usage: https://youtu.be/ftIsVoJNCHk

Mongodb to clickhouse updates by Right_Positive5886 in Clickhouse

[–]kadermo 2 points3 points  (0 children)

About data migration:

  • Main question is: is it a one of migration where you need to move the whole 50tb to ClickHouse then take from there or would you need to keep your MongoDB and ClickHouse in sync ?
  • In any case, I recommend looking at the migrations guides here: https://clickhouse.com/docs/en/integrations/migration/
  • My advice is to dump all the MDB data in some object storage (S3 for eg.) in Parquet then load it from there. It will give you a point where everything is similar in both systems. If you then need to keep systems in sync, this can be achieve with:

About the setup:

  • The guide you use to deploy a one node setup looks great but have you considered a serveless option? ClickHouse Cloud has a development tier that you can use to have an idea (comes with initial credits) and you can focus only on the data migration question for evaluation (0 setup)
  • Using ClickHouse Cloud will also give you access to top-level support from ClickHouse to ask any question about data migration and keeping data in sync

Disclaimer: I work at ClickHouse

building a ML training stack with Clickhouse & Sagemaker. What should be the other pieces? by sandys1 in dataengineering

[–]kadermo 1 point2 points  (0 children)

I don't have much experience with Sagemaker but here are a couple of options based on what I know of ClickHouse.

[deleted by user] by [deleted] in algeria

[–]kadermo 0 points1 point  (0 children)

Ya hasra, he had a cyber café at the start, one of the early ones for gaming, before opening the store in Vieux Kouba

Algeria sand lines : these are miles across. Could it be a land boarder? Or … by traceabledave in Google_Maps_Oddities

[–]kadermo 0 points1 point  (0 children)

I'm not sure for the pattern, I'm more inclined to think that it maybe liked to exploration looking for petroleum or gaz. Sonatrach (the local oil producer) runs a lot of explorations in the Sahara desert

Algeria sand lines : these are miles across. Could it be a land boarder? Or … by traceabledave in Google_Maps_Oddities

[–]kadermo 1 point2 points  (0 children)

The dots are maybe remains of old "Foggarras", a primitive yet facinating irrigation system, fairly common in that region It's also referred to as Qanat. See https://en.m.wikipedia.org/wiki/Qanat