ClickHouse
diff --git a/‎docs/integrations/data-ingestion/aws-glue/index.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/integrations/data-ingestion/aws-glue/index.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/integrations/data-ingestion/data-formats/arrow-avro-orc.md‎
Lines changed: 3 additions & 3 deletions b/‎docs/integrations/data-ingestion/data-formats/arrow-avro-orc.md‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎docs/integrations/data-ingestion/data-formats/binary.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/integrations/data-ingestion/data-formats/binary.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/integrations/data-ingestion/data-formats/json/inference.md‎
Lines changed: 4 additions & 4 deletions b/‎docs/integrations/data-ingestion/data-formats/json/inference.md‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎docs/integrations/data-ingestion/data-formats/json/schema.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/integrations/data-ingestion/data-formats/json/schema.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/integrations/data-ingestion/data-formats/parquet.md‎
Lines changed: 2 additions & 2 deletions b/‎docs/integrations/data-ingestion/data-formats/parquet.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/integrations/data-ingestion/dbms/jdbc-with-clickhouse.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/integrations/data-ingestion/dbms/jdbc-with-clickhouse.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/integrations/data-ingestion/insert-local-files.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/integrations/data-ingestion/insert-local-files.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/integrations/data-ingestion/kafka/confluent/kafka-connect-http.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/integrations/data-ingestion/kafka/confluent/kafka-connect-http.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/integrations/data-ingestion/kafka/kafka-connect-jdbc.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/integrations/data-ingestion/kafka/kafka-connect-jdbc.md‎
Lines changed: 1 addition & 1 deletion
@@ -104,6 +104,6 @@ job.commit()
 </TabItem>
 </Tabs>
 
-For more details, please visit our [Spark & JDBC documentation](/integrations/apache-spark#read-data).
+For more details, please visit our [Spark & JDBC documentation](/integrations/apache-spark/spark-jdbc#read-data).
 
 
@@ -48,7 +48,7 @@ FORMAT Avro;
 
 ### Avro and ClickHouse data types {#avro-and-clickhouse-data-types}
 
-Consider [data types matching](/interfaces/formats.md/#data_types-matching) when importing or exporting Avro files. Use explicit type casting to convert when loading data from Avro files:
+Consider [data types matching](/interfaces/formats/Avro#data-types-matching) when importing or exporting Avro files. Use explicit type casting to convert when loading data from Avro files:
 
 ```sql
 SELECT
@@ -100,7 +100,7 @@ INTO OUTFILE 'export.arrow'
 FORMAT Arrow
 ```
 
-Also, check [data types matching](/interfaces/formats.md/#data-types-matching-arrow) to know if any should be converted manually.
+Also, check [data types matching](/interfaces/formats/Arrow#data-types-matching) to know if any should be converted manually.
 
 ### Arrow data streaming {#arrow-data-streaming}
 
@@ -150,7 +150,7 @@ FROM INFILE 'data.orc'
 FORMAT ORC;
 ```
 
-Also, check [data types matching](/interfaces/formats.md/#data-types-matching-orc) as well as [additional settings](/interfaces/formats.md/#parquet-format-settings) to tune export and import.
+Also, check [data types matching](/interfaces/formats/ORC) as well as [additional settings](/interfaces/formats/Parquet#format-settings) to tune export and import.
 
 ## Further reading {#further-reading}
 
 
@@ -222,7 +222,7 @@ FORMAT CapnProto
 SETTINGS format_schema = 'schema:PathStats'
 ```
 
-Note that we had to cast the `Date` column as `UInt32` to [match corresponding types](/interfaces/formats.md/#data_types-matching-capnproto).
+Note that we had to cast the `Date` column as `UInt32` to [match corresponding types](/interfaces/formats/CapnProto#data_types-matching-capnproto).
 
 ## Other formats {#other-formats}
 
 
@@ -90,13 +90,13 @@ SETTINGS describe_compact_output = 1
 └────────────────┴─────────────────────────────────────────────────────────────────────────┘
 ```
 :::note Avoid nulls
-You can see a lot of the columns are detected as Nullable. We [do not recommend using the Nullable](/sql-reference/data-types/nullable#storage-features) type when not absolutely needed. You can use [schema_inference_make_columns_nullable](/interfaces/schema-inference#schema_inference_make_columns_nullable) to control the behavior of when Nullable is applied.
+You can see a lot of the columns are detected as Nullable. We [do not recommend using the Nullable](/sql-reference/data-types/nullable#storage-features) type when not absolutely needed. You can use [schema_inference_make_columns_nullable](/operations/settings/formats#schema_inference_make_columns_nullable) to control the behavior of when Nullable is applied.
 :::
 
 We can see that most columns have automatically been detected as `String`, with `update_date` column correctly detected as a `Date`. The `versions` column has been created as an `Array(Tuple(created String, version String))` to store a list of objects, with `authors_parsed` being defined as `Array(Array(String))` for nested arrays.
 
 :::note Controlling type detection
-The auto-detection of dates and datetimes can be controlled through the settings [`input_format_try_infer_dates`](/interfaces/schema-inference#input_format_try_infer_dates) and [`input_format_try_infer_datetimes`](/interfaces/schema-inference#input_format_try_infer_datetimes) respectively (both enabled by default). The inference of objects as tuples is controlled by the setting [`input_format_json_try_infer_named_tuples_from_objects`](/operations/settings/formats#input_format_json_try_infer_named_tuples_from_objects). Other settings which control schema inference for JSON, such as the auto-detection of numbers, can be found [here](/interfaces/schema-inference#text-formats).
+The auto-detection of dates and datetimes can be controlled through the settings [`input_format_try_infer_dates`](/operations/settings/formats#input_format_try_infer_dates) and [`input_format_try_infer_datetimes`](/operations/settings/formats#input_format_try_infer_datetimes) respectively (both enabled by default). The inference of objects as tuples is controlled by the setting [`input_format_json_try_infer_named_tuples_from_objects`](/operations/settings/formats#input_format_json_try_infer_named_tuples_from_objects). Other settings which control schema inference for JSON, such as the auto-detection of numbers, can be found [here](/interfaces/schema-inference#text-formats).
 :::
 
 ## Querying JSON {#querying-json}
@@ -183,7 +183,7 @@ ORDER BY update_date
 SETTINGS index_granularity = 8192
 ```
 
-The above is the correct schema for this data. Schema inference is based on sampling the data and reading the data row by row. Column values are extracted according to the format, with recursive parsers and heuristics used to determine the type for each value. The maximum number of rows and bytes read from the data in schema inference is controlled by the settings [`input_format_max_rows_to_read_for_schema_inference`](/interfaces/schema-inference#input_format_max_rows_to_read_for_schema_inferenceinput_format_max_bytes_to_read_for_schema_inference) (25000 by default) and [`input_format_max_bytes_to_read_for_schema_inference`](/interfaces/schema-inference#input_format_max_rows_to_read_for_schema_inferenceinput_format_max_bytes_to_read_for_schema_inference) (32MB by default). In the event detection is not correct, users can provide hints as described [here](/interfaces/schema-inference#schema_inference_hints).
+The above is the correct schema for this data. Schema inference is based on sampling the data and reading the data row by row. Column values are extracted according to the format, with recursive parsers and heuristics used to determine the type for each value. The maximum number of rows and bytes read from the data in schema inference is controlled by the settings [`input_format_max_rows_to_read_for_schema_inference`](/operations/settings/formats#input_format_max_rows_to_read_for_schema_inference) (25000 by default) and [`input_format_max_bytes_to_read_for_schema_inference`](/interfaces/schema-inference#input_format_max_rows_to_read_for_schema_inferenceinput_format_max_bytes_to_read_for_schema_inference) (32MB by default). In the event detection is not correct, users can provide hints as described [here](/interfaces/schema-inference#schema_inference_hints).
 
 ### Creating tables from snippets {#creating-tables-from-snippets}
 
@@ -272,7 +272,7 @@ FORMAT PrettyJSONEachRow
 
 ## Handling errors {#handling-errors}
 
-Sometimes, you might have bad data. For example, specific columns that do not have the right type or an improperly formatted JSON. For this, you can use the setting [`input_format_allow_errors_ratio`](/operations/settings/formats#input_format_allow_errors_ratio) to allow a certain number of rows to be ignored if the data is triggering insert errors. Additionally, [hints](/interfaces/schema-inference#schema_inference_hints) can be provided to assist inference.
+Sometimes, you might have bad data. For example, specific columns that do not have the right type or an improperly formatted JSON. For this, you can use the setting [`input_format_allow_errors_ratio`](/operations/settings/formats#input_format_allow_errors_ratio) to allow a certain number of rows to be ignored if the data is triggering insert errors. Additionally, [hints](/operations/settings/formats#schema_inference_hints) can be provided to assist inference.
 
 ## Further reading {#further-reading}
 
 
@@ -508,7 +508,7 @@ SELECT JSONExtractString(tags, 'holidays') as holidays FROM people
 1 row in set. Elapsed: 0.002 sec.
 ```
 
-Notice how the functions require both a reference to the `String` column `tags` and a path in the JSON to extract. Nested paths require functions to be nested e.g. `JSONExtractUInt(JSONExtractString(tags, 'car'), 'year')` which extracts the column `tags.car.year`. The extraction of nested paths can be simplified through the functions [JSON_QUERY](/sql-reference/functions/json-functions.md/#json_queryjson-path) AND [JSON_VALUE](/sql-reference/functions/json-functions.md/#json_valuejson-path).
+Notice how the functions require both a reference to the `String` column `tags` and a path in the JSON to extract. Nested paths require functions to be nested e.g. `JSONExtractUInt(JSONExtractString(tags, 'car'), 'year')` which extracts the column `tags.car.year`. The extraction of nested paths can be simplified through the functions [JSON_QUERY](/sql-reference/functions/json-functions#json_query) AND [JSON_VALUE](/sql-reference/functions/json-functions#json_value).
 
 Consider the extreme case with the `arxiv` dataset where we consider the entire body to be a `String`.
 
 
@@ -125,7 +125,7 @@ DESCRIBE TABLE imported_from_parquet;
 └──────┴──────────────────┴──────────────┴────────────────────┴─────────┴──────────────────┴────────────────┘
 ```
 
-By default, ClickHouse is strict with column names, types, and values. But sometimes, we can skip nonexistent columns or unsupported values during import. This can be managed with [Parquet settings](/interfaces/formats.md/#parquet-format-settings).
+By default, ClickHouse is strict with column names, types, and values. But sometimes, we can skip nonexistent columns or unsupported values during import. This can be managed with [Parquet settings](/interfaces/formats/Parquet#format-settings).
 
 
 ## Exporting to Parquet format {#exporting-to-parquet-format}
@@ -146,7 +146,7 @@ FORMAT Parquet
 This will create the `export.parquet` file in a working directory.
 
 ## ClickHouse and Parquet data types {#clickhouse-and-parquet-data-types}
-ClickHouse and Parquet data types are mostly identical but still [differ a bit](/interfaces/formats.md/#data-types-matching-parquet). For example, ClickHouse will export `DateTime` type as a Parquets' `int64`. If we then import that back to ClickHouse, we're going to see numbers ([time.parquet file](assets/time.parquet)):
+ClickHouse and Parquet data types are mostly identical but still [differ a bit](/interfaces/formats/Parquet#data-types-matching-parquet). For example, ClickHouse will export `DateTime` type as a Parquets' `int64`. If we then import that back to ClickHouse, we're going to see numbers ([time.parquet file](assets/time.parquet)):
 
 ```sql
 SELECT * FROM file('time.parquet', Parquet);
 
@@ -17,7 +17,7 @@ Using JDBC requires the ClickHouse JDBC bridge, so you will need to use `clickho
 
 **Overview:** The <a href="https://github.com/ClickHouse/clickhouse-jdbc-bridge" target="_blank">ClickHouse JDBC Bridge</a> in combination with the [jdbc table function](/sql-reference/table-functions/jdbc.md) or the [JDBC table engine](/engines/table-engines/integrations/jdbc.md) allows ClickHouse to access data from any external data source for which a <a href="https://en.wikipedia.org/wiki/JDBC_driver" target="_blank">JDBC driver</a> is available:
 <img src={require('./images/jdbc-01.png').default} class="image" alt="ClickHouse JDBC Bridge"/>
-This is handy when there is no native built-in [integration engine](/engines/table-engines/index.md#integration-engines-integration-engines), table function, or external dictionary for the external data source available, but a JDBC driver for the data source exists.
+This is handy when there is no native built-in [integration engine](/engines/table-engines/integrations), table function, or external dictionary for the external data source available, but a JDBC driver for the data source exists.
 
 You can use the ClickHouse JDBC Bridge for both reads and writes. And in parallel for multiple external data sources, e.g. you can run distributed queries on ClickHouse across multiple external and internal data sources in real time.
 
 
@@ -37,7 +37,7 @@ ENGINE = MergeTree
 ORDER BY toYYYYMMDD(timestamp)
 ```
 
-3. We want to lowercase the `author` column, which is easily done with the [`lower` function](/sql-reference/functions/string-functions/#lower-lcase). We also want to split the `comment` string into tokens and store the result in the `tokens` column, which can be done using the [`extractAll` function](/sql-reference/functions/string-search-functions/#extractallhaystack-pattern). You do all of this in one `clickhouse-client` command - notice how the `comments.tsv` file is piped into the `clickhouse-client` using the `<` operator:
+3. We want to lowercase the `author` column, which is easily done with the [`lower` function](/sql-reference/functions/string-functions#lower). We also want to split the `comment` string into tokens and store the result in the `tokens` column, which can be done using the [`extractAll` function](/sql-reference/functions/string-search-functions#extractall). You do all of this in one `clickhouse-client` command - notice how the `comments.tsv` file is piped into the `clickhouse-client` using the `<` operator:
 
 ```bash
 clickhouse-client \
 
@@ -137,7 +137,7 @@ The following additional parameters are relevant to using the HTTP Sink with Cli
 * `ssl.enabled` - set to true if using SSL.
 * `connection.user` - username for ClickHouse.
 * `connection.password` - password for ClickHouse.
-* `batch.max.size` - The number of rows to send in a single batch. Ensure this set is to an appropriately large number. Per ClickHouse [recommendations](../../../../concepts/why-clickhouse-is-so-fast.md#performance-when-inserting-data) a value of 1000 is should be considered a minimum.
+* `batch.max.size` - The number of rows to send in a single batch. Ensure this set is to an appropriately large number. Per ClickHouse [recommendations](/sql-reference/statements/insert-into#performance-considerations) a value of 1000 should be considered a minimum.
 * `tasks.max` - The HTTP Sink connector supports running one or more tasks. This can be used to increase performance. Along with batch size this represents your primary means of improving performance.
 * `key.converter` - set according to the types of your keys.
 * `value.converter` - set based on the type of data on your topic. This data does not need a schema. The format here must be consistent with the FORMAT specified in the parameter `http.api.url`. The simplest here is to use JSON and the org.apache.kafka.connect.json.JsonConverter converter. Treating the value as a string, via the converter org.apache.kafka.connect.storage.StringConverter, is also possible - although this will require the user to extract a value in the insert statement using functions. [Avro format](../../../../interfaces/formats.md#data-format-avro) is also supported in ClickHouse if using the io.confluent.connect.avro.AvroConverter converter.
 
@@ -55,7 +55,7 @@ The following parameters are relevant to using the JDBC connector with ClickHous
 * `_connection.url_` - this should take the form of `jdbc:clickhouse://&lt;clickhouse host>:&lt;clickhouse http port>/&lt;target database>`
 * `connection.user` - a user with write access to the target database
 * `table.name.format`- ClickHouse table to insert data. This must exist.
-* `batch.size` - The number of rows to send in a single batch. Ensure this set is to an appropriately large number. Per ClickHouse [recommendations](../../../concepts/why-clickhouse-is-so-fast.md#performance-when-inserting-data) a value of 1000 should be considered a minimum.
+* `batch.size` - The number of rows to send in a single batch. Ensure this set is to an appropriately large number. Per ClickHouse [recommendations](/sql-reference/statements/insert-into#performance-considerations) a value of 1000 should be considered a minimum.
 * `tasks.max` - The JDBC Sink connector supports running one or more tasks. This can be used to increase performance. Along with batch size this represents your primary means of improving performance.
 * `value.converter.schemas.enable` - Set to false if using a schema registry, true if you embed your schemas in the messages.
 * `value.converter` - Set according to your datatype e.g. for JSON, `io.confluent.connect.json.JsonSchemaConverter`.