-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Refine tiflash FAQ and configuration docs #20252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: JaySon-Huang <[email protected]>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
## PD 调度参数 | ||
|
||
可通过 [pd-ctl](/pd-control.md) 调整参数。如果你使用 TiUP 部署,可以用 `tiup ctl:v<CLUSTER_VERSION> pd` 代替 `pd-ctl -u <pd_ip:pd_port>` 命令。 | ||
|
||
- [`replica-schedule-limit`](/pd-configuration-file.md#replica-schedule-limit):用来控制 replica 相关 operator 的产生速度(涉及到下线、补副本的操作都与该参数有关) | ||
|
||
> **注意:** | ||
> | ||
> 不要超过 `region-schedule-limit`,否则会影响正常 TiKV 之间的 Region 调度。 | ||
|
||
- `store-balance-rate`:用于限制每个 TiKV store 或 TiFlash store 的 Region 调度速度。注意这个参数只对新加入集群的 store 有效,如果想立刻生效请用下面的方式。 | ||
|
||
> **注意:** | ||
> | ||
> 4.0.2 版本之后(包括 4.0.2 版本)废弃了 `store-balance-rate` 参数且 `store limit` 命令有部分变化。该命令变化的细节请参考 [store-limit 文档](/configure-store-limit.md)。 | ||
|
||
- 使用 `pd-ctl -u <pd_ip:pd_port> store limit <store_id> <value>` 命令单独设置某个 store 的 Region 调度速度。(`store_id` 可通过 `pd-ctl -u <pd_ip:pd_port> store` 命令获得)如果没有单独设置,则继承 `store-balance-rate` 的设置。你也可以使用 `pd-ctl -u <pd_ip:pd_port> store limit` 命令查看当前设置值。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These content are somehow outdated. And how to speed up tiflash replication are duplicated with that described in create-tiflash-replicas.md#speed-up-tiflash-replication.
|
||
- 使用 `pd-ctl -u <pd_ip:pd_port> store limit <store_id> <value>` 命令单独设置某个 store 的 Region 调度速度。(`store_id` 可通过 `pd-ctl -u <pd_ip:pd_port> store` 命令获得)如果没有单独设置,则继承 `store-balance-rate` 的设置。你也可以使用 `pd-ctl -u <pd_ip:pd_port> store limit` 命令查看当前设置值。 | ||
|
||
- [`replication.location-labels`](/pd-configuration-file.md#location-labels):用来表示 TiKV 实例的拓扑关系,其中 key 的顺序代表了不同标签的层次关系。在 TiFlash 开启的情况下需要使用 [`pd-ctl config placement-rules`](/pd-control.md#config-show--set-option-value--placement-rules) 来设置默认值,详细可参考 [geo-distributed-deployment-topology](/geo-distributed-deployment-topology.md)。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is duplicated with the later section "通过拓扑 label 进行副本调度"
#### TiDB 集群版本低于 v4.0.9 | ||
|
||
TiDB v4.0.9 之前的版本中,TiFlash 只支持将存储引擎中的主要数据分布在多盘上。通过 `path`(TiUP 中为 `data_dir`)和 `path_realtime_mode` 这两个参数配置多盘部署。 | ||
|
||
多个数据存储目录在 `path` 中以英文逗号分隔,比如 `/nvme_ssd_a/data/tiflash,/sata_ssd_b/data/tiflash,/sata_ssd_c/data/tiflash`。如果你的节点上有多块硬盘,推荐把性能最好的硬盘目录放在最前面,以更好地利用节点性能。 | ||
|
||
如果节点上有多块相同规格的硬盘,可以把 `path_realtime_mode` 参数留空(或者把该值明确地设为 `false`)。这表示数据会在所有的存储目录之间进行均衡。但由于最新的数据仍然只会被写入到第一个目录,因此该目录所在的硬盘会较其他硬盘繁忙。 | ||
|
||
如果节点上有多块规格不一致的硬盘,推荐把 `path_relatime_mode` 参数设置为 `true`,并且把性能最好的硬盘目录放在 `path` 参数内的最前面。这表示第一个目录只会存放最新数据,较旧的数据会在其他目录之间进行均衡。注意此情况下,第一个目录规划的容量大小需要占总容量的约 10%。 | ||
|
||
#### TiDB 集群版本为 v4.0.9 及以上 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
v4.0.x are EOL at 2024-04-02
And user are less likely to deploy a tidb cluster lower than v4.0.9 when checking the latest version of docs.
Signed-off-by: JaySon-Huang <[email protected]>
fec10d7
to
7f6d922
Compare
@Lloyd-Pottiger: adding LGTM is restricted to approvers and reviewers in OWNERS files. In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Signed-off-by: JaySon-Huang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@CalvinNeo: adding LGTM is restricted to approvers and reviewers in OWNERS files. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Signed-off-by: JaySon-Huang <[email protected]>
Signed-off-by: JaySon-Huang <[email protected]>
Signed-off-by: JaySon-Huang <[email protected]>
tiflash/troubleshoot-tiflash.md
Outdated
## TiFlash 数据同步卡住 | ||
|
||
如果 TiFlash 数据一开始可以正常同步,过一段时间后全部或者部分数据无法继续同步,你可以通过以下步骤确认或解决问题: | ||
|
||
1. 检查磁盘空间。 | ||
|
||
检查磁盘使用空间比例是否高于 `low-space-ratio` 的值(默认值 0.8,即当节点的空间占用比例超过 80% 时,为避免磁盘空间被耗尽,PD 会尽可能避免往该节点迁移数据)。 | ||
|
||
- 如果磁盘使用率大于等于 `low-space-ratio`,说明磁盘空间不足。此时,请删除不必要的文件,如 `${data}/flash/` 目录下的 `space_placeholder_file` 文件(必要时可在删除文件后将 `reserve-space` 设置为 0MB)。 | ||
- 如果磁盘使用率小于 `low-space-ratio`,说明磁盘空间正常,进入下一步。 | ||
|
||
2. 检查是否有 `down peer` (`down peer` 没有清理干净可能会导致同步卡住)。 | ||
|
||
- 执行 `pd-ctl region check-down-peer` 命令检查是否有 `down peer`。 | ||
- 如果存在 `down peer`,执行 `pd-ctl operator add remove-peer <region-id> <tiflash-store-id>` 命令将其清除。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is merged to the "TiFlash 副本始终处于不可用状态"
@@ -34,46 +34,6 @@ aliases: ['/docs-cn/dev/tiflash/troubleshoot-tiflash/','/docs-cn/dev/tiflash/tif | |||
|
|||
如果遇到上述方法无法解决的问题,可以打包 TiFlash 的 log 文件夹,并在 [AskTUG](http://asktug.com) 社区中提问。 | |||
|
|||
## TiFlash 副本始终处于不可用状态 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is moved to before the "TiFlash 数据不同步" part
Signed-off-by: JaySon-Huang <[email protected]>
Signed-off-by: JaySon-Huang <[email protected]>
Signed-off-by: JaySon-Huang <[email protected]>
Signed-off-by: JaySon-Huang <[email protected]>
Signed-off-by: JaySon-Huang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@CalvinNeo: adding LGTM is restricted to approvers and reviewers in OWNERS files. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
First-time contributors' checklist
What is changed, added or deleted? (Required)
remove-peer
restriction when rebalancing Regions from old to new TiFlash nodes.profiles.default.max_memory_usage_for_all_queries
since v6.6.0.Which TiDB version(s) do your changes apply to? (Required)
Tips for choosing the affected version(s):
By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.
For details, see tips for choosing the affected versions (in Chinese).
What is the related PR or file link(s)?
Do your changes match any of the following descriptions?