Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

读取速度似乎很慢,资源似乎不能得到有效利用[Bug]: #148

Open
1 of 2 tasks
limingoo opened this issue Jul 26, 2024 · 2 comments
Open
1 of 2 tasks

Comments

@limingoo
Copy link

What happened?

我使用的catalog只能单线程读取ck表,在并发设置10,只有一个task 在running,其他9个显示finished。一个task的读取速度似乎不是很高(大概3w条/s),其他物理资源得不到充分利用。请问是配置的不对吗

Affects Versions

1.16

What are you seeing the problem on?

Flink-Table-Api (SQL)

How to reproduce

No response

Relevant log output

No response

Anything else

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

  • I agree to follow this project's Code of Conduct
@itinycheng
Copy link
Owner

Hi @limingoo,

可以并行读取;

如果是分布式表可以设置下 scan.partition.num,这个配置可以按照分片切分并行度;
如果有number类型列可以设置下 scan.partition.num, scan.partition.lower-bound, scan.partition.upper-bound,这个类似于jdbc connector的并行度切分策略;

@limingoo
Copy link
Author

Hi @limingoo,

可以并行读取;

如果是分布式表可以设置下 scan.partition.num,这个配置可以按照分片切分并行度; 如果有number类型列可以设置下 scan.partition.num, scan.partition.lower-bound, scan.partition.upper-bound,这个类似于jdbc connector的并行度切分策略;

这个并行度切分策略有没有其他方案,比如说使用limit实现,这样数据比较均衡一些。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants