Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] add vocab file for features #97

Merged
merged 6 commits into from
Feb 5, 2025

Conversation

tiankongdeguiji
Copy link
Collaborator

No description provided.

@tiankongdeguiji tiankongdeguiji merged commit 3bee923 into alibaba:master Feb 5, 2025
5 checks passed
@@ -87,6 +87,9 @@ def num_embeddings(self) -> int:
num_embeddings = len(self.vocab_list)
elif len(self.vocab_dict) > 0:
num_embeddings = max(list(self.vocab_dict.values())) + 1
elif len(self.vocab_file) > 0:
self.init_fg()
num_embeddings = self._fg_op.vocab_list_size()
else:
raise ValueError(
f"{self.__class__.__name__}[{self.name}] must set hash_bucket_size"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

还要加一个 提示 vocab_file

elif len(self.vocab_file) > 0:
fg_cfg["vocab_file"] = self.vocab_file
fg_cfg["default_bucketize_value"] = self.default_bucketize_value
fg_cfg["value_type"] = "string"
elif self.config.HasField("num_buckets"):
fg_cfg["num_buckets"] = self.config.num_buckets
if self.config.default_value:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

前面说使用num_buckets,应该是integer。为什么这里当没有设置default_value的时候,要设置string ? fg_cfg["value_type"] = "string"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants