24 Jan 02:12

github-actions

v0.2.1

981be29

v0.2.1 Latest

Latest

Full Changelog: v0.2.0...v0.2.1

Changes:

Addition of CVM Plugin:

Based on the CVM plugin in PCTR-DNN, we have integrated and tested its functionality within our TN environment. The plugin extends embeddings by incorporating exposure and click data of features into the training process. During the forward propagation, while retrieving feature embeddings from the sparse table, the exposure and click values of the features are also fetched and included as two additional output columns. In the backward propagation, the show/click counts for the batch are propagated back to the sparse table along with the gradients.

Normalization Enhancements with PCTRDNN Statistics Logic:

We have introduced new statistical logic in the normalization process. This includes calculating incremental counts and sums, as well as computing the incremental variance as (data - mean).sqrt().

Removal of Environment Variable Control for Sparse Initialization:

To reduce performance overhead, we have eliminated the use of environment variables for controlling the initialization of sparse data structures.

Assets 2

24 Oct 13:03

github-actions

v0.2.0

74bfb58

v0.2.0

Full Changelog: v0.1.3...v0.2.0

Add Global Normalization

Code: normalization_layer.py

Changes:

Use Cumulative Sum and Sum of Squares Across Samples:
- Instead of relying solely on mean and variance within a single batch, the normalization now utilizes the cumulative sum and sum of squares across all samples.
Synchronize Across Nodes After Accumulating Partial Batches:
- After accumulating a portion of batches, the statistics are synchronized across all nodes to ensure consistency.
Update BN Table Statistics Per Batch:
- For each batch, the statistics in the Batch Normalization (BN) table are updated. The overall mean and variance are computed, and the results are outputted accordingly.
Store Mean and Variance in Checkpoint:
- The mean and variance are stored within the checkpoint, serving as parameters for prediction.
- The statistics are also stored in HDFS for future reference and usage.

Ref

Keras BN
TF Synchronized BN

Local Testing

Assets 2

22 Jul 03:15

github-actions

0.1.3.post2-tool

87bf293

0.1.3.post2-tool

Full Changelog: v0.1.3.post1...0.1.3.post2-tool

新增 qihoo-tensornet-tool pip package, 主要包括

合并sparse: sparse文件分散在不同层级下, 合并成一个hdfs目录下的单个或多个文件
sparse/dense的并行度修改: 旧版在并行度确定后无法更改, 不能实现扩缩容.
合并外部embedding: 通过传入已有sign与新sign值的对应关系, 实现新增embedding

新增功能都通过脚本手动提交, 暂不支持日常训练嵌入

Assets 2

05 Jul 09:34

github-actions

v0.1.3.post1

0f8555c

v0.1.3.post1

Full Changelog: v0.1.3...v0.1.3.post1

Assets 2

01 Jul 09:17

github-actions

v0.1.3

3862d68

v0.1.3

What's Changed

Format tensornet build env
Format release pipeline
Add deleteByShow for longtail embeddings
Compat input format and file pattern
Add sequence_embedding_features.py
Provide tool to merge sparse table

Full Changelog: 0.1.1...v0.1.3

Assets 2

04 Jan 12:44

zhangys-lucky

0.1.1

8ab0661

tensornet-0.1.1

enhance:

optimize parameter push and pull performance
compatible with tf-2.3, tf-2.4
support sparse table save with name.
dd feature drop show threshold and update show decay with moving avg

add:

add ftrl optimizer
add deepfm demo.

delete:

delete version field in sparse table.

bug fix:

fix model reload warning bug

Assets 2

11 Sep 06:49

zhangys-lucky

v0.1.0

ea19d0a

tensornet-0.1.0

TensorNet V-0.1.0

First version of TensorNet.

In this version we have published tensornet with async train mode support which we have tested completely.

the main API contained:

tn.distribute.PsStrategy.
tn.distribute.PsStrategy with same interface with tensorflow's strategy in order to cluster management.
tn.feature_column.category_column
tn.feature_column.category_column is one of the important API of tensornet, with which we could define sparse feature column support dimension close to 2**64. tn.feature_column.category_column has the same interface with tf.feature_column.
tn.layers.EmbeddingFeatures
tn.layers.EmbeddingFeatures is the second important API in tensornet, in which we pull and push sparse embeddiong vector from parameter server.
tn.optimizer.Optimizer
We wrapped tensorflow optimizer in tn.optimizer.Optimizer, this is mainly used in asyc train mode, in which we intercept tensorflow's gradients update logic, and update gradients in parameter server asynchronously.
tn.model.Model
We inherited keras.layers.model and override its save model method to support save and load sparse feature in parameter server.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changes:

Add Global Normalization

Changes:

Ref

Local Testing

What's Changed

TensorNet V-0.1.0

Releases: Qihoo360/tensornet

v0.2.1

Changes:

v0.2.0

Add Global Normalization

Changes:

Ref

Local Testing

0.1.3.post2-tool

v0.1.3.post1

v0.1.3

What's Changed

tensornet-0.1.1

tensornet-0.1.0

TensorNet V-0.1.0