-
Notifications
You must be signed in to change notification settings - Fork 398
Open
Labels
data:sequentialRelated to timeseries datasetsRelated to timeseries datasetsfeature requestRequest for a new featureRequest for a new feature
Description
Environment details
- SDV version:0.17.2
- Python version:3.9.13
- Operating System:Windows 10
Question description
I have a dataset where real features seems to follow NegativeBinomial distribution so per the paper
I want to force the loss during training for some features to use NegativeBinomial distribution
from deepecho.py
for field in self._output_columns:
dtype = timeseries_data[field].dtype
kind = dtype.kind
if kind in ('i', 'f'):
data_type = 'continuous'
elif kind in ('O', 'b'):
data_type = 'categorical'
else:
raise ValueError(f'Unsupported dtype {dtype}')
all feature will be continuous , so while the training
for key, props in self._data_map.items():
if props['type'] in ['continuous', 'timestamp']:
mu_idx, sigma_idx, missing_idx = props['indices']
mu = Y_padded[:, :, mu_idx]
sigma = torch.nn.functional.softplus(Y_padded[:, :, sigma_idx])
missing = torch.nn.LogSigmoid()(Y_padded[:, :, missing_idx])
for i in range(batch_size):
dist = torch.distributions.normal.Normal(
mu[:seq_len[i], i], sigma[:seq_len[i], i])
log_likelihood += torch.sum(dist.log_prob(X_padded[-seq_len[i]:, i, mu_idx]))
p_true = X_padded[:seq_len[i], i, missing_idx]
p_pred = missing[:seq_len[i], i]
log_likelihood += torch.sum(p_true * p_pred)
log_likelihood += torch.sum((1.0 - p_true) * torch.log(
1.0 - torch.exp(p_pred)))
elif props['type'] in ['count']:
r_idx, p_idx, missing_idx = props['indices']
r = torch.nn.functional.softplus(Y_padded[:, :, r_idx]) * props['range']
p = torch.sigmoid(Y_padded[:, :, p_idx])
x = X_padded[:, :, r_idx] * props['range']
missing = torch.nn.LogSigmoid()(Y_padded[:, :, missing_idx])
for i in range(batch_size):
dist = torch.distributions.negative_binomial.NegativeBinomial(
r[:seq_len[i], i], p[:seq_len[i], i], validate_args=False)
log_likelihood += torch.sum(dist.log_prob(x[:seq_len[i], i]))
p_true = X_padded[:seq_len[i], i, missing_idx]
p_pred = missing[:seq_len[i], i]
log_likelihood += torch.sum(p_true * p_pred)
log_likelihood += torch.sum((1.0 - p_true) * torch.log(
1.0 - torch.exp(p_pred)))
all my features will be modeled as gaussian , which is not correct for my case
Metadata
Metadata
Assignees
Labels
data:sequentialRelated to timeseries datasetsRelated to timeseries datasetsfeature requestRequest for a new featureRequest for a new feature