gluonnlp.loss¶
Gluon NLP Toolkit provides tools for easily setting up task specific loss.
Activation Regularizers¶
We now provide activation regularization and temporal activation regularization defined in the following work.
@article{merity2017revisiting,
title={Revisiting Activation Regularization for Language RNNs},
author={Merity, Stephen and McCann, Bryan and Socher, Richard},
journal={arXiv preprint arXiv:1708.01009},
year={2017}}
ActivationRegularizationLoss 
Computes Activation Regularization Loss. 
TemporalActivationRegularizationLoss 
Computes Temporal Activation Regularization Loss. 
API Reference¶
NLP loss.

class
gluonnlp.loss.
ActivationRegularizationLoss
(alpha=0, weight=None, batch_axis=None, **kwargs)[source]¶ Computes Activation Regularization Loss. (alias: AR)
The formulation is as below:
\[L = \alpha L_2(h_t)\]where \(L_2(\cdot) = {\cdot}_2, h_t\) is the output of the RNN at timestep t. \(\alpha\) is scaling coefficient.
The implementation follows the work:
@article{merity2017revisiting, title={Revisiting Activation Regularization for Language RNNs}, author={Merity, Stephen and McCann, Bryan and Socher, Richard}, journal={arXiv preprint arXiv:1708.01009}, year={2017} }
Parameters:

class
gluonnlp.loss.
TemporalActivationRegularizationLoss
(beta=0, weight=None, batch_axis=None, **kwargs)[source]¶ Computes Temporal Activation Regularization Loss. (alias: TAR)
The formulation is as below:
\[L = \beta L_2(h_th_{t+1})\]where \(L_2(\cdot) = {\cdot}_2, h_t\) is the output of the RNN at timestep t, \(h_{t+1}\) is the output of the RNN at timestep t+1, \(\beta\) is scaling coefficient.
The implementation follows the work:
@article{merity2017revisiting, title={Revisiting Activation Regularization for Language RNNs}, author={Merity, Stephen and McCann, Bryan and Socher, Richard}, journal={arXiv preprint arXiv:1708.01009}, year={2017} }
Parameters: