During the training of RNNs for sequence generation, what is the common technique used to mitigate the vanishing gradient problem?
Answer options
A
L1 regularization
B
Batch normalization
C
Dropout
D
Gradient clipping
E
Data augmentation
Correct answer: Gradient clipping
Explanation
The source marks the correct answer as: Gradient clipping.