Sometimes you need to label, For example, the discrete action space of reinforcement learning , Output action index ; Sometimes you need to one-hot, For example, training data or the action of inputting the previous state , Simple mutual transformation is still important .
adopt np.eye(action_dims)[actions]
Quickly generate :
>>> import numpy as np
>>> label = [1,2,2,3]
>>> np.eye(4)[label]
array([[0., 1., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 1.]])
numpy Can pass np.argmax(onehot, 1)
Realization ,pytorch Can pass torch.topk(one_hot, 1)[1].squeeze(1)
Realization :
>>> import torch
>>> onehot
array([[0., 1., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 1.]])
>>> np.argmax(onehot,1)
array([1, 2, 2, 3], dtype=int64)
>>> torch.topk(torch.tensor(onehot), 1)[1].squeeze(1)
tensor([1, 2, 2, 3])