Brush official Tutorial I found a usage self.v = torch.nn.Parameter(torch.FloatTensor(hidden_size)), After reading the explanation in the official tutorial, I am still in a fog , So I saw an explanation on stack overflow , And did a few experiments to fully understand this function . First of all, we can understand this function as a type conversion function , Will be a non trainable type Tensor Convert to a training type parameter And will the parameter Bind to this module Inside (net.parameter() There's this binding in parameter, So it can be optimized during parameter optimization ), So after the type conversion this self.v Become part of the model , It becomes a parameter in the model that can be changed according to the training . The purpose of using this function is to make some variables modify their values continuously in the learning process to achieve optimization .
s c o r e ( h t ) = { h t h ˉ s } d o t score(\mathtt{h}_t)=\left\{\mathtt{h}_t\mathtt{\bar h}_s\right\} dot score(ht)={ hthˉs}dot
stay concat Attention mechanism , A weight V It's continuous learning, so if parameter type .
By doing the following experiments, we found that ,linear Inside weight and bias Namely parameter type , And can't use tensor Type substitution , also linear Inside weight It's even possible to make model changes by specifying a shape different from the one at initialization .
And torch.tensor([1,2,3],requires_grad=True) The difference between , This just makes the parameters trainable , Not bound to module Of parameter In the list .