In the training process, dynamically adjusting the size of data can enhance the model's adaptability to different conditions size The adaptability of size objects , It is a means to improve the generalization ability of the model . If in paddledection In kit , It provides a variety of parameter settings size Combine ; Besides , Bloggers are reading SegFormer Model paper , The author also mentioned the use of 0.5-2.0 Scale the training data at the magnification of ( The author's operation also includes random horizontal flip and random clipping ). This operation is implemented for this purpose , For images resize Both use PIL.Image Realization and ai The framework has nothing to do with ,paddle、pytorch、tensorflwo You can use . Target detection is not supported for the time being , It will be improved if necessary in the future .
Remember in use , Conduct random_size when , Because you want to convert the picture to uint8, Therefore, the incoming data cannot be standardized ( Subtract the mean and divide the variance ), Only data that allows normalization is passed in . Otherwise, the performance will be extremely poor , The model doesn't converge ( There are negative numbers in the standardized data , turn uint8 There is information loss after )
1.1 resize Implementation of function
In the process of image processing resize when , It should be noted that label The format is WH, For two-dimensional data , and image The format is CWH or WHC For 3D data , Besides image It is also divided into single channel and three channels , The three channels are usually rgb data , The single channel is usually gray image data . Different functions need to be selected for different data resize operation , There are the following three functions .
from PIL import