- Similarity between faces: One person resembles another person to a large degree. This can lead to many problems facing security surveillance systems.
- Facial recognition systems have difficulty distinguishing between the main person and other people who are highly similar in terms of features. Therefore, in this study, a generative convolutional neural architecture was proposed that is capable of extracting many people who have a high degree of similarity to the main person.
- This study helps generate a large number of samples of useful facial images that help biometric neural structures focus on more interesting features to distinguish between people.
- The generative neural architecture focuses on controlling the latent space by focusing on part of the dimensions of the latent space, and thus working to generate a new image that bears the features of the basic image, but with some change.
- 10 dimensions were used to control the regeneration of the image while changing the features of the generated image to make it similar to the basic image but different from it in some things through manipulation of the 10 dimensions.
- 70 dimensions were used for the basic features that define the main person, and therefore 70 represent the basic features of the image, while 10 dimensions were used to slightly change the features of the main person.
- Every time we get a different input for 10 inputs, the generative neural structure will redraw the facial image in a way that is similar to the basic image of the face but differs from it in some features.
- Adaptive instance normalization was used to reformulate the basic image of the person while changing some features and at the same time maintaining a person close to the basic person.
- The idea here is to use the normalization process within the structure of the decoder and not within the structure of the encoder.
- The encoder structure was used to form and study the basic features of the facial image and project the image features to their place within the basic 70-dimensional latent space.
- While in the case of decoding, after receiving the input from the location of the basic image within the 70-dimensional latent space, input from 10 dimensions will be passed to influence the regeneration process so that the regeneration process is conditionally normalized.
- Note: In this study, the InfoGan methodology was used for semi-supervised conditional generation, but modifications to the InfoGan architecture were made by modifying the decoder architecture.
Examples | #Outputs |
---|---|
- Note: The vast majority of the data sets that were used include pictures of people with white skin, and therefore there are a small number of black people in the data set, and even within the data set there is not a large number of pictures of Asian people. Therefore, you can retrain with a larger data set than the one I used