Siamese models

facerec.siamese_models.get_base_model(base_architecture: str = 'ResNet50', input_shape: Tuple[int, int, int] = (150, 150, 3)) → Sequential

Get a pre-trained base model for a Siamese network.

Parameters:

base_architecture (str, optional) – The architecture of the pre-trained model (default: ‘ResNet50’).
input_shape (tuple[int, int, int], optional) – The input shape for the base model (default: (150, 150, 3)).

Returns:

Pre-trained base model.

Return type:

Sequential

Raises:

ValueError – If an unsupported base architecture is provided.

Example

>>> base_model = get_base_model('VGG16', (224, 224, 3))

facerec.siamese_models.get_siamese_model(base_architecture: str = 'ResNet50', input_shape: Tuple[int, int, int] = (150, 150, 3)) → Model

Create a Siamese network model.

Parameters:

base_architecture (str, optional) – The architecture of the pre-trained base model (default: ‘ResNet50’).
input_shape (tuple[int, int, int], optional) – The input shape for the model (default: (150, 150, 3)).

Returns:

Siamese network model.

Return type:

Model

Raises:

ValueError – If an unsupported base architecture is provided.

Example

>>> siamese_model = get_siamese_model('VGG16', (224, 224, 3))

facerec.siamese_models.train_model(siamese_network: Model, model_name: str, training_dataset_generator: DataGenerator, validation_dataset_generator: DataGenerator, num_epochs: int = 50) → History

Train a Siamese network model.

Parameters:

siamese_network (Model) – The Siamese network model to train.
model_name (str) – Name of the model for saving checkpoints and logs.
training_dataset_generator (DataGenerator) – The generator for the training dataset.
validation_dataset_generator (DataGenerator) – The generator for the validation dataset.
num_epochs (int, optional) – Number of training epochs (default: 50).

Returns:

Training history.

Return type:

History

Example

>>> history = train_model(siamese_model, 'my_siamese_model', training_data_gen, validation_data_gen)

Utils

facerec.utils.accuracy(y_true: Tensor, y_pred: Tensor) → Tensor

Calculate accuracy.

Parameters:

y_true (tf.Tensor) – Ground truth binary labels.
y_pred (tf.Tensor) – Predicted distances.

Returns:

Accuracy score.

Return type:

tf.Tensor

Example

>>> true_labels = tf.constant([1, 0, 1])
>>> predicted_scores = tf.constant([0.8, 0.2, 0.7])
>>> acc = accuracy(true_labels, predicted_scores)

facerec.utils.contrastive_loss(y_true: Tensor, y_pred: Tensor) → Tensor

Calculate the contrastive loss for Siamese networks.

Parameters:

y_true (tf.Tensor) – Ground truth labels (0 for dissimilar, 1 for similar pairs).
y_pred (tf.Tensor) – Predicted similarity scores.

Returns:

Contrastive loss value.

Return type:

tf.Tensor

Example

>>> true_labels = tf.constant([1, 0, 1])
>>> predicted_scores = tf.constant([0.8, 0.2, 0.7])
>>> loss = contrastive_loss(true_labels, predicted_scores)

facerec.utils.euclidean_distance(vectors: Tensor) → Tensor

Calculate the Euclidean distance between two vectors.

Parameters:: vectors (tf.Tensor) – A tuple of two tensors (x, y) to calculate the distance.
Returns:: The Euclidean distance between the two input tensors.
Return type:: tf.Tensor

Example

>>> x = tf.constant([[1.0, 2.0], [3.0, 4.0]])
>>> y = tf.constant([[5.0, 6.0], [7.0, 8.0]])
>>> distance = euclidean_distance((x, y))

facerec.utils.precision(y_true: Tensor, y_pred: Tensor) → Tensor

Calculate precision.

Parameters:

y_true (tf.Tensor) – Ground truth binary labels.
y_pred (tf.Tensor) – Predicted distances.

Returns:

Precision score.

Return type:

tf.Tensor

Example

>>> true_labels = tf.constant([1, 0, 1])
>>> predicted_scores = tf.constant([0.8, 0.2, 0.7])
>>> prec = precision(true_labels, predicted_scores)

facerec.utils.recall(y_true: Tensor, y_pred: Tensor) → Tensor

Calculate recall.

Parameters:

y_true (tf.Tensor) – Ground truth binary labels.
y_pred (tf.Tensor) – Predicted distances.

Returns:

Recall score.

Return type:

tf.Tensor

Example

>>> true_labels = tf.constant([1, 0, 1])
>>> predicted_scores = tf.constant([0.8, 0.2, 0.7])
>>> rec = recall(true_labels, predicted_scores)

facerec.utils.specificity(y_true: Tensor, y_pred: Tensor) → Tensor

Calculate specificity.

Parameters:

y_true (tf.Tensor) – Ground truth binary labels.
y_pred (tf.Tensor) – Predicted distances.

Returns:

Specificity score.

Return type:

tf.Tensor

Example

>>> true_labels = tf.constant([1, 0, 1])
>>> predicted_scores = tf.constant([0.8, 0.2, 0.7])
>>> spec = specificity(true_labels, predicted_scores)

Dataset generator

class facerec.dataset_generator.DataGenerator(positive_pairs_path: str, negative_pairs_path: str, images_path: str, input_shape: Tuple[int, int, int], batch_size: int, seed: int, shuffle: bool, debug: bool = False)

Custom data generator for Siamese network training.

Parameters:

positive_pairs_path (str) – Path to the CSV file containing positive pairs.
negative_pairs_path (str) – Path to the CSV file containing negative pairs.
images_path (str) – Path to the folder containing image data.
input_shape (Tuple[int, int, int]) – The desired shape for input images (height, width, channels).
batch_size (int) – Batch size for data generation.
seed (int) – Random seed for shuffling data.
shuffle (bool) – Whether to shuffle the data at the end of each epoch.
debug (bool, optional) – Whether to enable debug mode. Defaults to False.

__getitem__(index: int) → Tuple[Tuple[ndarray, ndarray], ndarray]

Generate a batch of data.

Parameters:: index (int) – Index of the batch.
Returns:: A tuple containing the left and right input image batches and the label batch.
Return type:: tuple[tuple[np.ndarray, np.ndarray], np.ndarray]

__init__(positive_pairs_path: str, negative_pairs_path: str, images_path: str, input_shape: Tuple[int, int, int], batch_size: int, seed: int, shuffle: bool, debug: bool = False)

__len__() → int

Get the number of batches per epoch.

Returns:: The number of batches per epoch.
Return type:: int

on_epoch_end(): Shuffle data indices at the end of each epoch.

facerec.dataset_generator.train_val_test_split(pairs_path: str, train_size: float, val_size: float, test_size: float, seed: int)

Split data from a pairs file into training, validation, and test sets and save them to separate files.

Parameters:

pairs_path (str) – Path to the pairs file containing the data.
train_size (float) – Proportion of data to include in the training set (0.0 to 1.0).
val_size (float) – Proportion of data to include in the validation set (0.0 to 1.0).
test_size (float) – Proportion of data to include in the test set (0.0 to 1.0).
seed (int) – Random seed for shuffling data.

Raises:

AssertionError – If the sum of train_size, val_size, and test_size is not equal to 1.

Note

The train_size, val_size, and test_size should sum up to 1.

Example

>>> train_val_test_split("pairs.txt", 0.7, 0.2, 0.1, 42)

Pairs generator

class facerec.pairs_generator.PairsGenerator(images_directory: str)

Generates positive and negative pairs of images for training a Siamese network.

Parameters:: images_directory (str) – The path to the directory containing the images.

images_directory

The path to the directory containing the images.

Type:: str

pairs_directory

The path to the directory where generated pairs will be stored.

Type:: str

positive_pairs

A list to store positive pairs of images.

Type:: list[list[str, list[str]]]

negative_pairs

A list to store negative pairs of images.

Type:: list[list[str, str, list[str]]]

export_negative_pairs_to_csv(filename: str = 'negative_pairs.csv')

Exports negative pairs to a CSV file.

Parameters:: filename (str, optional) – The name of the CSV file to export negative pairs to (default is ‘negative_pairs.csv’).

export_positive_pairs_to_csv(filename: str = 'positive_pairs.csv')

Exports positive pairs to a CSV file.

Parameters:: filename (str, optional) – The name of the CSV file to export positive pairs to (default is ‘positive_pairs.csv’).

generate_negative_combinations(combinations_num: int = 15080)

Generates negative pairs of images.

This method generates negative pairs by selecting random combinations of two images from different people’s directories.

Parameters:: combinations_num (int, optional) – The number of negative combinations to generate (default is 15080).

Notes

This method uses random sampling to create negative pairs.

generate_positive_pairs()

Generates positive pairs of images.

This method generates positive pairs by selecting combinations of two images from the same person’s directory.

Notes

The maximum number of positive pairs generated is set to 49 per person.

get_all_images_names()

Gets the names of all images in the directory.

Returns:: A list of image names.
Return type:: list[str]