Comparing 3 ways to store face features when developing facial recognition search

Thanks to the development of innovative technologies, miracles disappear from our lives one by one and become common everyday occurrences... and it's not so bad since in return we get a chance to benefit from these great achievements. And this also applies to facial recognition software.

The area is still quite new, but many companies have already managed to profit from it. That’s why the market is saturated with popular face recognition apps which have attracted numerous loyal users and started making good money. And if you also want to carve out your own piece of the pie, take your time to read our article carefully.

You'll learn about the purpose of facial recognition programming, understand how they work, find out a few efficient options on how to store faces when developing facial recognition search, and get other relevant information on the topic.

What is facial recognition technology?

In fact, we're talking about an online or mobile program able to isolate a human face in an image or video, so it is all about to determine the identity of a person it belongs to. By the way, the software can recognize other objects in the same way either (which will be discussed below).

Current face recognition apps identify such details as:

  • sex of the person;

  • his approximate age;

  • emotional state.

It captures, analyzes, and compares patterns based on the person's facial details.

  1. The face detection process is an essential step as it detects and locates human faces in images and videos.

  2. The face capture process transforms analog information (a face) into a set of digital information (data) based on the person's facial features.

  3.  The face match process verifies if two faces belong to the same person.

Today it's considered to be the most natural of all biometric measurements. And for a good reason – we recognize ourselves not by looking at our fingerprints or irises, for example, but by looking at our faces. 

Before we go any further, let's quickly define two keywords: "identification" and "authentication".

facial recognition search

Face recognition data to identify and verify

Biometrics are used to identify and authenticate a person using a set of recognizable and verifiable data unique and specific to that person. 

  • Identification answers the question: "Who are you?"

  • Authentication answers the question: "Are you really who you say you are?"

Here are some examples : 

  • In the case of facial biometrics, a 2D or 3D sensor "captures" a face. It then transforms it into digital data by applying an algorithm before comparing the image captured to those held in a database.

  • These automated systems can be used to identify or check the identity of individuals in just a few seconds based on their facial features: spacing of the eyes, bridge of the nose, the contour of the lips, ears, chin, etc.

  • They can even do this in the middle of a crowd and within dynamic and unstable environments. Proof of this can be seen in the performance achieved by Thales' Live Face Identification System (LFIS), an advanced solution resulting from our long-standing expertise in biometrics.  

  • Owners of the iPhone X have already been introduced to facial recognition technology. However, the Face ID biometric solution developed by Apple was heavily criticized in China in late 2017 because of its inability to differentiate between individual Chinese faces. 

Of course, other signatures via the human body also exist, such as fingerprints, iris scans, voice recognition, digitization of veins in the palm, and behavioral measurements. 

What are the recognition features?

When it comes to pattern recognition, a face identification system must store hundreds of thousands of feature vectors. The amount of vectors depends on the number of users. When identifying a person, the system not only recognizes the presence of a face in the frame but also highlights the face in the frame and finds a specific user.

facial recognition search

For example, when it comes to face recognition, the system measures such features as the distance between the eyes, the diameter of the eyes, the diameter of the nasal openings, the sharpness of the chin, and many other nuances. There are 128 such features in total.

The point of such painstaking measurements is that simply measuring face to face is not enough. Indeed, depending on the light, haircut, or the level of unshaven, the face can be identified in different ways. And such precise details can affect accuracy.

The main point of this article is to identify the best solution of how to store all these features since the more users you have in the systems, the more feature vectors you have to store.

Also, we want to draw your attention to the fact that these features may apply not only to the face but also to objects and even sounds. At the same time, storage methods do not depend on what features are in question.

What is challenging in storing feature vectors in the recognition systems?

The problem with this question is that the system stores several vectors of features of each user for more accurate recognition. Therefore, our task is to design a database that can store millions of feature vectors.

The problem with this question is that the system stores several vectors of features of each user for more accurate recognition. Therefore, our task is to design a database that can store millions of vectors. The difficulty is that none vector of features will ever be equal to another, that is, 100% identification is impossible.

In this case, the principle of "as close as possible" identification is used. That is, it has a minimal deviation from the vector in the base. This is calculated using the least-squares method, i.e. the sum of the squares of the distances between the elements of the vectors should be minimal.

Technically, we face two challenges. The first is a large amount of data, which leads to lengthy information processing. The second difficulty is the lack of a search algorithm.

Also, it is worth noting that we choose a customized solution to the problem because all cloud services will be much more expensive and technically it is not that difficult to do.

3 best methods of storing features for the fastest facial recognition search

Here we will only talk about using SQL storage, i.e. databases like MySQL, PostgreSQL, MS SQL. These are where performance issues lie. An alternative option would be to use noSQL storages like Cassandra, Redis, MongoDB - they have much more performance capabilities and there will be completely different data structures and indicators. But that's a topic for another article...

Method 1: All features serialized in one text column

The features are contained in one column, and there is a text field containing an array of 128 features

In the first option, it is necessary to make a selection for all values in the features column, which takes longer to fill the table.

Initially, the table contains 200 vectors. We make a cycle of 1000 search queries and after each query, we add a new vector to the database - usually, this is what face search systems do. 

Distribution of the number of requests by response time (ОХ - seconds, ОУ - number of requests).

faces storage method1

Average response time: 5.3s.

As you can see in the following graph, the linear load growth of the response time as the base increases. That is not very convenient.

faces storage method1

Method 2: One column per each feature

Features are contained in 128 columns (f1, f2, ..., f128). 

Distribution of the number of requests by response time.

faces storage method2

Average response time: 2.8s.

As you can see in the next graph, the response time does not change as the table grows.


faces storage method2

PS. Response time for getting features by photo ~ 1s.

Method 3: One-to-many related tables

Vectors are stored in 2 one-to-many connected tables. Table 1 stores only vectors (no features there) and table 2 stores all features of all vectors (1 row per 1 feature).

faces storage method3

As you can see, the third method is much faster than the first. However, the trend in the speed of request-response is increasing. That means much worse results for us.

Method 2 turned out to be the most working. And do not be afraid that you need to create 128 columns for 128 features, it may look scary, but in terms of performance, it is much cooler than methods 1 and 3. Such data storage in tables is not typical and no matter how recommended, but this is exactly the case when one should step back from "architectural beauty" and do just that for the sake of high performance.

Our experience 

In one of our custom software projects for a retail market, we used a face tracker to identify regular customers.

Also, a similar system was used for another project to identify people who came to the store after seeing a certain ad.

Final thoughts

The topic of identification systems by faces, sounds, and other parameters is gaining more and more popularity every year. Accordingly, the question of how to store large data of vectors of features in recognition systems is also being raised more and more often.

This task is quite solvable and does not require large expenses, unlike cloud services, but you have to try to reduce the response time of the database. If in your SaaS project there is a need for such custom development service, our team is ready to solve this issue! We are waiting for your applications in the form below.

Table of content

Rate this article

See also