| BIOMETRICS PUBLICATIONS
PHILIPPINE SOCIAL SECURITY SYSTEM AFIS BENCHMARK TEST PLAN
I. INTRODUCTION
The purpose of the Social Security Identification System (SSIS) automatic fingerprint identification system (AFIS) benchmark testing described in this document is to obtain performance data to verify that all AFIS performance requirements set forth in the "Request for Proposal" (RFP) can be met by the proposed system. It is our goal that these tests, to the fullest extent possible, be:
1) simple;
2) easy and rapid to perform;
3) easy and rapid to evaluate;
4) scientifically and statistically sound, such that results can be used as benchmarks by the vendors and other user organizations.
Our primary concern with regard to the AFIS system is the trade-off between false non-match errors (operationally allowing the possibility of a duplicate enrollment of a previously enrolled cardholder) and system search and match (S&M) speed, as driven by the fifth-year enrollment schedule given as Figure 1 in the SSIS RFP. This schedule calls for the four-finger processing approximately 0.2 people per second (based on 98% availability of system over 365 24-hour days annually) into a system that already holds about 20,000,000 enrollees. The S&M process, involving the search of the new enrollees against the entire database, must be accomplished on a time scale to allow compliance with the 36-hour turn around required in Section III-C-6 of the RFP, while ensuring no more than a 5% false non-match rate on searches involving cardholders already enrolled in the system. The S&M throughput rates will require some combination of hardware speed and parallelism with clever algorithmic search strategies. It is the goal of this testing to assure that these search strategies do not drive the system non-match error rate over the allowable 5% level.
Simultaneously, we require a false match rate of not more than 0.001 (1/10th of 1%) over all 1:N searches, even as N surpasses 20,000,000 during the fifth year. Given the presence of four fingerprints for almost all users, we feel that this goal will be attainable by most systems. We have set no goals for the performance of the 1:1 "verification" algorithm to be used during some applications of the issued Social Security Identification Card. It is our assumption that the basic 1:1 match algorithm will be the same in both systems and that accept/reject thresholds can be set based on future system policy decisions.
As the system false non-match rate is of such operational concern, the RFP specifies (Section III-B-3c) that no "filtering" based on information exogenous to the fingerprint (such as sex or birth year) may be employed, with the exception that prints may be classified as left and right forefingers and thumbs. On the other hand, it is our assumption (although by no means a requirement) that AFIS vendors will extensively employ "binning" techniques based on data endogenous to the fingerprints themselves. We realize that good estimations of the efficiencies of the proposed S&M strategies require an understanding of the bin distributions of our SSS cardholder population. The benchmark tests will attempt to provide an estimate of that distribution.
Under the requirement of no exogenous filtering, the system false match rate will be driven only by physiological, presentation and image device changes. Such changes will create differences in the presented and enrolled (template) fingerprint images , stressing the AFIS algorithms during both the binning and match processes. Exhibit F of the SSIS RFP requires that enrollees with "damaged" fingers reschedule for appointments. Nonetheless, we project that some small percentage of all enroll ees will have fingerprint images non-matchable by vendor algorithms.
It is assumed, but by no means required, that the vendors' S&M strategy will be to search first for matches against a primary finger, then confirm any matches against the remaining sample fingers. For this benchmark, the "false rejection rat e" of 5% specified in Section III F of the RFP will be the false non-match rate against the primary finger, to be considered the combination of the binning error and 1:1 match algorithm error rates. The false acceptance rate of 0.1%, as given in Section III F, will be interpreted a false match of all four fingers. The disposition of intermediate combinations of matches and non-matches will be done by a system policy, which may include human intervention, to be determined only after the system is operational. These cases will not be considered in this study. The goal of the AFIS benchmark testing will be to predict binning strategy efficiencies and error rates for both the binning and matching algorithms so that compliance with both the
through put requirement of Figure 1 and the system error requirements of Section III F can be predicted.
II. THE BENCHMARK DATA SETS
All AFIS benchmark testing will be done solely on fingerprint data sets to be provided by the SSIS team. Two data sets will be provided to all "short listed" vendors: a training set, consisting of 4,000 images; and a test set, consisting also of 4,000 images. There is a strong correspondence, although not 1:1, between the fingers imaged in both sets. Particular care has been taken to ensure that all images in the training set are of the best quality reasonably attainable from each volunteer. The test images more closely approximate the quality expected in system operation, but in all cases contain a core. All images are of resolution 512 pixels per inch, approximately 2.5 cm by 2.0 cm in size, 8 bit gray-scale and stored as TIFF files . All images were obtained using scanners of a single model from a single manufacturing production run. The two data sets will be distributed to "short listed" vendors on CD-ROM or other common media to be determined.
III. TECHNICAL PLAN
The vendors will be required to submit to the SSIS team sufficient technical detail of the hardware/software system to allow evaluation of the system capability for meeting the fifth-year throughput requirement. Vendors are required to establish a rational for their claims regarding compliance with the fifth-year throughput rate. For the purposes of these calculations, the vendor may ignore the processing time required for actual or false matches. Details should include a description of the parallelization and "cold match" (1:1 fingerprint comparison) processing rates of the hardware system. A description of any algorithmic strategies designed to limit the number of cold matches required should also be included, including the proposed use of binning. We do not require disclosure of the technical details determining the assignment of prints to bins, but do ask that the number of bins on each partitioning level, and the nature of any special bins ("search all" or any "soft" search designators), be disclosed so that efficiency gains can be predicted for searches over our population. Vendors using general "soft search" (statistically based) systems are required to disclose the details of the search strategy.
IV. THE BENCHMARK TESTS
Two tests, a binning test and a matching test, will be required of all short listed vendors, each to be conducted using only the 8,000 fingerprint images of the test data described in Section II above.
A. THE BINNING TEST
The first test is designed to determine whether the endogenous binning scheme can provide the required increase in search efficiency while meeting the error rate requirements. Vendors will be required to determine and report to us the bin or bins of each fingerprint image in the both the training and test files. This data will allow us to determine the distribution of fingerprints among bins and to determine the frequency of binning errors. From the binning scheme and the bin distributions, the expected search depth coefficient (S&M efficiency increase due to binning) will be estimated for our population.
Results must be returned to the SSS on a PC-compatible disk in an ASCII file with the following format:
|
training file name,
|
bin(s)
|
|
::
|
::
|
|
::
|
::
|
|
test file name,
|
bin(s)
|
|
::
|
::
|
|
::
|
::
|
The bin column may include a number or word designator identifying the chosen bin(s) for each file. If multiple bins are named for any file, the vendor must explain in separate documentation whether the bins represent exclusive partitions or multiple searches of all named bins. Any single bin representing multiple searches must likewise be explained. Vendors using "soft" binning schemes may depart from the above format, providing that sufficient statistical information for each file is included for the SSIS team to determine search efficiencies and expected error rates by standard statistical methods. Vendors who do not propose to use any endogenous binning should submit the required ASCII file with "0" placed in the bin column . Such vendors will be deemed to have zero binning errors and zero binning-improved search efficiency.
The SSIS team will evaluate bin distributions, search depth coefficient and error rates for our population. This data will be transmitted to the vendor.
B. THE MATCHING TEST
The second test is designed to determine both the 1:1 false match and false non-match error rates. Vendors will be required to match image files from the test set to image files in the training set. It is highly recommended that NO binning b e done during the matching tests to avoid "double jeopardy" with regard to bin errors. Vendors may chose to report either "hard" or "soft" results. "Hard" results represent the vendors best guess as to which files , if any, in the training set match each file in the test set. Files in the test set found not to match any file in the training set are to be reported as a null match. "Soft" results represent a one-dimensional distance measure between each test file and each of the training files. Results must be returned on a PC-compatible disk as an ASCII file.
Vendors reporting "hard" results must use the following format:
|
test file name,
|
matching training file name,
|
matching training file name,
|
|
::
|
::
|
::
|
|
::
|
::
|
::
|
Test files not matching any training file should include "NULL" in the second column. The SSIS team will evaluate the number of false matches by counting the number of files listed as "matching training files" which do not indeed come from the same finger as the test file. The false non-matches will be determined by counting the rows in which the actual matching training file(s) are not listed. False match and non-match rates will be established independently for
thumb-to-thumb and finger-to-finger matches. Although the vendor will have compared each thumb in the test set to all fingers in the data set, the number of comparisons will be taken to be the number of thumbs in the training set. Finger comparisons will be handled accordingly. This will have the effect of penalizing any finger mistaken for a thumb and vice versa. The error rates determined for use in system calculations will be the average of the thumb and finger error rates, reflecting the composition of the proposed four-finger system.
Vendors choosing to report "soft" results must deliver a 4,000 x 4,000 matrix containing the distances (integer or float with no delimiters between entries except a line delimiter between rows) between test and training files. The columns o f the matrix will represent training files in the order supplied and the rows will represent test files in the order supplied. For instance, element (1,1) will represent the distance between the first test file and the first training file, while element (1,2) will represent the distance between the first test file and the second training file.
The SSIS team will evaluate the data to establish the distance threshold giving the lowest possible false non-match rate while meeting the fifth-year false match rate requirements.
The SSIS team will determine 1:1 false match and false non-match error rates for our population and transmit these results to the vendor.
V. EVALUATION OF THE AFIS BENCHMARK TEST
Based on the expected search depth coefficient calculate from the binning tests, the binning scheme proposed, and the proposed hardware parallelism and processing rates, the throughput rates of the proposed system will be established at the fifth- year target of 20,000,000 previous enrollees. Systems calculated to be within 10% of the target throughput rate of 0.2 persons per second will be deemed to have passed this test.
Using the proposed binning scheme and the bin error rate discovered in the binning tests, the system bin error rate will be established. This system bin error will be added to the 1:1 false non-match rate measured in the matching tests, to estimate the system non-match rate under the assumption that system non-match occurs only when the primary finger is not matched. Systems calculated to be within 10% of the target error rate of 5% will be deemed to have passed this test.
Based upon the expected search depth coefficient, the target database size of 20,000,000 enrollees, and the 1:1 false match rate, the system false match rate will be estimated under the assumption that system false match occurs only when all four prints are falsely matched. Systems calculated to be within 10% of the 0.001 false match rate will be deemed to have passed this test.
BENCHMARK TEST SUMMARY
- Prepare detailed hardware system plan showing architecture parallelism and component 1:1 match rates.
- Prepare detailed "binning" plan if S&M efficiencies are required to meet throughput goals.
- Load "training set" TIFF files into AFIS system.
- Report bin(s) for each "training set" image.
- Create feature (minutiae) records for each training image and store without binning.
- Load and pre-process "test set" TIFF files reporting bin(s) for each image.
- Match "test set" images against "training set" database without binning.
|