A SOLUTION TO PROBLEM OF LINKING MULTIVARIATE DOCUMENTS

被引:14
作者
DUBOIS, NSD
机构
[1] University of California, Los Angeles
关键词
D O I
10.1080/01621459.1969.10500961
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In many scientific investigations, it is desired to bring together, or link, two or more documents which represent the same individual, even though these documents do not contain a unique identifier and were derived from different sources. In medical and public health research and elsewhere, this problem is known as the document linkage problem. This paper considers some aspects of classifying pairs of documents into one of two populations when their items are identifying information, where each item of information can take on three distinct values correct, incorrect or missing. Section 1 identifies three document linkage problems. Sections 2 and 3 deal with the mathematical formulation of the multivariate document linkage problem. Section 4 gives the classification procedure and Section 5 deals with the application of the theory to a problem in the field of public health. © Taylor & Francis Group, LLC.
引用
收藏
页码:163 / &
相关论文
共 19 条
[1]  
ACHESON ED, 1964, P ROYAL SOC MEDICINE, V57, P11
[2]   SOME CLASSIFICATION PROBLEMS WITH MULTIVARIATE QUALITATIVE DATA [J].
COCHRAN, WG ;
HOPKINS, CE .
BIOMETRICS, 1961, 17 (01) :10-&
[3]  
COX DR, 1959, J ROY STAT SOC B, V21, P195
[4]   ON THE PROBLEM OF MATCHING LISTS BY SAMPLES [J].
DEMING, WE ;
GLASSER, GJ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1959, 54 (286) :403-415
[5]   A DOCUMENT LINKAGE PROGRAM FOR DIGITAL-COMPUTERS [J].
DUBOIS, NSD .
BEHAVIORAL SCIENCE, 1965, 10 (03) :312-319
[6]  
FELLEGI IP, 1967, P INT S AUTOMATION P, V1, P93
[7]   ON THE ANALYSIS OF SAMPLES FROM K-LISTS [J].
GOODMAN, LA .
ANNALS OF MATHEMATICAL STATISTICS, 1952, 23 (04) :632-634
[8]  
HILLS M, 1966, J ROY STAT SOC B, V28, P1
[9]   A SOLUTION TO THE PROBLEM OF OPTIMUM CLASSIFICATION [J].
HOEL, PG ;
PETERSON, RP .
ANNALS OF MATHEMATICAL STATISTICS, 1949, 20 (03) :433-438
[10]   TECHNIQUES FOR DISCRIMINANT-ANALYSIS WITH DISCRETE VARIABLES [J].
LINHART, H .
METRIKA, 1959, 2 (02) :138-149