Purpose Claims databases are an important source for pharmacoepidemiological studies although they often lack information on some confounders. Two-phase methodology was used to estimate the bleeding risk in patients treated with phenprocoumon from claims data combined with additional information on body mass index (BMI) and smoking. Methods We conducted a nested case-control study using claims data from 2004 to 2007 (phase 1). Additional information was obtained from interviews in a subset of 505 insurants (phase 2). Adjusted bleeding OR were calculated using logistic regression using data from the complete case-control dataset. Furthermore, a two-phase analysis was conducted, taking into consideration phase 2 data on BMI and smoking. Results The phase 1 sample included 1248 cases and 24 960 controls. In phase 1, we observed an adjusted bleeding ORs of 3.93 (95% CI: 2.75-5.61) for male subjects aged 55 years taking phenprocoumon. The bleeding risk associated with phenprocoumon use decreased with increasing age. The two-phase analysis revealed smoking and a high BMI as risk factors for bleeding. The OR for phenprocoumon obtained from the two-phase analysis was of similar size as the phase 1 estimate. Discussion Phase 2 data added valuable information on smoking and BMI. However, phase 1 results did not change dramatically after accounting for phase 2 information, which is reassuring for the validity of database studies. Copyright (C) 2012 John Wiley & Sons, Ltd.