Background Large databases are being increasingly used for examining the epidemiology and outcomes of digestive and liver disorders. The complexity and rigor of the methods used to conduct these studies are often underestimated. Aims For the most commonly used databases, we provide a brief description of the contents, highlight strengths and weakness, and provide links for more detailed information. We also present a systematic approach to utilizing large databases for addressing research questions, highlighting commonly encountered study design issues, as well as strategies for resolving these issues. Conclusions 1. Research using large databases requires the same essential skills needed to conduct research studies using other data sources. These include a rigorous study design, expertise in analytic methods, and relevant research questions. 2. The completeness and accuracy of information contained in the database must be assessed. Methods for improving the quality and completeness of this information should be considered. 3. Despite similarities among large databases, gaining insight and experience into the structure and content of each database is essential. center dot Large databases can be a powerful source of information to examine the clinical epidemiology and outcomes of digestive and liver disorders. center dot Research using large databases requires the same essential skills needed to conduct research studies using other data sources. These include a rigorous study design, expertise in analytic methods, and relevant research questions. center dot The completeness and accuracy of information contained in the database must be assessed. Methods for improving the quality and completeness of this information should be considered. center dot Despite similarities among large databases, gaining insight and experience into the structure and content of each database is essential. center dot Examples of commonly used large databases are presented with a synopsis of information contained in the database, as well as strengths and limitations of using the database for research.