Building a pipeline to solicit expert knowledge from the community to aid gene summary curation

被引:4
作者
Antonazzo, Giulia [1 ]
Urbano, Jose M. [1 ]
Marygold, Steven J. [1 ]
Millburn, Gillian H. [1 ]
Brown, Nicholas H. [1 ]
机构
[1] Univ Cambridge, Dept Physiol Dev & Neurosci, Downing St, Cambridge CB2 3DY, England
来源
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION | 2020年
基金
英国医学研究理事会; 美国国家卫生研究院;
关键词
ONTOLOGY;
D O I
10.1093/database/baz152
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Brief summaries describing the function of each gene's product(s) are of great value to the research community, especially when interpreting genome-wide studies that reveal changes to hundreds of genes. However, manually writing such summaries, even for a single species, is a daunting task; for example, the Drosophila melanogaster genome contains almost 14 000 protein-coding genes. One solution is to use computational methods to generate summaries, but this often fails to capture the key functions or express them eloquently. Here, we describe how we solicited help from the research community to generate manually written summaries of D. melanogaster gene function. Based on the data within the FlyBase database, we developed a computational pipeline to identify researchers who have worked extensively on each gene. We e-mailed these researchers to ask them to draft a brief summary of the main function(s) of the gene's product, which we edited for consistency to produce a 'gene snapshot'. This approach yielded 1800 gene snapshot submissions within a 3-month period. We discuss the general utility of this strategy for other databases that capture data from the research literature.
引用
收藏
页数:10
相关论文
共 18 条
  • [1] [Anonymous], 2019, ONL MEND INH MAN OMI
  • [2] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [3] FlyBase: establishing a Gene Group resource for Drosophila melanogaster
    Attrill, Helen
    Falls, Kathleen
    Goodman, Joshua L.
    Millburn, Gillian H.
    Antonazzo, Giulia
    Rey, Alix J.
    Marygold, Steven J.
    [J]. NUCLEIC ACIDS RESEARCH, 2016, 44 (D1) : D786 - D792
  • [4] UniProt: a worldwide hub of protein knowledge
    Bateman, Alex
    Martin, Maria-Jesus
    Orchard, Sandra
    Magrane, Michele
    Alpi, Emanuele
    Bely, Benoit
    Bingley, Mark
    Britto, Ramona
    Bursteinas, Borisas
    Busiello, Gianluca
    Bye-A-Jee, Hema
    Da Silva, Alan
    De Giorgi, Maurizio
    Dogan, Tunca
    Castro, Leyla Garcia
    Garmiri, Penelope
    Georghiou, George
    Gonzales, Daniel
    Gonzales, Leonardo
    Hatton-Ellis, Emma
    Ignatchenko, Alexandr
    Ishtiaq, Rizwan
    Jokinen, Petteri
    Joshi, Vishal
    Jyothi, Dushyanth
    Lopez, Rodrigo
    Luo, Jie
    Lussi, Yvonne
    MacDougall, Alistair
    Madeira, Fabio
    Mahmoudy, Mahdi
    Menchi, Manuela
    Nightingale, Andrew
    Onwubiko, Joseph
    Palka, Barbara
    Pichler, Klemens
    Pundir, Sangya
    Qi, Guoying
    Raj, Shriya
    Renaux, Alexandre
    Lopez, Milagros Rodriguez
    Saidi, Rabie
    Sawford, Tony
    Shypitsyna, Aleksandra
    Speretta, Elena
    Turner, Edward
    Tyagi, Nidhi
    Vasudev, Preethi
    Volynkin, Vladimir
    Wardell, Tony
    [J]. NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) : D506 - D515
  • [5] Directly e-mailing authors of newly published papers encourages community curation
    Bunt, Stephanie M.
    Grumbling, Gary B.
    Field, Helen I.
    Marygold, Steven J.
    Brown, Nicholas H.
    Millburn, Gillian H.
    [J]. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2012,
  • [6] The Gene Ontology Resource: 20 years and still GOing strong
    Carbon, S.
    Douglass, E.
    Dunn, N.
    Good, B.
    Harris, N. L.
    Lewis, S. E.
    Mungall, C. J.
    Basu, S.
    Chisholm, R. L.
    Dodson, R. J.
    Hartline, E.
    Fey, P.
    Thomas, P. D.
    Albou, L. P.
    Ebert, D.
    Kesling, M. J.
    Mi, H.
    Muruganujian, A.
    Huang, X.
    Poudel, S.
    Mushayahama, T.
    Hu, J. C.
    LaBonte, S. A.
    Siegele, D. A.
    Antonazzo, G.
    Attrill, H.
    Brown, N. H.
    Fexova, S.
    Garapati, P.
    Jones, T. E. M.
    Marygold, S. J.
    Millburn, G. H.
    Rey, A. J.
    Trovisco, V.
    dos Santos, G.
    Emmert, D. B.
    Falls, K.
    Zhou, P.
    Goodman, J. L.
    Strelets, V. B.
    Thurmond, J.
    Courtot, M.
    Osumi-Sutherland, D.
    Parkinson, H.
    Roncaglia, P.
    Acencio, M. L.
    Kuiper, M.
    Laegreid, A.
    Logie, C.
    Lovering, R. C.
    [J]. NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) : D330 - D338
  • [7] Model organism data evolving in support of translational medicine
    Howe, Douglas G.
    Blake, Judith A.
    Bradford, Yvonne M.
    Bult, Carol J.
    Calvi, Brian R.
    Enge, Stacia R.
    Kadin, James A.
    Kaufman, Thomas C.
    Kishores, Ranjana
    Laulederkind, Stanleyj F.
    Lewis, Suzanna E.
    Moxon, Sierra A. T.
    Richardson, Joel E.
    Smith, Cynthia
    [J]. LAB ANIMAL, 2018, 47 (10) : 277 - 289
  • [8] A gene wiki for community annotation of gene function
    Huss, Jon W., III
    Orozco, Camilo
    Goodale, James
    Wu, Chunlei
    Batalov, Serge
    Vickers, Tim J.
    Valafar, Faramarz
    Su, Andrew I.
    [J]. PLOS BIOLOGY, 2008, 6 (07) : 1398 - 1402
  • [9] Automated Functional Testing of Search Engine
    Jin, Lingzi
    [J]. 2009 ICSE WORKSHOP ON AUTOMATION OF SOFTWARE TEST, 2009, : 97 - 100
  • [10] Automatic summarising: The state of the art
    Jones, Karen Spaerck
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2007, 43 (06) : 1449 - 1481