behind the research of 'nctc3000: a century of bacterial strain collecting leads to a rich genomic data resource'
posted on june 9, 2023 by jo dicks
jo dicks takes us behind the scenes of her latest publication 'nctc3000: a century of bacterial strain collecting leads to a rich genomic data resource' published in microbial geonomics.
my name is jo dicks. i’m lead bioinformatician within culture collections at the uk health security agency. i originally studied maths at university but became fascinated by genetics during a masters degree in statistics. i’ve studied and worked in biological organisations ever since my msc project. that was my first tentative step into bioinformatics, which i’m not even sure had a name then. thirty years on, it’s a thriving discipline.
i’ve enjoyed working with biological collections throughout my career. i have vivid memories of visiting the international rice research institute (irri) in the philippines towards the end of the 1990s, learning about the international crop germplasm collections and how molecular markers were revolutionising their analysis and development. this was a time when we had a handful of microbial genome sequences and very little else. i remember wondering if we would be able to sequence the genomes of the organisms in our collections while i was still working. fast forward to today and we’re doing precisely that, whatever the organism. my research involves adding value to the ukhsa culture collections through analysis of the constituent genomes, either in-house or in collaboration with other scientists.
we’ve just published a description of the nctc3000 project in microbial genomics. this project was a very ambitious effort to long-read sequence, assemble and annotate the genomes of approximately half of the 6,000 bacterial strains within the national collection of type cultures (nctc);. it was a wellcome funded collaboration between what was then public health england and the wellcome sanger institute, with significant additional support from pacific biosciences. the project has been a group effort involving a huge number of individuals. as well as the scientific work presented in the paper, so many people in the operational teams supporting the day-to-day work of the collection and the partner institutions were involved too. i was lucky enough to join the team in the latter stages of the project, working on dataset validation and analysis, and developing the paper with my colleagues. it’s now around ten years since the project began, which gives you some idea of what a monumental effort it has been.
the nctc has a wonderful history and in 2020 celebrated its centenary. its 6,000 strains represent a unique and irreplaceable snapshot of the bacterial history of the last century. scientists around the globe use nctc strains in their laboratories every day. the nctc3000 datasets are beginning to explain many of the unique properties of individual strains and are also uncovering previously unknown knowledge about them. we hope the scientific community will find this information useful and will be able to use the strains and the datasets in new ways, as well as the old ones.
i’m lucky to be able to work from home in a computer-based job. most days will be a mixture of teams meetings with colleagues, data analysis, reading papers and writing reports of one kind or another. i always enjoy the teams meetings, and not only because i’m a remote worker, but because i have genuinely great colleagues. and there’s always a lot of emails, isn’t there?
i enjoy working with my colleagues. in culture collections, there are so many different types of jobs and types of people, so there are lots of opportunities for interaction. and now that i’ve been at ukhsa for a while, i’ve recently started having ukhsa-based students. helping early career scientists learn and develop their careers has always been a particular joy for me. and i love data analysis. there’s something about a spreadsheet of data, finding the patterns and trends, that calms the mind.