archaea and crispr biology
issue: archaea
08 august 2017 article
the crispr-cas system is an adaptive immune system encoded in prokaryotes to defend against invasion of foreign genetic elements. current research data indicate that these immune systems are prevalent in archaea, the third domain of life. nevertheless, the prevalence probably reflects the fact that many of the current archaeal model organisms co-exist with a wide variety of viruses and are therefore enriched for the antiviral immunity. furthermore, an additional layer of complexity of crispr mechanisms has recently been discovered, such that crispr functionality is further modulated by a widespread class of proteins named cas accessory proteins. for this reason, these archaeal organisms provide unique resources for investigations to uncover the diversity and complexity of the immune system.
crispr-cas as an anti-viral weapon in prokaryotes
clustered regularly interspaced short palindromic repeats (crispr) and the crispr-associated (cas) system codes for an adaptive immunity in prokaryotes to defend against invasive genetic elements, including viruses and plasmids. the system is composed of crispr loci and cas gene cassettes. the former contain repetitive sequences that are interrupted by unique dna sequences (spacers) derived from genetic elements, representing a memory of infection history of invasive genetic elements; the latter code for proteins of rna-binding, helicase and nuclease domains. the immune system functions in three distinct stages (fig. 1): first, dna segments in foreign genetic elements are acquired as new spacers in crispr loci (adaptation); then, crispr loci are transcribed, yielding precursor crispr rnas (pre-crrnas) that are processed to produce mature crrnas (biogenesis); and finally, crrnas guide cas proteins to specifically target nucleic acids for destruction (interference).
antiviral immunity was first demonstrated for streptococcus thermophilus, a lactic acid bacterium, in 2007. upon the first exposure to a new bacteriophage, most bacterial cells are killed. however, a small portion of cells survive the bacteriophage infection and, commonly, one or more dna fragments are gained from the bacteriophage genome and inserted into the chromosomal crispr loci of the host. upon the re-occurrence of the phage infection the bacterium is then immune from the infection, and the immunity relies on the integrity of the crispr-cas system. investigation of many other crispr-cas systems in archaea and bacteria has revealed that all systems studied function under the same principle. further studies on crispr-cas effector complexes containing crrnas and cas proteins have led to the illustration of molecular mechanisms of target dna destruction for each type of crispr-cas system.
fig. 1. basic mechanisms of the three-step antiviral pathway by crispr-cas systems.
striking diversity of crispr-cas systems
the prevalence of the crispr-cas system in prokaryotes allowed the identification of >45 families of cas proteins in 2005, two years before the demonstration of crispr immunity. most cas proteins are not well conserved in amino acid sequence, but they form superfamilies of cas proteins that are structurally and functionally related. nevertheless, type-specific cas proteins have been identified. in 2015, a major effort was made in the crispr community to classify crispr-cas systems based on conservation of cas proteins and the molecular mechanisms involved. this has yielded six main types of crispr-cas systems, belonging to two main classes: those of class 1 require multiple cas proteins for interference whereas those of class 2 use a single cas protein for antiviral immunity. each type of crispr-cas system has a signature cas protein, which is type-specific. for example, signature cas proteins for the three classic types of crispr-cas – types i, ii and iii – are cas3, cas9 and cas10, respectively. furthermore, crispr-cas systems are further divided into subtypes within each type. it is estimated that about 80% of archaea and about 40% of bacteria contain at least one crispr-cas system. since only a very small fraction of these prokaryotes are known, the diversity of crispr-cas systems is much beyond our imagination. indeed, a recent investigation by a metagenomic approach has led to the identification of several novel crispr-cas systems.
in addition, some small archaeal plasmids carry a minimal crispr locus where no cas genes are identified. nevertheless, spacers in the plasmid minimal crispr arrays match some viruses, suggesting that these plasmids could have developed a strategy to hijack the host crispr-cas systems to silence virus infection.
the essence of uneven distribution of crispr-cas systems in archaea and bacteria
the huge diversity of crispr-cas systems raises a question as to how the systems evolve. in a crispr classification study, it was found that crispr-cas systems show a biased distribution in archaea and bacteria. whereas type i crispr-cas systems are abundant in both prokaryotic domains, all known class 2 crispr-cas systems are from bacteria, although some uncommon class 2 systems are predicted in archaea, including a type v system from the euryarchaeon ‘candidatus methanomethylophilus alvus’ and two type ii systems from uncultivated nanoarchaea. on the other hand, archaea possess many more type iii systems than bacteria. due to historical reasons, most known archaea belong to the so-called extremophiles in which crispr-cas systems are prevalent. in particular, all known extremely thermophilic archaea carry more than one crispr-cas system. the same is basically true for thermophilic bacteria. this suggests that crispr-cas systems may have some additional functions that are important for certain physiological groups of organisms such as thermophiles. interestingly, crispr-cas systems are absent from thaumarchaea and several bacterial taxa, further arguing for co-evolution between crispr-cas systems and their archaeal and bacterial hosts. to this end, the apparent prevalence of crispr-cas systems in archaea may reflect the fact that known archaea are dominated by those containing crispr-cas systems. possibly, more crispr-lacking phyla remain to be identified in archaea.
nevertheless, another possible reason accounting for the archaeal prevalence of crispr-cas systems is the occurrence of highly diverse archaeal viruses that infect the archaeal model organisms. therefore, the arms race between archaea and their diverse viruses may account for the presence of multiple diverse crispr-cas systems in a single cell. in this respect, archaea and their crispr-cas systems provide excellent resources for further studying crispr-cas systems and their biological functions.
cas accessory proteins as modulators of crispr functionality
the complexity of crispr biology has been further increased by the identification of a new class of crispr-related proteins termed ‘cas accessory proteins’. their encoding genes are often clustered together with cas genes but they also appear in other genomic environments. some of them are implicated in adaptation while others, in interference. they are probably not essential for the process of the three-step crispr immunity, but may modulate the functionality of the crispr-cas system. many of these proteins contain a carf (crispr-associated rossmann fold) domain, and they constitute the most abundant superfamily proteins associated with the crispr system. cas accessory proteins belonging to the csx1/csm6 superfamily are probably among the most interesting ones. they are carf domain ribonucleases, usually related to archaeal and bacterial type iii crispr-cas systems that mediate transcription-dependent dna interference. since these systems require a cognate target rna to activate the dna interference, the carf ribonuclease may modulate the crispr immunity by degrading viral transcripts. the mechanisms involved are one of the main focuses in crispr biology research, for which several archaea provide good models for investigation.
development of crispr biotechnology
in 2012, cas9-crrna complexes were tested as a programmed endonuclease for genome editing, and the principle was soon applied in genome editing of human cell lines and mouse models. this method was termed as crispr technology simply because it was developed based on the crispr immune principle. to date, the technology has been further developed to extend the application to transcription regulation, genome imaging and epigenetic regulation. the application can also be on a genome-wide scale to assay gene functions. focused research in crispr biology and biotechnology will greatly increase our understanding of these unique, prokaryotic adaptive immune systems, and facilitate crispr applications for years to come.
qunxin she & wenyuan han
university of copenhagen, ole maaloes vej 5, dk-2200 copenhagen n, denmark
[email protected]
[email protected]
further reading
burstein, d. & others (2017). new crispr- cas systems from uncultivated microbes. nature 542, 237–241.
makarova, k. s. & others (2015). an updated evolutionary classification of crispr-cas systems. nat rev microbiol 13, 722–736.
mohanraju, p. & others (2016). diverse evolutionary roots and mechanistic variations of the crispr-cas systems. science 353, aad5147.
tamulaitis, g. & others (2017). type iii crispr-cas immunity: major differences brushed aside. trends microbiol 25, 49–61.
images: fig. 1. jennifer doudna, hhmi/uc berkeley. computer model showing the crispr-cas silencing cmr subunits bound to rna (cyan) and dna (red). a number of cmr atoms have been removed in order to show rna and dna. laguna design/science photo library.