Exploring the unknown – Sequencing the CHO mitochondrial genomic landscape

Recently, we published an article in Metabolic Engineering that revealed widespread genetic heteroplasmy within the mitochondrial genome of Chinese hamster ovary (CHO) cells. That means that every copy of mitochondrial DNA (mtDNA) per cell is not necessarily identical. For the last 30 years, CHO cells have been asked to make hundreds of protein-based therapeutic agents to treat the global population for a variety of diseases such as cancer and arthritis and to do so in a very safe manner. This is a very energy demanding process for the cell and despite in recent years the CHO and Chinese hamster nuclear genome being sequenced, the mitochondrial genome has remained unexplored. Given that the mitochondria is the powerhouse of the cell, we decided to boldly go into unchartered territory and map the mitochondrial genomic landscape of CHO using next-generation sequencing technology.

First, let’s go back in time……


The 15 year human genome project that cost a massive €3 billion has been one of the greatest milestones of the century, unveiling genetic information that can now be used for personalised medicines and targeted therapies in the clinic. Advances over the last decade in sequencing technology has ushered in the next generation of sequencing platforms such as Illumina, making the cost of sequencing an entire genome go from ~€3 billion to ~€1,000. This has widespread implications for academic research groups, in that this once expensive technology can now be accessed for an array of projects in every corner of academia. The accessibility of whole genome sequencing has now gone beyond the release of the 1,000 genomes project and has moved onto the ambitious 100,000 genomes project.


This is a large-scale study which seeks to sequence the genome of ~70,000 patients in the National Health Service and identify novel recurrent genetic abnormalities associated with a plethora of diseases which could be used to both diagnose and treat.

Image sourced from the National Human Genome Research Institute

Biopharmaceutical drug production

A primary focus of research here at the National Institute for Cellular Biotechnology is in the area of biopharmaceutical production of recombinant therapeutic proteins. The Chinese hamster ovary (CHO) cell is the primary mammalian cell line used in the biopharmaceutical industry for the production of recombinant therapeutic proteins. This cell line is heralded as the “work horse” of biopharma due to its capability to produce high quality biologics with post-translational modifications (PTMs) similar to humans which ensure a fully functional and bioactive protein drug for the treatment of various diseases such as cancer and rheumatoid arthritis.


The various parental CHO cell lines such as CHO-K1 or CHO-S used today were originally isolated in 1957 from the ovary of a Chinese hamster. Today, over 70% of recombinant therapeutic monoclonal antibodies are produced in these cell lines with a long and rigorous cell line development process of ~12 months required to generate a single clone that maintains and exhibits desirable bioprocess characteristics such as fast growing, reaching high cell density, surviving long in culture and producing high amounts of protein per cell.

Unstable genetics – A double edged sword

By nature and owing to their genetic plasticity as a desirable feature for genetic engineering, CHO cells are genetically unstable. This genetic instability was highlighted in a previous study by Lewis and colleagues which revealed there to be more than 4 million mutations across 6 CHO cell lines and 3 lineages in addition to sequence variation that arose during the cell line development process alone. Understanding the genomic landscape of CHO has given insights into genetic variation in genes associated with bioprocess relevant phenotypes such as apoptosis and cell growth. By better understanding these genetic variations that could impact on CHO cell behaviour within the bioprocess, a better selection criteria could be adopted that would see the selection of genetically predisposed CHO cells that perform better in the bioreactor. On top of this, the unpredictable genetic heterogeneity that can arise within multiple CHO cell clones derived from the same parental line means that mutations can occur throughout the cell line development (CLD) process that often results in producer cell lines with sub-optimal performance characteristics.

All of the above means that the process of developing a CHO cell line to reliably produce a therapeutic product that meets the stringent safety regulations of regional drug authorities such as the Food and Drug Administration (FDA) or the European Medicines Agency (EMA) is a very costly and time-consuming one. At the end of this process, the results may still not be desirable. Although it is common to achieve 2-5 g/L production yields in the bioprocess, the high-throughput development of new and sophisticated therapies means that the already existing production cells lines may not produce these therapeutic proteins as efficiently. The CHO group at the NICB harnesses the power of genetic engineering to modify CHO cell lines using a variety of molecular tools such as CRISPR-Cas, microRNAs (DECOY-7) and inducible expression systems to enhance their performance within the bioprocess and ultimately boost drug production.


Multiple Ploidy Disorder

Paramount to the efficiency of production CHO cell lines is the presence of a balanced diet. That is, the perfect amount of critical nutrients within the culture media that will allow the cells to grow fast, survive longer and ultimately produce large quantities of therapeutic protein. Media development, process optimization and basic understanding of CHO cell metabolism has paid the largest dividends in relation to advances in protein drug production. At the centre of all this on a cellular level is the mitochondria; an organelle that is the powerhouse of the cell and generates the majority of cellular energy through a process called respiration. We have carried out studies in our own group that have shown enhanced mitochondrial activity to be associated with increases in recombinant protein production when microRNA-23 (miR-23) was stably depleted using a microRNA sponge.

The unique part of the mitochondrion is that, unlike all other cellular organelles, it carries its own genetic material in the form of a small circular DNA plasmid. The mitochondrion plasmid encodes 37 genes that play a critical role in mitochondrial function and cellular respiration. Interestingly, genetically, a single mitochondrion is polyploidy which means numerous copies of plasmid DNA exist in the same space. Additionally, the mitochondrial genome is 10 times more susceptible to DNA damage than the nuclear DNA which can usher a variety of mutations that can impeded mitochondrial activity. In humans, >250 variants have been associated with metabolic disorders and disease so it’s not surprising that genetic mutations in the mitochondrion could affect CHO cell behaviour.

Image sourced from Optimal Living Dynamics 

Another fascinating thing about the mitochondria is that the presence of genetic mutations are quite common and widespread throughout nature and in a lot of cases are not harmful to the organism. This is due, in part, to a phenomenon called heteroplasmy. Mitochondrial heteroplasmy is where each mitochondrial organelle can contain multiple copies of DNA but each/all/some copies may have a small genetic difference. This genetic difference in a small subset of mtDNA copies that amount to a dysfunctional protein often goes unnoticed because the number of wild type copies prevail. However, there is only so much a cell can take. As the number of mutated copies increase, a biochemical threshold is reached where by normal cellular metabolism cannot be maintained.

Image sourced from Stewart and Chinnery, 2015

With the emerging potential for heteroplasmic variations to occur within the mitochondrial genome on top of the already existing nuclear genomic instability which contributes to the unpredictable behaviour of CHO cells during development, we sought to explore the genomic landscape of the CHO cells mitochondrial genome using next-generation deep sequencing. To get a good picture of this, we sequenced the mitochondrial genome of 22 CHO cell lines in collaboration with Dr. Colin Clarke from the National Institute for Bioprocessing Research and Training (NIBRT). This panel of 22 cell lines was quite expansive covering publically available parental lines such as CHO-K1 and CHO-S, recombinant protein producing cell lines including CHO-DP12, industrially developed production clones from our partners at Biogen in Boston, Massachusetts, our own in-house transgenic cells and a family of clones all derived from two cell line development programmes. Firstly, we built a reliable reference sequence for Cricetulus Griseus by sequencing the mitochondrial genome of a liver tissue sample directly from an outbred Chinese hamster, gifted to us by Prof. Michael Betenbaugh from Johns Hopkins University.

Of the 22 CHO cell lines that were sequenced, it was evident that there was widespread heteroplasmy present within the mitochondrial genome of all cell lines with each cell line possessing a mutation that changed the amino acid sequencing of a protein-coding gene.

Furthermore, various transfer RNA (tRNA) were found to have heteroplasmic variants which could have huge repercussions on the ability of the mitochondria to efficiently translate its own protein-coding genes. CHO cell metabolism and nutrient requirements has been a central theme of biopharmaceutical research, however, up until now, the genomic architecture of the CHO mitochondria has remained unexplored. Our work recently published in Metabolic Engineering details the genetic variability in the CHO cells mitochondrial genome. By understanding the level of heterogeneity in CHO cells, the rate of its progression and the potential impact on the CHO cell performance, it could be possible to predict and select for clones whose production attributes and metabolic programmes are best suited to producing high quality therapeutic proteins.