Bioinformatics analysis of CHO omics data


The NICB CHO cell research group utilises advanced computational methods to maximise the information extracted from proteomic and gene expression datasets. In recent years, multivariate statistical and machine learning algorithms have been used in our group to understand the biological mechanisms underlying desirable phenotypes such as rapid cellular growth and high productivity, develop predictive tools for cell line development and integrate data from multiple levels of the biological system.

Examples of our work include:

Predicting cell-specific productivity from CHO gene expression

We developed the first predictive model of productivity in CHO bioprocess culture based on gene expression profiles. The dataset used to construct the model consisted of transcriptomic data from 70 stationary phases, temperature-shifted CHO production cell line samples, for which the cell-specific productivity had been determined. These samples were utilised to investigate gene expression over a range of high to low monoclonal antibody and fc-fusion-producing CHO cell lines. We utilised a supervised regression algorithm, partial least squares (PLS) incorporating jackknife gene selection, to produce a model of cell-specific productivity (Qp) capable of predicting Qp to within 4.44 pg/cell/day root mean squared error in cross model validation (RMSE(CMV)). The final model, consisting of 287 genes, was capable of accurately predicting Qp in a further panel of 10 additional samples which were incorporated as an independent validation. Several of the genes constituting the model are linked with biological processes relevant to protein metabolism.

Large scale microarray profiling and coexpression network analysis of CHO cells

Weighted gene coexpression network analysis (WGCNA) was utilised to explore Chinese hamster ovary (CHO) cell transcriptome patterns associated with bioprocess relevant phenotypes. The dataset set used in this study consisted of 295 microarrays from 121 individual CHO cultures producing a range of biologics including monoclonal antibodies, fusion proteins and therapeutic factors; non-producing cell lines were also included. Samples were taken from a wide range of process scales and formats that varied in terms of seeding density, temperature, medium, feed medium, culture duration and product type. Cells were sampled for gene expression analysis at various stages of the culture and bioprocess-relevant characteristics including cell density, growth rate, viability, lactate, ammonium and cell specific productivity (Qp) were determined. WGCNA identified six distinct clusters of co-expressed genes, five of which were found to have associations with bioprocess variables. Two coexpression clusters were found to be associated with culture growth rate (1 positive and 1 negative). In addition, associations between a further three coexpression modules and Qp were observed (1 positive and 2 negative). Gene set enrichment analysis (GSEA) identified a number of significant biological processes within coexpressed gene clusters including cell cycle, protein secretion and vesicle transport. In summary, the approach presented in this study provides a novel perspective on the CHO cell transcriptome. We have also developed the CHO gene coexpression database to allow user-friendly access to the findings of this study for the CHO cell community.

Integrated miRNA, mRNA and proteomic expression profiling to study CHO cell clonal growth rate variation

To study the role of microRNA (miRNA) in the regulation of Chinese hamster ovary (CHO) cell growth, qPCR, microarray and quantitative LC-MS/MS analysis were utilised for simultaneous expression profiling of miRNA, mRNA and protein. The sample set under investigation consisted of clones with variable cellular growth rates derived from the same population. In addition to providing a systems level perspective on cell growth, the integration of multiple profiling datasets can facilitate the identification of non-seed miRNA targets, complement computational prediction tools and reduce false positive and false negative rates.

Utilising multiple datasets to identify high confidence putative miRNA targets associated with CHO cell growth rate.


Learn more about our Bioinformatics Core Facility by clicking here


Padraig Doolan (

Click here to meet the CHO research group



  • Clarke C1, Madden SF, Doolan P, Aherne ST, Joyce H, O'Driscoll L, Gallagher WM, Hennessy BT, Moriarty M, Crown J, Kennedy S, Clynes M (2013) Correlating transcriptional networks to breast cancer survival: a large-scale coexpression analysis. Carcinogenesis 34: 2300-2308.


  • Doolan, P., Clarke, C., Kinsella, P., Breen, L., Meleady, P., Leonard, M., Zhang, L., Clynes, M., Aherne, S.T. and Barron, N. (2013) Transcriptomic analysis of clonal growth rate variation during CHO cell line development. J Biotechnol, 166, 105-113.


  • Clarke C, Henry M, Doolan P, Kelly S, Aherne S, Sanchez N, Kelly P, Kinsella P, Breen L, Madden SF, Zhang L, Leonard M, Clynes M, Meleady P, Barron N. (2012) Integrated miRNA, mRNA and protein Expression analysis reveals the role of post-transcriptional regulation in controlling CHO cell growth rate. BMC Genomics 13:656.


  • Clarke, C., Henry, M., Doolan, P., Kelly, S., Aherne, S., Sanchez, N., Kelly, P., Kinsella, P., Breen, L., Madden, S.F. et al. (2012) Integrated miRNA, mRNA and protein expression analysis reveals the role of post-transcriptional regulation in controlling CHO cell growth rate. BMC Genomics, 13, 656.


  • Clarke, C., Doolan, P., Barron, N., Meleady, P., Madden, S.F., DiNino, D., Leonard, M. and Clynes, M. (2012) CGCDB: a web-based resource for the investigation of gene coexpression in CHO cell culture. Biotechnol Bioeng, 109, 1368-1370.


  • Clarke, C., Doolan, P., Barron, N., Meleady, P., O'Sullivan, F., Gammell, P., Melville, M., Leonard, M. and Clynes, M. (2011) Large scale microarray profiling and coexpression network analysis of CHO cells identifies transcriptional modules associated with growth and productivity. J Biotechnol, 155, 350-359.


  • Clarke, C., Doolan, P., Barron, N., Meleady, P., O'Sullivan, F., Gammell, P., Melville, M., Leonard, M. and Clynes, M. (2011) Predicting cell-specific productivity from CHO gene expression. J Biotechnol, 151, 159-165.