Biostatistics & Bioinformatics Shared Resource Facility (BB SRF)
The mission of the Biostatistics and Bioinformatics Shared Resource Facility (BCBR SRF) is to provide expertise and a variety of services to basic and clinical investigators of the Markey Cancer Center (MCC) doing basic, pre-clinical, and clinical research.
The Biostatistics and Bioinformatics Shared Resource Facility (BB SRF) is a cancer center-managed shared resource that plays a key collaborative role in providing scientific and statistical input across the entire spectrum of cancer research being performed at the Markey Cancer Center (MCC). The mission of the BB SRF is to apply statistical principles to enhance the execution of scientific research through collaborative interactions with MCC members. To this end, our primary goal is to provide a comprehensive, centralized, and accessible biostatistics and bioinformatics support infrastructure to support all phases of cancer studies, including study development, implementation, and post-study analyses.
Biostatistics Component (Associate Director: Brent Shelton, PhD)
- Study design, power, and sample size calculations for grant applications on preclinical, translational studies, clinical trials, and population-based interventions.
- Clinical trials development and conduct.
- Statistical analysis, including interim and final analysis.
- Statistical programming for data quality control.
- MCC teaching/mentoring/general statistical consult.
Bioinformatics Component (Associate Director: Chi Wang, PhD)
- Next generation sequencing data processing and analysis.
- Metabolomics data processing and analysis.
- Microarray data processing and analysis.
- Genomic data mining.
- Omics integration.
- Grant writing, training and consultation.
Prioritization of Services
- Level 1: Collaboration with MCC members with cancer-related, peer-reviewed studies with BB SRF funding.
- Level 2: MCC investigators preparing cancer-related, peer-reviewed grant applications with proposed BB SRF funding.
- Level 3: MCC pilot/feasibility projects.
- Level 4: Data analysis, manuscript preparations of MCC investigators with BB SRF statisticians as co-authors.
- Level 5: MCC investigators developing non-peer reviewed studies (such as IITs with industry support) with request for BB SRF funding.
- Level 6: Unfunded cancer research projects that require limited statistical input.
Request BB SRF Services
The BB SRF uses iLab Solutions to manage service requests and project tracking. To start using BB SRF Services, click the link below, which will take you to a landing page with more detailed instructions, including a one-time account setup. Once your account is set up, iLab will enable you to place BB SRF service requests, provide the required approvals, and monitor the progress of your project.
Returning investigators can login to iLab here.
The Bioinformatics Section of the Biostatistics and Bioinformatics Shared Resource Facility (BB SRF) aims to build and maintain robust and state-of-the-art analysis pipelines for analyzing, interpreting, and visualization of large-scale genomic, epigenomic, transcriptomic and metabolomic data generated by the Markey Cancer Center's (MCC) cancer research experiments. While these pipelines can be used for general purpose bioinformatics applications, they are specifically tailored to reveal mutations and complex behaviors of cancer genomes. We work closely with and provide custom bioinformatics solutions for MCC investigators. Our current services focus on the following areas. New services will be added depending on demand.
- Next Generation Sequencing Data Analysis. The Bioinformatics Section has developed comprehensive pipelines to process and analyze data from:
- Whole-genome or whole-exome sequencing.
- Whole-genome bisulfite sequencing.
- Metabolomics Data Analysis. The Bioinformatics Section provides informatics support for raw and intermediate data analysis of metabolomics datasets, especially stable isotope-resolved metabolomics datasets. Results of these analyses can feed into other biostatistical analyses provided by the section. Custom downstream metabolic modeling and relative flux analyses can be provided on a limited basis.
- Microarray Data Analysis. The Bioinformatics Section has developed a pipeline for microarray data processing and analysis, including data normalization, quality assessment, differential expression identification and visualization, and pathway/functional analysis.
- Integrative Analysis of Multiple Omics Datasets. The Bioinformatics Section provides bioinformatics support to analyze the interaction or correlation across multiple genomic data. Some examples include the integrative analysis of DNA-methylation data and RNA-seq data to look at the regulation of global gene expression, the detection of aberrant transcripts using both DNA-seq and RNA-seq data, and correlation analysis between RNA-seq and existing microarray data. The section also provides support for soft multi-omics integration using CategoryCompare, which provides integration at the level of annotations across omics datasets. Please contact us for more details.
- Genomic Data Mining. The section uses genomic data repositories such as GEO, Oncomine, and TCGA to correlate genomic data from specific gene(s) of interest with clinical outcomes.
- Other Large-Scale Genomic Data Analysis. The section provides bioinformatics support for other genomic experimental platforms such as the NanoString nCounter system.
- Grant-writing Support. The section will help investigators with genomic study design, sample size/power calculation, data analysis plans, and writing bioinformatics sections.
- Training and Consultation. It is important to establish a rapport and dialogue between biomedical researchers, bioinformaticians and computational biologists. The Bioinformatics Section works with investigators to establish new data analysis pipelines. The section's personnel host training/consultation/courses/lectures on bioinformatics study designs, tools, resources and databases. New services will be advertised as they become available.
Storage and Computational Resources
The BB SRF is adequately equipped with state-of- the art computing resources to support all levels of research, particularly clinical and bioinformatics studies. Computing resources for the BB SRF and other MCC facilities are centrally coordinated by the MCC Cancer Research Informatics Shared Resource Facility (CRI SRF). The CRI SRF also takes the lead in coordinating resources from the university’s Center for Computational Sciences (CCS, see below) to support big data initiatives. The BB SRF leverages MCC resources and University of Kentucky campus-wide resources to ensure optimal and secure infrastructure for computational and analytical activities.
The CCS offers a Dell HPC cluster (supercomputer) rated at just over 140 Teraflops, consisting of general purpose CPUs, highly parallel GPUs, large memory nodes, a high speed Infiniband network, and high-performance file system.
Data Management Section
- Heidi Weiss (firstname.lastname@example.org)
Goal: The mission of the Data Management Section (DMS) is to ensure adequate and standardized data management support for research projects, in particular, those emanating from the four Research Programs at the Markey Cancer Center (MCC).
Summary of Services:
The DMS provides services during all phases of studies, including planning execution and analyses.
- Planning: Services include assistance in eCRF/questionnaire development, assurance of sufficient linkage between all data, creation of automated alerts for interim analyses and safety stops and interaction with MCC's Cancer Research Informatics Shared Resource Facility (CRI SRF) for informatics technology support.
- Execution: Data will be exported periodically on a pre-determined schedule, data validation scripts will be run on the data and queries sent for clarification. Enrollment and quality reports will be created.
- Analyses: At the end of data collection, a final data validation will be performed.
Statistical and Bioinformatics Tools and Software
Specialized Software Tools Developed in House
- An R package ordcrm provides the setup and calculations needed to design and implement a likelihood-based, continual reassessment method (CRM) dose-finding trial incorporating either binary or ordinal toxicities. Additionally, this package can perform simulations to assess design performance under various criteria. Documentation files and functions are described in more detail here.
- Stacey Slone and Dr. Emily Dressler have developed email notifications for interim analyses and safety triggers that integrate with the OnCore system to assist with implementation of investigator-initiated clinical trials at MCC. Please email Stacey for more information at email@example.com.
- Dr. Jinze Liu has developed multiple functions for bioinformatics applications in R. These include Software for Alignment: MapSplice, MapPER and software for differential transcriptome analysis: FDM. For more information, please email Dr. Liu at firstname.lastname@example.org.
- bacr is an R package developed by Dr. Chi Wang for implementing the Bayesian Adjustment for Confounding (BAC) method for estimating the average causal effect of a treatment on an outcome from cohort studies.
- NanoStringDiff is an R/Bioconductor package developed by Dr. Chi Wang to perform differential expression analysis based on gene expression data generated from the NanoString nCounter system. In addition, a user-friendly web application, NanoStringDiffWeb, is available here.
- “paf” R package: Calculate unadjusted/adjusted attributable fraction function of a set of covariates for a censored survival outcome from a Cox model using the method proposed by Chen, Lin and Zeng (Biometrika 97, 713-726, 2010).
- “KENDL” R Package: Calculate the kernel-smoothed nonparametric estimator for the exposure distribution in presence of detection limits using the method proposed by Yang et al. (Sat. Med 36(18), 2935-2946, 2017)
Commercially Available Software Freeware Tools
Our faculty and staff have extensive experience and expertise with many statistical software packages to help develop studies, calculate sample sizes with adequate power, and perform data analyses for MCC investigators across all four Research Programs. We are happy to collaborate with researchers on a project that may require specialized software. Our statistical package experience includes (but is not limited to):
- General Purpose: SAS, SUDAAN, SPSS, MINITAB, StatXact, LogXact, STATA
- Design and Sample Size: nQuery Advisor 7,0, NCSS and PASS 2011, EAST, EAST SURV, EAST Adapt
- Adaptive Designs: ExpDesign Studio, ADDPLAN
- Shareware: R, SaTScan for spatial analyses, WinBugs, MD Anderson Biostatistics Software
- Bioinformatics: FastQC, SAMtools, Picard, Cutadapt, trimmomatic, BBDuk, MapSplice2, STAR, BWA, Bowtie1, Bowtie2, TopHat, HTSeq, RSEM, DESeq2, edgeR, DSS, GSEA, GOseq, IPA, David, GATK, MuTect1, MuTect2, VarScan2, GISTIC2, MutSigCV, Oncotator, SnpEFF, ANNOVAR, Bismark, methylKit, MACS2, HOMER, QIIME2, mothur.
- Bioinformatics Databases: Oncomine, Gene Expression Omnibus, Sequence Read Archive, Genomic Data Commons, cBioPortal, COSMIC, dbSNP and 1000 Genomes.
MCC Custom Database Applications
The Cancer Research Informatics Shared Resource Facility, in collaboration with BB SRF personnel, works closely with MCC investigators to develop database applications needed for a specific project.
Investigators are required to acknowledge the Markey Cancer Center Biostatistics and Bioinformatics Shared Resource Facility (BB SRF) in any publications that result from the use of biostatistics, bioinformatics or information received through the MCC BB SRF. For your convenience, you are welcome to use the following statement:
The research was supported by the Biostatistics and Bioinformatics Shared Resource Facility of the University of Kentucky Markey Cancer Center (P30CA177558).
For more information, contact email@example.com.
Faculty and Staff
Director of Biostatistics and Bioinformatics Shared Resource Facility
Bayesian designs and interim monitoring for early phase clinical trials
Design of chemoprevention and immunotherapy clinical trials
Study design and sample size planning for in vitro, in vivo mouse cancer models and translational studies
Chief and Professor
Associate Director of Biostatistics
Missing Data and Selection Bias
Categorical Data Analyses
Screening for early detection and cancer prevention
Chi Wang, PhD
Associate Director of Bioinformatics
Next generation sequencing data analysis
Microarray data analysis
Bayesian model selection
Research Interests - read more at Li Chen's website
Semiparametric and nonparametric methods
Population-based secondary cancer data analyses
Missing data analysis
Geo-spatial data analyses
Comparative effectiveness analyses
Department of Computer Science
College of Engineering
235 Hardymon Building
Lexington, KY 40506-0046
Bioinformatics methods for next-generation sequencing data
Computing methods for alignment and differential transcriptome analysis of RNA-seq data
Hunter Moseley, PhD
Markey Cancer Center
Ben F. Roach Building, Rm CC436
Lexington, KY 40536
Developing computational methods for analyzing
Interpreting biological and biophysical data that leverage relevant information from public scientific databases and integrate system-wide analyses across omics-level datasets
Jinpeng Liu, MS
Quan Chen, DrPH
Markey Cancer Center
2365 Harrodsburg Rd Suite A230
Rani Jayswal, MS
Daheng He, PhD
Markey Cancer Center
800 Rose Street
Andrew Shearer, MS
Lauren Corum, MS
Markey Cancer Center
Biostatistics, Division of Cancer Biostatistics
800 Rose Street