Skip to main content

Frequently Asked Questions about Pharmacology-AI

A. General Questions

What is Pharmacology-AI? 

Pharmacology-AI is a cloud-based software that uses machine learning to analyse big data, with the objective of identifying genomic or medical features that are important drivers of a drug response. By identifying key features, sub-groups of patients most likely to respond to a drug can be easily identified, enabling patient stratification. 

What are the outputs of Pharmacology-AI? 

Pharmacology-AI ranks the features (e.g. features from 'omic' or medical history data) that have the greatest impact on the drug/biomarker response and displays them in an interactive report. The relative importance of each feature to the drug response is shown as a % of the total drug response. 

How does Pharmacology-AI work? 

Pharmacology-AI uses machine learning (a form of AI) to reveal the key features within a patient population that drive a change in a functional response. The system is automated, meaning the user only needs to upload datasets where information on each patient (for example genomic data, electronic health record data etc.) and some type of functional response or outcome measure is available, and the software will do the rest.  

What is unique about Pharmacology-AI? 

Pharmacology-AI is unique in multiple respects.  

Firstly, it is the only system specifically aiming to identify and rank the most important reasons for patient variations in drug responses, by linking pharmacology with omics (of any type) and electronic health records.  

Secondly, it utilizes IBM's machine learning technology, which automatically selects the optimum machine learning model, providing confidence in patient stratification strategies.  

Finally, the results are provided in an easy-to-access interactive format with the option to include predicted responses for specific patient cohorts of your choosing, without the need for specialist bioinformatics expertise.  There is even a "Predict" mode that allows you to upload data on a previously unseen patient and obtain a prediction of the expected response for that patient. 

How can I access Pharmacology-AI? 

REPROCELL's newly launched platform offers an end-to-end solution on a fee-for-service basis. Data is shared via a secure folder on REPROCELL's system, where the data is wrangled to create a file suitable for upload.

REPROCELL then runs the data through the machine learning platform to generate and test various machine learning models. The Sponsor is provided with a link to an interactive, easy-to-interpret report showing the key features driving drug/biomarker responses and the predictive accuracy of the model.  

REPROCELL can also provide enhanced reports predicting the responses/outcomes in patient groups with selected features (e.g. to model drug effects in patient sub-populations).

Do I need specialist bioinformatics or programming skills?

No, the system is designed to be accessible to all scientists. Both the data ingestion and analysis processes are designed to be user friendly and no programming skills are required. 

In what format should data be provided ?

Data is uploaded to the system as CSV files with each column representing a feature of interest (e.g. a SNP, or a medical or demographic feature, such as age. Each column should be named using a description of the feature (e.g. the reference SNP cluster ID for a particular SNP, or the name of a drug that the patients may or may not have been prescribed).

Features should be represented by numbers. For categorical features, such as whether or not a patient has received a particular drug, 0 is entered for 'no' and 1 is entered for 'yes'. For continuous features,  such as age, gene expression levels, or a drug response, the whole number can simply be entered. Pharmacology-AI automatically recognises data as either categorical or continuous data and allows you to name groups of features as having similar qualities, e.g. columns containing clinical, genomic or transcriptomic data.

Each row of data represents a different patient. 

What if my data is not already in a suitable format for machine learning?

For data not already formatted in this way (i.e. as numbers representing categorical or continuous features), REPROCELL offers a data wrangling service to convert your data to a suitable CSV file for machine learning. Please contact us for more information. 

B. Input Data

What is meant by "features"? 

Features are any type of record associated with the patient, whether that be 'omic' information or an entry in the patient's medical records.  

Examples of omic features are SNPs from genome sequencing (whole exome/whole genome etc.), transcriptomic, metabolomic, proteomic or microbiomic data. The feature type is labelled during the upload procedure. 

Examples of medical history are patient demographic data such as age, sex, height, weight, previous or current morbidities, or current or former medications.  

Any feature, whether nature or nurture, that might impact on a patient's response to a therapy can be accommodated in the system. Pharmacology-AI is capable of handling almost any data type. REPROCELL offers a "data wrangling" bioinformatics service to ensure the data is in a suitable format for the machine learning system.  

What type of drug/biomarker responses can be used in the system? 

Pharmacology-AI can handle any type of quantifiable measurement. For example, a response might be measured during an ex vivo or in vitro tissue experiment, a change in a biomarker from a blood sample of a patient or volunteer, a change in a clinical measurement such as blood pressure, or even a clinical outcome, such as 5-year survival. The user is required to tell the system the units of the measurement, but otherwise any numerical value representing a "response" can be uploaded.  

How many patients/donors are needed to run an analysis in Pharmacology-AI? 

There is no set number of patients. Ideally, the dataset is a true representation of the patient population of interest; therefore, the number of patients required may vary quite considerably based on the variation in drug response across the patient population. During development, we were able to identify key genomic and medical features driving anti-inflammatory responses to drugs tested in human ex vivo tissues using as few as 25 donors; however, typically 50-100 or in some cases hundreds to thousands of donors may be required to provide a good reflection of the true patient population.  

No matter the number of donors that are uploaded, the software will provide information on the accuracy of its predictions.  

Can REPROCELL generate big data for analysis? 

Yes. REPROCELL is unique in offering access to human fresh tissue assays, where tissues from multiple donors provide a readout of drug response in combination with omic data and/or the deidentified medical histories of each donor. These combined data can be uploaded to Pharmacology-AI to reveal insights into the reasons for patient variation in drug response. Most commonly, such research at REPROCELL has been in IBD and COPD, where biopsies of COPD lung, or Crohn's or ulcerative colitis disease tissue have been tested ex vivo and whole exome or transcriptome data has also been generated. Examples of these types of studies can be found in recent publications by REPROCELL, linked on our website. 

Is my data held securely?

Yes. Pharmacology-AI is designed in line with industry best practices and is hosted on a secure cloud environment with appropriate security controls. It is designed in line with the Open Web Application Security Project (OWASP).

Is my data visible alongside other user's data?

No. Each client of REPROCELL has their own unique instance of the software, held in distinct locations in the cloud, which can be accessed only by you and authorised REPROCELL staff. 

C. Machine Learning

How are the machine learning models generated? 

The key to a successful machine learning model is the quality of data uploaded to the system. Users should be confident that the dataset is a good representation of the patient population of interest. The machine learning software firstly creates a model using some of the dataset to "train" the model. The model is trained by providing data on some type of functional response, such as a change in a biomarker in response to a drug, a measured change in an in vitro/ex vivo tissue (such as a human fresh tissue), or even an outcome to a clinical trial. The drug response data is uploaded together with available information on each patient or volunteer, such as their medical history or genomic/transcriptomic/proteomic/metabolomic or microbiomic data (I.e. "big data). 

To create the model, part of the dataset is set aside, as "unseen" patients, that can later be used to test how well the model predicts a drug/biomarker response in unseen patients. The accuracy with which Pharmacology-AI predicts a drug response is displayed, meaning the margin of error in any prediction is clearly understood.  

What machine learning/AI methods does it use? 

Pharmacology-AI uses a number of state-of-the-art machine learning approaches and automatically selects the machine learning model that provides the best predictions with the lowest margin of error. The selected machine learning model is displayed in the report. Ultimately, as with all  models, the predictive accuracy of the model is dependent on the quality of the data that is uploaded.  


Last Update: 25 May 2023

Contents