A Machine Learning-Based Test for Predicting Response to Psoriasis Biologics

Objective: This study was designed to develop and prospectively validate a machine learning based algorithm that could predict patient response to the most common biologic drug classes used in the management of psoriasis patients. This type of tool would allow clinicians to have greater confidence that a given patient will respond to a specific drug class, which could lead to improved health outcomes and reduced wasted healthcare spend. Methods: Patients were enrolled into one of two observational studies (STAMP studies) where dermal biomarker patches (DBPs) were applied at baseline prior to drug exposure, followed by clinical evaluations at 12 weeks after exposure. PASI measurements were made at baseline and 12 weeks to evaluate clinical response to a clinical phenotype. Responders were defined as those who reached PASI75 at 12 weeks. The transcriptomes obtained from the DBPs were sequenced and analyzed to derive and/or validate classifiers for each biologic class, which were then combined to yield predictive responses for all three biologic drug classes (IL-23i, IL-17i, and TNFi). Results: A total of 242 psoriasis patients were enrolled in these studies, including 118 patients (49.6%) treated with IL-23i, 79 patients (33.2%) treated with IL-17i, 35 patients (14.7%) treated with TNFi, and 6 patients (2.5%) treated with IL-12/23i. The IL-23i predictive classifier was developed from the earlier enrolled patients and independently validated with the latter enrolled patients. IL-17i and TNFi predictive classifiers were developed using publicly available datasets and independently validated with patients from the STAMP studies. In the independent validation, positive predictive values for three classifiers (IL-23i, IL-17i, and TNFi) were 93.1%, 92.3% and 85.7% respectively. Over the entire cohort, 99.5% of patients were predicted to respond to at least one drug class. Conclusion: This study demonstrates the power of using baseline dermal biomarkers and machine learning methods as applied to the prediction of patient response to psoriasis biologics prior to drug exposure. Using this test, patients, physicians, and the health care system all can benefit in distinct ways. Precision medicine can be realized for individual patients as most will likely respond to their prescribed biologic the first time. Physicians can prescribe these drugs with increased confidence, and the healthcare system will realize lower net costs as well as greatly reduced wasted spend by significantly improving initial response rates to expensive biologic therapeutics.

The promise of personalized medicine has been touted for many years but has been elusive in some specialties. 1 Recently, with the influx of large data sets from "omics"based methods including genomics, transcriptomics, and metabolomics, personalized approaches to medical practice have come to the forefront and many specialties now use some form of personalized medicine in research and clinical care. This is particularly true in oncology where biomarker-guided treatment paradigms are increasingly commonplace. 2 However, personalized medicine in dermatology has traditionally lagged behind other medical specialties. Advances in the molecular understanding of the skin as well as advances in cutaneous pathophysiology have initiated new lines of thinking for the application of personalized medicine to the treatment of the skin. These successes were first realized in melanoma and current treatment guidelines for metastatic melanoma recommend testing tissue for relevant mutations (NRS, BRAF, KIT, GNAQ/11, and/or BAP1) with the goal of treatment that is personalized for a specific patient. 3 However, other inflammatory skin diseases continue to have a need for personalized approaches. Indeed, the American Academy of Dermatology (AAD) and National Psoriasis Foundation (NPF) joint guidelines on the treatment of psoriasis with biologic agents stated the urgent need for the identification of biomarkers that can guide efficient biologic selection for individual patients was highlighted. 4 Psoriasis is a T-cell mediated inflammatory skin disease characterized by discrete erythematous plaques and papules with micaceous scale. 5 Worldwide, this is a common disease, with approximately 2.8% of the United States population, or 7.5 million people, diagnosed with psoriasis. The pathology of the disease has been heavily studied and is known to be triggered by a complex inflammatory circuit that stimulates keratinocyte proliferation via upregulation of a host of cytokines including tumor necrosis factor-alpha (TNF), interleukin (IL)-17, and IL-23. 6 Current treatment paradigms for psoriasis are distinguished by topical medications and/or phototherapy for mild to moderate patients, and systemic medications for patients who are classified as moderate to severe disease. The advent of biologic therapy as one of these systemic agents has revolutionized the management and treatment of psoriasis patients and is a direct result of the increased molecular understanding of the disease. 7 Presently, there are eleven approved biologic agents approved for use in the United States for the treatment of psoriasis, with more under development. These monoclonal antibodies are highly specific immunomodulators and have proven to be particularly effective and safe in the clearance of skin lesions. This increase in treatment options has come with a concomitant increase in patient expectations for disease control. 8 Even with the plethora of treatment options available today, the most common reason patients discontinue biologic treatments is lack of efficacy. 9 Indeed, recently published real world evidence reported response rates to biologics that are significantly lower than those observed in clinical trials. 10 While broad stroke patient stratification measures have been reported, their value is limited in clinical practice. Biologic drugs, while effective, are also particularly expensive; the cost of medication necessary to reach skin clearance (PASI 100) can cost up to $366,645 per patient annually. 11 When one INTRODUCTION considers the combination of a lack of clarity to which biologic is most effective for a given patient along with the significant cost for biologics, the need for biomarkers that predict treatment efficacy has never been greater.
We recently described a novel biomarker capture platform that utilizes a proprietary dermal biomarker patch to capture the whole transcriptome including mRNA biomarkers from the epidermis and upper dermis. 12 This platform showed excellent concordance with biopsy and provides a scalable method to access skin biomarkers in a minimally invasive manner. Furthermore, we have reported the use of this platform in preliminary machine learning classifier builds for the prediction of response and nonresponse to IL-17 and IL-23 inhibitors. Herein, we extend this preliminary study to the development and prospective validation of an actionable clinical test for predicting patient response to psoriasis biologics for all three drug classes.

Dermal Biomarker Patch Platform
Dermal biomarker patches (DBPs) used in this study were fabricated and modified as previously described and used according to the manufacturer's specifications. 12

Human Subject Recruitment and Enrollment
Data were analyzed from past and ongoing observational, multicenter (20 centers), single-arm, open-label, 12-week studies, referred as STAMP studies. The protocols for these studies were approved by local institution ethics committees and conforms to the provisions of the Declaration of Helsinki and the International Council for Harmonisation (ICH) Guidelines on Good Clinical Practice (GCP). All patients who received treatment provided written informed consent. The primary objective of the study protocols was to examine if baseline or ontherapy transcriptomics can be used to help predict selection of medications and provide new therapeutic targets for drug development (Supplemental Table 1). Visits included screening, baseline, week 1, week 4, week 8, and week 12. PASI, PGA, and BSA scoring was performed at every visit excluding the screening visit. Subjects were administered the Dermal Biomarker Patch at every visit excluding the screening visit. Subject medical history, physical exam, and demographics were collected at screening.

Study population
These studies enrolled both male and female patients who were aged 18 years or older, diagnosed with psoriasis by either a rheumatologist or a dermatologist with at least one identifiable study lesion of 2 cm in diameter or greater, and were planned for treatment with IL-23 inhibitor (IL-23i), IL-17 inhibitor (IL-17i), or TNFα inhibitor (TNFαi) therapy once enrolled in the study. The exclusion criteria included use of topical steroids on the study lesion within 2 weeks prior to the baseline visit and concurrent use of Plaquenil. All study participants were also instructed to refrain from the use of all topical steroids throughout the study until the end of study treatment.

Dermal Biomarker Patch Application
To apply DBPs to the skin, a customized spring-loaded applicator was used. This applicator served to standardize the application pressure across subjects and users. The loaded applicator was placed against the skin and the trigger pressed, applying the patch to the skin. The patch was METHODS then held in place against the skin for 5 minutes by a ring of medical tape. After this time, the patch was removed from the subject, immediately placed into storage buffer (LiCl, Triton X-100, Tris-EDTA), and stored at 4 ˚C until processing.

Dermal Biomarker Patch Processing
Dermal transcriptomes were processed within 96 hours of collection from subjects. The applied DBPs were washed with chilled 1X PBS and then dried under a stream of nitrogen. mRNA extraction from the patch was performed by applying PCR grade water (50 µL, 95 ˚C) to the DBP. The patch was then heated 1 minute at 95 ˚C to elute the bound mRNA from the DBP. This eluted mRNA was then converted to cDNA using the Takara SMART-Seq® Single Cell kit according to the manufacturer's instructions. Amplified cDNA samples were then stored at -20˚C until analysis.

Next-Generation Sequencing Procedures
Amplified cDNA was sequenced by a commercial vendor (Psomagen, Inc., Rockville, MD) according to standard procedures.
Library preparation was accomplished using Illumina Nextera DNA Flex kits according to the manufacturer's instructions. Prepared indexed libraries were then loaded onto a NovaSeq6000 S4 with read length of 150PE for sequencing of 40 M reads per sample. During sequencing, the quality score (Q30) was maintained over 75%. Upon completion of sequencing runs, FASTQ file quality was checked with FASTQC and trimmed with the Trim_galore program. The trimmed FASTQ files were aligned and mapped to human reference genome GRCh38 using the hisat2 program. The number of reads was counted for each Ensemble gene ID using the FeatureCounts program and Homo sapiens GRCH38.84.gtf. RNA expression analysis was further processed using the Bioconductor package edgeR.
Genes were filtered using filterByExpr before logCPM (log counts per million reads) were calculated as a measure of gene expression level. For downstream classifier builds, logCPM values were used.

IL-23 Classifier Development
Five common classifiers were selected and applied for predicting responders under IL-23i treatment using the R package caret. The selected classifiers have been frequently used in the medical field for exploring predictive or prognostic biomarkers and included glmnet (Lasso and Elastic-Net Regularized Generalized Linear Model), PAM (Nearest Shrunken Centroids), LM (Linear Regression Model), SVM (Support Vector Machine), and RF (Random Forest).
The five classifiers were compared for their predictive performance using the following experimental design: 1) the data set was split into ten stratified outer folds; 2) for each of the folds, the data were preprocessed for feature selection. The top 20, 50, or 200 differentially expressed genes (features) were selected using linear regression model; 3) The hyperparameters were tuned in the training set via a ten-fold cross-validation, and the process subsequently repeated five times; 4) Based on the selected hyperparameters, a model was derived from the training set and applied to the test set. Performance metrics on the test set were then calculated. This process was repeated five times for each classifier.
The earliest enrolled IL-23i treated patients in STAMP studies were used for IL-23i classifier training. Baseline PASI filter (none, 6+, 8+, and 10+) were applied to explore the impact of disease severity on classifier performance. Classifier training were performed using the machine learning approaches stated above and test performance was assessed using 10-fold cross validation. IL-23i classifier were locked once a desired performance (> 85% PPV and >85% sensitivity) were achieved. The IL-23i treatment patients enrolled after the classifier lock were used as the independent validation set.

IL-17 Classifier Development
Previously we have reported a list of 17 genes which were predictive of psoriasis patients' response to IL-17i by analyzing a publicly available data set. 13,14 In brief, moderate to severe psoriasis patients (baseline PASI ≥ 10) were treated with brodalumab and the patients were followed up for 12 weeks. PASI measurement were performed at baseline and week 12, and patients' treatment response was assessed using week 12 PASI75. Lesional and nonlesional skin biopsy samples were collected at baseline and week 12. RNA profiling was performed using an Affymetrix microarray platform. The lesional samples collected at baseline were used in our predictive biomarker analysis.
The 17 predictive genes were negatively correlating with patients' response to brodalumab. The 17 genes were mapped to 14 Ensemble gene IDs reported in RNASeq data from STAMP studies. The 14-gene classifier was validated in STAMP studies.

TNFi Classifier Development
Publicly available data sets in the NCBI Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/) and European Bioinformatics Institute (EMBL-EBI) big data database (https://www.ebi.ac.uk/) were used as the classifier training data sets. For initial data selection, search terms of psoriasis patients with biologics treatment and transcriptome profiles were used to identify either array or sequencing data. Supervised predictive biomarker selection was applied to individual training data to filter genes based on the following assessment: 1) correlation between gene expression and patient response; 2) median gene expression level; 3) gene expression dynamic range; 4) difference between average gene expression of responder and non-responders. Ratios of genes down-regulated and gene upregulated in TNFi responders were used to develop a prediction of TNFi treatment responses.

Prospective classifier validation
IL-23i, IL-17i and TNFi classifiers were independently validated using the patients enrolled in STAMP studies. Each classifier discretely predicted a patient as either a responder or non-responder for biologic class. Response was defined as achieving PASI75 at week 12. The cross-tabulation of observed and predicted classes with associated statistics was calculated with the confusionMatrix function of the R caret package.

Characteristics of study subjects
A total of 242 psoriasis patients were enrolled in the STAMP studies (Figure 1) at time of data lock, including 38 patients who were still in follow up. STAMP is an actively recruiting study designed to continue enrolling new patients to support psoriasis biomarker research. Varied demographics and clinical features of the study subjects were observed ( Table 1). With regard to drug class, 49.6% patients were treated with IL-23i, 33.2% were treated with IL-17i, 14.7% were treated with TNFi, and 2.5% were treated with IL-12/23i.

RESULTS
Of the 242 patients initially identified for this study, 185 patients completed the study, meaning that both baseline and week 12 PASI scores were collected, and 57 patients were either screen failures, lost to follow up, or still in follow up. Out of the 185 patients, 177 had baseline DBP samples collected, while 8 patients failed DBP sample collection. Of this subset, 167 samples passed sequencing data QC metrics and were included in the biomarker analysis, with 10 (5.4%, 10/177) samples failing either sample processing or sequencing data QC.
The patient response rate was 64.1% for the whole cohort, and ranged from 47.6% to 72.5% for different biologics (Supplemental Table 2). High (baseline PASI ≥ 8) PASI patients had 26.4% higher response rate than low PASI patients across all drug classes.
High (baseline PASI ≥ 8) PASI patients were used for predictive classifier development and validation. The IL-23i treated patient population was divided into two subsets, 17 IL-23i treated patients were used for training an IL-23i predictive classifier, and the remaining 43 patients were used for prospective validation. All high PASI IL-17i and TNFαi patients were used for the classifier validation.

IL-23i Classifier Development and Performance in Training Set
A subset of 17 IL-23i treated high PASI (≥8) patients were used for IL-23i predictive classifier training, including 9 responders and 8 non-responders. The best performing model was built on glmnet using the top 50 features selected with linear regression model. Test performance was assessed with ten-fold cross validation and the positive predictive value (PPV), sensitivity and balanced accuracy were 89.7%, 96.3%, and 91.9%, respectively.

TNFi Data Source and Predictive Biomarker Discovery
Four publicly available datasets ( Table 2) were identified and used for the TNFi response classifier development. 15,16,17,18 A total of 73 patients were included in these datasets, out of which 58 patients had both transcriptome data and outcome assessment data for predictive biomarker discovery. Patient outcome was assessed with PASI75 at week 12 or 16, or histological response.
Supervised predictive biomarkers selection was applied to the four training data sets. Nine genes were determined as predictive of TNFi response in at least two datasets (CNFN, CTSC, GBAP1, CRABP2, PCDH7, PPIG, RAB31, C3, EGR1). Interestingly, CNFN was previously reported as a key gene determining psoriasis molecular classes and EGR1 was well known as key regulator in the psoriatic transcriptome. 19,20 The output of the classifier was a TNFi response prediction score; in this scoring system, the lower the prediction score, the higher chance the patient will respond to TNFi treatment. The classifier performance showed PPV, sensitivity and balanced accuracy of 78.9%, 44.1%, and 63.7%, respectively with this training set (Supplemental Table 3).

Mind.Px Classifier Validation
Patient demographics and disease characteristics for the 95 patients included in the prospective validation can be found in Supplemental Table 4. Only patients with baseline PASI≥8 were included in the validation. For the three classifiers, positive predictive value ranged from 85.7% to 93.1% ( Table 3). Correlation between observed W12 PASI changes and predicted drug response were assessed (Figure 2).
The same analysis was repeated for 66 moderate to severe disease patients (i.e., PASI≥10), and similar overall test performance was observed with PPV ranging from 90% to 100% in this smaller cohort (Supplemental Table 5).
Patients with baseline PASI<8 were also analyzed to determine the classifier performance in milder patients. In this case, the balanced accuracy ranged from 44.4% to 52.8% for three classifiers, suggesting that the developed classifier was optimized for moderate to severe psoriasis patients.

Mind.Px Predicted Response Prevalence
The predicted response prevalence of patients by all three classifiers (IL-23i, IL-17i and TNFi) was assessed using 195 patients who had baseline DBP samples and completed RNASeq sequencing data ( Figure  3 and Table 4). Individually, the predicted response prevalence was 72.3%, 51.7% and 67.1% for IL-23i, IL-17i and TNFi classifiers, respectively. Critically, 99.5% (194/195) patients were predicted as to responder to at least one of the three drug classes. All possible combinations of the three drug classes were represented with 17.4% (34/195) of patients predicted as a responder to all three drug classes, 56.9% (111/195) of patients predicted as a responder to two of the three drug classes, and 25.1% (49/195) of patients predicted as a responder to one of the three drug classes.

STAMP Study Demographics
There were 242 patients included in this analysis, with demographics that largely were consistent with previous studies with respect to gender, race, and age (Table 2). Similarly, the average patient in these studies was obese (BMI>30), with a mean age of 48.5 years. Most interestingly, in this study, the vast majority of patients (86%) were biologic naïve or had not been administered a biologic within the past 12 weeks. Given that many moderate to severe psoriasis patients have been exposed to biologics, this finding was particularly surprising, but analysis of the classifier response of biologic naïve versus biologic exposed patients showed no difference between the predictive value of the algorithms in either of these patient groups (data not shown).

Classifier Development and Validation
The final IL-23i classifier was developed and validated using patients enrolled in the STAMP studies. A subset of the total IL-23i enrolled patients were used as the training set and the remaining patients in the cohort were used for classifier validation. Since the training and test set were from the same study with the sample and data processed in the same manner, the classifier developed with the training set can be applied to the   The strategy for the development of the IL-17i and TNFi classifiers was different from the IL-23i classifier. The IL-17i and TNFi patient sample sizes in STAMP studies were smaller than IL-23i patients and were not sufficient to divide into separate training and test sets, so publicly available datasets were used for the training of these two classifiers. It was noted that the training sets from the public data differed significantly than the test sets in some aspects, including sample collection method (punch biopsy vs. dermal patch), RNA preparation protocol, transcriptome profiling method (array vs. sequencing based). Due to these different natures of the training sets, the training sets primarily were used only for feature selection. Once the predictive genes were identified, a simplified algorithm that utilized gene expression values or the ratio of gene expression values, was applied as the predictive classifier. The cutoffs were preset prior to the validation using percentile data values calculated from the prediction scores of STAMP patients and this allowed for an assessment of classifier performance while minimizing the risk of overfitting.
Here, week 12 PASI75 was used as the patient outcome determination. In a clinical setting where a better response (e.g., PASI 90 or PASI 100) is desired, our classifier can potentially be used for the identification of this group of patients with certain clinical cutoff adjustment. Further classifier development to conclusively identify "super-responders" or "super-non-responders" is ongoing and will be reported in due course.
All three classifiers were validated for baseline PASI ≥ 8 patients. However, the predictive value of these classifiers in patients with lower starting PASI scores was limited. This could be because the three classifiers were developed with high (≥ 10) PASI patients as the training set in order to match the types of patients that were enrolled in the pivotal clinical studies for each biologic. It is possible that a mild patient might biologically have different transcriptomic biomarkers. Alternatively, in patients with low starting PASI scores, the reliability of the response determination measure (PASI 75) is low given the reduced dynamic range of the measurement.
Other clinical variables have been previously used to stratify psoriasis patients or have been correlated with poorer outcomes. In particular, BMI and age have been reported as having clinical prognostic value in assessing biologic treatment response. 21,22,23 We have explored the predictive significance of BMI and age as a possible orthogonal input variable in the classifier. However, adding either variable as a covariate when exploring the predictive models, no improvement was observed in the predictive accuracy or positive predictive value (data not shown).
The prevalence of the biomarker predicted response revealed key features of this test. Of those tested, only a single patient out of the 195 patients tested was predicted to not respond to any of the three biologic drug classes. Given the rarity of the "triple negative" outcome, the biological rationale for this patient's failure is impossible to determine without a larger data set. It is possible that patients that fail all three classes of biologics have an altered immune system; greater study of these patients is required to fully elucidate the mechanism underpinning this phenotype. These data concur with a widely accepted clinical fact that the treatment and management of psoriasis has dramatically changed since the introduction of biologics; almost all patients will respond to one of the three    1 Medical history includes prescription and over-the-counter medication history. 2 Only applicable to subjects who have not been examined by a rheumatologist or dermatologist within 30 days prior to screening. Height, weight, and BMI are included. 3 If screening and baseline occur on the same day, clinician must ensure subject has refrained from any topical steroid use 2 weeks prior to the application of the Mindera Dermal Biomarker Patch. 4 For screening/baseline assessments, the physical exam, PASI, PGA, BSA, and/or Mindera Dermal Biomarker Patch application can be completed at either the screening visit or the baseline visit (PASI, PGA, and BSA should be completed at the same visit and before Mindera Dermal Biomarker Patch application).