Ultraviolet (UV) radiation may lead to melanoma and non-melanoma skin cancers by causing helix-distorting DNA damages such as cyclobutane pyrimidine dimers (CPDs). These DNA lesions, if located in important genes and not repaired promptly, are mutagenic and may eventually result in carcinogenesis. Examining CPD formation and repair processes across the genome can shed light on the mutagenesis mechanisms associated with UV damage in relevant cancers. We recently developed CPD-Seq, a high-throughput and single nucleotide resolution sequencing technique that can specifically capture UV-induced CPD lesions across the genome. This novel technique has been increasingly used in studies of UV damage and can be adapted to sequence other clinically relevant DNA lesions. While the library preparation protocol has been established, a systematic protocol to analyze CPD-Seq data has not been described yet. To streamline the various general or specific analysis steps, we developed a protocol named CPDSeqer to assist researchers with CPD-Seq data processing. CPDSeqer is flexible to accommodate both a single- and multiple-sample experimental design, and it allows both genome-wide analyses and regional scrutiny (such as suspected UV damage hotspots). The runtime of CPDSeqer scales with raw data size and takes roughly four hours per sample with the possibility of acceleration by parallel computing. Various guiding graphics are generated to help diagnose the performance of the experiment and inform on regional enrichment of CPD formation. UV damage comparison analyses are conjectured in three analysis scenarios, and the resulting HTML pages report damage directional trends and statistical significance.


CPD-seq is available at github.


Sheng Q, Hui Y, Duan M, Ness S, He J, Kang H, Jiang L, Wyrick J, Mao P, Guo Y, A streamlined solution for quality control, processing, and elucidating cyclobutane pyrimidine dimer sequencing data, Nature Protocols, 2020