Algorithm for counting and collapsing the number of UMIs supporting a specific ligation.
counterUMI4C( wk_dir, pos_viewpoint, res_enz, digested_genome, filter_bp = 1e+07 )
wk_dir | Working directory where to save the outputs generated by the UMI-4c analysis. |
---|---|
pos_viewpoint | GRanges object containing the genomic position of the viewpoint. |
res_enz | Character containing the restriction enzyme sequence. |
digested_genome | Path for the digested genome file generated using the
|
filter_bp | Integer indicating the bp upstream and downstream of the viewpoint to select for further analysis. Default=10e6. |
Creates a compressed tab-delimited file in wk_dir/count
named
"basename(fastq) _counts.tsv.gz
", containing the
coordinates for the viewpoint fragment, contact fragment and the number of
UMIs detected in the ligation.
For collapsing different molecules into the same UMI, takes into account the ligation position and the number of UMI sequence mismatches.
if (interactive()) { path <- downloadUMI4CexampleData(reduced = TRUE) hg19_dpnii <- digestGenome( cut_pos = 0, res_enz = "GATC", name_RE = "DpnII", sel_chr = "chr16", # digest only chr16 to make example faster ref_gen = BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19, out_path = file.path(path, "digested_genome") ) viewpoint <- GenomicRanges::GRanges("chr16:10972515-10972548") counterUMI4C( wk_dir = file.path(path, "CIITA"), pos_viewpoint = viewpoint, res_enz = "GATC", digested_genome = hg19_dpnii ) }