Algorithm for counting and collapsing the number of UMIs supporting a specific ligation.

counterUMI4C(
  wk_dir,
  pos_viewpoint,
  res_enz,
  digested_genome,
  filter_bp = 1e+07
)

Arguments

wk_dir

Working directory where to save the outputs generated by the UMI-4c analysis.

pos_viewpoint

GRanges object containing the genomic position of the viewpoint.

res_enz

Character containing the restriction enzyme sequence.

digested_genome

Path for the digested genome file generated using the digestGenome function.

filter_bp

Integer indicating the bp upstream and downstream of the viewpoint to select for further analysis. Default=10e6.

Value

Creates a compressed tab-delimited file in wk_dir/count named "basename(fastq) _counts.tsv.gz", containing the coordinates for the viewpoint fragment, contact fragment and the number of UMIs detected in the ligation.

Details

For collapsing different molecules into the same UMI, takes into account the ligation position and the number of UMI sequence mismatches.

Examples

if (interactive()) { path <- downloadUMI4CexampleData(reduced = TRUE) hg19_dpnii <- digestGenome( cut_pos = 0, res_enz = "GATC", name_RE = "DpnII", sel_chr = "chr16", # digest only chr16 to make example faster ref_gen = BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19, out_path = file.path(path, "digested_genome") ) viewpoint <- GenomicRanges::GRanges("chr16:10972515-10972548") counterUMI4C( wk_dir = file.path(path, "CIITA"), pos_viewpoint = viewpoint, res_enz = "GATC", digested_genome = hg19_dpnii ) }