scbiot.pp.remove_promoter_proximal_peaks

scbiot.pp.remove_promoter_proximal_peaks#

scbiot.pp.remove_promoter_proximal_peaks(adata, gtf_file, promoter_up=2000, promoter_down=500, chrom_col=None, start_col=None, end_col=None)#

Remove peaks that overlap promoter windows defined from a GTF.

Parameters#

adata:

ATAC AnnData with peaks in adata.var or encoded in adata.var_names.

gtf_file:

Path to the GTF annotation used to define gene promoters.

promoter_up / promoter_down:

Upstream/downstream distances (bp) from the TSS defining promoter windows.

chrom_col / start_col / end_col:

Optional adata.var columns for peak coordinates. If not provided or missing, the function tries standard column names or parses adata.var_names.

Returns#

AnnData

Copy of adata with promoter-proximal peaks removed. The input adata is annotated with adata.var["is_promoter_proximal"].

Examples#

Basic usage:

>>> import scbiot as scb
# download gtf from GENCODE: https://www.gencodegenes.org/human/    
>>> adata_atac = scb.pp.remove_promoter_proximal_peaks(atac, f"{dir}/inputs/gencode.vM25.chr_patch_hapl_scaff.annotation.gtf.gz")
Parameters:
  • adata (anndata.AnnData)

  • gtf_file (Path | str)

  • promoter_up (int)

  • promoter_down (int)

  • chrom_col (str | None)

  • start_col (str | None)

  • end_col (str | None)

Return type:

anndata.AnnData