episcanpy.api¶
API¶
Import epiScanpy’s high-level API as:
import episcanpy.api as epi
Count Matrices: CT¶
Loading data, loading annotations, building count matrices, filtering of lowly covered methylation variables. Filtering of lowly covered cells.
Load features¶
In order to build a count matrix for either methylation or open chromatin data, loading the segmentation of the genome of interest or the set of features of interest is a prerequirement.
|
The function load features is here to transform a bed file into a usable set of units to measure methylation levels. |
|
Generate windows/bins of the given size for the appropriate genome (default choice is human). |
|
If the features loaded are too smalls or of different sizes, it is possible to normalise them to a unique given size by extending the feature coordinate in both directions. |
|
Plot the different feature sizes in an histogram. |
|
Extract the names of the loaded features, specifying the chromosome they originated from. |
Reading methylation file¶
Functions to read methylation files, extract methylation and buildthe count matrices:
|
Build methylation count matrix for a given annotation. |
|
Read file from which you want to extract the methylation level and (assuming it is like the Ecker/Methylpy format) extract the number of methylated read and the total number of read for the cytosines covered and in the right genomic context (CG or CH) :param sample_name: name of the file to read to extract key information. |
|
read the raw count matrix and convert it into an AnnData object. |
Reading open chromatin(ATAC) file¶
ATAC-seq specific functions to build count matrices and load data:
|
Build a count matrix one set of features at a time. |
|
Convert regular atac matrix into a sparse Anndata: |
Preprocessing: PP¶
Imputing missing data (methylation), filtering lowly covered cells or variables, correction for batch effect.
|
Histogram of the number of open features (in the case of ATAC-seq data) per cell. |
|
|
|
convert the count matrix into a binary matrix. |
|
Automatically computes PCA coordinates, loadings and variance decomposition, a neighborhood graph of observations, t-distributed stochastic neighborhood embedding (tSNE) Uniform Manifold Approximation and Projection (UMAP) |
|
Load observational metadata in adata.obs. |
|
Load sparse matrix (including matrices corresponding to 10x data) as AnnData objects. |
Methylation matrices¶
Methylation specific count matrices.
|
Impute missing values in methyaltion level matrices. |
|
read the raw count matrix and convert it into an AnnData object. |
|
Temporary function to load and impute methyaltion count matrix into an AnnData object |
Tools: TL¶
|
It is a wrap-up function of scanpy sc.tl.rank_genes_groups function. |
|
Compute silhouette scores. |
|
Automatically computes PCA coordinates, loadings and variance decomposition, a neighborhood graph of observations, t-distributed stochastic neighborhood embedding (tSNE) Uniform Manifold Approximation and Projection (UMAP) |
|
Convert list of known cell type markers from literature to a dictionary Input list of known marker genes First row is considered the header |
|
Use markers of a given cell type to plot peak openness for peaks in promoters of the given markers Input cell type, cell type markers, peak promoter intersections |
Plotting: PL¶
The plotting module episcanpy.plotting
largely parallels the tl.*
and a few of the pp.*
functions.
For most tools and for some preprocessing functions, you’ll find a plotting function with the same name.