episcanpy.api¶

API¶

Import epiScanpy’s high-level API as:

import episcanpy.api as epi

Count Matrices: CT¶

Loading data, loading annotations, building count matrices, filtering of lowly covered methylation variables. Filtering of lowly covered cells.

Load features¶

In order to build a count matrix for either methylation or open chromatin data, loading the segmentation of the genome of interest or the set of features of interest is a prerequirement.

`ct.load_features`(file_features[, …])	The function load features is here to transform a bed file into a usable set of units to measure methylation levels.
`ct.make_windows`(size[, chromosomes, max_length])	Generate windows/bins of the given size for the appropriate genome (default choice is human).
`ct.size_feature_norm`(loaded_feature, size)	If the features loaded are too smalls or of different sizes, it is possible to normalise them to a unique given size by extending the feature coordinate in both directions.
`ct.plot_size_features`(loaded_feature[, …])	Plot the different feature sizes in an histogram.
`ct.name_features`(loaded_features)	Extract the names of the loaded features, specifying the chromosome they originated from.

Reading methylation file¶

Functions to read methylation files, extract methylation and buildthe count matrices:

`ct.build_count_mtx`(cells, annotation[, …])	Build methylation count matrix for a given annotation.
`ct.read_cyt_summary`(sample_name, meth_type, …)	Read file from which you want to extract the methylation level and (assuming it is like the Ecker/Methylpy format) extract the number of methylated read and the total number of read for the cytosines covered and in the right genomic context (CG or CH) :param sample_name: name of the file to read to extract key information.
`ct.load_met_noimput`(matrix_file[, path, save])	read the raw count matrix and convert it into an AnnData object.

Reading open chromatin(ATAC) file¶

ATAC-seq specific functions to build count matrices and load data:

`ct.bld_atac_mtx`(list_bam_files, loaded_feat)	Build a count matrix one set of features at a time.
`ct.save_sparse_mtx`(initial_matrix[, …])	Convert regular atac matrix into a sparse Anndata:

General functions¶

Functions non -omic specific:

toctree

.

ct.save_sparse_mtx

Preprocessing: PP¶

Imputing missing data (methylation), filtering lowly covered cells or variables, correction for batch effect.

`pp.coverage_cells`(adata[, bins, key_added, …])	Histogram of the number of open features (in the case of ATAC-seq data) per cell.
`pp.commoness_features`
`pp.binarize`(adata[, copy])	convert the count matrix into a binary matrix.
`pp.lazy`(adata[, pp_pca, nb_pcs, …])	Automatically computes PCA coordinates, loadings and variance decomposition, a neighborhood graph of observations, t-distributed stochastic neighborhood embedding (tSNE) Uniform Manifold Approximation and Projection (UMAP)
`pp.load_metadata`(adata, metadata_file[, …])	Load observational metadata in adata.obs.
`pp.read_ATAC_10x`(matrix[, cell_names, …])	Load sparse matrix (including matrices corresponding to 10x data) as AnnData objects.

Methylation matrices¶

Methylation specific count matrices.

`pp.imputation_met`(adata[, …])	Impute missing values in methyaltion level matrices.
`pp.load_met_noimput`(matrix_file[, path, save])	read the raw count matrix and convert it into an AnnData object.
`pp.readandimputematrix`(file_name[, min_coverage])	Temporary function to load and impute methyaltion count matrix into an AnnData object

Tools: TL¶

`tl.rank_features`(adata, groupby[, omic, …])	It is a wrap-up function of scanpy sc.tl.rank_genes_groups function.
`tl.silhouette`(adata_name, cluster_annot[, …])	Compute silhouette scores.
`tl.lazy`(adata[, pp_pca, copy])	Automatically computes PCA coordinates, loadings and variance decomposition, a neighborhood graph of observations, t-distributed stochastic neighborhood embedding (tSNE) Uniform Manifold Approximation and Projection (UMAP)
`tl.load_markers`(path, marker_list_file)	Convert list of known cell type markers from literature to a dictionary Input list of known marker genes First row is considered the header
`tl.identify_cluster`(adata, cell_type, …[, …])	Use markers of a given cell type to plot peak openness for peaks in promoters of the given markers Input cell type, cell type markers, peak promoter intersections

Plotting: PL¶

The plotting module episcanpy.plotting largely parallels the tl.* and a few of the pp.* functions. For most tools and for some preprocessing functions, you’ll find a plotting function with the same name.