ovrlpy.Visualizer ================= .. py:class:: ovrlpy.Visualizer(KDE_bandwidth=1.5, celltyping_min_expression=10, celltyping_min_distance=5, n_components_pca=0.7, dtype=np.float32, umap_kwargs=UMAP_2D_PARAMS, cumap_kwargs=UMAP_RGB_PARAMS) A class to visualize spatial transcriptomics data. Contains a latent gene expression UMAP and RGB embedding. :param KDE_bandwidth: The bandwidth of the KDE. :type KDE_bandwidth: float, optional :param celltyping_min_expression: Minimum expression level for cell typing. :type celltyping_min_expression: int, optional :param celltyping_min_distance: Minimum distance for cell typing. :type celltyping_min_distance: int, optional :param n_components_pca: Number of components for PCA. :type n_components_pca: float, optional :param dtype: Datatype for the KDE. :param umap_kwargs: Keyword arguments for 2D UMAP embedding. :type umap_kwargs: dict, optional :param cumap_kwargs: Keyword arguments for 3D UMAP embedding. :type cumap_kwargs: dict, optional .. attribute:: KDE_bandwidth The bandwidth of the KDE. :type: float .. attribute:: celltyping_min_expression Minimum expression level for cell typing. :type: int .. attribute:: celltyping_min_distance Minimum distance for cell typing. :type: int .. attribute:: pseudocell_locations_x x-coordinates of cell typing regions of interest obtained through gene expression localmax sampling. :type: numpy.ndarray .. attribute:: pseudocell_locations_y y-coordinates of cell typing regions of interest obtained through gene expression localmax sampling. :type: numpy.ndarray .. attribute:: pseudocell_expression_samples Gene expression matrix of the cell typing regions of interest. :type: pandas.DataFrame .. attribute:: signatures A matrix of celltypes x gene signatures to use to annotate the UMAP. :type: pandas.DataFrame .. attribute:: celltype_centers The center of gravity of each celltype in the 2d embedding, used for UMAP annotation. :type: numpy.ndarray .. attribute:: celltype_class_assignments The class assignments of the cell types. :type: numpy.ndarray .. attribute:: pca_2d The PCA object used for the 2d embedding. :type: sklearn.decomposition.PCA .. attribute:: embedder_2d The UMAP object used for the 2d embedding. :type: umap.UMAP .. attribute:: pca_3d The PCA object used for the 3d RGB embedding. :type: sklearn.decomposition.PCA .. attribute:: embedder_3d The UMAP object used for the 3d RGB embedding. :type: umap.UMAP .. attribute:: n_components_pca Number of components for PCA. :type: float .. attribute:: umap_kwargs Keyword arguments for 2D UMAP embedding object. :type: dict .. attribute:: cumap_kwargs Keyword arguments for 3D UMAP RGB embedding object. :type: dict .. attribute:: genes A list of genes to utilize in the model. :type: list .. attribute:: embedding The 2d embedding of pseudocell gene expression . :type: numpy.ndarray .. attribute:: colors The RGB embedding. :type: numpy.ndarray .. attribute:: colors_min_max The minimum and maximum values of the RGB embedding, necessary for normalization of the transform method. :type: list .. attribute:: integrity_map The integrity map of the tissue. :type: numpy.ndarray .. attribute:: signal_map A pixel map of overall signal strength in the tissue, used to mask out low-signal regions that are difficult to interpret. :type: numpy.ndarray .. py:method:: fit_transcripts(coordinate_df, genes=None, gene_key = 'gene', signature_matrix=None, fit_umap = True, patch_length = 500, n_workers = 8) Fits the visualizer to a spatial transcripts dataset using the SSAM algorithm. :param coordinate_df: A dataframe of coordinates. :type coordinate_df: pandas.DataFrame :param genes: A list of genes to utilize in the model. None uses all genes. :type genes: list :param gene_key: The key in the dataframe containing the gene names. :type gene_key: str :param signature_matrix: A matrix of celltypes x gene signatures to use to annotate the UMAP. :type signature_matrix: pandas.DataFrame :param fit_umap: Whether to fit the UMAP to the data. :type fit_umap: bool :param patch_length: Size of the length in each dimension when calculating signal integrity in patches. Smaller values will use less memory, but may take longer to compute. :type patch_length: int :param n_workers: The number of workers to use in the SSAM algorithm :type n_workers: int .. py:method:: fit_pseudocells(pseudocell_expression_samples, *, genes = None, fit_umap = True) Fits the visualizer to a given pseudocell expression sample. :param pseudocell_expression_samples: A cell x gene matrix of gene expression :type pseudocell_expression_samples: pandas.DataFrame :param genes: A list of genes to utilize in the model. :type genes: list, optional :param fit_umap: Whether to fit the UMAP to the data. :type fit_umap: bool .. py:method:: fit_signatures(signature_matrix=None) Fits the visualizer with a given signature matrix. :param signature_matrix: A matrix of celltypes x gene signatures to use to annotate the UMAP. None defaults to displaying individual genes. :type signature_matrix: pandas.DataFrame .. py:method:: subsample_df(x, y, coordinate_df, window_size = 30) Subsamples the coordinate dataframe spatially based on given x, y coordinates and window size. :param x: x-coordinate to center the sampling window :type x: float :param y: y-coordinate to center the sampling window :type y: float :param coordinate_df: DataFrame of gene annotated molecule coordinates to create the subsample from :type coordinate_df: pandas.DataFrame :param window_size: The window size of the sampling window. Molecules within this window around (x,y) are sampled and returned as a new DataFrame. :type window_size: int, optional .. py:method:: transform_transcripts(coordinate_df) Transforms the coordinate dataframe to the visualizers 2d and 3d embedding space. :param coordinate_df: Data frame of gene-annotated molecule coordinates to transform. :type coordinate_df: pandas.DataFrame .. py:method:: transform_pseudocells(pseudocell_expression_samples) Transforms a matrix of gene expression to the visualizer's 2d and 3d embedding space. :param pseudocell_expression_samples: A cell x gene matrix of gene expression :type pseudocell_expression_samples: pandas.DataFrame .. py:method:: pseudocell_df() Returns a pandas.DataFrame containing the gene-count matrix of the fitted tissue's determined pseudo-cells. :rtype: pandas.DataFrame .. py:method:: plot_region_of_interest(subsample, subsample_embedding_color, x = None, y = None, window_size = None, rasterized = True, scalebar = SCALEBAR_PARAMS) Plots an instance of the visualized data. :param subsample: A dataframe of molecule coordinates and gene assignments. :type subsample: pandas.DataFrame :param subsample_embedding_color: A list of rgb values for each molecule. :type subsample_embedding_color: pandas.DataFrame :param x: Center x-coordinate for the region-of-interest. :type x: float :param y: Center y-coordinate for the region-of-interest. :type y: float :param window_size: Window size of the region-of-interest. :type window_size: float, optional :param rasterized: If True all plots will be rasterized. :type rasterized: bool, optional :param scalebar: If `None` no scalebar will be plotted. Otherwise a dictionary with additional kwargs for ``matplotlib_scalebar.scalebar.ScaleBar``. By default :py:attr:`ovrlpy.SCALEBAR_PARAMS` :type scalebar: dict[str, typing.Any] | None .. py:method:: plot_umap(ax = None, rasterized = False, **kwargs) Plots the UMAP embedding. :param ax: Axis object to plot on. :type ax: typing.Optional[matplotlib.axes.Axes] :param rasterized: If True the plot will be rasterized. :type rasterized: bool, optional :param kwargs: Keyword arguments for :py:func:`matplotlib.pyplot.scatter`. .. py:method:: plot_tissue(rasterized = False, scalebar = SCALEBAR_PARAMS, **kwargs) Plots the tissue embedding. :param rasterized: If True the plot will be rasterized. :type rasterized: bool, optional :param scalebar: If `None` no scalebar will be plotted. Otherwise a dictionary with additional kwargs for ``matplotlib_scalebar.scalebar.ScaleBar``. By default :py:attr:`ovrlpy.SCALEBAR_PARAMS` :type scalebar: dict[str, typing.Any] | None :param kwargs: Keyword arguments for the matplotlib's scatter plot function. .. py:method:: plot_fit(rasterized = True, umap_kwargs={'scatter_kwargs': {'s': 1}}, tissue_kwargs={'s': 1}) Plots the fitted model. :param rasterized: If True all plots will be rasterized. :type rasterized: bool, optional