ovrlpy.Visualizer
=================

.. py:class:: ovrlpy.Visualizer(KDE_bandwidth=1.5, celltyping_min_expression=10, celltyping_min_distance=5, n_components_pca=0.7, dtype=np.float32, umap_kwargs=UMAP_2D_PARAMS, cumap_kwargs=UMAP_RGB_PARAMS)

   A class to visualize spatial transcriptomics data.
   Contains a latent gene expression UMAP and RGB embedding.

   :param KDE_bandwidth: The bandwidth of the KDE.
   :type KDE_bandwidth: float, optional
   :param celltyping_min_expression: Minimum expression level for cell typing.
   :type celltyping_min_expression: int, optional
   :param celltyping_min_distance: Minimum distance for cell typing.
   :type celltyping_min_distance: int, optional
   :param n_components_pca: Number of components for PCA.
   :type n_components_pca: float, optional
   :param dtype: Datatype for the KDE.
   :param umap_kwargs: Keyword arguments for 2D UMAP embedding.
   :type umap_kwargs: dict, optional
   :param cumap_kwargs: Keyword arguments for 3D UMAP embedding.
   :type cumap_kwargs: dict, optional

   .. attribute:: KDE_bandwidth

      The bandwidth of the KDE.

      :type: float

   .. attribute:: celltyping_min_expression

      Minimum expression level for cell typing.

      :type: int

   .. attribute:: celltyping_min_distance

      Minimum distance for cell typing.

      :type: int

   .. attribute:: pseudocell_locations_x

      x-coordinates of cell typing regions of interest obtained through gene expression localmax sampling.

      :type: numpy.ndarray

   .. attribute:: pseudocell_locations_y

      y-coordinates of cell typing regions of interest obtained through gene expression localmax sampling.

      :type: numpy.ndarray

   .. attribute:: pseudocell_expression_samples

      Gene expression matrix of the cell typing regions of interest.

      :type: pandas.DataFrame

   .. attribute:: signatures

      A matrix of celltypes x gene signatures to use to annotate the UMAP.

      :type: pandas.DataFrame

   .. attribute:: celltype_centers

      The center of gravity of each celltype in the 2d embedding, used for UMAP annotation.

      :type: numpy.ndarray

   .. attribute:: celltype_class_assignments

      The class assignments of the cell types.

      :type: numpy.ndarray

   .. attribute:: pca_2d

      The PCA object used for the 2d embedding.

      :type: sklearn.decomposition.PCA

   .. attribute:: embedder_2d

      The UMAP object used for the 2d embedding.

      :type: umap.UMAP

   .. attribute:: pca_3d

      The PCA object used for the 3d RGB embedding.

      :type: sklearn.decomposition.PCA

   .. attribute:: embedder_3d

      The UMAP object used for the 3d RGB embedding.

      :type: umap.UMAP

   .. attribute:: n_components_pca

      Number of components for PCA.

      :type: float

   .. attribute:: umap_kwargs

      Keyword arguments for 2D UMAP embedding object.

      :type: dict

   .. attribute:: cumap_kwargs

      Keyword arguments for 3D UMAP RGB embedding object.

      :type: dict

   .. attribute:: genes

      A list of genes to utilize in the model.

      :type: list

   .. attribute:: embedding

      The 2d embedding of pseudocell gene expression .

      :type: numpy.ndarray

   .. attribute:: colors

      The RGB embedding.

      :type: numpy.ndarray

   .. attribute:: colors_min_max

      The minimum and maximum values of the RGB embedding, necessary for normalization of the transform method.

      :type: list

   .. attribute:: integrity_map

      The integrity map of the tissue.

      :type: numpy.ndarray

   .. attribute:: signal_map

      A pixel map of overall signal strength in the tissue, used to mask out low-signal regions that are difficult to interpret.

      :type: numpy.ndarray


   .. py:method:: fit_transcripts(coordinate_df, genes=None, gene_key = 'gene', signature_matrix=None, fit_umap = True, patch_length = 500, n_workers = 8)

      Fits the visualizer to a spatial transcripts dataset using the SSAM algorithm.

      :param coordinate_df: A dataframe of coordinates.
      :type coordinate_df: pandas.DataFrame
      :param genes: A list of genes to utilize in the model. None uses all genes.
      :type genes: list
      :param gene_key: The key in the dataframe containing the gene names.
      :type gene_key: str
      :param signature_matrix: A matrix of celltypes x gene signatures to use to annotate the UMAP.
      :type signature_matrix: pandas.DataFrame
      :param fit_umap: Whether to fit the UMAP to the data.
      :type fit_umap: bool
      :param patch_length: Size of the length in each dimension when calculating signal integrity in patches.
                           Smaller values will use less memory, but may take longer to compute.
      :type patch_length: int
      :param n_workers: The number of workers to use in the SSAM algorithm
      :type n_workers: int


   .. py:method:: fit_pseudocells(pseudocell_expression_samples, *, genes = None, fit_umap = True)

      Fits the visualizer to a given pseudocell expression sample.

      :param pseudocell_expression_samples: A cell x gene matrix of gene expression
      :type pseudocell_expression_samples: pandas.DataFrame
      :param genes: A list of genes to utilize in the model.
      :type genes: list, optional
      :param fit_umap: Whether to fit the UMAP to the data.
      :type fit_umap: bool


   .. py:method:: fit_signatures(signature_matrix=None)

      Fits the visualizer with a given signature matrix.

      :param signature_matrix: A matrix of celltypes x gene signatures to use to annotate the UMAP.
                               None defaults to displaying individual genes.
      :type signature_matrix: pandas.DataFrame


   .. py:method:: subsample_df(x, y, coordinate_df, window_size = 30)

      Subsamples the coordinate dataframe spatially based on given x, y coordinates and window
      size.

      :param x: x-coordinate to center the sampling window
      :type x: float
      :param y: y-coordinate to center the sampling window
      :type y: float
      :param coordinate_df: DataFrame of gene annotated molecule coordinates to create the subsample from
      :type coordinate_df: pandas.DataFrame
      :param window_size: The window size of the sampling window. Molecules within this window around (x,y)
                          are sampled and returned as a new DataFrame.
      :type window_size: int, optional


   .. py:method:: transform_transcripts(coordinate_df)

      Transforms the coordinate dataframe to the visualizers 2d and 3d embedding space.

      :param coordinate_df: Data frame of gene-annotated molecule coordinates to transform.
      :type coordinate_df: pandas.DataFrame


   .. py:method:: transform_pseudocells(pseudocell_expression_samples)

      Transforms a matrix of gene expression to the visualizer's 2d and 3d embedding space.

      :param pseudocell_expression_samples: A cell x gene matrix of gene expression
      :type pseudocell_expression_samples: pandas.DataFrame


   .. py:method:: pseudocell_df()

      Returns a pandas.DataFrame containing the gene-count matrix of the fitted
      tissue's determined pseudo-cells.

      :rtype: pandas.DataFrame


   .. py:method:: plot_region_of_interest(subsample, subsample_embedding_color, x = None, y = None, window_size = None, rasterized = True, scalebar = SCALEBAR_PARAMS)

      Plots an instance of the visualized data.

      :param subsample: A dataframe of molecule coordinates and gene assignments.
      :type subsample: pandas.DataFrame
      :param subsample_embedding_color: A list of rgb values for each molecule.
      :type subsample_embedding_color: pandas.DataFrame
      :param x: Center x-coordinate for the region-of-interest.
      :type x: float
      :param y: Center y-coordinate for the region-of-interest.
      :type y: float
      :param window_size: Window size of the region-of-interest.
      :type window_size: float, optional
      :param rasterized: If True all plots will be rasterized.
      :type rasterized: bool, optional
      :param scalebar: If `None` no scalebar will be plotted. Otherwise a dictionary with
                       additional kwargs for ``matplotlib_scalebar.scalebar.ScaleBar``.
                       By default :py:attr:`ovrlpy.SCALEBAR_PARAMS`
      :type scalebar: dict[str, typing.Any] | None


   .. py:method:: plot_umap(ax = None, rasterized = False, **kwargs)

      Plots the UMAP embedding.

      :param ax: Axis object to plot on.
      :type ax: typing.Optional[matplotlib.axes.Axes]
      :param rasterized: If True the plot will be rasterized.
      :type rasterized: bool, optional
      :param kwargs: Keyword arguments for :py:func:`matplotlib.pyplot.scatter`.


   .. py:method:: plot_tissue(rasterized = False, scalebar = SCALEBAR_PARAMS, **kwargs)

      Plots the tissue embedding.

      :param rasterized: If True the plot will be rasterized.
      :type rasterized: bool, optional
      :param scalebar: If `None` no scalebar will be plotted. Otherwise a dictionary with
                       additional kwargs for ``matplotlib_scalebar.scalebar.ScaleBar``.
                       By default :py:attr:`ovrlpy.SCALEBAR_PARAMS`
      :type scalebar: dict[str, typing.Any] | None
      :param kwargs: Keyword arguments for the matplotlib's scatter plot function.


   .. py:method:: plot_fit(rasterized = True, umap_kwargs={'scatter_kwargs': {'s': 1}}, tissue_kwargs={'s': 1})

      Plots the fitted model.

      :param rasterized: If True all plots will be rasterized.
      :type rasterized: bool, optional