MERFISH mouse liver
In this notebook, we will use ovrlpy to investigate the Vizgen MERFISH’s mouse liver dataset.
We want to create a signal embedding of the transcriptome, and a vertical signal incoherence map to identify locations with a high risk of containing spatial doublets.
Settings and Imports
First, let’s define settings and input files.
from pathlib import Path
import matplotlib.pyplot as plt
import ovrlpy
sample_nr = 1
slice_nr = 1
data_path = Path("/dh-projects/ag-ishaque/raw_data/vizgen-merfish/vz-liver-showcase")
coordinate_file = (
data_path / f"Liver{sample_nr}Slice{slice_nr}" / "detected_transcripts.csv"
)
Loading the data
Next, we want to load the data.
coordinate_df = ovrlpy.io.read_MERFISH(coordinate_file)
print(f"Number of transcripts: {len(coordinate_df):,}")
Number of transcripts: 417,243,171
coordinate_df.head()
| x | y | z | gene | |
|---|---|---|---|---|
| 0 | 2506.4070 | -95.451480 | 0.0 | Comt |
| 1 | 2531.8447 | -95.187020 | 0.0 | Comt |
| 2 | 2483.7969 | -91.360115 | 0.0 | Comt |
| 3 | 2505.7693 | -84.081650 | 0.0 | Comt |
| 4 | 2501.3940 | -81.387090 | 0.0 | Comt |
The dataset is quite large, so we will subset to a smaller region.
# subset to region
x_lims = (2000, 9000)
y_lims = (1000, 9000)
coordinate_df = coordinate_df.loc[
lambda df: (df["x"] > x_lims[0])
& (df["x"] < x_lims[1])
& (df["y"] > y_lims[0])
& (df["y"] < y_lims[1])
]
coordinate_df = coordinate_df.assign(
x=lambda df: df["x"] - df["x"].min(), y=lambda df: df["y"] - df["y"].min()
)
print(f"Number of transcripts: {len(coordinate_df):,}")
Number of transcripts: 300,693,961
Tissue overview
plt.scatter(coordinate_df.loc[::100, "x"], coordinate_df.loc[::100, "y"], s=0.1)
_ = plt.gca().set_aspect("equal", adjustable="box")
Compute & Visualize coherence map
integrity, signal, visualizer = ovrlpy.run(
df=coordinate_df, cell_diameter=7, n_expected_celltypes=10, n_workers=8
)
Running vertical adjustment
Creating gene expression embeddings for visualization:
Analyzing in 3d mode:
determining pseudocells:
found 137299 pseudocells
sampling expression:
100%|██████████| 120/120 [08:21<00:00, 4.18s/it]
Modeling 10 pseudo-celltype clusters;
Creating signal integrity map:
0%| | 1/224 [00:03<14:37, 3.94s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
7%|▋ | 16/224 [00:55<09:28, 2.73s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
8%|▊ | 17/224 [00:58<09:41, 2.81s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
8%|▊ | 18/224 [01:01<09:52, 2.88s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
9%|▉ | 21/224 [01:11<11:27, 3.39s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
12%|█▎ | 28/224 [01:38<12:15, 3.75s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
23%|██▎ | 51/224 [03:02<10:44, 3.73s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
26%|██▌ | 58/224 [03:28<10:35, 3.83s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
38%|███▊ | 86/224 [05:11<08:12, 3.57s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
41%|████ | 92/224 [05:33<07:59, 3.63s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
43%|████▎ | 96/224 [05:47<07:43, 3.62s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
49%|████▉ | 110/224 [06:38<06:48, 3.59s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
70%|███████ | 157/224 [09:34<04:09, 3.72s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
77%|███████▋ | 173/224 [10:35<03:11, 3.75s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
80%|███████▉ | 179/224 [10:57<02:46, 3.71s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
83%|████████▎ | 187/224 [11:26<02:17, 3.72s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
86%|████████▌ | 192/224 [11:44<01:52, 3.51s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
91%|█████████ | 204/224 [12:29<01:15, 3.78s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
93%|█████████▎| 208/224 [12:41<00:48, 3.04s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
95%|█████████▌| 213/224 [13:00<00:40, 3.66s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
96%|█████████▌| 215/224 [13:08<00:33, 3.68s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
98%|█████████▊| 220/224 [13:26<00:14, 3.69s/it]/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/ovrlpy/_utils.py:397: RuntimeWarning: invalid value encountered in divide
spatial_patch_cosine_similarity[patch_signal_mask] = np.sum(
100%|██████████| 224/224 [13:35<00:00, 3.64s/it]
visualizer.plot_fit()
Signal integrity of the tissue sample
fig, ax = ovrlpy.plot_signal_integrity(integrity, signal, signal_threshold=3)
Doublet probability
doublet_df = ovrlpy.detect_doublets(
integrity, signal, minimum_signal_strength=3, integrity_sigma=3
)
ax = plt.scatter(
doublet_df["x"], doublet_df["y"], c=doublet_df["integrity"], s=0.1, cmap="viridis_r"
)
plt.gca().set_aspect("equal", adjustable="box")
_ = plt.colorbar(ax)
Visualize a specific doublet event like so.
doublet_case = 100
x, y = doublet_df.loc[doublet_case, ["x", "y"]]
_ = ovrlpy.plot_region_of_interest(
x, y, coordinate_df, visualizer, integrity, signal, window_size=40
)
/dh-projects/ag-ishaque/analysis/muellni/envs/ovrlpy/lib/python3.12/site-packages/sklearn/base.py:493: UserWarning: X does not have valid feature names, but PCA was fitted with feature names
warnings.warn(