gimVI seq and spatial mixing

scrnaseq · December 11, 2020, 5:51pm

Thank you for these great set of tools! I’ve been following the gimVI tutorial for integrating my scRNA-seq data with spatial data, but I’ve noticed that the seq and spatial data don’t mix well when I plot the model-outputted latent representations on a UMAP.

Could you offer some tips on parameters to tweak during model training that might improve the seq/spatial mixing?

adamgayoso · December 14, 2020, 5:19am

I would start with trying to increase the number of encoder and decoder layers. Could you post your script that you’re using?

scrnaseq · December 15, 2020, 5:01pm

Thanks, here is my script:

import os
import copy
import numpy as np
import pandas as pd
import scanpy as sc
import anndata as ad
from scvi.data import setup_anndata
from scvi.model import GIMVI

home_path = 'drive/My Drive/_output'
in_path = os.path.join(home_path, '121120_spatial_data.h5ad')
spatial_data = sc.read_h5ad(in_path)
in_path = os.path.join(home_path, '121120_seq_data.h5ad')
seq_data = sc.read_h5ad(in_path)

seq_data = seq_data[:, spatial_data.var_names].copy()
seq_gene_names = seq_data.var_names

train_size = 0.8
n_genes = seq_data.n_vars
n_train_genes = int(n_genes * train_size)

rand_train_gene_idx = np.random.choice(range(n_genes), n_train_genes, replace=False)
rand_test_gene_idx = sorted(set(range(n_genes)) - set(rand_train_gene_idx))
rand_train_genes = seq_gene_names[rand_train_gene_idx]
rand_test_genes = seq_gene_names[rand_test_gene_idx]

spatial_data_partial = spatial_data[:, rand_train_genes].copy()

sc.pp.filter_cells(spatial_data_partial, min_counts=1)
sc.pp.filter_cells(seq_data, min_counts=1)

setup_anndata(spatial_data_partial, labels_key='ClusterName', batch_key='TMA_12')
setup_anndata(seq_data, labels_key='cl_CellType', batch_key='cType')

spatial_data = spatial_data[spatial_data_partial.obs_names, :]

model = GIMVI(seq_data, spatial_data_partial)
model.train(200)

latent_seq, latent_spatial = model.get_latent_representation()

n = 150000
np.random.seed(54321)
seq_idxs = np.random.choice(latent_seq.shape[0], n, replace=False)
spatial_idxs = np.random.choice(latent_spatial.shape[0], n, replace=False)

latent_representation = np.concatenate([latent_seq[seq_idxs, :], latent_spatial[spatial_idxs, :]])
latent_adata = ad.AnnData(latent_representation)
latent_labels = (['seq'] * n) + (['spatial'] * n)
latent_adata.obs['labels'] = latent_labels

sc.tl.pca(latent_adata)
sc.pp.neighbors(latent_adata)
sc.tl.umap(latent_adata)
sc.pl.umap(latent_adata[np.random.permutation(np.arange(latent_adata.obs.shape[0])), :], color='labels')

adamgayoso · December 21, 2020, 4:43pm

You might try increasing the weight of the adversarial loss, which would be the kappa parameter when you call the .train() method.

scrnaseq · December 22, 2020, 4:16am

Thank you! I will try higher values of kappa. I also noticed that the spatial_data clusters don’t separate well on the UMAP, even though the seq_data clusters do separate well. Are there any parameters that would influence one but not the other?

Topic		Replies	Views
Can gimvi be used to symmetrically impute missing genes in two spatial datasets? scvi-tools gimvi	1	265	December 10, 2022
Thoughts on a more ~realistic tutorial? scvi-tools tutorials	14	966	February 26, 2022
Batch Integration Parameter Tuning scvi-tools integration , gene-selection , scvi , modeling	1	485	March 2, 2022
Release: scvi-tools 1.1.0 General release , scvi-tools	0	117	February 13, 2024
Smartseq data prep for SCVI scvi-tools scvi , preprocessing	3	453	December 10, 2022

gimVI seq and spatial mixing

Related Topics