You need to use the method of the object corresponding to the instance of the model you trained. In the tutorial this is called
adata_mvi. So you need to replace the
dge_df = scvi.model.MULTIVI.differential_expression(adata = adata_mvi, groupby = ‘leiden’)
dge_df = adata_mvi.differential_expression(adata = adata_mvi, groupby = ‘leiden’)
Regarding your other questions, I haven’t used this particular model, but the tutorial takes 36 minutes to train for 12k cells with a GPU. So if you have ~120k cells and are also using a GPU, it sounds reasonable. Without a GPU training will be slower (I don’t know how much slower).
With the RNA-seq scVI models my experience is that you can use fewer epochs for training if you have more cells to cut down training time. My typical workflow is to run quick training with fewer epochs while I explore hyperparameters/setting, then when I think I understand the variation in the data I start a longer training run and save the model so I can just load it when I want to do use it for some analysis in the future.
Regarding the UMAP; UMAP training is non-deterministic. In the tutorial a manual random seed is set for
scvi, but it doesn’t seem like a manual random seed is set for the UMAP training. The resulting UMAP will then look different, but the general structure in the plot (number of clusters, overlap between labels, etc) should be consistent between UMAP runs.