Scanvi best practices

Hi,

I’ve been reviewing the latest changes in SCANVI. I noted the removal of pretraining scvi and that the option to do pretraining was moved to the from_scvi_model() method.

Could you elaborate on whether it is still best practice to pretrain with scvi, or whether just using scanvi on its own is better?

Thanks
Charlotte

1 Like

It is still best practice to pretrain a SCVI model and then instantiate SCANVI with the from_scvi_model class method. We moved this around for API reasons.

1 Like

Thanks for the quick response!

1 Like

Does it make a difference if I already add the labels_key when training the SCVI model, i.e.

scvi.data.setup_anndata(adata, batch_key='batch', labels_key="seed_labels")
scvi_model = scvi.model.SCVI(adata)
scvi_model.train()
scanvi_model = scvi.model.SCANVI.from_scvi_model(scvi_model, 'Unknown')
scanvi_model.train()

or if I add the labels only when running SCANVI?

scvi.data.setup_anndata(adata, batch_key="batch")
scvi_model = scvi.model.SCVI(adata)
scvi_model.train()
scvi.data.setup_anndata(adata, batch_key="batch", labels_key="seed_labels")
scanvi_model = scvi.model.SCANVI.from_scvi_model(scvi_model, "Unknown", adata=adata)
scanvi_model.train()

The former method is used in the “seed labelling” tutorial, the latter in the “atlas-level integration” tutorial.

@grst either way will work, setup anndata just creates a dictionary in adata.uns["_scvi"] and the labels won’t do anything to SCVI. This will be more explicit in a future release.

1 Like