Training on GPU substantially slower with 0.10.0 vs 0.9.1

I upgraded to 0.10.0 (conda) however extremely extremely slow compared to the fast performance I was experiencing with 0.9.1

I am using:
scvi-tools 0.9.1 (downgraded after 0.10.0)
torch 1.8.1
cudatoolkit 10.2.89

In both cases GPU was registering and in use.
Same data.
model = scvi.model.SCVI(adata)
model.train()

Any idea?
thanks,
Ben

We didn’t change any of the training code in 0.10.0. I would try upgrading to 0.10.0 again, and then reinstalling (first uninstall) PyTorch using conda and taking care to select the correct CUDA version.

that’s what i thought - hmmm. Will give it a go again.