Thank you for your excellent tutorial.
I read the article and code. It seems you split the raw data into source data and target data. I have a question why train VAE separately with respect to different data? Could I this using the source_datasets is trained the model, then using target_datasets trained is like fine-tuning?
And I also see there are two encoders. The front part seems similar, 1003128, 128128. Is these two parts weight shared?