Generalized Multi-Task Learning from Substantially Unlabeled Multi-Source Medical Image Data

Ayaan Haque1, Abdullah-Al-Zubaer Imran2, Adam Wang2, Demetri Terzopoulos3,4
1: Saratoga High School, Saratog, 2: Stanford University, Stanford, 3: University of California, Los Angeles, 4: VoxelCloud, Inc., Los Angeles
October 2021 issue
Publication date: 2021/10/27
PDF · arXiv · Code

Abstract

Deep learning-based models, when trained in a fully-supervised manner, can be effective in performing complex image analysis tasks, although contingent upon the availability of large labeled datasets. Especially in the medical imaging domain, however, expert image annotation is expensive, time-consuming, and prone to variability. Semi-supervised learning from limited quantities of labeled data has shown promise as an alternative. Maximizing knowledge gains from copious unlabeled data benefits semi-supervised learning models. Moreover, learning multiple tasks within the same model further improves its generalizability. We propose MultiMix, a new multi-task learning model that jointly learns disease classification and anatomical segmentation in a semi-supervised manner while preserving explainability through a novel saliency bridge between the two tasks. Our experiments with varying quantities of multi-source labeled data in the training sets confirm the effectiveness of MultiMix in the simultaneous classification of pneumonia and segmentation of the lungs in chest X-ray images. Moreover, both in-domain and cross-domain evaluations across these tasks further showcase the potential of our model to adapt to challenging generalization scenarios. Our code is available at https://github.com/ayaanzhaque/MultiMix

Keywords

Multi-Task Learning · Semi-Supervised Learning · Data Augmentation · Saliency Bridge · Classification · Segmentation · Chest X-Ray · Lungs · Pneumonia

Bibtex @article{melba:2021:011:haque, title = "Generalized Multi-Task Learning from Substantially Unlabeled Multi-Source Medical Image Data", author = "Haque, Ayaan and Imran, Abdullah-Al-Zubaer and Wang, Adam and Terzopoulos, Demetri", journal = "Machine Learning for Biomedical Imaging", volume = "1", issue = "October 2021 issue", year = "2021", pages = "1--25", issn = "2766-905X", url = "https://melba-journal.org/2021:011" }
RISTY - JOUR AU - Haque, Ayaan AU - Imran, Abdullah-Al-Zubaer AU - Wang, Adam AU - Terzopoulos, Demetri PY - 2021 TI - Generalized Multi-Task Learning from Substantially Unlabeled Multi-Source Medical Image Data T2 - Machine Learning for Biomedical Imaging VL - 1 IS - October 2021 issue SP - 1 EP - 25 SN - 2766-905X UR - https://melba-journal.org/2021:011 ER -

2021:011 cover

1 Introduction

Learning-based medical image analysis has become widespread with the advent of deep learning. However, most deep learning models rely on large pools of labeled data. Especially in the medical imaging domain, obtaining copious labeled imagery is often infeasible, as annotation requires substantial domain expertise and manual labor. Therefore, developing large-scale deep learning methodologies for medical image analysis tasks is challenging. In confronting the limited labeled data problem, Semi-Supervised Learning (SSL) has been gaining attention. In semi-supervised learning, unlabeled training examples are leveraged in combination with labeled examples to maximize information gains (Chapelle et al., 2009). Specifically within the medical domain, where collecting data is generally easier than annotating those data, the use of deep learning for medical image analysis tasks can be fostered by leveraging semi-supervised learning.

Recent research has yielded a variety of semi-supervised learning techniques (Imran, 2020). Pseudo-labeling (Lee, 2013) trains a model with labeled data and unlabeled data simultaneously, generating labels for the unlabeled data by assuming the model-predicted labels to be reliable. Similarly, entropy minimization (Grandvalet and Bengio, 2005) trains so as to match the predicted data distribution of unlabeled data with that of the labeled data, under the assumption that unlabeled examples should yield prediction distributions that are similar to those from labeled examples (Ouali et al., 2020). Domain adaptation (Beijbom, 2012) is a form of inductive transfer learning, where a model is trained on labeled data from the source domain as well as labeled plus unlabeled data from the target domain, which improves model generalization for the target domain, but lacks clinical value if the target domain data is inaccessible during training.

Thus, progress has been made in learning from limited labeled data, although mainly within the confines of single-task learning. In particular, individual medical imaging tasks, such as diagnostic classification and anatomical segmentation, have been addressed using state-of-the-art Convolutional Neural Network (CNN) models (Anwar et al., 2018); e.g., for medical image segmentation, encoder-decoder networks (Ronneberger et al., 2015), variational auto-encoder networks, (Myronenko, 2018), context encoder networks (Gu et al., 2019), multiscale adversarial learning (Imran and Terzopoulos, 2021a), etc.

By contrast, Multi-Task Learning (MTL) is defined as optimizing more than one loss in a single model such that multiple related tasks are performed by sharing the learned representation (Ruder, 2017). Jointly training multiple tasks within a model improves the generalizability of the model as each of the tasks regularizes the others (Caruana, 1993). Assuming that training data with limited annotations come from different distributions for different tasks, multi-task learning may be useful in such scenarios for learning in a scarcely-supervised manner (Imran et al., 2020; Imran and Terzopoulos, 2021b).

Combining the objectives of substantially unlabeled data training and multi-task learning, Semi-Supervised Multi-Task Learning (SSMTL) is a promising research area in the context of medical image analysis. While there have been prior efforts on multi-tasking (Mehta et al., 2018; Girard et al., 2019), rarely do they focus on incorporating semi-supervised learning particularly within the medical realm. Liu et al. (2008) proposed a general semi-supervised multi-tasking method that uses soft-parameter sharing to allow multiple classification tasks in a single model. Gao et al. (2019) performed multi-tasking on tasks within the same medical domain by exploiting feature transfer. Adversarial learning (Salimans et al., 2016) combines a classifier with a discriminator to perform semi-supervised, adversarial multi-tasking. Imran and Terzopoulos (2019) introduced semi-supervised multi-task learning using adversarial learning and attention masking. Zhou et al. (2019) proposed a semi-supervised multi-tasking model that uses an attention mechanism to grade segmented retinal images. None of the aforecited works, however, take into consideration the disparity in the training data distributions for multiple tasks.

To learn diagnostic classification and anatomical segmentation jointly from substantially unlabeled multi-source data, we propose MultiMix, a novel, better-generalized multi-tasking model that incorporates confidence-based augmentation and a module that bridges the classification and segmentation tasks. This saliency bridge module produces a saliency map by computing the gradient of the class score with respect to the input image, thus not only enabling the analysis of the model’s predictions, but also improving the model’s performance of both tasks. While the explainability of any deep learning model can be based on visualizing saliency maps (Simonyan et al., 2014; Zhang et al., 2016; Hu et al., 2019), to our knowledge a saliency bridge between two shared tasks within a single model has not previously been explored. We demonstrate that the saliency bridge module in conjunction with a simple yet effective semi-supervised learning method in a multi-tasking setting can yield improved and consistent performance across multiple domains.

This article is a revised and extended version of our ISBI publication (Haque et al., 2021).111With an augmented literature review, a more detailed explanation of the methods, model architecture, and training algorithm, further details about the datasets, saliency map visualizations from multiple datasets, and additional results and discussion supported by quantitative (performance metrics tables) and qualitative (mask predictions, Bland Altman plots, ROC curves, consistency plots) characteristics. Our main contributions may be summarized as follows:

  • A new semi-supervised learning model, MultiMix, that exploits confidence-based data augmentation and consistency regularization to jointly learn diagnostic classification and anatomical segmentation from multi-source, multi-domain medical image datasets.

  • Incorporation of an innovative saliency bridge module connecting the segmentation and classification branches of the model, resulting in the improved performance of both tasks.

  • Substantiation of the improved generalizability (both in-domain and cross-domain) of the proposed model via experimentation with varied quantities of labeled data and mixed data sources related to multiple tasks, specifically in the classification of pneumonia and the simultaneous segmentation of the lungs in chest X-ray images.

  • MultiMix software made available at https://github.com/ayaanzhaque/MultiMix.

2 The MultiMix Model

To formulate our approach, we assume unknown data distributions p(Xc,C)𝑝superscript𝑋𝑐𝐶p(X^{c},C)italic_p ( italic_X start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT , italic_C ) over images Xcsuperscript𝑋𝑐X^{c}italic_X start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT and class labels C𝐶Citalic_C as well as p(Xs,S)𝑝superscript𝑋𝑠𝑆p(X^{s},S)italic_p ( italic_X start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT , italic_S ) over images Xssuperscript𝑋𝑠X^{s}italic_X start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT and segmentation labels S𝑆Sitalic_S. Hence, segmentation labels for the Xcsuperscript𝑋𝑐X^{c}italic_X start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT images and class labels for the Xssuperscript𝑋𝑠X^{s}italic_X start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT images are unavailable. We also assume access to labeled training sets 𝒟lcsubscriptsuperscript𝒟𝑐𝑙\mathcal{D}^{c}_{l}caligraphic_D start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT sampled i.i.d. from p(Xc,C)𝑝superscript𝑋𝑐𝐶p(X^{c},C)italic_p ( italic_X start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT , italic_C ) and 𝒟lssubscriptsuperscript𝒟𝑠𝑙\mathcal{D}^{s}_{l}caligraphic_D start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT sampled i.i.d. from p(Xs,S)𝑝superscript𝑋𝑠𝑆p(X^{s},S)italic_p ( italic_X start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT , italic_S ), along with unlabeled training sets 𝒟ucsubscriptsuperscript𝒟𝑐𝑢\mathcal{D}^{c}_{u}caligraphic_D start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT sampled i.i.d. from p(Xc)𝑝superscript𝑋𝑐p(X^{c})italic_p ( italic_X start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT ) and 𝒟ussubscriptsuperscript𝒟𝑠𝑢\mathcal{D}^{s}_{u}caligraphic_D start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT sampled i.i.d. from p(Xs)𝑝superscript𝑋𝑠p(X^{s})italic_p ( italic_X start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ), after marginalizing out C𝐶Citalic_C and S𝑆Sitalic_S, respectively.

Refer to caption
Figure 1: Schematic of the MultiMix model. Classification: Using predictions on unlabeled weakly augmented images, pseudo-labels are generated with confidence, and loss is computed with these labels and the strongly augmented versions of those images. Segmentation: Saliency maps generated from the class predictions are concatenated via the saliency bridge module to guide the decoder in generating the segmentation masks.

In our MultiMix model (Figure 1), we utilize a U-Net-like (Ronneberger et al., 2015) encoder-decoder architecture for image deconstruction and reconstruction. The encoder functions similarly to a standard CNN. To perform multi-tasking, we use pooling layers followed by fully-connected layers, allowing the encoder to output class predictions through the classification branch of the model. Furthermore, in the segmentaton branch of the model, the segmentation predictions are obtained as the output of the decoder.

MultiMix performs multi-tasking in a semi-supervised learning manner, assuming the training data for the two tasks come from disparate distributions. It is well established that a multi-tasking model usually outperforms its single-task counterparts (Imran and Terzopoulos, 2019; Imran et al., 2020; Imran, 2020). The shared encoder in the MultiMix model learns features useful for addressing both the classification and segmentation tasks. This joint representation learning enables the model to avoid overfitting and generalize better. Most importantly, it exploits the relatedness of the tasks, which is crucial for effective multi-tasking.

In the following sections, we describe the classification and segmentation branches of the MultiMix model, explain the saliency bridge module that bridges the two branches, and specify the MultiMix training procedure.

2.1 Classification Branch

For semi-supervised classification, we leverage data augmentation and pseudo-labeling. Inspired by the work of Sohn et al. (2020), for each unlabeled image we perform two degrees of augmentation: weak and strong. The former consists of standard augmentations—both random horizontal flipping and random cropping—and is applied to the labeled data as well, whereas the latter is performed by randomly applying any number of augmentations from a pool of “heavy” augmentations.222This pool includes augmentations such as horizontal flip, crop, autocontrast, brightness, contrast, equalize, identity, posterize, rotate, sharpness, shearX, shearY, solarize, translateX, and TranslateY. Autocontrast, brightness, contrast, and equalize are all severe image intensity modifications. An unlabeled image xucsubscriptsuperscript𝑥𝑐𝑢x^{c}_{u}italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT is first weakly augmented, xwc=WAug(xuc)subscriptsuperscript𝑥𝑐𝑤WAugsubscriptsuperscript𝑥𝑐𝑢x^{c}_{w}=\text{WAug}(x^{c}_{u})italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT = WAug ( italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ), and a pseudo-label cp=arg max(c^w)tsubscript𝑐𝑝arg maxsubscript^𝑐𝑤𝑡c_{p}=\text{arg\,max}(\hat{c}_{w})\geq titalic_c start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = arg max ( over^ start_ARG italic_c end_ARG start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT ) ≥ italic_t is synthesized from xwcsubscriptsuperscript𝑥𝑐𝑤x^{c}_{w}italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT using the model prediction c^wsubscript^𝑐𝑤\hat{c}_{w}over^ start_ARG italic_c end_ARG start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT. The image-label pair is retained only if the confidence with which the model generates the pseudo-label exceeds the experimentally tuned threshold t𝑡titalic_t, thus deterring learning from poor and incorrect labels. Second, xgc=GAug(xuc)subscriptsuperscript𝑥𝑐𝑔GAugsubscriptsuperscript𝑥𝑐𝑢x^{c}_{g}=\text{GAug}(x^{c}_{u})italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT = GAug ( italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ) are strongly augmented versions of xucsubscriptsuperscript𝑥𝑐𝑢x^{c}_{u}italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT.

Our training strategy promotes effective learning from large amounts of unlabeled data, which is challenging. At first, the predictions are less reliable as the model begins to learn mainly from the labeled data, but the model gains confidence with the generation of labels for the unlabeled images and, as a result, it becomes more proficient. Since the unlabeled examples are incrementally added to the training set, subject to the threshold, the model learns to predict more accurately in a progressive manner and, with increasing confidence, the performance of the model improves at an increasingly higher rate. Furthermore, employing two degrees of augmentation enables the model to maximize its knowledge gain from the unlabeled data due to the enhanced image diversity through what is known as consistency learning as, in theory, two augmented versions of the same image should yield the same prediction, which is encouraged using an unsupervised loss. In other words, imposing on the model, through an unsupervised loss, to produce the same predictions on images subjected to two different degrees of augmentation results in better classification performance.

The classification objective

Lc(cl,c^l,cp,c^g)=Ll(cl,c^l)+λLu(cp,c^g)superscript𝐿𝑐subscript𝑐𝑙subscript^𝑐𝑙subscript𝑐𝑝subscript^𝑐𝑔subscript𝐿𝑙subscript𝑐𝑙subscript^𝑐𝑙𝜆subscript𝐿𝑢subscript𝑐𝑝subscript^𝑐𝑔L^{c}(c_{l},\hat{c}_{l},c_{p},\hat{c}_{g})=L_{l}(c_{l},\hat{c}_{l})+\lambda L_% {u}(c_{p},\hat{c}_{g})italic_L start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT ( italic_c start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , over^ start_ARG italic_c end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , over^ start_ARG italic_c end_ARG start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ) = italic_L start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ( italic_c start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , over^ start_ARG italic_c end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ) + italic_λ italic_L start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ( italic_c start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , over^ start_ARG italic_c end_ARG start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT )(1)

includes a supervised loss component Llsubscript𝐿𝑙L_{l}italic_L start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT for the labeled data, which uses cross-entropy between the reference class label clsubscript𝑐𝑙c_{l}italic_c start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT and the model prediction c^lsubscript^𝑐𝑙\hat{c}_{l}over^ start_ARG italic_c end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT, as well as an unsupervised loss component Lusubscript𝐿𝑢L_{u}italic_L start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT for the unlabeled data, which uses cross-entropy between the pseudo-label cpsubscript𝑐𝑝c_{p}italic_c start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT and the model prediction c^gsubscript^𝑐𝑔\hat{c}_{g}over^ start_ARG italic_c end_ARG start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT.

Note that the model is trained to ignore GAug as it is provided the pseudo-label cpsubscript𝑐𝑝c_{p}italic_c start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT. Since the underlying data distributions are the same for both augmentations, it is compelled to learn that for the sake of consistency. Weak augmentations are used to produce reliable and usable pseudo-labels whereas strong augmentations are used to provide a difficult challenge for the model. This difficulty forces the model to learn more effective representations in order to be accurate, and it also prevents overfitting from minimizing the loss too early. With the assumption that the weakly augmented image has the correct label to be associated with the strongly augmented image, the model is empowered to discern the augmentations in the image, and its performance improves as a result, by learning the underlying features crucial to the diagnosis. This helps achieve better generalization despite the differences in data distributions across different domains. By teaching the model to learn only the more salient representations that will exist to some extent in all domains, it can generalize and be effective across domains.

2.2 Segmentation Branch

For segmentation, the predictions are made through the encoder-decoder architecture with skip connections. For the labeled samples xlssubscriptsuperscript𝑥𝑠𝑙x^{s}_{l}italic_x start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT, we calculate the direct segmentation loss in the form of Dice loss Ll(sl,s^l)subscript𝐿𝑙subscript𝑠𝑙subscript^𝑠𝑙L_{l}(s_{l},\hat{s}_{l})italic_L start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ( italic_s start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , over^ start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ) between the reference lung mask slsubscript𝑠𝑙s_{l}italic_s start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT and predicted segmentation s^lsubscript^𝑠𝑙\hat{s}_{l}over^ start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT. Since we do not have the segmentation masks for the unlabeled examples xussubscriptsuperscript𝑥𝑠𝑢x^{s}_{u}italic_x start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT, we cannot directly calculate the segmentation loss for them. To ensure consistency, we compute the KL divergence Lu(s^l,s^u)subscript𝐿𝑢subscript^𝑠𝑙subscript^𝑠𝑢L_{u}(\hat{s}_{l},\hat{s}_{u})italic_L start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ( over^ start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , over^ start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ) between segmentation predictions for the labeled examples and unlabeled examples s^usubscript^𝑠𝑢\hat{s}_{u}over^ start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT. This penalizes the model for making predictions that increasingly differ from those of the labeled data, which helps the model fit the unlabeled data. The total segmentation objective is therefore

Ls(sl,s^l,s^u)=αLl(sl,s^l)+βLu(s^l,s^u),superscript𝐿𝑠subscript𝑠𝑙subscript^𝑠𝑙subscript^𝑠𝑢𝛼subscript𝐿𝑙subscript𝑠𝑙subscript^𝑠𝑙𝛽subscript𝐿𝑢subscript^𝑠𝑙subscript^𝑠𝑢L^{s}(s_{l},\hat{s}_{l},\hat{s}_{u})=\alpha L_{l}(s_{l},\hat{s}_{l})+\beta L_{% u}(\hat{s}_{l},\hat{s}_{u}),italic_L start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ( italic_s start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , over^ start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , over^ start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ) = italic_α italic_L start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ( italic_s start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , over^ start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ) + italic_β italic_L start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ( over^ start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , over^ start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ) ,(2)

where α𝛼\alphaitalic_α and β𝛽\betaitalic_β are weights.

2.3 Saliency Bridge Module

We incorporate a saliency bridge module to bridge between the classification and segmentation branches of the MultiMix model, as indicated in Figure 1. To learn which image regions are most relevant to classification, saliency maps

yl=Saliency(c^ls)andyu=Saliency(c^us),formulae-sequencesubscript𝑦𝑙Saliencysubscriptsuperscript^𝑐𝑠𝑙andsubscript𝑦𝑢Saliencysubscriptsuperscript^𝑐𝑠𝑢y_{l}=\text{Saliency}(\hat{c}^{s}_{l})\quad\text{and}\quad y_{u}=\text{% Saliency}(\hat{c}^{s}_{u}),italic_y start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT = Saliency ( over^ start_ARG italic_c end_ARG start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ) and italic_y start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT = Saliency ( over^ start_ARG italic_c end_ARG start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ) ,(3)

where c^lssubscriptsuperscript^𝑐𝑠𝑙\hat{c}^{s}_{l}over^ start_ARG italic_c end_ARG start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT and c^ussubscriptsuperscript^𝑐𝑠𝑢\hat{c}^{s}_{u}over^ start_ARG italic_c end_ARG start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT denote the class predictions for the input images xlssubscriptsuperscript𝑥𝑠𝑙x^{s}_{l}italic_x start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT and xussubscriptsuperscript𝑥𝑠𝑢x^{s}_{u}italic_x start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT, respectively, are generated from the classification branch by computing the gradient of the predicted class with respect to the input image.333These saliency maps should not be confused with simultaneous segmentation and saliency detection or prediction, where a semantic segmentation model is trained to produce saliency maps to accompany the output segmentation masks; e.g., (Zeng et al., 2019). Our saliency bridge module is novel in that it performs a saliency analysis of MultiMix’s classification branch and leverages it to improve the performance of its semantic segmentation branch. It cannot be directly known if the image samples in 𝒟ssuperscript𝒟𝑠\mathcal{D}^{s}caligraphic_D start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT represent normal or diseased cases, thus xlssubscriptsuperscript𝑥𝑠𝑙x^{s}_{l}italic_x start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT and xussubscriptsuperscript𝑥𝑠𝑢x^{s}_{u}italic_x start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT are considered to be unlabeled for the classification task. Therefore, the saliency maps generated via the class prediction are not true segmentation maps, but they will nonetheless highlight the lungs or lung regions relevant to the particular disease class (see Appendix C).

The outputs of the saliency bridge module,

bl=ylxlsandbu=yuxus,formulae-sequencesubscript𝑏𝑙direct-sumsubscript𝑦𝑙subscriptsuperscript𝑥𝑠𝑙andsubscript𝑏𝑢direct-sumsubscript𝑦𝑢subscriptsuperscript𝑥𝑠𝑢b_{l}=y_{l}\oplus x^{s}_{l}\quad\text{and}\quad b_{u}=y_{u}\oplus x^{s}_{u},italic_b start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT = italic_y start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ⊕ italic_x start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT and italic_b start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT = italic_y start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ⊕ italic_x start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ,(4)

obtained by concatenating the saliency maps with the associated input images, are further downsampled before they are concatenated with the encoder-decoder bottleneck in the segmentation branch. This results in a tighter connection between the classification and segmentation tasks and improves the effectiveness of the bridge module, which retains important information from the encoder that may otherwise be lost because of the repeated convolutions. The saliency maps serve to guide the segmentation during the decoding phase, yielding improved segmentation while learning from limited labeled data. With improving classification performance, the saliency maps become more accurate, thus yielding improved segmentations, since the shared parameters responsible for improved classification produce a feedback loop that allows both tasks to improve jointly.

Conventionally, saliency maps are used to analyze which features and areas of the image are most relevant for classification, thereby enhancing understanding of the model’s learning process. Similarly, our saliency module is explainable, as it is a relevant connection between the classification and segmentation tasks (although model explainability in and of itself is not the main focus of our work). Since the saliency maps are comparable to segmentation masks, it is sensible to employ them to guide the decoder in the task of segmentation. Multi-tasking requires the tasks to be somewhat related, so our task-relevant bridge fosters a tighter bond between classification and segmentation.

2.4 MultiMix Training Procedure

Algorithm 1 MultiMix Mini-Batch Training
0:  
  Training set of labeled classification data 𝒟lcsubscriptsuperscript𝒟𝑐𝑙\mathcal{D}^{c}_{l}caligraphic_D start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT
  Training set of labeled segmentation data 𝒟lssubscriptsuperscript𝒟𝑠𝑙\mathcal{D}^{s}_{l}caligraphic_D start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT
  Training set of unlabeled classification data 𝒟ucsubscriptsuperscript𝒟𝑐𝑢\mathcal{D}^{c}_{u}caligraphic_D start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT
  Training set of unlabeled segmentation data 𝒟ussubscriptsuperscript𝒟𝑠𝑢\mathcal{D}^{s}_{u}caligraphic_D start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT
  Network architecture θsubscript𝜃\mathcal{F}_{\theta}caligraphic_F start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT with learnable parameters θ𝜃\thetaitalic_θ
  Minibatch size m𝑚mitalic_m
  repeat
     Create labeled classification minibatch: {xlc1,,xlcm}𝒟lcsimilar-tosuperscriptsubscriptsuperscript𝑥𝑐𝑙1superscriptsubscriptsuperscript𝑥𝑐𝑙𝑚subscriptsuperscript𝒟𝑐𝑙\{{}^{1}\!x^{c}_{l},\dots,{}^{m}\!x^{c}_{l}\}\sim\mathcal{D}^{c}_{l}{ start_FLOATSUPERSCRIPT 1 end_FLOATSUPERSCRIPT italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , … , start_FLOATSUPERSCRIPT italic_m end_FLOATSUPERSCRIPT italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT } ∼ caligraphic_D start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT
     Create labeled segmentation minibatch: {xls1,,xlsm}𝒟lssimilar-tosuperscriptsubscriptsuperscript𝑥𝑠𝑙1superscriptsubscriptsuperscript𝑥𝑠𝑙𝑚subscriptsuperscript𝒟𝑠𝑙\{{}^{1}\!x^{s}_{l},\dots,{}^{m}\!x^{s}_{l}\}\sim\mathcal{D}^{s}_{l}{ start_FLOATSUPERSCRIPT 1 end_FLOATSUPERSCRIPT italic_x start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , … , start_FLOATSUPERSCRIPT italic_m end_FLOATSUPERSCRIPT italic_x start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT } ∼ caligraphic_D start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT
     Create unlabeled classification minibatch: {xuc1,,xucm}𝒟ucsimilar-tosuperscriptsubscriptsuperscript𝑥𝑐𝑢1superscriptsubscriptsuperscript𝑥𝑐𝑢𝑚subscriptsuperscript𝒟𝑐𝑢\{{}^{1}\!x^{c}_{u},\dots,{}^{m}\!x^{c}_{u}\}\sim\mathcal{D}^{c}_{u}{ start_FLOATSUPERSCRIPT 1 end_FLOATSUPERSCRIPT italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT , … , start_FLOATSUPERSCRIPT italic_m end_FLOATSUPERSCRIPT italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT } ∼ caligraphic_D start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT
     Create unlabeled segmentation minibatch: {xus1,,xusm}𝒟ussimilar-tosuperscriptsubscriptsuperscript𝑥𝑠𝑢1superscriptsubscriptsuperscript𝑥𝑠𝑢𝑚subscriptsuperscript𝒟𝑠𝑢\{{}^{1}\!x^{s}_{u},\dots,{}^{m}\!x^{s}_{u}\}\sim\mathcal{D}^{s}_{u}{ start_FLOATSUPERSCRIPT 1 end_FLOATSUPERSCRIPT italic_x start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT , … , start_FLOATSUPERSCRIPT italic_m end_FLOATSUPERSCRIPT italic_x start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT } ∼ caligraphic_D start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT
     Compute predictions for the labeled data: c^liθ(xlci);s^liθ(xlsi)formulae-sequencesuperscriptsubscript^𝑐𝑙𝑖subscript𝜃superscriptsubscriptsuperscript𝑥𝑐𝑙𝑖superscriptsubscript^𝑠𝑙𝑖subscript𝜃superscriptsubscriptsuperscript𝑥𝑠𝑙𝑖{}^{i}\!\hat{c}_{l}\leftarrow\mathcal{F}_{\theta}({}^{i}\!x^{c}_{l});~{}{}^{i}% \!\hat{s}_{l}\leftarrow\mathcal{F}_{\theta}({}^{i}\!x^{s}_{l})start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT over^ start_ARG italic_c end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ← caligraphic_F start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ) ; start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT over^ start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ← caligraphic_F start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT italic_x start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT )
     Generate weakly-augmented samples: xwciWAug(xuci)superscriptsubscriptsuperscript𝑥𝑐𝑤𝑖WAugsuperscriptsubscriptsuperscript𝑥𝑐𝑢𝑖{}^{i}\!x^{c}_{w}\leftarrow\text{WAug}({}^{i}\!x^{c}_{u})start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT ← WAug ( start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT )
     Generate strongly-augmented samples: xgciGAug(xuci)superscriptsubscriptsuperscript𝑥𝑐𝑔𝑖GAugsuperscriptsubscriptsuperscript𝑥𝑐𝑢𝑖{}^{i}\!x^{c}_{g}\leftarrow\text{GAug}({}^{i}\!x^{c}_{u})start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ← GAug ( start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT )
     Compute predictions for the unlabeled data: c^wiθ(xwci);c^giθ(xgci);s^uiθ(xusi)formulae-sequencesuperscriptsubscript^𝑐𝑤𝑖subscript𝜃superscriptsubscriptsuperscript𝑥𝑐𝑤𝑖formulae-sequencesuperscriptsubscript^𝑐𝑔𝑖subscript𝜃superscriptsubscriptsuperscript𝑥𝑐𝑔𝑖superscriptsubscript^𝑠𝑢𝑖subscript𝜃superscriptsubscriptsuperscript𝑥𝑠𝑢𝑖{}^{i}\!\hat{c}_{w}\leftarrow\mathcal{F}_{\theta}({}^{i}\!x^{c}_{w});~{}{}^{i}% \!\hat{c}_{g}\leftarrow\mathcal{F}_{\theta}({}^{i}\!x^{c}_{g});~{}{}^{i}\!\hat% {s}_{u}\leftarrow\mathcal{F}_{\theta}({}^{i}\!x^{s}_{u})start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT over^ start_ARG italic_c end_ARG start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT ← caligraphic_F start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT ) ; start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT over^ start_ARG italic_c end_ARG start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ← caligraphic_F start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT italic_x start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ) ; start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT over^ start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ← caligraphic_F start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT italic_x start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT )
     Compute pseudo label: cpiarg max(c^wi)tsuperscriptsubscript𝑐𝑝𝑖arg maxsuperscriptsubscript^𝑐𝑤𝑖𝑡{}^{i}\!c_{p}\leftarrow\text{arg\,max}({}^{i}\!\hat{c}_{w})\geq tstart_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ← arg max ( start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT over^ start_ARG italic_c end_ARG start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT ) ≥ italic_t
     Update θsubscript𝜃\mathcal{F}_{\theta}caligraphic_F start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT by backpropagating the loss gradient θLsubscript𝜃𝐿\nabla\!_{\theta}L∇ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT italic_L
  until convergence

Algorithm 1 presents the main steps of the MultiMix training procedure applied to labeled and unlabeled classification and segmentation training data. The model is trained simultaneously on the classification objective (1) and segmentation objective (2) using the following total loss for a minibatch size of m𝑚mitalic_m:

L=1mi=1m(Lc(cli,c^li,cpi,c^gi)+Ls(sli,s^li,s^ui))𝐿1𝑚superscriptsubscript𝑖1𝑚superscript𝐿𝑐superscriptsubscript𝑐𝑙𝑖superscriptsubscript^𝑐𝑙𝑖superscriptsubscript𝑐𝑝𝑖superscriptsubscript^𝑐𝑔𝑖superscript𝐿𝑠superscriptsubscript𝑠𝑙𝑖superscriptsubscript^𝑠𝑙𝑖superscriptsubscript^𝑠𝑢𝑖L=\frac{1}{m}\sum_{i=1}^{m}\biggl{(}L^{c}\left({}^{i}\!c_{l},{}^{i}\!\hat{c}_{% l},{}^{i}\!c_{p},{}^{i}\!\hat{c}_{g}\right)+L^{s}\left({}^{i}\!s_{l},{}^{i}\!% \hat{s}_{l},{}^{i}\!\hat{s}_{u}\right)\biggr{)}italic_L = divide start_ARG 1 end_ARG start_ARG italic_m end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ( italic_L start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT ( start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT over^ start_ARG italic_c end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT over^ start_ARG italic_c end_ARG start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ) + italic_L start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ( start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT italic_s start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT over^ start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , start_FLOATSUPERSCRIPT italic_i end_FLOATSUPERSCRIPT over^ start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ) )(5)

3 Experimental Evaluation

3.1 Data

Models were trained and tested in the combined classification and segmentation tasks using chest X-ray images from two different sources: pneumonia detection (CheX) (Kermany et al., 2018) and the Japanese Society of Radiological Technology (JSRT) (Shiraishi et al., 2000). We further validated the models using the Montgomery County chest X-rays (MCU) (Jaeger et al., 2014) and a subset of the NIH chest X-ray dataset (NIHX) (Wang et al., 2017) (Figure 1(a)). Table 1 presents some details about the datasets used in our experiments. In addition to the diversity in the source, image quality, size, and proportion of normal and abnormal images, the disparity in the intensity distributions of the four datasets is also evident (Figure 1(b)). All the images were normalized and resized to 256×256×12562561256\times 256\times 1256 × 256 × 1 before passing them to the models.

NormalAbnormal
CheX
NIHX
(a)
Refer to caption
(b)
Figure 2: (a) Sample (normal, abnormal) images from the CheX and NIHX datasets. (b) Intensity distributions of the four chest X-ray image datasets.
Table 1: Details of the datasets used for training and testing.
ModeDatasetTotalNormalAbnormalTrainValTest
in-domainJSRT24711113123
CheX5,85615834273521616624
cross-domainMCU138931035
NIHX4185275414314185

3.2 Implementation Details

Baselines:

We used the U-Net and encoder-only (Enc) networks separately for the single-task baseline models in both the fully supervised and semi-supervised schemes. Using the same backbone network, we also trained a multi-tasking U-Net with the classification branch (UMTL). All these models incorporate an INorm, LReLU, and dropout at every convolutional block (see Appendix A). Moreover, we performed ablation experiments to assess the impact of each key piece of our MultiMix model: single-task Enc-SSL (encoder with confidence-based augmentation SSL), single-task Enc-MM (an implementation of MixMatch (Berthelot et al., 2019)), UMTL-S (UMTL with saliency bridge), UMTL-SSL (UMTL with SSL classification), and UMTL-SSL-S (UMTL with saliency bridge and confidence-based augmentation).

Augmentations:

We performed random horizontal flip and 32×32323232\times 3232 × 32 crop in WAug for the examples in 𝒟ucsubscriptsuperscript𝒟𝑐𝑢\mathcal{D}^{c}_{u}caligraphic_D start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT. On the other hand, GAug was applied through a random combination from the pool of augmentations: random horizontal flip, crop (32×32323232\times 3232 × 32), autocontrast, brightness, contrast, equalize, identity, posterize, rotate (30superscript3030^{\circ}30 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT), sharpness, shearX, shearY, solarize, translateX (30%), and translateY (30%).

Training:

All the models (single-task or multi-task) were trained on varying |𝒟ls|subscriptsuperscript𝒟𝑠𝑙|\mathcal{D}^{s}_{l}|| caligraphic_D start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT | (10, 50, full), and |𝒟lc|subscriptsuperscript𝒟𝑐𝑙|\mathcal{D}^{c}_{l}|| caligraphic_D start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT | (100, 1000, full). Each experiment was repeated 5 times and the average performance is reported. We implemented the models using Python and the PyTorch framework and trained using an Nvidia K80 GPU.

Hyper-parameters:

We used the Adam optimizer with adaptive learning rates of 0.1 every 8 epochs and an initial learning rate of 0.0001. A negative slope of 0.2 was applied to Leaky ReLU, and the dropout was set to 0.25. We set t=0.7𝑡0.7t=0.7italic_t = 0.7, λ=0.25𝜆0.25\lambda=0.25italic_λ = 0.25, α=5.0𝛼5.0\alpha=5.0italic_α = 5.0 (for smaller |𝒟ls|subscriptsuperscript𝒟𝑠𝑙|\mathcal{D}^{s}_{l}|| caligraphic_D start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT |) and β=0.01𝛽0.01\beta=0.01italic_β = 0.01. Each model was trained with a mini-batch size of m=10𝑚10m=10italic_m = 10. All model-specific hyperparameters were experimentally tuned. We found that the performance of the model varied only minimally subject to the different choices.

Evaluation:

For classification, along with the overall accuracy (Acc), we recorded the class-wise F1 scores (F1-N for normal and F1-P for pneumonia). To evaluate segmentation performance, we used the Dice similarity (DS), Jaccard similarity (JS), structural similarity measure (SSIM), average Hausdorff distance (HD), precision (P), and Recall (R) scores.

Refer to caption
(a) in-domain
Refer to caption
(b) cross-domain
Figure 3: Distributions of the Dice scores demonstrate the superiority of the MultiMix model over the baseline models in segmenting lungs from the chest X-ray images in both domains.
Table 2: Classification and segmentation performance figures with varying label proportions in in-domain evaluations: CheX (classification) and JSRT (segmentation) datasets. The best scores from fully-supervised models are underlined and the best scores from semi-supervised models are bolded. Scores are given as mean±stdplus-or-minusmeanstd\text{mean}\pm\text{std}mean ± std.
Model|𝒟lc|subscriptsuperscript𝒟𝑐𝑙|\mathcal{D}^{c}_{l}|| caligraphic_D start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT |Classification|𝒟ls|subscriptsuperscript𝒟𝑠𝑙|\mathcal{D}^{s}_{l}|| caligraphic_D start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT |Segmentation
AccF1-NorF1-AbnDSJSSSIMHDPR
U-Net100.634±0.017plus-or-minus0.6340.0170.634\pm 0.0170.634 ± 0.0170.695±0.024plus-or-minus0.6950.0240.695\pm 0.0240.695 ± 0.0240.810±0.019plus-or-minus0.8100.0190.810\pm 0.0190.810 ± 0.0192.899±0.272plus-or-minus2.8990.2722.899\pm 0.2722.899 ± 0.2720.779±0.021plus-or-minus0.7790.0210.779\pm 0.0210.779 ± 0.0210.865±0.023plus-or-minus0.8650.0230.865\pm 0.0230.865 ± 0.023
500.855±0.004plus-or-minus0.8550.0040.855\pm 0.0040.855 ± 0.0040.854±0.008plus-or-minus0.8540.0080.854\pm 0.0080.854 ± 0.0080.904±0.003plus-or-minus0.9040.0030.904\pm 0.0030.904 ± 0.0030.341±0.071plus-or-minus0.3410.0710.341\pm 0.0710.341 ± 0.0710.918±0.009plus-or-minus0.9180.0090.918\pm 0.0090.918 ± 0.0090.925±0.009plus-or-minus0.9250.0090.925\pm 0.0090.925 ± 0.009
Full0.915±0.001plus-or-minus0.9150.0010.915\pm 0.0010.915 ± 0.0010.906±0.002plus-or-minus0.9060.0020.906\pm 0.0020.906 ± 0.0020.929±0.002plus-or-minus0.9290.0020.929\pm 0.0020.929 ± 0.0020.104±0.025plus-or-minus0.1040.0250.104\pm 0.0250.104 ± 0.0250.949±0.007plus-or-minus0.9490.0070.949\pm 0.0070.949 ± 0.0070.953±0.005plus-or-minus0.9530.0050.953\pm 0.0050.953 ± 0.005
Enc1000.732±0.044plus-or-minus0.7320.0440.732\pm 0.0440.732 ± 0.0440.424±0.122plus-or-minus0.4240.1220.424\pm 0.1220.424 ± 0.1220.806±0.026plus-or-minus0.8060.0260.806\pm 0.0260.806 ± 0.026
10000.773±0.037plus-or-minus0.7730.0370.773\pm 0.0370.773 ± 0.0370.546±0.020plus-or-minus0.5460.0200.546\pm 0.0200.546 ± 0.0200.842±0.018plus-or-minus0.8420.0180.842\pm 0.0180.842 ± 0.018
Full0.737±0.021plus-or-minus0.7370.0210.737\pm 0.0210.737 ± 0.0210.534±0.058plus-or-minus0.5340.0580.534\pm 0.0580.534 ± 0.0580.838±0.012plus-or-minus0.8380.0120.838\pm 0.0120.838 ± 0.012
Enc-MM1000.738±0.043plus-or-minus0.7380.0430.738\pm 0.0430.738 ± 0.0430.452±0.103plus-or-minus0.4520.1030.452\pm 0.1030.452 ± 0.1030.800±0.024plus-or-minus0.8000.0240.800\pm 0.0240.800 ± 0.024
10000.745±0.036plus-or-minus0.7450.0360.745\pm 0.0360.745 ± 0.0360.560±0.078plus-or-minus0.5600.0780.560\pm 0.0780.560 ± 0.0780.584±0.101plus-or-minus0.5840.1010.584\pm 0.1010.584 ± 0.101
Full0.751±0.036plus-or-minus0.7510.0360.751\pm 0.0360.751 ± 0.0360.605±0.065plus-or-minus0.6050.0650.605\pm 0.0650.605 ± 0.0650.846±0.025plus-or-minus0.8460.0250.846\pm 0.0250.846 ± 0.025
Enc-SSL1000.780±0.035plus-or-minus0.7800.0350.780\pm 0.0350.780 ± 0.0350.570±0.083plus-or-minus0.5700.0830.570\pm 0.0830.570 ± 0.0830.844±0.021plus-or-minus0.8440.0210.844\pm 0.0210.844 ± 0.021
10000.822±0.027plus-or-minus0.8220.0270.822\pm 0.0270.822 ± 0.0270.692±0.058plus-or-minus0.6920.0580.692\pm 0.0580.692 ± 0.0580.876±0.015plus-or-minus0.8760.0150.876\pm 0.0150.876 ± 0.015
Full0.817±0.016plus-or-minus0.8170.0160.817\pm 0.0160.817 ± 0.0160.680±0.042plus-or-minus0.6800.0420.680\pm 0.0420.680 ± 0.0420.872±0.013plus-or-minus0.8720.0130.872\pm 0.0130.872 ± 0.013
UMTL1000.707±0.024plus-or-minus0.7070.0240.707\pm 0.0240.707 ± 0.0240.443±0.071plus-or-minus0.4430.0710.443\pm 0.0710.443 ± 0.0710.797±0.016plus-or-minus0.7970.0160.797\pm 0.0160.797 ± 0.016100.626±0.008plus-or-minus0.6260.0080.626\pm 0.0080.626 ± 0.0080.871±0.014plus-or-minus0.8710.0140.871\pm 0.0140.871 ± 0.0140.908±0.010plus-or-minus0.9080.0100.908\pm 0.0100.908 ± 0.0104.323±1.268plus-or-minus4.3231.2684.323\pm 1.2684.323 ± 1.2680.900±0.014plus-or-minus0.9000.0140.900\pm 0.0140.900 ± 0.0140.964±0.007plus-or-minus0.9640.0070.964\pm 0.0070.964 ± 0.007
1000.655±0.052plus-or-minus0.6550.0520.655\pm 0.0520.655 ± 0.0520.683±0.129plus-or-minus0.6830.1290.683\pm 0.1290.683 ± 0.1290.853±0.030plus-or-minus0.8530.0300.853\pm 0.0300.853 ± 0.030500.647±0.022plus-or-minus0.6470.0220.647\pm 0.0220.647 ± 0.0220.854±0.040plus-or-minus0.8540.0400.854\pm 0.0400.854 ± 0.0400.881±0.025plus-or-minus0.8810.0250.881\pm 0.0250.881 ± 0.0254.733±1.238plus-or-minus4.7331.2384.733\pm 1.2384.733 ± 1.2380.864±0.043plus-or-minus0.8640.0430.864\pm 0.0430.864 ± 0.0430.989±0.004plus-or-minus0.9890.0040.989\pm 0.0040.989 ± 0.004
1000.706±0.046plus-or-minus0.7060.0460.706\pm 0.0460.706 ± 0.0460.416±0.138plus-or-minus0.4160.1380.416\pm 0.1380.416 ± 0.1380.804±0.028plus-or-minus0.8040.0280.804\pm 0.0280.804 ± 0.028Full0.696±0.014plus-or-minus0.6960.0140.696\pm 0.0140.696 ± 0.0140.872±0.0252plus-or-minus0.8720.02520.872\pm 0.02520.872 ± 0.02520.911±0.016plus-or-minus0.9110.0160.911\pm 0.0160.911 ± 0.0163.908±0.795plus-or-minus3.9080.7953.908\pm 0.7953.908 ± 0.7950.892±0.025plus-or-minus0.8920.0250.892\pm 0.0250.892 ± 0.0250.986±0.004plus-or-minus0.9860.0040.986\pm 0.0040.986 ± 0.004
10000.750±0.010plus-or-minus0.7500.0100.750\pm 0.0100.750 ± 0.0100.490±0.020plus-or-minus0.4900.0200.490\pm 0.0200.490 ± 0.0200.825±0.005plus-or-minus0.8250.0050.825\pm 0.0050.825 ± 0.005100.761±0.004plus-or-minus0.7610.0040.761\pm 0.0040.761 ± 0.0040.904±0.009plus-or-minus0.9040.0090.904\pm 0.0090.904 ± 0.0090.926±0.002plus-or-minus0.9260.0020.926\pm 0.0020.926 ± 0.0023.050±0.531plus-or-minus3.0500.5313.050\pm 0.5313.050 ± 0.5310.924±0.001plus-or-minus0.9240.0010.924\pm 0.0010.924 ± 0.0010.977±0.009plus-or-minus0.9770.0090.977\pm 0.0090.977 ± 0.009
10000.749±0.024plus-or-minus0.7490.0240.749\pm 0.0240.749 ± 0.0240.510±0.064plus-or-minus0.5100.0640.510\pm 0.0640.510 ± 0.0640.833±0.009plus-or-minus0.8330.0090.833\pm 0.0090.833 ± 0.009500.768±0.001plus-or-minus0.7680.0010.768\pm 0.0010.768 ± 0.0010.927±0.003plus-or-minus0.9270.0030.927\pm 0.0030.927 ± 0.0030.938±0.002plus-or-minus0.9380.0020.938\pm 0.0020.938 ± 0.0022.606±0.205plus-or-minus2.6060.2052.606\pm 0.2052.606 ± 0.2050.940±0.004plus-or-minus0.9400.0040.940\pm 0.0040.940 ± 0.0040.985±0.001plus-or-minus0.9850.0010.985\pm 0.0010.985 ± 0.001
10000.747±0.045plus-or-minus0.7470.0450.747\pm 0.0450.747 ± 0.0450.530±0.140plus-or-minus0.5300.1400.530\pm 0.1400.530 ± 0.1400.840±0.024plus-or-minus0.8400.0240.840\pm 0.0240.840 ± 0.024Full0.759±0.005plus-or-minus0.7590.0050.759\pm 0.0050.759 ± 0.0050.928±0.010plus-or-minus0.9280.0100.928\pm 0.0100.928 ± 0.0100.930±0.008plus-or-minus0.9300.0080.930\pm 0.0080.930 ± 0.0082.955±0.483plus-or-minus2.9550.4832.955\pm 0.4832.955 ± 0.4830.924±0.015plus-or-minus0.9240.0150.924\pm 0.0150.924 ± 0.0150.981±0.006plus-or-minus0.9810.0060.981\pm 0.0060.981 ± 0.006
Full0.744±0.011plus-or-minus0.7440.0110.744\pm 0.0110.744 ± 0.0110.515±0.022plus-or-minus0.5150.0220.515\pm 0.0220.515 ± 0.0220.828±0.016plus-or-minus0.8280.0160.828\pm 0.0160.828 ± 0.016100.909±0.021plus-or-minus0.9090.0210.909\pm 0.0210.909 ± 0.0210.919±0.037plus-or-minus0.9190.0370.919\pm 0.0370.919 ± 0.0370.521±0.028plus-or-minus0.5210.0280.521\pm 0.0280.521 ± 0.0280.903±0.296plus-or-minus0.9030.2960.903\pm 0.2960.903 ± 0.2960.912±0.050plus-or-minus0.9120.0500.912\pm 0.0500.912 ± 0.0500.962±0.013plus-or-minus0.9620.0130.962\pm 0.0130.962 ± 0.013
Full0.738±0.004plus-or-minus0.7380.0040.738\pm 0.0040.738 ± 0.0040.438±0.013plus-or-minus0.4380.0130.438\pm 0.0130.438 ± 0.0130.820±0.000plus-or-minus0.8200.0000.820\pm 0.0000.820 ± 0.000500.930±0.000plus-or-minus0.9300.0000.930\pm 0.0000.930 ± 0.0000.948±0.000plus-or-minus0.9480.0000.948\pm 0.0000.948 ± 0.0000.954±0.000plus-or-minus0.9540.0000.954\pm 0.0000.954 ± 0.0000.444±0.142plus-or-minus0.4440.1420.444\pm 0.1420.444 ± 0.1420.969±0.001plus-or-minus0.9690.0010.969\pm 0.0010.969 ± 0.0010.977±0.001plus-or-minus0.9770.0010.977\pm 0.0010.977 ± 0.001
Full0.731±0.018plus-or-minus0.7310.0180.731\pm 0.0180.731 ± 0.0180.447±0.067plus-or-minus0.4470.0670.447\pm 0.0670.447 ± 0.0670.822±0.009plus-or-minus0.8220.0090.822\pm 0.0090.822 ± 0.009Full0.932±0.000plus-or-minus0.9320.0000.932\pm 0.0000.932 ± 0.0000.951±0.000plus-or-minus0.9510.0000.951\pm 0.0000.951 ± 0.0000.957±0.000plus-or-minus0.9570.0000.957\pm 0.0000.957 ± 0.0000.372¯±0.052plus-or-minus¯0.3720.052\underline{0.372}\pm 0.052under¯ start_ARG 0.372 end_ARG ± 0.0520.965±0.001plus-or-minus0.9650.0010.965\pm 0.0010.965 ± 0.0010.977±0.001plus-or-minus0.9770.0010.977\pm 0.0010.977 ± 0.001
UMTL-S1000.704±0.052plus-or-minus0.7040.0520.704\pm 0.0520.704 ± 0.0520.358±0.223plus-or-minus0.3580.2230.358\pm 0.2230.358 ± 0.2230.806±0.024plus-or-minus0.8060.0240.806\pm 0.0240.806 ± 0.024100.922±0.007plus-or-minus0.9220.0070.922\pm 0.0070.922 ± 0.0070.848±0.013plus-or-minus0.8480.0130.848\pm 0.0130.848 ± 0.0130.891±0.009plus-or-minus0.8910.0090.891\pm 0.0090.891 ± 0.0094.005±0.413plus-or-minus4.0050.4134.005\pm 0.4134.005 ± 0.4130.871±0.017plus-or-minus0.8710.0170.871\pm 0.0170.871 ± 0.0170.966±0.005plus-or-minus0.9660.0050.966\pm 0.0050.966 ± 0.005
1000.701±0.033plus-or-minus0.7010.0330.701\pm 0.0330.701 ± 0.0330.336±0.130plus-or-minus0.3360.1300.336\pm 0.1300.336 ± 0.1300.796±0.019plus-or-minus0.7960.0190.796\pm 0.0190.796 ± 0.019500.926±0.002plus-or-minus0.9260.0020.926\pm 0.0020.926 ± 0.0020.867±0.003plus-or-minus0.8670.0030.867\pm 0.0030.867 ± 0.0030.894±0.003plus-or-minus0.8940.0030.894\pm 0.0030.894 ± 0.0034.393±0.217plus-or-minus4.3930.2174.393\pm 0.2174.393 ± 0.2170.873±0.005plus-or-minus0.8730.0050.873\pm 0.0050.873 ± 0.0050.891±0.002plus-or-minus0.8910.0020.891\pm 0.0020.891 ± 0.002
1000.713±0.041plus-or-minus0.7130.0410.713\pm 0.0410.713 ± 0.0410.442±0.164plus-or-minus0.4420.1640.442\pm 0.1640.442 ± 0.1640.794±0.025plus-or-minus0.7940.0250.794\pm 0.0250.794 ± 0.025Full0.931±0.003plus-or-minus0.9310.0030.931\pm 0.0030.931 ± 0.0030.890±0.006plus-or-minus0.8900.0060.890\pm 0.0060.890 ± 0.0060.920±0.003plus-or-minus0.9200.0030.920\pm 0.0030.920 ± 0.0033.983±0.375plus-or-minus3.9830.3753.983\pm 0.3753.983 ± 0.3750.906±0.008plus-or-minus0.9060.0080.906\pm 0.0080.906 ± 0.0080.980±0.004plus-or-minus0.9800.0040.980\pm 0.0040.980 ± 0.004
10000.740±0.020plus-or-minus0.7400.0200.740\pm 0.0200.740 ± 0.0200.482±0.052plus-or-minus0.4820.0520.482\pm 0.0520.482 ± 0.0520.828±0.012plus-or-minus0.8280.0120.828\pm 0.0120.828 ± 0.012100.948±0.001plus-or-minus0.9480.0010.948\pm 0.0010.948 ± 0.0010.908±0.003plus-or-minus0.9080.0030.908\pm 0.0030.908 ± 0.0030.924±0.003plus-or-minus0.9240.0030.924\pm 0.0030.924 ± 0.0032.546±0.217plus-or-minus2.5460.2172.546\pm 0.2172.546 ± 0.2170.931±0.005plus-or-minus0.9310.0050.931\pm 0.0050.931 ± 0.0050.972±0.002plus-or-minus0.9720.0020.972\pm 0.0020.972 ± 0.002
10000.771±0.041plus-or-minus0.7710.0410.771\pm 0.0410.771 ± 0.0410.566±0.010plus-or-minus0.5660.0100.566\pm 0.0100.566 ± 0.0100.844±0.024plus-or-minus0.8440.0240.844\pm 0.0240.844 ± 0.024500.965±0.001plus-or-minus0.9650.0010.965\pm 0.0010.965 ± 0.0010.931±0.003plus-or-minus0.9310.0030.931\pm 0.0030.931 ± 0.0030.941±0.001plus-or-minus0.9410.0010.941\pm 0.0010.941 ± 0.0012.083±0.217plus-or-minus2.0830.2172.083\pm 0.2172.083 ± 0.2170.949±0.005plus-or-minus0.9490.0050.949\pm 0.0050.949 ± 0.0050.981±0.002plus-or-minus0.9810.0020.981\pm 0.0020.981 ± 0.002
10000.742±0.019plus-or-minus0.7420.0190.742\pm 0.0190.742 ± 0.0190.497±0.059plus-or-minus0.4970.0590.497\pm 0.0590.497 ± 0.0590.830±0.014plus-or-minus0.8300.0140.830\pm 0.0140.830 ± 0.014Full0.962±0.005plus-or-minus0.9620.0050.962\pm 0.0050.962 ± 0.0050.925±0.010plus-or-minus0.9250.0100.925\pm 0.0100.925 ± 0.0100.935±0.008plus-or-minus0.9350.0080.935\pm 0.0080.935 ± 0.0081.758±0.132plus-or-minus1.7580.1321.758\pm 0.1321.758 ± 0.1320.958±0.015plus-or-minus0.9580.0150.958\pm 0.0150.958 ± 0.0150.985±0.005plus-or-minus0.9850.0050.985\pm 0.0050.985 ± 0.005
Full0.747±0.006plus-or-minus0.7470.0060.747\pm 0.0060.747 ± 0.0060.500±0.021plus-or-minus0.5000.0210.500\pm 0.0210.500 ± 0.0210.830±0.006plus-or-minus0.8300.0060.830\pm 0.0060.830 ± 0.006100.955±0.020plus-or-minus0.9550.0200.955\pm 0.0200.955 ± 0.0200.914±0.035plus-or-minus0.9140.0350.914\pm 0.0350.914 ± 0.0350.936±0.027plus-or-minus0.9360.0270.936\pm 0.0270.936 ± 0.0270.568±0.136plus-or-minus0.5680.1360.568\pm 0.1360.568 ± 0.1360.954±0.037plus-or-minus0.9540.0370.954\pm 0.0370.954 ± 0.0370.956±0.005plus-or-minus0.9560.0050.956\pm 0.0050.956 ± 0.005
Full0.737±0.016plus-or-minus0.7370.0160.737\pm 0.0160.737 ± 0.0160.433±0.054plus-or-minus0.4330.0540.433\pm 0.0540.433 ± 0.0540.820±0.008plus-or-minus0.8200.0080.820\pm 0.0080.820 ± 0.008500.972±0.006plus-or-minus0.9720.0060.972\pm 0.0060.972 ± 0.0060.944±0.011plus-or-minus0.9440.0110.944\pm 0.0110.944 ± 0.0110.953±0.009plus-or-minus0.9530.0090.953\pm 0.0090.953 ± 0.0090.560±0.427plus-or-minus0.5600.4270.560\pm 0.4270.560 ± 0.4270.966±0.014plus-or-minus0.9660.0140.966\pm 0.0140.966 ± 0.0140.977±0.004plus-or-minus0.9770.0040.977\pm 0.0040.977 ± 0.004
Full0.723±0.005plus-or-minus0.7230.0050.723\pm 0.0050.723 ± 0.0050.413±0.019plus-or-minus0.4130.0190.413\pm 0.0190.413 ± 0.0190.817±0.005plus-or-minus0.8170.0050.817\pm 0.0050.817 ± 0.005Full0.974±0.000plus-or-minus0.9740.0000.974\pm 0.0000.974 ± 0.0000.953±0.000plus-or-minus0.9530.0000.953\pm 0.0000.953 ± 0.0000.957±0.000plus-or-minus0.9570.0000.957\pm 0.0000.957 ± 0.0000.539±0.437plus-or-minus0.5390.4370.539\pm 0.4370.539 ± 0.4370.967±0.002plus-or-minus0.9670.0020.967\pm 0.0020.967 ± 0.0020.981±0.001plus-or-minus0.9810.0010.981\pm 0.0010.981 ± 0.001
UMTL-SSL1000.790±0.043plus-or-minus0.7900.0430.790\pm 0.0430.790 ± 0.0430.618±0.105plus-or-minus0.6180.1050.618\pm 0.1050.618 ± 0.1050.856±0.024plus-or-minus0.8560.0240.856\pm 0.0240.856 ± 0.024100.906±0.002plus-or-minus0.9060.0020.906\pm 0.0020.906 ± 0.0020.925±0.004plus-or-minus0.9250.0040.925\pm 0.0040.925 ± 0.0040.940±0.002plus-or-minus0.9400.0020.940\pm 0.0020.940 ± 0.0020.626±0.280plus-or-minus0.6260.2800.626\pm 0.2800.626 ± 0.2800.954±0.006plus-or-minus0.9540.0060.954\pm 0.0060.954 ± 0.0060.953±0.003plus-or-minus0.9530.0030.953\pm 0.0030.953 ± 0.003
1000.818±0.039plus-or-minus0.8180.0390.818\pm 0.0390.818 ± 0.0390.688±0.087plus-or-minus0.6880.0870.688\pm 0.0870.688 ± 0.0870.872±0.024plus-or-minus0.8720.0240.872\pm 0.0240.872 ± 0.024500.919±0.001plus-or-minus0.9190.0010.919\pm 0.0010.919 ± 0.0010.946±0.001plus-or-minus0.9460.0010.946\pm 0.0010.946 ± 0.0010.952±0.001plus-or-minus0.9520.0010.952\pm 0.0010.952 ± 0.0010.561±0.115plus-or-minus0.5610.1150.561\pm 0.1150.561 ± 0.1150.962±0.003plus-or-minus0.9620.0030.962\pm 0.0030.962 ± 0.0030.963±0.002plus-or-minus0.9630.0020.963\pm 0.0020.963 ± 0.002
1000.852±0.039plus-or-minus0.8520.0390.852\pm 0.0390.852 ± 0.0390.670±0.095plus-or-minus0.6700.0950.670\pm 0.0950.670 ± 0.0950.868±0.022plus-or-minus0.8680.0220.868\pm 0.0220.868 ± 0.022Full0.937±0.001plus-or-minus0.9370.0010.937\pm 0.0010.937 ± 0.0010.954±0.001plus-or-minus0.9540.0010.954\pm 0.0010.954 ± 0.0010.958±0.001plus-or-minus0.9580.0010.958\pm 0.0010.958 ± 0.0010.613±0.386plus-or-minus0.6130.3860.613\pm 0.3860.613 ± 0.3860.969±0.004plus-or-minus0.9690.0040.969\pm 0.0040.969 ± 0.0040.981±0.002plus-or-minus0.9810.0020.981\pm 0.0020.981 ± 0.002
10000.794±0.020plus-or-minus0.7940.0200.794\pm 0.0200.794 ± 0.0200.630±0.046plus-or-minus0.6300.0460.630\pm 0.0460.630 ± 0.0460.860±0.012plus-or-minus0.8600.0120.860\pm 0.0120.860 ± 0.012100.893±0.000plus-or-minus0.8930.0000.893\pm 0.0000.893 ± 0.0000.926±0.001plus-or-minus0.9260.0010.926\pm 0.0010.926 ± 0.0010.941±0.001plus-or-minus0.9410.0010.941\pm 0.0010.941 ± 0.0010.524±0.107plus-or-minus0.5240.1070.524\pm 0.1070.524 ± 0.1070.961±0.002plus-or-minus0.9610.0020.961\pm 0.0020.961 ± 0.0020.962±0.001plus-or-minus0.9620.0010.962\pm 0.0010.962 ± 0.001
10000.822±0.038plus-or-minus0.8220.0380.822\pm 0.0380.822 ± 0.0380.693±0.096plus-or-minus0.6930.0960.693\pm 0.0960.693 ± 0.0960.877±0.026plus-or-minus0.8770.0260.877\pm 0.0260.877 ± 0.026500.903±0.000plus-or-minus0.9030.0000.903\pm 0.0000.903 ± 0.0000.945±0.000plus-or-minus0.9450.0000.945\pm 0.0000.945 ± 0.0000.952±0.001plus-or-minus0.9520.0010.952\pm 0.0010.952 ± 0.0010.712±0.167plus-or-minus0.7120.1670.712\pm 0.1670.712 ± 0.1670.963±0.002plus-or-minus0.9630.0020.963\pm 0.0020.963 ± 0.0020.980±0.002plus-or-minus0.9800.0020.980\pm 0.0020.980 ± 0.002
10000.818±0.005plus-or-minus0.8180.0050.818\pm 0.0050.818 ± 0.0050.707±0.019plus-or-minus0.7070.0190.707\pm 0.0190.707 ± 0.0190.867±0.005plus-or-minus0.8670.0050.867\pm 0.0050.867 ± 0.005Full0.899±0.001plus-or-minus0.8990.0010.899\pm 0.0010.899 ± 0.0010.953±0.001plus-or-minus0.9530.0010.953\pm 0.0010.953 ± 0.0010.958±0.001plus-or-minus0.9580.0010.958\pm 0.0010.958 ± 0.0010.724±0.400plus-or-minus0.7240.4000.724\pm 0.4000.724 ± 0.4000.968±0.005plus-or-minus0.9680.0050.968\pm 0.0050.968 ± 0.0050.982±0.003plus-or-minus0.9820.0030.982\pm 0.0030.982 ± 0.003
Full0.812±0.022plus-or-minus0.8120.0220.812\pm 0.0220.812 ± 0.0220.688±0.050plus-or-minus0.6880.0500.688\pm 0.0500.688 ± 0.0500.870±0.012plus-or-minus0.8700.0120.870\pm 0.0120.870 ± 0.012100.905±0.005plus-or-minus0.9050.0050.905\pm 0.0050.905 ± 0.0050.921±0.008plus-or-minus0.9210.0080.921\pm 0.0080.921 ± 0.0080.935±0.004plus-or-minus0.9350.0040.935\pm 0.0040.935 ± 0.0040.627±0.150plus-or-minus0.6270.1500.627\pm 0.1500.627 ± 0.1500.946±0.014plus-or-minus0.9460.0140.946\pm 0.0140.946 ± 0.0140.973±0.009plus-or-minus0.9730.0090.973\pm 0.0090.973 ± 0.009
Full0.813±0.012plus-or-minus0.8130.0120.813\pm 0.0120.813 ± 0.0120.683±0.020plus-or-minus0.6830.0200.683\pm 0.0200.683 ± 0.0200.873±0.008plus-or-minus0.8730.0080.873\pm 0.0080.873 ± 0.008500.927±0.001plus-or-minus0.9270.0010.927\pm 0.0010.927 ± 0.0010.947±0.001plus-or-minus0.9470.0010.947\pm 0.0010.947 ± 0.0010.954±0.001plus-or-minus0.9540.0010.954\pm 0.0010.954 ± 0.0010.397±0.172plus-or-minus0.3970.172\textbf{0.397}\pm 0.1720.397 ± 0.1720.968±0.001plus-or-minus0.9680.0010.968\pm 0.0010.968 ± 0.0010.977±0.001plus-or-minus0.9770.0010.977\pm 0.0010.977 ± 0.001
Full0.816±0.008plus-or-minus0.8160.0080.816\pm 0.0080.816 ± 0.0080.678±0.019plus-or-minus0.6780.0190.678\pm 0.0190.678 ± 0.0190.873±0.004plus-or-minus0.8730.0040.873\pm 0.0040.873 ± 0.004Full0.935±0.001plus-or-minus0.9350.0010.935\pm 0.0010.935 ± 0.0010.954¯±0.001plus-or-minus¯0.9540.001\underline{0.954}\pm 0.001under¯ start_ARG 0.954 end_ARG ± 0.0010.958±0.001plus-or-minus0.9580.0010.958\pm 0.0010.958 ± 0.0010.625±0.208plus-or-minus0.6250.2080.625\pm 0.2080.625 ± 0.2080.970¯±0.002plus-or-minus¯0.9700.002\underline{0.970}\pm 0.002under¯ start_ARG 0.970 end_ARG ± 0.0020.981±0.001plus-or-minus0.9810.0010.981\pm 0.0010.981 ± 0.001
UMTL-SSL-S1000.798±0.030plus-or-minus0.7980.0300.798\pm 0.0300.798 ± 0.0300.628±0.081plus-or-minus0.6280.0810.628\pm 0.0810.628 ± 0.0810.860±0.018plus-or-minus0.8600.0180.860\pm 0.0180.860 ± 0.018100.951±0.004plus-or-minus0.9510.0040.951\pm 0.0040.951 ± 0.0040.911±0.008plus-or-minus0.9110.0080.911\pm 0.0080.911 ± 0.0080.935±0.004plus-or-minus0.9350.0040.935\pm 0.0040.935 ± 0.0040.792±0.313plus-or-minus0.7920.3130.792\pm 0.3130.792 ± 0.3130.940±0.006plus-or-minus0.9400.0060.940\pm 0.0060.940 ± 0.0060.963±0.006plus-or-minus0.9630.0060.963\pm 0.0060.963 ± 0.006
1000.834±0.033plus-or-minus0.8340.0330.834\pm 0.0330.834 ± 0.0330.696±0.074plus-or-minus0.6960.0740.696\pm 0.0740.696 ± 0.0740.874±0.019plus-or-minus0.8740.0190.874\pm 0.0190.874 ± 0.019500.972±0.001plus-or-minus0.9720.0010.972\pm 0.0010.972 ± 0.0010.946±0.001plus-or-minus0.9460.0010.946\pm 0.0010.946 ± 0.0010.952±0.001plus-or-minus0.9520.0010.952\pm 0.0010.952 ± 0.0010.727±0.340plus-or-minus0.7270.3400.727\pm 0.3400.727 ± 0.3400.965±0.002plus-or-minus0.9650.0020.965\pm 0.0020.965 ± 0.0020.977±0.002plus-or-minus0.9770.0020.977\pm 0.0020.977 ± 0.002
1000.817±0.036plus-or-minus0.8170.0360.817\pm 0.0360.817 ± 0.0360.688±0.100plus-or-minus0.6880.1000.688\pm 0.1000.688 ± 0.1000.860±0.021plus-or-minus0.8600.0210.860\pm 0.0210.860 ± 0.021Full0.975±0.001plus-or-minus0.9750.0010.975\pm 0.0010.975 ± 0.0010.951±0.001plus-or-minus0.9510.0010.951\pm 0.0010.951 ± 0.0010.954±0.001plus-or-minus0.9540.0010.954\pm 0.0010.954 ± 0.0010.812±0.315plus-or-minus0.8120.3150.812\pm 0.3150.812 ± 0.3150.968±0.002plus-or-minus0.9680.0020.968\pm 0.0020.968 ± 0.0020.981±0.002plus-or-minus0.9810.0020.981\pm 0.0020.981 ± 0.002
10000.806±0.020plus-or-minus0.8060.0200.806\pm 0.0200.806 ± 0.0200.652±0.055plus-or-minus0.6520.0550.652\pm 0.0550.652 ± 0.0550.872±0.014plus-or-minus0.8720.0140.872\pm 0.0140.872 ± 0.014100.956±0.002plus-or-minus0.9560.0020.956\pm 0.0020.956 ± 0.0020.916±0.004plus-or-minus0.9160.0040.916\pm 0.0040.916 ± 0.0040.937±0.003plus-or-minus0.9370.0030.937\pm 0.0030.937 ± 0.0030.852±0.275plus-or-minus0.8520.2750.852\pm 0.2750.852 ± 0.2750.943±0.005plus-or-minus0.9430.0050.943\pm 0.0050.943 ± 0.0050.966±0.003plus-or-minus0.9660.0030.966\pm 0.0030.966 ± 0.003
10000.808±.013plus-or-minus0.808.0130.808\pm.0130.808 ± .0130.662±0.038plus-or-minus0.6620.0380.662\pm 0.0380.662 ± 0.0380.862±0.010plus-or-minus0.8620.0100.862\pm 0.0100.862 ± 0.010500.971±0.000plus-or-minus0.9710.0000.971\pm 0.0000.971 ± 0.0000.944±0.001plus-or-minus0.9440.0010.944\pm 0.0010.944 ± 0.0010.952±0.000plus-or-minus0.9520.0000.952\pm 0.0000.952 ± 0.0000.917±0.239plus-or-minus0.9170.2390.917\pm 0.2390.917 ± 0.2390.965±0.002plus-or-minus0.9650.0020.965\pm 0.0020.965 ± 0.0020.978±0.003plus-or-minus0.9780.0030.978\pm 0.0030.978 ± 0.003
10000.801±0.020plus-or-minus0.8010.0200.801\pm 0.0200.801 ± 0.0200.646±0.049plus-or-minus0.6460.0490.646\pm 0.0490.646 ± 0.0490.862±0.010plus-or-minus0.8620.0100.862\pm 0.0100.862 ± 0.010Full0.975±0.001plus-or-minus0.9750.0010.975\pm 0.0010.975 ± 0.0010.952±0.001plus-or-minus0.9520.0010.952\pm 0.0010.952 ± 0.0010.954±0.001plus-or-minus0.9540.0010.954\pm 0.0010.954 ± 0.0010.753±0.228plus-or-minus0.7530.2280.753\pm 0.2280.753 ± 0.2280.969±0.001plus-or-minus0.9690.0010.969\pm 0.0010.969 ± 0.0010.981±0.001plus-or-minus0.9810.0010.981\pm 0.0010.981 ± 0.001
Full0.796±0.033plus-or-minus0.7960.0330.796\pm 0.0330.796 ± 0.0330.632±0.086plus-or-minus0.6320.0860.632\pm 0.0860.632 ± 0.0860.864±0.018plus-or-minus0.8640.0180.864\pm 0.0180.864 ± 0.018100.960±0.002plus-or-minus0.9600.0020.960\pm 0.0020.960 ± 0.0020.923±0.004plus-or-minus0.9230.0040.923\pm 0.0040.923 ± 0.0040.940±0.002plus-or-minus0.9400.0020.940\pm 0.0020.940 ± 0.0020.782±0.229plus-or-minus0.7820.2290.782\pm 0.2290.782 ± 0.2290.954±0.005plus-or-minus0.9540.0050.954\pm 0.0050.954 ± 0.0050.967±0.003plus-or-minus0.9670.0030.967\pm 0.0030.967 ± 0.003
Full0.808±0.014plus-or-minus0.8080.0140.808\pm 0.0140.808 ± 0.0140.662±0.030plus-or-minus0.6620.0300.662\pm 0.0300.662 ± 0.0300.868±0.007plus-or-minus0.8680.0070.868\pm 0.0070.868 ± 0.007500.972±0.001plus-or-minus0.9720.0010.972\pm 0.0010.972 ± 0.0010.945±0.001plus-or-minus0.9450.0010.945\pm 0.0010.945 ± 0.0010.953±0.001plus-or-minus0.9530.0010.953\pm 0.0010.953 ± 0.0010.645±0.196plus-or-minus0.6450.1960.645\pm 0.1960.645 ± 0.1960.966±0.003plus-or-minus0.9660.0030.966\pm 0.0030.966 ± 0.0030.978±0.003plus-or-minus0.9780.0030.978\pm 0.0030.978 ± 0.003
Full0.800±0.016plus-or-minus0.8000.0160.800\pm 0.0160.800 ± 0.0160.632±0.038plus-or-minus0.6320.0380.632\pm 0.0380.632 ± 0.0380.628±0.009plus-or-minus0.6280.0090.628\pm 0.0090.628 ± 0.009Full0.961±0.008plus-or-minus0.9610.0080.961\pm 0.0080.961 ± 0.0080.924±0.016plus-or-minus0.9240.0160.924\pm 0.0160.924 ± 0.0160.940±0.009plus-or-minus0.9400.0090.940\pm 0.0090.940 ± 0.0090.392±0.337plus-or-minus0.3920.3370.392\pm 0.3370.392 ± 0.3370.948±0.014plus-or-minus0.9480.0140.948\pm 0.0140.948 ± 0.0140.969±0.007plus-or-minus0.9690.0070.969\pm 0.0070.969 ± 0.007
MultiMix1000.800±0.025plus-or-minus0.8000.0250.800\pm 0.0250.800 ± 0.0250.594±0.064plus-or-minus0.5940.0640.594\pm 0.0640.594 ± 0.0640.856±0.015plus-or-minus0.8560.0150.856\pm 0.0150.856 ± 0.015100.954±0.004plus-or-minus0.9540.0040.954\pm 0.0040.954 ± 0.0040.920±0.008plus-or-minus0.9200.0080.920\pm 0.0080.920 ± 0.0080.938±0.004plus-or-minus0.9380.0040.938\pm 0.0040.938 ± 0.0040.695±0.198plus-or-minus0.6950.1980.695\pm 0.1980.695 ± 0.1980.949±0.010plus-or-minus0.9490.0100.949\pm 0.0100.949 ± 0.0100.969±0.007plus-or-minus0.9690.0070.969\pm 0.0070.969 ± 0.007
1000.824±0.022plus-or-minus0.8240.0220.824\pm 0.0220.824 ± 0.0220.613±0.056plus-or-minus0.6130.0560.613\pm 0.0560.613 ± 0.0560.854±0.014plus-or-minus0.8540.0140.854\pm 0.0140.854 ± 0.014500.971±0.001plus-or-minus0.9710.0010.971\pm 0.0010.971 ± 0.0010.943±0.002plus-or-minus0.9430.0020.943\pm 0.0020.943 ± 0.0020.951±0.001plus-or-minus0.9510.0010.951\pm 0.0010.951 ± 0.0010.681±0.086plus-or-minus0.6810.0860.681\pm 0.0860.681 ± 0.0860.964±0.003plus-or-minus0.9640.0030.964\pm 0.0030.964 ± 0.0030.976±0.002plus-or-minus0.9760.0020.976\pm 0.0020.976 ± 0.002
1000.792±0.035plus-or-minus0.7920.0350.792\pm 0.0350.792 ± 0.0350.593±0.101plus-or-minus0.5930.1010.593\pm 0.1010.593 ± 0.1010.854±0.021plus-or-minus0.8540.0210.854\pm 0.0210.854 ± 0.021Full0.973±0.012plus-or-minus0.9730.0120.973\pm 0.0120.973 ± 0.0120.948±0.022plus-or-minus0.9480.0220.948\pm 0.0220.948 ± 0.0220.954±0.015plus-or-minus0.9540.0150.954\pm 0.0150.954 ± 0.0150.636±0.070plus-or-minus0.6360.0700.636\pm 0.0700.636 ± 0.0700.966±0.025plus-or-minus0.9660.0250.966\pm 0.0250.966 ± 0.0250.981±0.004plus-or-minus0.9810.0040.981\pm 0.0040.981 ± 0.004
10000.817±0.016plus-or-minus0.8170.0160.817\pm 0.0160.817 ± 0.0160.647±0.038plus-or-minus0.6470.0380.647\pm 0.0380.647 ± 0.0380.865±0.006plus-or-minus0.8650.0060.865\pm 0.0060.865 ± 0.006100.954±0.004plus-or-minus0.9540.0040.954\pm 0.0040.954 ± 0.0040.910±0.008plus-or-minus0.9100.0080.910\pm 0.0080.910 ± 0.0080.932±0.004plus-or-minus0.9320.0040.932\pm 0.0040.932 ± 0.0040.902±0.186plus-or-minus0.9020.1860.902\pm 0.1860.902 ± 0.1860.942±0.005plus-or-minus0.9420.0050.942\pm 0.0050.942 ± 0.0050.968±0.007plus-or-minus0.9680.0070.968\pm 0.0070.968 ± 0.007
10000.825±0.016plus-or-minus0.8250.0160.825\pm 0.0160.825 ± 0.0160.650±0.033plus-or-minus0.6500.0330.650\pm 0.0330.650 ± 0.0330.860±0.011plus-or-minus0.8600.0110.860\pm 0.0110.860 ± 0.011500.970±0.001plus-or-minus0.9700.0010.970\pm 0.0010.970 ± 0.0010.941±0.002plus-or-minus0.9410.0020.941\pm 0.0020.941 ± 0.0020.950±0.001plus-or-minus0.9500.0010.950\pm 0.0010.950 ± 0.0010.811±0.112plus-or-minus0.8110.1120.811\pm 0.1120.811 ± 0.1120.964±0.004plus-or-minus0.9640.0040.964\pm 0.0040.964 ± 0.0040.977±0.002plus-or-minus0.9770.0020.977\pm 0.0020.977 ± 0.002
10000.830±0.048plus-or-minus0.8300.0480.830\pm 0.0480.830 ± 0.0480.586±0.138plus-or-minus0.5860.1380.586\pm 0.1380.586 ± 0.1380.856±0.029plus-or-minus0.8560.0290.856\pm 0.0290.856 ± 0.029Full0.974±0.011plus-or-minus0.9740.011\textbf{0.974}\pm 0.0110.974 ± 0.0110.919±0.020plus-or-minus0.9190.0200.919\pm 0.0200.919 ± 0.0200.953±0.014plus-or-minus0.9530.0140.953\pm 0.0140.953 ± 0.0140.643±0.126plus-or-minus0.6430.1260.643\pm 0.1260.643 ± 0.1260.933±0.024plus-or-minus0.9330.0240.933\pm 0.0240.933 ± 0.0240.984±0.004plus-or-minus0.9840.0040.984\pm 0.0040.984 ± 0.004
Full0.840±0.025plus-or-minus0.8400.0250.840\pm 0.0250.840 ± 0.0250.730±0.060plus-or-minus0.7300.0600.730\pm 0.0600.730 ± 0.0600.880±0.016plus-or-minus0.8800.0160.880\pm 0.0160.880 ± 0.016100.954±0.002plus-or-minus0.9540.0020.954\pm 0.0020.954 ± 0.0020.913±0.004plus-or-minus0.9130.0040.913\pm 0.0040.913 ± 0.0040.935±0.001plus-or-minus0.9350.0010.935\pm 0.0010.935 ± 0.0010.621±0.123plus-or-minus0.6210.1230.621\pm 0.1230.621 ± 0.1230.949±0.006plus-or-minus0.9490.0060.949\pm 0.0060.949 ± 0.0060.968±0.006plus-or-minus0.9680.0060.968\pm 0.0060.968 ± 0.006
Full0.854±0.022plus-or-minus0.8540.022\textbf{0.854}\pm 0.0220.854 ± 0.0220.760±0.055plus-or-minus0.7600.055\textbf{0.760}\pm 0.0550.760 ± 0.0550.890±0.014plus-or-minus0.8900.014\textbf{0.890}\pm 0.0140.890 ± 0.014500.972±0.001plus-or-minus0.9720.0010.972\pm 0.0010.972 ± 0.0010.950±0.001plus-or-minus0.9500.001\textbf{0.950}\pm 0.0010.950 ± 0.0010.956±0.001plus-or-minus0.9560.001\textbf{0.956}\pm 0.0010.956 ± 0.0010.692±0.036plus-or-minus0.6920.0360.692\pm 0.0360.692 ± 0.0360.970±0.003plus-or-minus0.9700.003\textbf{0.970}\pm 0.0030.970 ± 0.0030.980±0.003plus-or-minus0.9800.003\textbf{0.980}\pm 0.0030.980 ± 0.003
Full0.843¯±0.024plus-or-minus¯0.8430.024\underline{0.843}\pm 0.024under¯ start_ARG 0.843 end_ARG ± 0.0240.740¯±0.065plus-or-minus¯0.7400.065\underline{0.740}\pm 0.065under¯ start_ARG 0.740 end_ARG ± 0.0650.890¯±0.017plus-or-minus¯0.8900.017\underline{0.890}\pm 0.017under¯ start_ARG 0.890 end_ARG ± 0.017Full0.975¯±0.000plus-or-minus¯0.9750.000\underline{0.975}\pm 0.000under¯ start_ARG 0.975 end_ARG ± 0.0000.952±0.001plus-or-minus0.9520.0010.952\pm 0.0010.952 ± 0.0010.960¯±0.001plus-or-minus¯0.9600.001\underline{0.960}\pm 0.001under¯ start_ARG 0.960 end_ARG ± 0.0010.528±0.037plus-or-minus0.5280.0370.528\pm 0.0370.528 ± 0.0370.970¯±0.001plus-or-minus¯0.9700.001\underline{0.970}\pm 0.001under¯ start_ARG 0.970 end_ARG ± 0.0010.982¯±0.001plus-or-minus¯0.9820.001\underline{0.982}\pm 0.001under¯ start_ARG 0.982 end_ARG ± 0.001
Table 3: Classification and segmentation performance figures with varying label proportions in cross-domain evaluations: NIHX (classification) and MCU (segmentation) datasets. The best scores from fully-supervised models are underlined and the best scores from semi-supervised models are bolded. Scores are given as mean±stdplus-or-minusmeanstd\text{mean}\pm\text{std}mean ± std.
Model|𝒟lc|subscriptsuperscript𝒟𝑐𝑙|\mathcal{D}^{c}_{l}|| caligraphic_D start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT |Classification|𝒟ls|subscriptsuperscript𝒟𝑠𝑙|\mathcal{D}^{s}_{l}|| caligraphic_D start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT |Segmentation
AccF1-NorF1-AbnDSJSSSIMHDPR
U-Net100.555±0.047plus-or-minus0.5550.0470.555\pm 0.0470.555 ± 0.0470.480±0.053plus-or-minus0.4800.0530.480\pm 0.0530.480 ± 0.0530.680±0.059plus-or-minus0.6800.0590.680\pm 0.0590.680 ± 0.0598.691±1.100plus-or-minus8.6911.1008.691\pm 1.1008.691 ± 1.1000.553±0.070plus-or-minus0.5530.0700.553\pm 0.0700.553 ± 0.0700.866±0.032plus-or-minus0.8660.0320.866\pm 0.0320.866 ± 0.032
500.763±0.026plus-or-minus0.7630.0260.763\pm 0.0260.763 ± 0.0260.736±0.037plus-or-minus0.7360.0370.736\pm 0.0370.736 ± 0.0370.870±0.019plus-or-minus0.8700.0190.870\pm 0.0190.870 ± 0.0192.895±0.832plus-or-minus2.8950.8322.895\pm 0.8322.895 ± 0.8320.752±0.035plus-or-minus0.7520.0350.752\pm 0.0350.752 ± 0.0350.887±0.019plus-or-minus0.8870.0190.887\pm 0.0190.887 ± 0.019
Full0.838±0.023plus-or-minus0.8380.0230.838\pm 0.0230.838 ± 0.0230.906±0.035plus-or-minus0.9060.0350.906\pm 0.0350.906 ± 0.0350.929±0.017plus-or-minus0.9290.0170.929\pm 0.0170.929 ± 0.0171.414±0.529plus-or-minus1.4140.5291.414\pm 0.5291.414 ± 0.5290.793±0.041plus-or-minus0.7930.0410.793\pm 0.0410.793 ± 0.0410.910±0.013plus-or-minus0.9100.0130.910\pm 0.0130.910 ± 0.013
Enc1000.352±0.035plus-or-minus0.3520.0350.352\pm 0.0350.352 ± 0.0350.070±0.131plus-or-minus0.0700.1310.070\pm 0.1310.070 ± 0.1310.506±0.008plus-or-minus0.5060.0080.506\pm 0.0080.506 ± 0.008
10000.390±0.037plus-or-minus0.3900.0370.390\pm 0.0370.390 ± 0.0370.192±0.124plus-or-minus0.1920.1240.192\pm 0.1240.192 ± 0.1240.508±0.007plus-or-minus0.5080.0070.508\pm 0.0070.508 ± 0.007
Full0.434±0.026plus-or-minus0.4340.0260.434\pm 0.0260.434 ± 0.0260.296±0.068plus-or-minus0.2960.0680.296\pm 0.0680.296 ± 0.0680.524±0.005plus-or-minus0.5240.0050.524\pm 0.0050.524 ± 0.005
Enc-MM1000.360±0.030plus-or-minus0.3600.0300.360\pm 0.0300.360 ± 0.0300.110±0.065plus-or-minus0.1100.0650.110\pm 0.0650.110 ± 0.0650.500±0.024plus-or-minus0.5000.0240.500\pm 0.0240.500 ± 0.024
10000.406±0.078plus-or-minus0.4060.0780.406\pm 0.0780.406 ± 0.0780.242±0.114plus-or-minus0.2420.1140.242\pm 0.1140.242 ± 0.1140.460±0.010plus-or-minus0.4600.0100.460\pm 0.0100.460 ± 0.010
Full0.452±0.040plus-or-minus0.4520.0400.452\pm 0.0400.452 ± 0.0400.316±0.068plus-or-minus0.3160.0680.316\pm 0.0680.316 ± 0.0680.502±0.030plus-or-minus0.5020.0300.502\pm 0.0300.502 ± 0.030
Enc-SSL1000.402±0.052plus-or-minus0.4020.0520.402\pm 0.0520.402 ± 0.0520.222±0.136plus-or-minus0.2220.1360.222\pm 0.1360.222 ± 0.1360.510±0.012plus-or-minus0.5100.0120.510\pm 0.0120.510 ± 0.012
10000.486±0.050plus-or-minus0.4860.0500.486\pm 0.0500.486 ± 0.0500.380±0.109plus-or-minus0.3800.1090.380\pm 0.1090.380 ± 0.1090.530±0.010plus-or-minus0.5300.0100.530\pm 0.0100.530 ± 0.010
Full0.510±0.024plus-or-minus0.5100.0240.510\pm 0.0240.510 ± 0.0240.472±0.056plus-or-minus0.4720.0560.472\pm 0.0560.472 ± 0.0560.538±0.004plus-or-minus0.5380.0040.538\pm 0.0040.538 ± 0.004
UMTL1000.350±0.034plus-or-minus0.3500.0340.350\pm 0.0340.350 ± 0.0340.045±0.123plus-or-minus0.0450.1230.045\pm 0.1230.045 ± 0.1230.510±0.003plus-or-minus0.5100.0030.510\pm 0.0030.510 ± 0.003100.586±0.023plus-or-minus0.5860.0230.586\pm 0.0230.586 ± 0.0230.708±0.035plus-or-minus0.7080.0350.708\pm 0.0350.708 ± 0.0350.836±0.020plus-or-minus0.8360.0200.836\pm 0.0200.836 ± 0.0207.156±2.316plus-or-minus7.1562.3167.156\pm 2.3167.156 ± 2.3160.731±0.032plus-or-minus0.7310.0320.731\pm 0.0320.731 ± 0.0320.950±0.014plus-or-minus0.9500.0140.950\pm 0.0140.950 ± 0.014
1000.363±0.029plus-or-minus0.3630.0290.363\pm 0.0290.363 ± 0.0290.085±0.096plus-or-minus0.0850.0960.085\pm 0.0960.085 ± 0.0960.515±0.005plus-or-minus0.5150.0050.515\pm 0.0050.515 ± 0.005500.580±0.040plus-or-minus0.5800.0400.580\pm 0.0400.580 ± 0.0400.684±0.061plus-or-minus0.6840.0610.684\pm 0.0610.684 ± 0.0610.825±0.038plus-or-minus0.8250.0380.825\pm 0.0380.825 ± 0.0387.013±1.576plus-or-minus7.0131.5767.013\pm 1.5767.013 ± 1.5760.697±0.065plus-or-minus0.6970.0650.697\pm 0.0650.697 ± 0.0650.975±0.012plus-or-minus0.9750.0120.975\pm 0.0120.975 ± 0.012
1000.342±0.049plus-or-minus0.3420.0490.342\pm 0.0490.342 ± 0.0490.015±0.153plus-or-minus0.0150.1530.015\pm 0.1530.015 ± 0.1530.508±0.006plus-or-minus0.5080.0060.508\pm 0.0060.508 ± 0.006Full0.607±0.023plus-or-minus0.6070.0230.607\pm 0.0230.607 ± 0.0230.742±0.037plus-or-minus0.7420.0370.742\pm 0.0370.742 ± 0.0370.863±0.021plus-or-minus0.8630.0210.863\pm 0.0210.863 ± 0.0216.398±1.331plus-or-minus6.3981.3316.398\pm 1.3316.398 ± 1.3310.759±0.041plus-or-minus0.7590.0410.759\pm 0.0410.759 ± 0.0410.968±0.007plus-or-minus0.9680.0070.968\pm 0.0070.968 ± 0.007
10000.413±0.004plus-or-minus0.4130.0040.413\pm 0.0040.413 ± 0.0040.263±0.033plus-or-minus0.2630.0330.263\pm 0.0330.263 ± 0.0330.507±0.005plus-or-minus0.5070.0050.507\pm 0.0050.507 ± 0.005100.676±0.020plus-or-minus0.6760.0200.676\pm 0.0200.676 ± 0.0200.674±0.030plus-or-minus0.6740.0300.674\pm 0.0300.674 ± 0.0300.833±0.016plus-or-minus0.8330.0160.833\pm 0.0160.833 ± 0.0163.268±0.768plus-or-minus3.2680.7683.268\pm 0.7683.268 ± 0.7680.712±0.025plus-or-minus0.7120.0250.712\pm 0.0250.712 ± 0.0250.927±0.013plus-or-minus0.9270.0130.927\pm 0.0130.927 ± 0.013
10000.400±0.022plus-or-minus0.4000.0220.400\pm 0.0220.400 ± 0.0220.203±0.069plus-or-minus0.2030.0690.203\pm 0.0690.203 ± 0.0690.513±0.005plus-or-minus0.5130.0050.513\pm 0.0050.513 ± 0.005500.704±0.025plus-or-minus0.7040.0250.704\pm 0.0250.704 ± 0.0250.811±0.040plus-or-minus0.8110.0400.811\pm 0.0400.811 ± 0.0400.896±0.015plus-or-minus0.8960.0150.896\pm 0.0150.896 ± 0.0153.232±0.229plus-or-minus3.2320.2293.232\pm 0.2293.232 ± 0.2290.828±0.034plus-or-minus0.8280.0340.828\pm 0.0340.828 ± 0.0340.964±0.014plus-or-minus0.9640.0140.964\pm 0.0140.964 ± 0.014
10000.430±0.041plus-or-minus0.4300.0410.430\pm 0.0410.430 ± 0.0410.293±0.110plus-or-minus0.2930.1100.293\pm 0.1100.293 ± 0.1100.517±0.005plus-or-minus0.5170.0050.517\pm 0.0050.517 ± 0.005Full0.638±0.015plus-or-minus0.6380.0150.638\pm 0.0150.638 ± 0.0150.795±0.024plus-or-minus0.7950.0240.795\pm 0.0240.795 ± 0.0240.890±0.013plus-or-minus0.8900.0130.890\pm 0.0130.890 ± 0.0133.893±0.465plus-or-minus3.8930.4653.893\pm 0.4653.893 ± 0.4650.810±0.031plus-or-minus0.8100.0310.810\pm 0.0310.810 ± 0.0310.966±0.010plus-or-minus0.9660.0100.966\pm 0.0100.966 ± 0.010
Full0.455±0.015plus-or-minus0.4550.0150.455\pm 0.0150.455 ± 0.0150.365±0.038plus-or-minus0.3650.0380.365\pm 0.0380.365 ± 0.0380.525±0.004plus-or-minus0.5250.0040.525\pm 0.0040.525 ± 0.004100.737±0.037plus-or-minus0.7370.0370.737\pm 0.0370.737 ± 0.0370.765±0.056plus-or-minus0.7650.0560.765\pm 0.0560.765 ± 0.0560.879±0.028plus-or-minus0.8790.0280.879\pm 0.0280.879 ± 0.0280.917±0.478plus-or-minus0.9170.4780.917\pm 0.4780.917 ± 0.4780.801±0.059plus-or-minus0.8010.0590.801\pm 0.0590.801 ± 0.0590.930±0.013plus-or-minus0.9300.0130.930\pm 0.0130.930 ± 0.013
Full0.444±0.026plus-or-minus0.4440.0260.444\pm 0.0260.444 ± 0.0260.332±0.075plus-or-minus0.3320.0750.332\pm 0.0750.332 ± 0.0750.522±0.007plus-or-minus0.5220.0070.522\pm 0.0070.522 ± 0.007500.868±0.022plus-or-minus0.8680.0220.868\pm 0.0220.868 ± 0.0220.793±0.038plus-or-minus0.7930.0380.793\pm 0.0380.793 ± 0.0380.894±0.014plus-or-minus0.8940.0140.894\pm 0.0140.894 ± 0.0140.742±0.212plus-or-minus0.7420.2120.742\pm 0.2120.742 ± 0.2120.898±0.0378plus-or-minus0.8980.03780.898\pm 0.03780.898 ± 0.03780.946±0.004plus-or-minus0.9460.0040.946\pm 0.0040.946 ± 0.004
Full0.443±0.038plus-or-minus0.4430.0380.443\pm 0.0380.443 ± 0.0380.328±0.099plus-or-minus0.3280.0990.328\pm 0.0990.328 ± 0.0990.520±0.010plus-or-minus0.5200.0100.520\pm 0.0100.520 ± 0.010Full0.854±0.018plus-or-minus0.8540.0180.854\pm 0.0180.854 ± 0.0180.828±0.030plus-or-minus0.8280.0300.828\pm 0.0300.828 ± 0.0300.913±0.012plus-or-minus0.9130.0120.913\pm 0.0120.913 ± 0.0120.792±0.490plus-or-minus0.7920.4900.792\pm 0.4900.792 ± 0.4900.866±0.026plus-or-minus0.8660.0260.866\pm 0.0260.866 ± 0.0260.942±0.008plus-or-minus0.9420.0080.942\pm 0.0080.942 ± 0.008
UMTL-S1000.344±0.021plus-or-minus0.3440.0210.344\pm 0.0210.344 ± 0.0210.006±0.074plus-or-minus0.0060.0740.006\pm 0.0740.006 ± 0.0740.510±0.005plus-or-minus0.5100.0050.510\pm 0.0050.510 ± 0.005100.797±0.029plus-or-minus0.7970.0290.797\pm 0.0290.797 ± 0.0290.670±0.042plus-or-minus0.6700.0420.670\pm 0.0420.670 ± 0.0420.807±0.027plus-or-minus0.8070.0270.807\pm 0.0270.807 ± 0.0275.754±1.047plus-or-minus5.7541.0475.754\pm 1.0475.754 ± 1.0470.698±0.043plus-or-minus0.6980.0430.698\pm 0.0430.698 ± 0.0430.938±0.009plus-or-minus0.9380.0090.938\pm 0.0090.938 ± 0.009
1000.364±0.035plus-or-minus0.3640.0350.364\pm 0.0350.364 ± 0.0350.098±0.019plus-or-minus0.0980.0190.098\pm 0.0190.098 ± 0.0190.506±0.006plus-or-minus0.5060.0060.506\pm 0.0060.506 ± 0.006500.828±0.032plus-or-minus0.8280.0320.828\pm 0.0320.828 ± 0.0320.715±0.049plus-or-minus0.7150.0490.715\pm 0.0490.715 ± 0.0490.826±0.033plus-or-minus0.8260.0330.826\pm 0.0330.826 ± 0.0336.412±1.753plus-or-minus6.4121.7536.412\pm 1.7536.412 ± 1.7530.731±0.053plus-or-minus0.7310.0530.731\pm 0.0530.731 ± 0.0530.971±0.009plus-or-minus0.9710.0090.971\pm 0.0090.971 ± 0.009
1000.342±0.016plus-or-minus0.3420.0160.342\pm 0.0160.342 ± 0.0160.008±0.070plus-or-minus0.0080.0700.008\pm 0.0700.008 ± 0.0700.510±0.000plus-or-minus0.5100.0000.510\pm 0.0000.510 ± 0.000Full0.838±0.041plus-or-minus0.8380.0410.838\pm 0.0410.838 ± 0.0410.715±0.063plus-or-minus0.7150.0630.715\pm 0.0630.715 ± 0.0630.834±0.043plus-or-minus0.8340.0430.834\pm 0.0430.834 ± 0.0436.321±1.573plus-or-minus6.3211.5736.321\pm 1.5736.321 ± 1.5730.740±0.068plus-or-minus0.7400.0680.740\pm 0.0680.740 ± 0.0680.966±0.009plus-or-minus0.9660.0090.966\pm 0.0090.966 ± 0.009
10000.378±0.017plus-or-minus0.3780.0170.378\pm 0.0170.378 ± 0.0170.138±0.057plus-or-minus0.1380.0570.138\pm 0.0570.138 ± 0.0570.512±0.007plus-or-minus0.5120.0070.512\pm 0.0070.512 ± 0.007100.844±0.018plus-or-minus0.8440.0180.844\pm 0.0180.844 ± 0.0180.718±0.027plus-or-minus0.7180.0270.718\pm 0.0270.718 ± 0.0270.854±0.014plus-or-minus0.8540.0140.854\pm 0.0140.854 ± 0.0143.921±0.480plus-or-minus3.9210.4803.921\pm 0.4803.921 ± 0.4800.754±0.023plus-or-minus0.7540.0230.754\pm 0.0230.754 ± 0.0230.939±0.011plus-or-minus0.9390.0110.939\pm 0.0110.939 ± 0.011
10000.392±0.024plus-or-minus0.3920.0240.392\pm 0.0240.392 ± 0.0240.186±0.078plus-or-minus0.1860.0780.186\pm 0.0780.186 ± 0.0780.514±0.005plus-or-minus0.5140.0050.514\pm 0.0050.514 ± 0.005500.883±0.016plus-or-minus0.8830.0160.883\pm 0.0160.883 ± 0.0160.793±0.025plus-or-minus0.7930.0250.793\pm 0.0250.793 ± 0.0250.888±0.011plus-or-minus0.8880.0110.888\pm 0.0110.888 ± 0.0113.017±0.191plus-or-minus3.0170.1913.017\pm 0.1913.017 ± 0.1910.821±0.027plus-or-minus0.8210.0270.821\pm 0.0270.821 ± 0.0270.959±0.002plus-or-minus0.9590.0020.959\pm 0.0020.959 ± 0.002
10000.370±0.014plus-or-minus0.3700.0140.370\pm 0.0140.370 ± 0.0140.130±0.057plus-or-minus0.1300.0570.130\pm 0.0570.130 ± 0.0570.510±0.000plus-or-minus0.5100.0000.510\pm 0.0000.510 ± 0.000Full0.898±0.016plus-or-minus0.8980.0160.898\pm 0.0160.898 ± 0.0160.831±0.027plus-or-minus0.8310.0270.831\pm 0.0270.831 ± 0.0270.905±0.014plus-or-minus0.9050.0140.905\pm 0.0140.905 ± 0.0144.150±1.269plus-or-minus4.1501.2694.150\pm 1.2694.150 ± 1.2690.845±0.031plus-or-minus0.8450.0310.845\pm 0.0310.845 ± 0.0310.970±0.010plus-or-minus0.9700.0100.970\pm 0.0100.970 ± 0.010
Full0.470±0.026plus-or-minus0.4700.0260.470\pm 0.0260.470 ± 0.0260.398±0.015plus-or-minus0.3980.0150.398\pm 0.0150.398 ± 0.0150.524±0.010plus-or-minus0.5240.0100.524\pm 0.0100.524 ± 0.010100.881±0.019plus-or-minus0.8810.0190.881\pm 0.0190.881 ± 0.0190.785±0.031plus-or-minus0.7850.0310.785\pm 0.0310.785 ± 0.0310.888±0.014plus-or-minus0.8880.0140.888\pm 0.0140.888 ± 0.0140.862±0.199plus-or-minus0.8620.1990.862\pm 0.1990.862 ± 0.1990.830±0.034plus-or-minus0.8300.0340.830\pm 0.0340.830 ± 0.0340.939±0.009plus-or-minus0.9390.0090.939\pm 0.0090.939 ± 0.009
Full0.413±0.009plus-or-minus0.4130.0090.413\pm 0.0090.413 ± 0.0090.270±0.008plus-or-minus0.2700.0080.270\pm 0.0080.270 ± 0.0080.510±0.014plus-or-minus0.5100.0140.510\pm 0.0140.510 ± 0.014500.917±0.009plus-or-minus0.9170.0090.917\pm 0.0090.917 ± 0.0090.848±0.014plus-or-minus0.8480.0140.848\pm 0.0140.848 ± 0.0140.919±0.006plus-or-minus0.9190.0060.919\pm 0.0060.919 ± 0.0060.658±0.227plus-or-minus0.6580.2270.658\pm 0.2270.658 ± 0.2270.966±0.012plus-or-minus0.9660.0120.966\pm 0.0120.966 ± 0.0120.888±0.007plus-or-minus0.8880.0070.888\pm 0.0070.888 ± 0.007
Full0.433±0.026plus-or-minus0.4330.0260.433\pm 0.0260.433 ± 0.0260.315±0.007plus-or-minus0.3150.0070.315\pm 0.0070.315 ± 0.0070.513±0.015plus-or-minus0.5130.0150.513\pm 0.0150.513 ± 0.015Full0.916±0.008plus-or-minus0.9160.0080.916\pm 0.0080.916 ± 0.0080.850±0.013plus-or-minus0.8500.0130.850\pm 0.0130.850 ± 0.0130.921±0.005plus-or-minus0.9210.0050.921\pm 0.0050.921 ± 0.0050.882±0.151plus-or-minus0.8820.1510.882\pm 0.1510.882 ± 0.1510.886±0.014plus-or-minus0.8860.0140.886\pm 0.0140.886 ± 0.0140.952±0.002plus-or-minus0.9520.0020.952\pm 0.0020.952 ± 0.002
UMTL-SSL1000.442±0.045plus-or-minus0.4420.0450.442\pm 0.0450.442 ± 0.0450.316±0.004plus-or-minus0.3160.0040.316\pm 0.0040.316 ± 0.0040.524±0.007plus-or-minus0.5240.0070.524\pm 0.0070.524 ± 0.007100.833±0.025plus-or-minus0.8330.0250.833\pm 0.0250.833 ± 0.0250.778±0.042plus-or-minus0.7780.0420.778\pm 0.0420.778 ± 0.0420.884±0.019plus-or-minus0.8840.0190.884\pm 0.0190.884 ± 0.0190.895±0.415plus-or-minus0.8950.4150.895\pm 0.4150.895 ± 0.4150.810±0.042plus-or-minus0.8100.0420.810\pm 0.0420.810 ± 0.0420.948±0.007plus-or-minus0.9480.0070.948\pm 0.0070.948 ± 0.007
1000.398±0.010plus-or-minus0.3980.0100.398\pm 0.0100.398 ± 0.0100.166±0.004plus-or-minus0.1660.0040.166\pm 0.0040.166 ± 0.0040.520±0.019plus-or-minus0.5200.0190.520\pm 0.0190.520 ± 0.019500.853±0.023plus-or-minus0.8530.0230.853\pm 0.0230.853 ± 0.0230.839±0.038plus-or-minus0.8390.0380.839\pm 0.0380.839 ± 0.0380.907±0.017plus-or-minus0.9070.0170.907\pm 0.0170.907 ± 0.0170.851±0.157plus-or-minus0.8510.1570.851\pm 0.1570.851 ± 0.1570.864±0.039plus-or-minus0.8640.0390.864\pm 0.0390.864 ± 0.0390.952±0.007plus-or-minus0.9520.0070.952\pm 0.0070.952 ± 0.007
1000.385±0.062plus-or-minus0.3850.0620.385\pm 0.0620.385 ± 0.0620.165±0.006plus-or-minus0.1650.0060.165\pm 0.0060.165 ± 0.0060.515±0.013plus-or-minus0.5150.0130.515\pm 0.0130.515 ± 0.013Full0.841±0.023plus-or-minus0.8410.0230.841\pm 0.0230.841 ± 0.0230.818±0.039plus-or-minus0.8180.0390.818\pm 0.0390.818 ± 0.0390.911±0.015plus-or-minus0.9110.0150.911\pm 0.0150.911 ± 0.0150.853±0.410plus-or-minus0.8530.4100.853\pm 0.4100.853 ± 0.4100.854±0.039plus-or-minus0.8540.0390.854\pm 0.0390.854 ± 0.0390.949±0.007plus-or-minus0.9490.0070.949\pm 0.0070.949 ± 0.007
10000.445±0.038plus-or-minus0.4450.0380.445\pm 0.0380.445 ± 0.0380.333±0.091plus-or-minus0.3330.0910.333\pm 0.0910.333 ± 0.0910.525±0.005plus-or-minus0.5250.0050.525\pm 0.0050.525 ± 0.005100.818±0.032plus-or-minus0.8180.0320.818\pm 0.0320.818 ± 0.0320.781±0.051plus-or-minus0.7810.0510.781\pm 0.0510.781 ± 0.0510.892±0.023plus-or-minus0.8920.0230.892\pm 0.0230.892 ± 0.0231.085±0.464plus-or-minus1.0850.4641.085\pm 0.4641.085 ± 0.4640.825±0.051plus-or-minus0.8250.0510.825\pm 0.0510.825 ± 0.0510.938±0.008plus-or-minus0.9380.0080.938\pm 0.0080.938 ± 0.008
10000.526±0.080plus-or-minus0.5260.0800.526\pm 0.0800.526 ± 0.0800.486±0.017plus-or-minus0.4860.0170.486\pm 0.0170.486 ± 0.0170.544±0.016plus-or-minus0.5440.0160.544\pm 0.0160.544 ± 0.016500.826±0.018plus-or-minus0.8260.0180.826\pm 0.0180.826 ± 0.0180.804±0.029plus-or-minus0.8040.0290.804\pm 0.0290.804 ± 0.0290.904±0.014plus-or-minus0.9040.0140.904\pm 0.0140.904 ± 0.0140.811±0.137plus-or-minus0.8110.1370.811\pm 0.1370.811 ± 0.1370.792±0.031plus-or-minus0.7920.0310.792\pm 0.0310.792 ± 0.0310.949±0.006plus-or-minus0.9490.0060.949\pm 0.0060.949 ± 0.006
10000.485±0.043plus-or-minus0.4850.0430.485\pm 0.0430.485 ± 0.0430.413±0.097plus-or-minus0.4130.0970.413\pm 0.0970.413 ± 0.0970.538±0.008plus-or-minus0.5380.0080.538\pm 0.0080.538 ± 0.008Full0.843±0.010plus-or-minus0.8430.0100.843\pm 0.0100.843 ± 0.0100.837±0.018plus-or-minus0.8370.0180.837\pm 0.0180.837 ± 0.0180.924±0.007plus-or-minus0.9240.0070.924\pm 0.0070.924 ± 0.0070.983±0.429plus-or-minus0.9830.4290.983\pm 0.4290.983 ± 0.4290.882±0.016plus-or-minus0.8820.0160.882\pm 0.0160.882 ± 0.0160.953±0.006plus-or-minus0.9530.0060.953\pm 0.0060.953 ± 0.006
Full0.526±0.023plus-or-minus0.5260.0230.526\pm 0.0230.526 ± 0.0230.504±0.036plus-or-minus0.5040.0360.504\pm 0.0360.504 ± 0.0360.546±0.012plus-or-minus0.5460.0120.546\pm 0.0120.546 ± 0.012100.824±0.020plus-or-minus0.8240.0200.824\pm 0.0200.824 ± 0.0200.765±0.0312plus-or-minus0.7650.03120.765\pm 0.03120.765 ± 0.03120.873±0.017plus-or-minus0.8730.0170.873\pm 0.0170.873 ± 0.0170.994±0.379plus-or-minus0.9940.3790.994\pm 0.3790.994 ± 0.3790.790±0.031plus-or-minus0.7900.0310.790\pm 0.0310.790 ± 0.0310.943±0.010plus-or-minus0.9430.0100.943\pm 0.0100.943 ± 0.010
Full0.530±0.030plus-or-minus0.5300.0300.530\pm 0.0300.530 ± 0.0300.514±0.058plus-or-minus0.5140.0580.514\pm 0.0580.514 ± 0.0580.542±0.012plus-or-minus0.5420.0120.542\pm 0.0120.542 ± 0.012500.867±0.020plus-or-minus0.8670.0200.867\pm 0.0200.867 ± 0.0200.839±0.034plus-or-minus0.8390.0340.839\pm 0.0340.839 ± 0.0340.917±0.013plus-or-minus0.9170.0130.917\pm 0.0130.917 ± 0.0130.566±0.282plus-or-minus0.5660.2820.566\pm 0.2820.566 ± 0.2820.881±0.033plus-or-minus0.8810.0330.881\pm 0.0330.881 ± 0.0330.945±0.005plus-or-minus0.9450.0050.945\pm 0.0050.945 ± 0.005
Full0.520¯±0.031plus-or-minus¯0.5200.031\underline{0.520}\pm 0.031under¯ start_ARG 0.520 end_ARG ± 0.0310.490¯±0.061plus-or-minus¯0.4900.061\underline{0.490}\pm 0.061under¯ start_ARG 0.490 end_ARG ± 0.0610.542±0.007plus-or-minus0.5420.0070.542\pm 0.0070.542 ± 0.007Full0.884±0.011plus-or-minus0.8840.0110.884\pm 0.0110.884 ± 0.0110.884±0.021plus-or-minus0.8840.0210.884\pm 0.0210.884 ± 0.0210.934±0.008plus-or-minus0.9340.0080.934\pm 0.0080.934 ± 0.0080.599±0.201plus-or-minus0.5990.2010.599\pm 0.2010.599 ± 0.2010.918±0.021plus-or-minus0.9180.0210.918\pm 0.0210.918 ± 0.0210.955±0.003plus-or-minus0.9550.0030.955\pm 0.0030.955 ± 0.003
UMTL-SSL-S1000.370±0.046plus-or-minus0.3700.0460.370\pm 0.0460.370 ± 0.0460.114±0.129plus-or-minus0.1140.1290.114\pm 0.1290.114 ± 0.1290.510±0.011plus-or-minus0.5100.0110.510\pm 0.0110.510 ± 0.011100.853±0.026plus-or-minus0.8530.0260.853\pm 0.0260.853 ± 0.0260.747±0.041plus-or-minus0.7470.0410.747\pm 0.0410.747 ± 0.0410.866±0.019plus-or-minus0.8660.0190.866\pm 0.0190.866 ± 0.0191.048±0.186plus-or-minus1.0480.1861.048\pm 0.1861.048 ± 0.1860.782±0.041plus-or-minus0.7820.0410.782\pm 0.0410.782 ± 0.0410.944±0.010plus-or-minus0.9440.0100.944\pm 0.0100.944 ± 0.010
1000.400±0.067plus-or-minus0.4000.0670.400\pm 0.0670.400 ± 0.0670.192±0.171plus-or-minus0.1920.1710.192\pm 0.1710.192 ± 0.1710.518±0.012plus-or-minus0.5180.0120.518\pm 0.0120.518 ± 0.012500.889±0.019plus-or-minus0.8890.0190.889\pm 0.0190.889 ± 0.0190.799±0.031plus-or-minus0.7990.0310.799\pm 0.0310.799 ± 0.0310.899±0.014plus-or-minus0.8990.0140.899\pm 0.0140.899 ± 0.0140.854±0.298plus-or-minus0.8540.2980.854\pm 0.2980.854 ± 0.2980.834±0.031plus-or-minus0.8340.0310.834\pm 0.0310.834 ± 0.0310.950±0.006plus-or-minus0.9500.0060.950\pm 0.0060.950 ± 0.006
1000.370±0.061plus-or-minus0.3700.0610.370\pm 0.0610.370 ± 0.0610.114±0.159plus-or-minus0.1140.1590.114\pm 0.1590.114 ± 0.1590.514±0.016plus-or-minus0.5140.0160.514\pm 0.0160.514 ± 0.016Full0.915±0.019plus-or-minus0.9150.0190.915\pm 0.0190.915 ± 0.0190.848±0.032plus-or-minus0.8480.0320.848\pm 0.0320.848 ± 0.0320.920±0.013plus-or-minus0.9200.0130.920\pm 0.0130.920 ± 0.0130.987±0.328plus-or-minus0.9870.3280.987\pm 0.3280.987 ± 0.3280.880±0.033plus-or-minus0.8800.0330.880\pm 0.0330.880 ± 0.0330.956±0.003plus-or-minus0.9560.0030.956\pm 0.0030.956 ± 0.003
10000.432±0.043plus-or-minus0.4320.0430.432\pm 0.0430.432 ± 0.0430.286±0.019plus-or-minus0.2860.0190.286\pm 0.0190.286 ± 0.0190.524±0.010plus-or-minus0.5240.0100.524\pm 0.0100.524 ± 0.010100.871±0.034plus-or-minus0.8710.0340.871\pm 0.0340.871 ± 0.0340.785±0.054plus-or-minus0.7850.0540.785\pm 0.0540.785 ± 0.0540.884±0.025plus-or-minus0.8840.0250.884\pm 0.0250.884 ± 0.0251.327±0.135plus-or-minus1.3270.1351.327\pm 0.1351.327 ± 0.1350.818±0.053plus-or-minus0.8180.0530.818\pm 0.0530.818 ± 0.0530.944±0.008plus-or-minus0.9440.0080.944\pm 0.0080.944 ± 0.008
10000.458±0.077plus-or-minus0.4580.0770.458\pm 0.0770.458 ± 0.0770.342±0.186plus-or-minus0.3420.1860.342\pm 0.1860.342 ± 0.1860.530±0.013plus-or-minus0.5300.0130.530\pm 0.0130.530 ± 0.013500.893±0.008plus-or-minus0.8930.0080.893\pm 0.0080.893 ± 0.0080.803±0.014plus-or-minus0.8030.0140.803\pm 0.0140.803 ± 0.0140.895±0.006plus-or-minus0.8950.0060.895\pm 0.0060.895 ± 0.0061.123±0.215plus-or-minus1.1230.2151.123\pm 0.2151.123 ± 0.2150.835±0.015plus-or-minus0.8350.0150.835\pm 0.0150.835 ± 0.0150.946±0.005plus-or-minus0.9460.0050.946\pm 0.0050.946 ± 0.005
10000.462±0.050plus-or-minus0.4620.0500.462\pm 0.0500.462 ± 0.0500.350±0.115plus-or-minus0.3500.1150.350\pm 0.1150.350 ± 0.1150.536±0.014plus-or-minus0.5360.0140.536\pm 0.0140.536 ± 0.014Full0.930±0.008plus-or-minus0.9300.0080.930\pm 0.0080.930 ± 0.0080.860±0.013plus-or-minus0.8600.0130.860\pm 0.0130.860 ± 0.0130.925±0.006plus-or-minus0.9250.0060.925\pm 0.0060.925 ± 0.0061.042±0.206plus-or-minus1.0420.2061.042\pm 0.2061.042 ± 0.2060.912±0.013plus-or-minus0.9120.0130.912\pm 0.0130.912 ± 0.0130.955±0.003plus-or-minus0.9550.0030.955\pm 0.0030.955 ± 0.003
Full0.482±0.042plus-or-minus0.4820.0420.482\pm 0.0420.482 ± 0.0420.412±0.087plus-or-minus0.4120.0870.412\pm 0.0870.412 ± 0.0870.536±0.008plus-or-minus0.5360.0080.536\pm 0.0080.536 ± 0.008100.880±0.031plus-or-minus0.8800.0310.880\pm 0.0310.880 ± 0.0310.765±0.050plus-or-minus0.7650.0500.765\pm 0.0500.765 ± 0.0500.885±0.022plus-or-minus0.8850.0220.885\pm 0.0220.885 ± 0.0220.745±0.219plus-or-minus0.7450.2190.745\pm 0.2190.745 ± 0.2190.818±0.045plus-or-minus0.8180.0450.818\pm 0.0450.818 ± 0.0450.941±0.014plus-or-minus0.9410.0140.941\pm 0.0140.941 ± 0.014
Full0.490±0.021plus-or-minus0.4900.0210.490\pm 0.0210.490 ± 0.0210.426±0.041plus-or-minus0.4260.0410.426\pm 0.0410.426 ± 0.0410.540±0.006plus-or-minus0.5400.0060.540\pm 0.0060.540 ± 0.006500.912±0.014plus-or-minus0.9120.0140.912\pm 0.0140.912 ± 0.0140.845±0.024plus-or-minus0.8450.0240.845\pm 0.0240.845 ± 0.0240.909±0.009plus-or-minus0.9090.0090.909\pm 0.0090.909 ± 0.0090.956±0.215plus-or-minus0.9560.2150.956\pm 0.2150.956 ± 0.2150.881±0.025plus-or-minus0.8810.0250.881\pm 0.0250.881 ± 0.0250.952±0.002plus-or-minus0.9520.0020.952\pm 0.0020.952 ± 0.002
Full0.510±0.038plus-or-minus0.5100.0380.510\pm 0.0380.510 ± 0.0380.474±0.071plus-or-minus0.4740.0710.474\pm 0.0710.474 ± 0.0710.540±0.011plus-or-minus0.5400.0110.540\pm 0.0110.540 ± 0.011Full0.875±0.032plus-or-minus0.8750.0320.875\pm 0.0320.875 ± 0.0320.809±0.053plus-or-minus0.8090.0530.809\pm 0.0530.809 ± 0.0530.875±0.026plus-or-minus0.8750.0260.875\pm 0.0260.875 ± 0.0260.722±0.335plus-or-minus0.7220.3350.722\pm 0.3350.722 ± 0.3350.851±0.057plus-or-minus0.8510.0570.851\pm 0.0570.851 ± 0.0570.944±0.008plus-or-minus0.9440.0080.944\pm 0.0080.944 ± 0.008
MultiMix1000.440±0.058plus-or-minus0.4400.0580.440\pm 0.0580.440 ± 0.0580.164±0.019plus-or-minus0.1640.0190.164\pm 0.0190.164 ± 0.0190.510±0.014plus-or-minus0.5100.0140.510\pm 0.0140.510 ± 0.014100.857±0.028plus-or-minus0.8570.0280.857\pm 0.0280.857 ± 0.0280.732±0.044plus-or-minus0.7320.0440.732\pm 0.0440.732 ± 0.0440.863±0.018plus-or-minus0.8630.0180.863\pm 0.0180.863 ± 0.0181.227±0.534plus-or-minus1.2270.5341.227\pm 0.5341.227 ± 0.5340.767±0.044plus-or-minus0.7670.0440.767\pm 0.0440.767 ± 0.0440.943±0.016plus-or-minus0.9430.0160.943\pm 0.0160.943 ± 0.016
1000.370±0.086plus-or-minus0.3700.0860.370\pm 0.0860.370 ± 0.0860.036±0.003plus-or-minus0.0360.0030.036\pm 0.0030.036 ± 0.0030.510±0.013plus-or-minus0.5100.0130.510\pm 0.0130.510 ± 0.013500.889±0.021plus-or-minus0.8890.0210.889\pm 0.0210.889 ± 0.0210.790±0.036plus-or-minus0.7900.0360.790\pm 0.0360.790 ± 0.0360.890±0.015plus-or-minus0.8900.0150.890\pm 0.0150.890 ± 0.0151.061±0.434plus-or-minus1.0610.4341.061\pm 0.4341.061 ± 0.4340.866±0.035plus-or-minus0.8660.0350.866\pm 0.0350.866 ± 0.0350.947±0.008plus-or-minus0.9470.0080.947\pm 0.0080.947 ± 0.008
1000.500±0.080plus-or-minus0.5000.0800.500\pm 0.0800.500 ± 0.0800.300±0.002plus-or-minus0.3000.0020.300\pm 0.0020.300 ± 0.0020.510±0.006plus-or-minus0.5100.0060.510\pm 0.0060.510 ± 0.006Full0.899±0.022plus-or-minus0.8990.0220.899\pm 0.0220.899 ± 0.0220.825±0.036plus-or-minus0.8250.0360.825\pm 0.0360.825 ± 0.0360.906±0.017plus-or-minus0.9060.0170.906\pm 0.0170.906 ± 0.0170.647±0.074plus-or-minus0.6470.0740.647\pm 0.0740.647 ± 0.0740.852±0.040plus-or-minus0.8520.0400.852\pm 0.0400.852 ± 0.0400.952±0.012plus-or-minus0.9520.0120.952\pm 0.0120.952 ± 0.012
10000.520±0.041plus-or-minus0.5200.0410.520\pm 0.0410.520 ± 0.0410.386±0.009plus-or-minus0.3860.0090.386\pm 0.0090.386 ± 0.0090.530±0.014plus-or-minus0.5300.0140.530\pm 0.0140.530 ± 0.014100.862±0.017plus-or-minus0.8620.0170.862\pm 0.0170.862 ± 0.0170.775±0.026plus-or-minus0.7750.0260.775\pm 0.0260.775 ± 0.0260.878±0.011plus-or-minus0.8780.0110.878\pm 0.0110.878 ± 0.0111.307±0.325plus-or-minus1.3070.3251.307\pm 0.3251.307 ± 0.3250.816±0.029plus-or-minus0.8160.0290.816\pm 0.0290.816 ± 0.0290.939±0.006plus-or-minus0.9390.0060.939\pm 0.0060.939 ± 0.006
10000.540±0.018plus-or-minus0.5400.0180.540\pm 0.0180.540 ± 0.0180.500±0.036plus-or-minus0.5000.0360.500\pm 0.0360.500 ± 0.0360.536±0.005plus-or-minus0.5360.0050.536\pm 0.0050.536 ± 0.005500.912±0.018plus-or-minus0.9120.0180.912\pm 0.0180.912 ± 0.0180.831±0.031plus-or-minus0.8310.0310.831\pm 0.0310.831 ± 0.0310.907±0.012plus-or-minus0.9070.0120.907\pm 0.0120.907 ± 0.0121.293±0.375plus-or-minus1.2930.3751.293\pm 0.3751.293 ± 0.3750.865±0.030plus-or-minus0.8650.0300.865\pm 0.0300.865 ± 0.0300.955±0.007plus-or-minus0.9550.0070.955\pm 0.0070.955 ± 0.007
10000.570±0.088plus-or-minus0.5700.088\textbf{0.570}\pm 0.0880.570 ± 0.0880.620±0.003plus-or-minus0.6200.003\textbf{0.620}\pm 0.0030.620 ± 0.0030.510±0.008plus-or-minus0.5100.0080.510\pm 0.0080.510 ± 0.008Full0.936±0.026plus-or-minus0.9360.026\textbf{0.936}\pm 0.0260.936 ± 0.0260.880±0.043plus-or-minus0.8800.043\textbf{0.880}\pm 0.0430.880 ± 0.0430.932±0.022plus-or-minus0.9320.022\textbf{0.932}\pm 0.0220.932 ± 0.0220.803±0.178plus-or-minus0.8030.1780.803\pm 0.1780.803 ± 0.1780.917±0.050plus-or-minus0.9170.0500.917\pm 0.0500.917 ± 0.0500.979±0.008plus-or-minus0.9790.008\textbf{0.979}\pm 0.0080.979 ± 0.008
Full0.550±0.038plus-or-minus0.5500.0380.550\pm 0.0380.550 ± 0.0380.430±0.006plus-or-minus0.4300.0060.430\pm 0.0060.430 ± 0.0060.534±0.010plus-or-minus0.5340.0100.534\pm 0.0100.534 ± 0.010100.886±0.013plus-or-minus0.8860.0130.886\pm 0.0130.886 ± 0.0130.802±0.022plus-or-minus0.8020.0220.802\pm 0.0220.802 ± 0.0220.894±0.011plus-or-minus0.8940.0110.894\pm 0.0110.894 ± 0.0110.746±0.284plus-or-minus0.7460.2840.746\pm 0.2840.746 ± 0.2840.839±0.028plus-or-minus0.8390.0280.839\pm 0.0280.839 ± 0.0280.948±0.007plus-or-minus0.9480.0070.948\pm 0.0070.948 ± 0.007
Full0.560±0.040plus-or-minus0.5600.0400.560\pm 0.0400.560 ± 0.0400.570±0.008plus-or-minus0.5700.0080.570\pm 0.0080.570 ± 0.0080.550±0.007plus-or-minus0.5500.007\textbf{0.550}\pm 0.0070.550 ± 0.007500.935±0.017plus-or-minus0.9350.0170.935\pm 0.0170.935 ± 0.0170.878±0.030plus-or-minus0.8780.0300.878\pm 0.0300.878 ± 0.0300.930±0.012plus-or-minus0.9300.0120.930\pm 0.0120.930 ± 0.0120.515±0.232plus-or-minus0.5150.232\textbf{0.515}\pm 0.2320.515 ± 0.2320.928±0.033plus-or-minus0.9280.033\textbf{0.928}\pm 0.0330.928 ± 0.0330.957±0.005plus-or-minus0.9570.0050.957\pm 0.0050.957 ± 0.005
Full0.520¯±0.022plus-or-minus¯0.5200.022\underline{0.520}\pm 0.022under¯ start_ARG 0.520 end_ARG ± 0.0220.490¯±0.064plus-or-minus¯0.4900.064\underline{0.490}\pm 0.064under¯ start_ARG 0.490 end_ARG ± 0.0640.550¯±0.008plus-or-minus¯0.5500.008\underline{0.550}\pm 0.008under¯ start_ARG 0.550 end_ARG ± 0.008Full0.943¯±0.009plus-or-minus¯0.9430.009\underline{0.943}\pm 0.009under¯ start_ARG 0.943 end_ARG ± 0.0090.892¯±0.015plus-or-minus¯0.8920.015\underline{0.892}\pm 0.015under¯ start_ARG 0.892 end_ARG ± 0.0150.937¯±0.006plus-or-minus¯0.9370.006\underline{0.937}\pm 0.006under¯ start_ARG 0.937 end_ARG ± 0.0060.417¯±0.181plus-or-minus¯0.4170.181\underline{0.417}\pm 0.181under¯ start_ARG 0.417 end_ARG ± 0.1810.928¯±0.016plus-or-minus¯0.9280.016\underline{0.928}\pm 0.016under¯ start_ARG 0.928 end_ARG ± 0.0160.958¯±0.002plus-or-minus¯0.9580.002\underline{0.958}\pm 0.002under¯ start_ARG 0.958 end_ARG ± 0.002
Ground Truth
|𝒟s|=10superscript𝒟𝑠10|\mathcal{D}^{s}|=10| caligraphic_D start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT | = 10|𝒟s|=50superscript𝒟𝑠50|\mathcal{D}^{s}|=50| caligraphic_D start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT | = 50|𝒟s|=Fullsuperscript𝒟𝑠Full|\mathcal{D}^{s}|=\text{\small Full}| caligraphic_D start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT | = Full
U-Net
UMTL
UMTLS
UMTL-SSL
UMTL-SSL-S
MultiMix
(a) JSRT (in-domain)
Ground Truth
|𝒟s|=10superscript𝒟𝑠10|\mathcal{D}^{s}|=10| caligraphic_D start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT | = 10|𝒟s|=50superscript𝒟𝑠50|\mathcal{D}^{s}|=50| caligraphic_D start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT | = 50|𝒟s|=Fullsuperscript𝒟𝑠Full|\mathcal{D}^{s}|=\text{\small Full}| caligraphic_D start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT | = Full
U-Net
UMTL
UMTLS
UMTL-SSL
UMTL-SSL-S
MultiMix
(b) MCU (cross-domain)
Figure 4: Visualization of the ground truth reference (green) and predicted (red) segmentation boundaries in a chest X-ray reveals the superiority of MultiMix.
(a) in-domain
(b) cross-domain
Figure 5: Bland-Altman plots at varying training labels show good agreement between the number of ground truth pixels and MultiMix-predicted pixels for the (a) in-domain and (b) cross-domain evaluations, as well as consistent improvement with increasing quantities of labeled data.

3.3 Results and Discussion

As is revealed by the results in Table 2, the performance of our model improves with the inclusion of each of the novel components in the backbone network. For the classification task, our confidence-based augmentation approach for semi-supervised learning yields significantly improved performance compared to the baseline models. Even with the min|𝒟lc|subscriptsuperscript𝒟𝑐𝑙\min|\mathcal{D}^{c}_{l}|roman_min | caligraphic_D start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT | and min|𝒟ls|subscriptsuperscript𝒟𝑠𝑙\min|\mathcal{D}^{s}_{l}|roman_min | caligraphic_D start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT |, our MultiMix-100-10 model outperforms the fully-supervised baseline (Enc) in classifying the normal and abnormal chest X-rays. As is confirmed by the Student’s t-test, MultiMix exhibits significant improvements over the classification baselines Enc, Enc-SSL, and UMTL (p<0.05𝑝0.05p<0.05italic_p < 0.05).

For the segmentation task, the inclusion of the saliency bridge module yields large improvements over the baseline U-Net and UMTL models. Again, with min|𝒟ls|subscriptsuperscript𝒟𝑠𝑙\min|\mathcal{D}^{s}_{l}|roman_min | caligraphic_D start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT |, we observed a 30% performance gain over its counterparts, which proves the effectiveness of our MultiMix model. The improvement in Dice scores of MultiMix with minimal supervision over the segmentation baselines of U-Net, UMTL, and UMTL-S is statistically significant (p<0.05𝑝0.05p<0.05italic_p < 0.05), confirming the quantitative efficacy of MultiMix. Figure 3 shows improved and consistent segmentation performance by the MultiMix model over the baselines. For a fair comparison, we used the same backbone U-Net and the same classification branch for all the models.

In Figure 4, the segmented lung boundary visualizations also show good agreement with the reference masks by MultiMix over the other models (also see Appendix B). For both the in-domain and cross-domain segmentations, we observe that the predicted boundaries are almost identical with the reference boundaries, as they substantially overlap. Moreover, the noise in the predictions is mitigated with the introduction of each additional component into the intermediate models, which justifies the value of those components in the MultiMix model. The good agreement between the ground truth lung masks and the MultiMix predicted segmentation masks is confirmed by the Bland-Altman plots for varying quantities of labeled data, shown in Figure 4(a).

The generalization test through the cross-domain datasets (MCU and NIHX) demonstrates the effectiveness of the MultiMix model. It consistently performs well against both domains with improved generalizability in either task. As reported in Table 3, the performance of MultiMix is as promising as in the in-domain evaluations. MultiMix achieved better scores in the classification task over all the baseline models. Due to the significant differences in the NIHX and CheX datasets, the scores are not as good as the in-domain results, yet our model performs significantly better than the other classification models Enc, Enc-SSL, and UMTL (p<0.05𝑝0.05p<0.05italic_p < 0.05). For the segmentation task, our MultiMix model again achieved better scores in all the various metrics, with improved consistency over the baselines (Figure 3). Just like for the in-domain results, MultiMix shows significant improvements in Dice scores over the segmentation baselines U-Net, UMTL, and UMTL-S (p<0.05𝑝0.05p<0.05italic_p < 0.05), thus proving the generalizability of our method. The Bland-Altman plots in Figure 4(b) for the MultiMix model in the cross-domain segmentation evaluations with varying quantities of labeled data confirms the observed good agreement between the ground truth lung segmentation masks and the MultiMix-predicted segmentation masks.

Refer to caption
(a) in-domain
Refer to caption
(b) cross-domain
Figure 6: Classification accuracies of different supervised and semi-supervised baselines at different training datasizes. The in-domain and cross-domain plots show that MultiMix has higher accuracy and consistency over the baselines.
Refer to caption
(a) in-domain
Refer to caption
(b) cross-domain
Figure 7: ROC curves for supervised and semi-supervised baselines with 50 segmentation labels show higher AUC values from our MultiMix for in-domain and cross-domain evaluations.

Figure 6 demonstrates the superiority and better consistency of our MultiMix models over the baselines in classifying normal and abnormal (pneumonia) X-rays. Figure 7 further showcases the superior classification performance of MultiMix over the baseline single-task and multi-task models. Figure 6 and Figure 7 together show that MultiMix outperforms all baselines in many different metrics, as the Accuracy and Area Under Curve (AUC) values confirm the superiority of the MultiMix model. With regard to the cross-domain ROC curves, however, although MultiMix has the best relative performance when compared to the baselines, the absolute performance of the algorithm indicates room for improvement.

4 Conclusions

We have presented MultiMix, a novel semi-supervised, multi-task learning model that jointly learns classification and segmentation tasks. Through the incorporation of confidence-guided data augmentation and a novel saliency bridge module, MultiMix performs improved and consistent pneumonia detection and lung segmentation when trained on multi-source chest X-ray datasets with varying quantities of ground truth labels. Our thorough experimentation using four different chest X-ray datasets demonstrated the effectiveness of MultiMix both in in-domain and cross-domain evaluations, for both tasks; in fact, outperforming a number of baseline models.

Beyond chest X-rays—which is the most frequently performed radiologic procedure worldwide, comprising 40% of all imaging tests, or 1.4 billion annually (World Health Organization, 2016)—our future work will focus on generalizing the MultiMix concept, particularly the saliency bridge module, to other applications and imaging modalities, including volumetric images.


Ethical Standards

Appropriate ethical standards were maintained in writing this manuscript and conducting the reported research, following all applicable laws and regulations regarding the treatment of animals or human subjects.


Conflicts of Interest

DT is a founder of VoxelCloud, Inc., Los Angeles, CA, USA.

A Model Architecture

Architectural details of the MultiMix model are presented in Table 4 for the encoder network and in Table 5 for the decoder network. The encoder and decoder incorporate double-convolution blocks; the encoder has 5 blocks and the decoder has 4. Each block includes a 2D convolutional layer, an instance normalization layer, and a Leaky ReLU activation, and this sequence repeats in each block.

In the encoder, each double-convolution block is followed by a dropout layer and a maxpooling layer. The encoder finally branches to a classification branch, which includes a maxpooling layer (5), an average pooling layer, followed by a fully-connected layer for classification prediction.

The decoder begins with an upsampling layer. Next, in the first double-convolution layer, the downsampled saliency maps and original inputs are concatenated. The increase in dimensions at the beginning of each decoder block are due to the skip connections. These convolutional layers are also followed by a dropout layer. This sequence is repeated for 3 more layers. To output the final segmentation prediction, the decoder finishes with a single convolutional layer that downsamples to a single channel.

B Segmentation Visualization

Figure 8 shows the ground truth lung masks and masks predicted by the MultiMix model (MultiMix-50-1000) for a number of images from the JSRT dataset (in-domain) and MCU dataset (cross-domain). Both parts of the figure display the accuracy in the predicted segmentation masks, both in-domain and cross-domain, as there is almost no noise in these predictions, proving the effectiveness of our algorithm even when being trained with limited labeled data.

C Saliency Visualization

Figure 9 shows the class-specific saliency maps generated by our MultiMix-50-1000 model for both in-domain and cross-domain classification data (Xcsuperscript𝑋𝑐X^{c}italic_X start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT). The maps consistently highlight particular regions in the input X-rays for the Normal and Pneumonia classes. Similarly, Figure 10 shows the saliency maps for the in-domain and cross-domain segmentation data (Xssuperscript𝑋𝑠X^{s}italic_X start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT). While the class labels are not available, two distinct types of saliency maps are generated like for the classification data.

Class-specific saliency maps generated for images in Xcsuperscript𝑋𝑐X^{c}italic_X start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT consistently highlight regions responsible for predicting the particular classes of the images (Figure 9), enabling the use of these maps to improve the segmentation of images in Xssuperscript𝑋𝑠X^{s}italic_X start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT (Figure 10).

Table 4: Architectural details of the MultiMix Encoder for minibatch size m𝑚mitalic_m.
NameInput Feature MapsOutput Feature Maps
Conv layer - 1m×256×256×1𝑚2562561m\times 256\times 256\times 1italic_m × 256 × 256 × 1m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16
InstanceNorm - 1m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16
LReLU - 1m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16
Conv Layer - 2m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16
InstanceNorm - 2m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16
LReLU - 2m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16
Dropout - 1m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16
Maxpool - 1m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16m×128×128×16𝑚12812816m\times 128\times 128\times 16italic_m × 128 × 128 × 16
Conv Layer - 3m×128×128×16𝑚12812816m\times 128\times 128\times 16italic_m × 128 × 128 × 16m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32
InstanceNorm - 3m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32
LReLU - 3m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32
Conv Layer - 4m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32
InstanceNorm - 4m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32
LReLU - 4m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32
Dropout - 2m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32
Maxpool - 2m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32m×64×64×32𝑚646432m\times 64\times 64\times 32italic_m × 64 × 64 × 32
Conv Layer - 5m×64×64×32𝑚646432m\times 64\times 64\times 32italic_m × 64 × 64 × 32m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64
InstanceNorm - 5m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64
LReLU - 5m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64
Conv Layer - 6m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64
InstanceNorm - 6m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64
LReLU - 6m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64
Dropout - 3m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64
Maxpool - 3m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64m×32×32×64𝑚323264m\times 32\times 32\times 64italic_m × 32 × 32 × 64
Conv Layer - 7m×32×32×64𝑚323264m\times 32\times 32\times 64italic_m × 32 × 32 × 64m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128
InstanceNorm - 7m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128
LReLU - 7m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128
Conv Layer - 8m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128
InstanceNorm - 8m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128
LReLU - 8m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128
Dropout - 4m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128
Maxpool - 4m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128m×16×16×128𝑚1616128m\times 16\times 16\times 128italic_m × 16 × 16 × 128
Conv Layer - 9m×16×16×128𝑚1616128m\times 16\times 16\times 128italic_m × 16 × 16 × 128m×16×16×256𝑚1616256m\times 16\times 16\times 256italic_m × 16 × 16 × 256
InstanceNorm - 9m×16×16×256𝑚1616256m\times 16\times 16\times 256italic_m × 16 × 16 × 256m×16×16×256𝑚1616256m\times 16\times 16\times 256italic_m × 16 × 16 × 256
LReLU - 9m×16×16×256𝑚1616256m\times 16\times 16\times 256italic_m × 16 × 16 × 256m×16×16×256𝑚1616256m\times 16\times 16\times 256italic_m × 16 × 16 × 256
Conv Layer - 10m×16×16×256𝑚1616256m\times 16\times 16\times 256italic_m × 16 × 16 × 256m×16×16×256𝑚1616256m\times 16\times 16\times 256italic_m × 16 × 16 × 256
InstanceNorm - 10m×16×16×256𝑚1616256m\times 16\times 16\times 256italic_m × 16 × 16 × 256m×16×16×256𝑚1616256m\times 16\times 16\times 256italic_m × 16 × 16 × 256
LReLU - 10m×16×16×256𝑚1616256m\times 16\times 16\times 256italic_m × 16 × 16 × 256m×16×16×256𝑚1616256m\times 16\times 16\times 256italic_m × 16 × 16 × 256
Dropout - 5m×16×16×256𝑚1616256m\times 16\times 16\times 256italic_m × 16 × 16 × 256m×16×16×256𝑚1616256m\times 16\times 16\times 256italic_m × 16 × 16 × 256
Maxpool - 5m×16×16×256𝑚1616256m\times 16\times 16\times 256italic_m × 16 × 16 × 256m×8×8×256𝑚88256m\times 8\times 8\times 256italic_m × 8 × 8 × 256
Avgpoolm×8×8×256𝑚88256m\times 8\times 8\times 256italic_m × 8 × 8 × 256m×1×1×256𝑚11256m\times 1\times 1\times 256italic_m × 1 × 1 × 256
GAPm×1×1×256𝑚11256m\times 1\times 1\times 256italic_m × 1 × 1 × 256m×256𝑚256m\times 256italic_m × 256
Fully Connected Layerm×256𝑚256m\times 256italic_m × 256m×2𝑚2m\times 2italic_m × 2
Table 5: Architectural details of the MultiMix Decoder for minibatch size m𝑚mitalic_m.
NameInput Feature MapsOutput Feature Maps
Upsample - 1m×16×16×256𝑚1616256m\times 16\times 16\times 256italic_m × 16 × 16 × 256m×32×32×256𝑚3232256m\times 32\times 32\times 256italic_m × 32 × 32 × 256
Conv Layer - 1m×32×32×386𝑚3232386m\times 32\times 32\times 386italic_m × 32 × 32 × 386m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128
InstanceNorm - 1m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128
LReLU - 1m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128
Conv Layer - 2m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128
InstanceNorm - 2m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128
LReLU - 2m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128
Dropout - 1m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128
Upsample - 2m×32×32×128𝑚3232128m\times 32\times 32\times 128italic_m × 32 × 32 × 128m×64×64×128𝑚6464128m\times 64\times 64\times 128italic_m × 64 × 64 × 128
Conv Layer - 3m×64×64×192𝑚6464192m\times 64\times 64\times 192italic_m × 64 × 64 × 192m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64
InstanceNorm - 3m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64
LReLU - 3m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64
Conv Layer - 4m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64
InstanceNorm - 4m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64
LReLU - 4m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64
Dropout - 2m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64
Upsample - 3m×64×64×64𝑚646464m\times 64\times 64\times 64italic_m × 64 × 64 × 64m×128×128×64𝑚12812864m\times 128\times 128\times 64italic_m × 128 × 128 × 64
Conv Layer - 5m×128×128×96𝑚12812896m\times 128\times 128\times 96italic_m × 128 × 128 × 96m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32
InstanceNorm - 5m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32
LReLU - 5m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32
Conv Layer - 6m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32
InstanceNorm - 6m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32
LReLU - 6m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32
Dropout - 3m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32
Upsample - 4m×128×128×32𝑚12812832m\times 128\times 128\times 32italic_m × 128 × 128 × 32m×256×256×32𝑚25625632m\times 256\times 256\times 32italic_m × 256 × 256 × 32
Conv Layer - 7m×256×256×48𝑚25625648m\times 256\times 256\times 48italic_m × 256 × 256 × 48m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16
InstanceNorm - 7m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16
LReLU - 7m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16
Conv Layer - 8m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16
InstanceNorm - 8m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16
LReLU - 8m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16
Dropout - 4m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16m×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16
Final Conv Layerm×256×256×16𝑚25625616m\times 256\times 256\times 16italic_m × 256 × 256 × 16m×256×256×1𝑚2562561m\times 256\times 256\times 1italic_m × 256 × 256 × 1
ImageGround TruthPrediction
(a) JSRT (in-domain)
ImageGround TruthPrediction
(b) MCU (cross-domain)
Figure 8: Visualizations of the segmented lung masks by MultiMix-50-1000 on the in-domain JSRT dataset and cross-domain MCU dataset. The results show good agreement between the groundtruth and predicted masks.
ImageSaliency Map
Normal
Pneumonia
(a) CheX (in-domain)
ImageSaliency Map
Normal
Pneumonia
(b) NIHX (cross-domain)
Figure 9: Examples from Xcsuperscript𝑋𝑐X^{c}italic_X start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT. Class-specific MultiMix saliency maps highlight crucial regions in the input X-ray images in detecting pneumonia, demonstrating the effective predictions by the classifier and providing useful information for improved segmentation.
ImageSaliency Map
(a) JSRT (in-domain)
ImageSaliency Map
(b) MCU (cross-domain)
Figure 10: Examples from Xssuperscript𝑋𝑠X^{s}italic_X start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT. MultiMix saliency maps consistently highlight the crucial regions in the input X-ray images, thus providing useful information for improved segmentation.

References

  • Anwar et al. (2018) S. M. Anwar, M. Majid, A. Qayyum, M. Awais, M. Alnowami, and M. K. Khan. Medical image analysis using convolutional neural networks: A review. Journal of Medical Systems, 42(11):226, 2018.
  • Beijbom (2012) O. Beijbom. Domain adaptations for computer vision applications. arXiv Preprint arXiv:1211.4860, 2012.
  • Berthelot et al. (2019) D. Berthelot, N. Carlini, I. Goodfellow, N. Papernot, A. Oliver, and C. Raffel. Mixmatch: A holistic approach to semi-supervised learning. arXiv Preprint arXiv:1905.02249, 2019.
  • Caruana (1993) R. A. Caruana. Multitask learning: A knowledge-based source of inductive bias. In Proceedings of the International Conference on Machine Learning, pages 41–48, 1993.
  • Chapelle et al. (2009) O. Chapelle, B. Scholkopf, and A. Zien. Semi-supervised learning (chapelle, o. et al., eds.; 2006)[book reviews]. IEEE Transactions on Neural Networks, 20(3):542–542, 2009.
  • Gao et al. (2019) F. Gao, H. Yoon, T. Wu, and X. Chu. A feature transfer enabled multi-task deep learning model on medical imaging. arXiv Preprint arXiv:1906.01828, 2019.
  • Girard et al. (2019) F. Girard, C. Kavalec, and F. Cheriet. Joint segmentation and classification of retinal arteries/veins from fundus images. Artificial Intelligence in Medicine, 94:96–109, Mar 2019.
  • Grandvalet and Bengio (2005) Y. Grandvalet and Y. Bengio. Semi-supervised learning by entropy minimization. In Advances in Neural Information Processing Systems, pages 529–536, 2005.
  • Gu et al. (2019) Z. Gu, J. Cheng, H. Fu, K. Zhou, H. Hao, Y. Zhao, T. Zhang, S. Gao, and J. Liu. CE-Net: Context encoder network for 2D medical image segmentation. IEEE Transactions on Medical Imaging, 38(10):2281–2292, 2019.
  • Haque et al. (2021) A. Haque, A.-A.-Z. Imran, A. Wang, and D. Terzopoulos. Multimix: Sparingly supervised, extreme multitask learning from medical images. In Proceedings of the IEEE International Symposium on Biomedical Imaging (ISBI), pages 693–696, Nice, France, April 2021.
  • Hu et al. (2019) J. Hu, H. Wang, S. Gao, M. Bao, T. Liu, Y. Wang, and J. Zhang. S-unet: A bridge-style U-Net framework with a saliency mechanism for retinal vessel segmentation. IEEE Access, 7:174167–174177, 2019. doi: 10.1109/ACCESS.2019.2940476.
  • Imran (2020) A.-A.-Z. Imran. From Fully-Supervised, Single-Task to Scarcely-Supervised, Multi-Task Deep Learning for Medical Image Analysis. PhD thesis, Computer Science Department, University of California, Los Angeles, 2020.
  • Imran and Terzopoulos (2019) A.-A.-Z. Imran and D. Terzopoulos. Semi-supervised multi-task learning with chest X-ray images. In Proceedings of the 10th International Workshop on Machine Learning in Medical Imaging (MLMI 2019), volume 11861 of Lecture Notes in Computer Science, pages 98–105. Springer Nature, 2019.
  • Imran and Terzopoulos (2021a) A.-A.-Z. Imran and D. Terzopoulos. Progressive adversarial semantic segmentation. In Proceedings of the 25th International Conference on Pattern Recognition (ICPR), pages 4910–4917, Milan, Italy, January 2021a.
  • Imran and Terzopoulos (2021b) A.-A.-Z. Imran and D. Terzopoulos. Multi-adversarial variational autoencoder nets for simultaneous image generation and classification. In Deep Learning Applications, Volume 2, pages 249–271. Springer, 2021b.
  • Imran et al. (2020) A.-A.-Z. Imran, C. Huang, H. Tang, W. Fan, Y. Xiao, D. Hao, Z. Qian, and D. Terzopoulos. Partly supervised multi-task learning. In Proceedings of the 19th IEEE International Conference on Machine Learning and Applications (ICMLA), pages 769–774, Miami, FL, December 2020.
  • Jaeger et al. (2014) S. Jaeger, S. Candemir, et al. Two public chest X-ray datasets for computer-aided screening of pulmonary diseases. Quantitative Imaging in Medicine and Surgery, 2014.
  • Kermany et al. (2018) D. S. Kermany, M. Goldbaum, W. Cai, C. C. S. Valentim, H. Liang, S. L. Baxter, A. McKeown, G. Yang, X. Wu, F. Yan, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell, 172(5):1122–1131, 2018.
  • Lee (2013) D.-H. Lee. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In ICML Workshop on Challenges in Representation Learning, volume 3, pages 1–6, 2013.
  • Liu et al. (2008) Q. Liu, X. Liao, and L. Carin. Semi-supervised multitask learning. In Advances in Neural Information Processing Systems, pages 937–944, 2008.
  • Mehta et al. (2018) S. Mehta, E. Mercan, J. Bartlett, D. Weave, J. G. Elmore, and Shapiro L. Y-Net: Joint segmentation and classification for diagnosis of breast biopsy images. arXiv Preprint arXiv:1806.01313, 2018.
  • Myronenko (2018) A. Myronenko. 3D MRI brain tumor segmentation using autoencoder regularization. arXiv Preprint arXiv:1810.11654, 2018.
  • Ouali et al. (2020) Y. Ouali, C. Hudelot, and M. Tami. An overview of deep semi-supervised learning. arXiv Preprint arXiv:2006.05278, 2020.
  • Ronneberger et al. (2015) O. Ronneberger, P. Fischer, and T. Brox. U-Net: Convolutional networks for biomedical image segmentation. arXiv Preprint arXiv:1505.04597, 2015.
  • Ruder (2017) S. Ruder. An overview of multi-task learning in deep neural networks. arXiv Preprint arXiv:1706.05098, 2017.
  • Salimans et al. (2016) T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and X. Chen. Improved techniques for training GANs. arXiv Preprint arXiv:1606.03498, 2016.
  • Shiraishi et al. (2000) J. Shiraishi, S. Katsuragawa, et al. Development of a digital image database for chest radiographs with and without a lung nodule. American Journal of Roentgenology, 174:71–74, 2000.
  • Simonyan et al. (2014) K. Simonyan, A. Vedaldi, and A. Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv Preprint arXiv:1312.6034, 2014.
  • Sohn et al. (2020) K. Sohn, D. Berthelot, C.-L. Li, Z. Zhang, N. Carlini, E. D. Cubuk, A. Kurakin, H. Zhang, and C. Raffel. FixMatch: Simplifying semi-supervised learning with consistency and confidence. arXiv Preprint arXiv:2001.07685, 2020.
  • Wang et al. (2017) X. Wang, Y. Peng, L. Lu, Z. Lu, M. Bagheri, and R. M. Summers. ChestX-Ray8: Hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2097–2106, 2017.
  • World Health Organization (2016) World Health Organization. Communicating radiation risks in paediatric imaging, 2016. URL https://www.who.int/publications/i/item/978924151034.
  • Zeng et al. (2019) Y. Zeng, Y. Zhuge, H. Lu, and L. Zhang. Joint learning of saliency detection and weakly supervised semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7223–7233, 2019.
  • Zhang et al. (2016) D. Zhang, D. Meng, L. Zhao, and J. Han. Bridging saliency detection to weakly supervised object detection based on self-paced curriculum learning. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2016.
  • Zhou et al. (2019) Y. Zhou, X. He, L. Huang, L. Liu, F. Zhu, S. Cui, and L. Shao. Collaborative learning of semi-supervised segmentation and classification for medical images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.