Recently, a study from the HKUST Smart Lab on Optical Coherence Tomography Angiography (OCTA) image super-resolution will be published in the IEEE Transactions on Neural Networks and Learning Systems (TNNLS). The research introduces an adaptive texture generator within a reference-based super-resolution model to enhance OCTA images with a broader field of view (FOV).
Optical Coherence Tomography Angiography (OCTA) is an advanced imaging technique that provides high-resolution, depth-resolved vascular flow images for observing retinal microvasculature. However, OCTA technology faces a significant challenge: as the field of view increases, the resolution tends to decrease. This is due to the fixed number of B-scans, which results in reduced B-scan density with larger FOVs. For instance, a small FOV of 3x3 mm OCTA images can clearly depict microvasculature and the avascular zone (FAZ) of the macula. In contrast, larger FOVs, such as 6x6 mm OCTA images, suffer from significant resolution degradation, potentially missing or distorting detailed capillary structures. Despite this, larger images are crucial in clinical practice for detecting various pathological features, such as microaneurysms and non-perfusion areas. Therefore, there is a pressing need for OCTA images that combine a large FOV with high resolution. However, acquiring such images directly can be challenging due to increased scan times, which can lead to patient discomfort and artifacts from involuntary movements like blinking. Consequently, leveraging image enhancement techniques to improve the resolution of OCTA images is a more viable strategy.
Image super-resolution (SR) aims to reconstruct high-resolution (HR) images from given low-resolution (LR) images. Current SR methods mainly fall into two categories: Single Image Super-Resolution (SISR) and Reference-based Super-Resolution (RefSR). SISR utilizes information from the input LR image and learns the mapping to its corresponding HR version from numerous training samples. However, this approach is limited as directly recovering high-frequency details from LR images can be challenging. The RefSR method addresses this issue by leveraging additional high-resolution reference images to aid the SR process, enabling the network to handle more complex tasks, such as 8x magnification, while producing finer, more realistic details.
We propose a new RefSR model named LTGNet, which integrates a traditional RefSR framework with a Learnable Texture Generator (LTG). This framework first extracts features from the input image using an encoder, then conducts a texture search to find the most similar textures to the low-resolution image, merging these with the LR features before generating the super-resolved image through a decoder. During training, the LTG is optimized using numerous LR-HR image pairs, allowing it to generate textures during inference that are independent of specific reference images, significantly expanding the texture space and enhancing model robustness. Additionally, we introduce a hybrid feature fusion decoder that computes the textures differently during training and inference, generating the final super-resolved image through a series of residual blocks and multi-scale feature processing blocks.
Figure 1: Overview of LTGNet. (a) Illustration of the overall reference-based framework. (b) Detailed structure of LTG. Different paths involved in different stages are marked with different colors.
Figure 2: Texture search process.
Figure 3: The architecture of the hybrid feature fusion decoder.
Datasets: To obtain LR and HR image pairs, original 6x6 mm OCTA images were upscaled by a factor of two using bicubic interpolation and registered with paired 3x3 mm OCTA images. The largest inscribed rectangle was cropped from the overlapping areas to generate LR-HR image pairs. This study is the first to propose acquiring ground truth across the entire 6x6 mm region to validate model performance. A total of 233 image pairs were used for training, with 56 pairs reserved for validation, and an independent test set of 36 groups covering the entire 6x6 mm region.
Figure 4: Examples of LR-HR image pairs from the dataset.
Quantitative Results: We evaluated model performance using Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and Learned Perceptual Image Patch Similarity (LPIPS). Even without reference images during testing, LTGNet automatically generated useful textures, ranking among the top two in most metrics across two datasets, outperforming several state-of-the-art RefSR models. Notably, our method excelled in SSIM, while another model, DATSR, exhibited stability in PSNR. However, LTGNet demonstrated superior perceptual image similarity, as indicated by lower LPIPS values.
Table 1: Quantitative results on the datasets.
Reconstruction Quality: LTGNet showcased enhanced capabilities in reconstructing subtle vascular details and maintaining vascular connectivity. Even in the presence of misleading noise in low-resolution images, our method accurately identified true vessels and restored details, unlike other models that misrepresented signals as potential neovascularization.
Figure 5: Reconstruction examples of LTGNet and other models.
Ablation Studies: We compared model performance with and without texture, and using both searched and generated textures. Results indicated that incorporating additional texture information from the high-resolution domain was beneficial for the reconstruction process, with significant improvements in PSNR and SSIM when using generated textures.
Table 2: Ablation studies on the datasets demonstrating the importance of texture information.
We present LTGNet, a reference-based super-resolution model designed to enhance OCTA images. LTGNet does not rely on a single reference image for texture but generates textures through LTG, thereby expanding the available texture space and reducing dependence on specific reference images, enhancing model robustness. Experimental and visual results demonstrate that LTGNet is competitive with state-of-the-art methods in terms of performance and robustness, showcasing its reliability and potential for practical applications. Furthermore, we validate the model’s performance across the entire 6x6 mm region, highlighting its superior robustness and generalization capabilities.
References:
Ruan, Y., Yang, D., Tang, Z., Ran, A. R., Cheung, C. Y., & Chen, H. (2024). Reference-based OCT Angiogram Super-resolution with Learnable Texture Generation. ArXiv. /abs/2305.05835