Publication date: Jun 27, 2025
The emergence of novel viral diseases, with SARS-CoV-2 as a stark example, poses increasing threats to public health, causing significant global morbidity and mortality. Accurate identification and segmentation of viral imaging are crucial for tracking virus progression and mutations, and for devising new treatment strategies. Advanced virus recognition and segmentation models, utilizing high-performance networks like U-Net, have achieved notable success. However, these models struggle with multiple challenges, including limited labeled virus images, significant morphological variability, and indistinct boundaries. Consequently, this study introduces ViruSeg, based on the EVA-02 large language-image pre-trained model and data augmentation techniques, designed to efficiently perform virus segmentation tasks. Initially, the ViruSeg model employs data augmentation techniques like cutout and image fine-tuning to enrich electron microscope virus images, enhancing model generalization and effectively delineating virus boundaries and different forms. Secondly, ViruSeg utilizes the EVA-02 pre-trained model to learn a universal representation of virus images, enhancing adaptability to data scarcity. Finally, virus segmentation is conducted using the Cascade Mask R-CNN (CMR) model. Comprehensive evaluations on benchmark datasets demonstrate the superior performance of ViruSeg compared to advanced virus segmentation methods. We anticipate that the proposed solution will advance virology research and the development of treatments for related diseases. All dataset and code are available through https://github. com/xiachashuanghua/project .
| Concepts | Keywords |
|---|---|
| Accurate | Data augmentation |
| Global | Data scarcity |
| Microscope | Large language-image model |
| Virology | Virus image segmentation |
| Virus |
Semantics
| Type | Source | Name |
|---|---|---|
| disease | MESH | viral diseases |
| disease | MESH | morbidity |
| drug | DRUGBANK | Spinosad |
| drug | DRUGBANK | Coenzyme M |