Previously, doctors interpreted computed tomography (CT) images based on their experience in diagnosing kidney diseases. However, with the rapid increase in CT images, such interpretations were required considerable time and effort, producing inconsistent results. Several novel neural network models were proposed to automatically identify kidney or tumor areas in CT images for solving this problem. In most of these models, only the neural network structure was modified to improve accuracy. However, data pre-processing was also a crucial step in improving the results. This study systematically discussed the necessary pre-processing methods before processing medical images in a neural network model. The experimental results were shown that the proposed pre-processing methods or models significantly improve the accuracy rate compared with the case without data pre-processing. Specifically, the dice score was improved from 0.9436 to 0.9648 for kidney segmentation and 0.7294 for all types of tumor detections. The performance was suitable for clinical applications with lower computational resources based on the proposed medical image processing methods and deep learning models. The cost efficiency and effectiveness were also achieved for automatic kidney volume calculation and tumor detection accurately.
This paper proposes an encoder-decoder architecture for kidney segmentation. A hyperparameter optimization process is implemented, including the development of a model architecture, selecting a windowing method and a loss function, and data augmentation. The model consists of EfficientNet-B5 as the encoder and a feature pyramid network as the decoder that yields the best performance with a Dice score of 0.969 on the 2019 Kidney and Kidney Tumor Segmentation Challenge dataset. The proposed model is tested with different voxel spacing, anatomical planes, and kidney and tumor volumes. Moreover, case studies are conducted to analyze segmentation outliers. Finally, five-fold cross-validation and the 3D-IRCAD-01 dataset are used to evaluate the developed model in terms of the following evaluation metrics: the Dice score, recall, precision, and the Intersection over Union score. A new development and application of artificial intelligence algorithms to solve image analysis and interpretation will be demonstrated in this paper. Overall, our experiment results show that the proposed kidney segmentation solutions in CT images can be significantly applied to clinical needs to assist surgeons in surgical planning. It enables the calculation of the total kidney volume for kidney function estimation in ADPKD and supports radiologists or doctors in disease diagnoses and disease progression.
Beamforming is regarded as a promising technique for future wireless communication systems. In this regard, codebook-based beamforming offers satisfactory performance with acceptable computational complexity; however, it requires a high power-consumed beam overhead. To achieve balance between the utilized beam overhead and the achieved spectral efficiency performance, we propose a super-resolution-based scheme using a hierarchical codebook. We consider beam sweeping as an inference problem, where low-resolution beam radiating responses are used as an input, and high-resolution beam sweeping responses are output. Simulation results confirm that the proposed scheme exhibits extraordinary performance-overhead tradeoffs as compared with state-of-the-art codebook-based beamforming designs.
In visual search, the gallery set could be incrementally growing and added to the database in practice. However, In visual se In visual search, the gallery set could be incrementally growing and added to the database in practice. However, existing methods rely on the model trained on the entire dataset, ignoring the continual updating of the model. Be- sides, as the model updates, the new model must re-extract features for the entire gallery set to maintain compatible feature space, imposing a high computational cost for a large gallery set. To address the issues of long-term visual search, we introduce a continual learning (CL) approach that can handle the incrementally growing gallery set with backward embedding consistency. We enforce the losses of inter-session data coherence, neighbor-session model co- herence, and intra-session discrimination to conduct a con- tinual learner. In addition to the disjoint setup, our CL so- lution also tackles the situation of increasingly adding new classes for the blurry boundary without assuming all cat- egories known in the beginning and during model update. To our knowledge, this is the first CL method both tackling the issue of backward-consistent feature embedding and al- lowing novel classes to occur in the new sessions. Extensive experiments on various benchmarks show the efficacy of our approach under a wide range of setups arch, the gallery set could be incrementally growing and added to the database in practice. However, existing methods rely on the model trained on the entire dataset, ignoring the continual updating of the model. Be- sides, as the model updates, the new model must re-extract features for the entire gallery set to maintain compatible feature space, imposing a high computational cost for a large gallery set. To address the issues of long-term visual search, we introduce a continual learning (CL) approach that can handle the incrementally growing gallery set with backward embedding consistency. We enforce the losses of inter-session data coherence, neighbor-session model co- herence, and intra-session discrimination to conduct a con- tinual learner. In addition to the disjoint setup, our CL so- lution also tackles the situation of increasingly adding new classes for the blurry boundary without assuming all cat- egories known in the beginning and during model update. To our knowledge, this is the first CL method both tackling the issue of backward-consistent feature embedding and al- lowing novel classes to occur in the new sessions. Extensive experiments on various benchmarks show the efficacy of our approach under a wide range of setups existing methods rely on the model trained on the entire dataset, ignoring the continual updating of the model. Be- sides, as the model updates, the new model must re-extract features for the entire gallery set to maintain compatible feature space, imposing a high computational cost for a large gallery set. To address the issues of long-term visual search, we introduce a continual learning (CL) approach that can handle the incrementally growing gallery set with backward embedding consistency. We enforce the losses of inter-session data coherence, neighbor-session model co- herence, and intra-session discrimination to conduct a con- tinual learner. In addition to the disjoint setup, our CL so- lution also tackles the situation of increasingly adding new classes for the blurry boundary without assuming all cat- egories known in the beginning and during model update. To our knowledge, this is the first CL method both tackling the issue of backward-consistent feature embedding and al- lowing novel classes to occur in the new sessions. Extensive experiments on various benchmarks show the efficacy of our approach under a wide range of setupsIn visual search, the gallery set could be incrementally growing and added to the database in practice. However, existing methods rely on the model trained on the entire dataset, ignoring the continual updating of the model. Be- sides, as the model updates, the new model must re-extract features for the entire gallery set to maintain compatible feature space, imposing a high computational cost for a large gallery set. To address the issues of long-term visual search, we introduce a continual learning (CL) approach thatIn visual search, the gallery set could be incrementally growing and added to the database in practice. However, existing methods rely on the model trained on the entire dataset, ignoring the continual updating of the model. Be- In visual search, the gallery set could be incrementally growing and added to the database in practice. However, existing methods rely on the model trained on the entire dataset, ignoring the continual updating of the model. Be- sides, as the model updates, the new model must re-extract features for the entire gallery set to maintain compatible feature space, imposing a high computational cost for a large gallery set. To address the issues of long-term visual search, we introduce a continual learning (CL) approach that can handle the incrementally growing gallery set with backward embedding consistency. We enforce the losses of inter-session data coherence, neighbor-session model co- herence, and intra-session discrimination to conduct a con- tinual learner. In addition to the disjoint setup, our CL so- lution also tackles the situation of increasingly adding new classes for the blurry boundary without assuming all cat- egories known in the beginning and during model update. To our knowledge, this is the first CL method both tackling the issue of backward-consistent feature embedding and al- lowing novel classes to occur in the new sessions. Extensive experiments on various benchmarks show the efficacy of our approach under a wide range of setups sides, as the model updates, the new model must re-extract features for the entire gallery set to maintain compatible feature space, imposing a high computational cost for a large gallery set. To address the issues of long-term visual search, we introduce a continual learning (CL) approach that can handle the incrementally growing gallery set with backward embedding consistency. We enforce the losses of inter-session data coherence, neighbor-session model co- herence, and intra-session discrimination to conduct a con- tinual learner. In addition to the disjoint setup, our CL so- lution also tackles the situation of increasingly adding new classes for the blurry boundary without assuming all cat- egories known in the beginning and during model update. To our knowledge, this is the first CL method both tackling the issue of backward-consistent feature embedding and al- lowing novel classes to occur in the new sessions. Extensive experiments on various benchmarks show the efficacy of our approach undeIn visual search, the gallery set could be incrementally growing and added to the database in practice. However, existing methods rely on the model trained on the entire dataset, ignoring the continual updating of the model. Be- sides, as the model updates, the new model must re-extract features for the entire gallery set to maintain compatible feature space, imposing a high computational cost for a large gallery set. To address the issues of long-term visual search, we introduce a continual learning (CL) approach that can handle the incrementally growing gallery set with backward embedding consistency. We enforce the losses of inter-session data coherence, neighbor-session model co- herence, and intra-session discrimination to conduct a con- tinual learner. In addition to the disjoint setup, our CL so- lution also tackles the situation of increasingly adding new classes for the blurry boundary without assuming all cat- egories known in the beginning and during model update. To our knowledge, this is the first CL method both tackling the issue of backward-consistent feature embedding and al- lowing novel classes to occur in the new sessions. Extensive experiments on various benchmarks show the efficacy of our approach under a wide range of setupsr a wide range of setups can handle the incrementally growing gallery set with backward embedding consistency. We enforce the losses of In visual search, the gallery set could be incrementally growing and added to the database in practice. However, existing methods rely on the model trained on the entire dataset, ignoring the continual updating of the model. Be- sides, as the model updates, the new model must re-extract features for the entire gallery set to maintain compatible feature space, imposing a high computational cost for a large gallery set. To address the issues of long-term visual search, we introduce a continual learning (CL) approach that can handle the incrementally growing gallery set with backward embedding consistency. We enforce the losses of inter-session data coherence, neighbor-session model co- herence, and intra-session discrimination to conduct a con- tinual learner. In addition to the disjoint setup, our CL so- lution also tackles the situation of increasingly adding new classes for the blurry boundary without assuming all cat- egories known in the beginning and during model update. To our knowledge, this is the first CL method both tackling the issue of backward-consistent feature embedding and al- lowing novel classes to occur in the new sessions. Extensive experiments on various benchmarks show the efficacy of our approach under a wide range of setups inter-session data coherence, neighbor-session model co- herence, and intra-session discrimination to conduct a con- tinual learner. In addition to the disjoint setup, our CL so- lution also tackles the situation of increasingly adding new classes for the blurry boundary without assuming all cat- egories known in the beginning and during model update. To our knowledge, this is the first CL method both tackling the issue of backward-consistent feature embedding and al- lowing novel classes to occur in the new sessions. Extensive experiments on various benchmarks show the efficacy of our approach under a wide range of setups
The key for the contemporary deep learning-based object and action localization algorithms to work is the large-scale annotated data. However, in real-world scenarios, since there are infinite amounts of unlabeled data beyond the categories of publicly available datasets, it is not only time- and manpower-consuming to annotate all the data but also requires a lot of computational resources to train the detectors. To address these issues, we show a simple and reliable baseline that can be easily obtained and work directly for the zero-shot text-guided object and action localization tasks without introducing additional training costs by using Grad-CAM, the widely used class visual saliency map generator, with the help of the recently released Contrastive Language-Image Pre-Training (CLIP) model by OpenAI, which is trained contrastively using the dataset of 400 million image-sentence pairs with rich cross-modal information between text semantics and image appearances. With extensive experiments on the Open Images and HICO-DET datasets, the results demonstrate the effectiveness of the proposed approach for the text-guided unseen object and action localization tasks for images.
Vehicle positioning is a key component of autonomous driving. The global positioning system (GPS) is the most commonly used vehicle positioning system currently. However, its accuracy will be affected by environmental differences and thus fails to meet the requirements of meter-level accuracy. We consider a coordinate neighboring vehicle positioning system (CNVPS) based on GPS, omnidirectional radar, and V2V communication ability to obtain additional information from neighboring vehicles to improve the GPS positioning accuracy of vehicles in various environments. We further use the concept of transfer learning (TL) wherein an adversarial mechanism is designed to eliminate the deviation of multiple environments to optimize vehicle positioning accuracy in multiple environments using one model. The simulation results show that, compared with the existing methods, the proposed system architecture not only improves the performance but also effectively reduces the amount of data required for training.
Device-free wireless indoor localization is an essential technology for the Internet of Things (IoT), and fingerprint-based methods are widely used. A common challenge to fingerprint-based methods is data collection and labeling. This paper proposes a few-shot transfer learning system that uses only a small amount of labeled data from the current environment and reuses a large amount of existing labeled data previously collected in other environments, thereby significantly reducing the data collection and labeling cost for localization in each new environment. The core method lies in graph neural network (GNN) based few-shot transfer learning and its modifications. Experimental results conducted on real-world environments show that the proposed system achieves comparable performance to a convolutional neural network (CNN) model, with 40 times fewer labeled data.
Millimeter wave (mmWave) is a key technology for fifth-generation (5G) and beyond communications. Hybrid beamforming has been proposed for large-scale antenna systems in mmWave communications. Existing hybrid beamforming designs based on infinite-resolution phase shifters (PSs) are impractical due to hardware cost and power consumption. In this paper, we propose an unsupervised-learning-based scheme to jointly design the analog precoder and combiner with low-resolution PSs for multiuser multiple-input multiple-output (MU-MIMO) systems. We transform the analog precoder and combiner design problem into a phase classification problem and propose a generic neural network architecture, termed the phase classification network (PCNet), capable of producing solutions of various PS resolutions. Simulation results demonstrate the superior sum-rate and complexity performance of the proposed scheme, as compared to state-of-the-art hybrid beamforming designs for the most commonly used low-resolution PS configurations.
Item concept modeling is commonly achieved by leveraging textual information. However, many existing models do not leverage the inferential property of concepts to capture word meanings, which therefore ignores the relatedness between correlated concepts, a phenomenon which we term conceptual “correlation sparsity.” In this paper, we distinguish between word modeling and concept modeling and propose an item concept modeling framework centering around the item concept network (ICN). ICN models and further enriches item concepts by leveraging the inferential property of concepts and thus addresses the correlation sparsity issue. Specifically, there are two stages in the proposed framework: ICN construction and embedding learning. In the first stage, we propose a generalized network construction method to build ICN, a structured network which infers expanded concepts for items via matrix operations. The second stage leverages neighborhood proximity to learn item and concept embeddings. With the proposed ICN, the resulting embedding facilitates both homogeneous and heterogeneous tasks, such as item-to-item and concept-to-item retrieval, and delivers related results which are more diverse than traditional keyword-matching-based approaches. As our experiments on two real-world datasets show, the framework encodes useful conceptual information and thus outperforms traditional methods in various item classification and retrieval tasks.