資訊科技創新研究中心 | 近期研究成果

Y.-R. Yeh, C.-H. Huang, And Y.-C. F. Wang

Heterogeneous Domain Adaptation and Classification by Exploiting the Correlation Subspace

IEEE Transactions on Image Processing

May 2014

We present a novel domain adaptation approach for solving cross-domain pattern recognition problems, i.e., the data or features to be processed and recognized are collected from different domains of interest. Inspired by canonical correlation analysis (CCA), we utilize the derived correlation subspace as a joint representation for associating data across different domains, and we advance reduced kernel techniques for kernel CCA (KCCA) if nonlinear correlation subspace are desirable. Such techniques not only makes KCCA computationally more efficient, potential over-fitting problems can be alleviated as well. Instead of directly performing recognition in the derived CCA subspace (as prior CCA-based domain adaptation methods did), we advocate the exploitation of domain transfer ability in this subspace, in which each dimension has a unique capability in associating cross-domain data. In particular, we propose a novel support vector machine (SVM) with a correlation regularizer, named correlation-transfer SVM, which incorporates the domain adaptation ability into classifier design for cross-domain recognition. We show that our proposed domain adaptation and classification approach can be successfully applied to a variety of cross-domain recognition tasks such as cross-view action recognition, handwritten digit recognition with different features, and image-to-text or text-to-image classification. From our empirical results, we verify that our proposed method outperforms state-of-the-art domain adaptation approaches in terms of recognition performance.

Yen-Yu Lin, Ju-Hsuan Hua, Nick C. Tang, Min-Hung Chen, And Hong-Yuan Mark Liao

Depth and Skeleton Associated Action Recognition without Online Accessible RGB-D Cameras

IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Poster Session

June 2014

The recent advances in RGB-D cameras have allowed us to better solve increasingly complex computer vision tasks. However, modern RGB-D cameras are still restricted by the short effective distances. The limitation may make RGB-D cameras not online accessible in practice, and degrade their applicability. We propose an alternative scenario to address this problem, and illustrate it with the application to action recognition. We use Kinect to offline collect an auxiliary, multi-modal database, in which not only the RGB videos but also the depth maps and skeleton structures of actions of interest are available. Our approach aims to enhance action recognition in RGB videos by leveraging the extra database. Specifically, it optimizes a feature transformation, by which the actions to be recognized can be concisely reconstructed by entries in the auxiliary database. In this way, the inter-database variations are adapted. More importantly, each action can be augmented with additional depth and skeleton images retrieved from the auxiliary database. The proposed approach has been evaluated on three benchmarks of action recognition. The promising results manifest that the augmented depth and skeleton features can lead to remarkable boost in recognition accuracy.

Kuang-Jui Hsu, Yen-Yu Lin, And Yung-Yu Chuang

Augmented Multiple Instance Regression for Inferring Object Contours in Bounding Boxes

IEEE Transactions on Image Processing (TIP)

April 2014

In this paper, we address the problem of the high annotation cost of acquiring training data for semantic segmentation. Most modern approaches to semantic segmentation are based upon graphical models, such as the conditional random fields, and rely on sufficient training data in form of object contours. To reduce the manual effort on pixel-wise annotating contours, we consider the setting in which the training data set for semantic segmentation is a mixture of a few object contours and an abundant set of bounding boxes of objects. Our idea is to borrow the knowledge derived from the object contours to infer the unknown object contours enclosed by the bounding boxes. The inferred contours can then serve as training data for semantic segmentation. To this end, we generate multiple contour hypotheses for each bounding box with the assumption that at least one hypothesis is close to the ground truth. This paper proposes an approach, called augmented multiple instance regression (AMIR), that formulates the task of hypothesis selection as the problem of multiple instance regression (MIR), and augments information derived from the object contours to guide and regularize the training process of MIR. In this way, a bounding box is treated as a bag with its contour hypotheses as instances, and the positive instances refer to the hypotheses close to the ground truth. The proposed approach has been evaluated on the Pascal VOC segmentation task. The promising results demonstrate that AMIR can precisely infer the object contours in the bounding boxes, and hence provide effective alternatives to manually labeled contours for semantic segmentation.

W.-L. Shen, C.-S. Chen, K. C.-J. Lin And K. A. Hua

Autonomous Mobile Mesh Networks

IEEE Transactions on Mobile Computing

February 2014

W.-L. Shen, K. C.-J. Lin, S. Gollakota And M.-S. Chen

Rate Adaptation for 802.11 Multiuser MIMO Networks

IEEE Transactions on Mobile Computing

January 2014

W.-L. Shen, Y.-C. Tung, K.-C. Lee, K. C.-J. Lin, S. Gollakota, D. Katabi And M.-S. Chen

Rate Adaptation for 802.11 Multiuser MIMO Networks

ACM MOBICOM

August 2012

In multiuser MIMO (MU-MIMO) networks, the optimal bit rate of a user is highly dynamic and changes from one packet to the next. This breaks traditional bit rate adaptation algorithms, which rely on recent history to predict the best bit rate for the next packet. To address this problem, we introduce TurboRate, a rate adaptation scheme for MU-MIMO LANs. TurboRate shows that clients in a MU-MIMO LAN can adapt their bit rate on a per-packet basis if each client learns two variables: its SNR when it transmits alone to the access point, and the direction along which its signal is received at the AP. TurboRate shows that each client can compute these two variables passively without exchanging control frames with the access point. A TurboRate client then annotates its packets with these variables to enable other clients to pick the optimal bit rate and transmit concurrently to the AP. A prototype implementation in USRP-N200 shows that traditional rate adaptation does not deliver the gains of MU-MIMO WLANs, and can interact negatively with MU-MIMO, leading to very low throughput. In contrast, enabling MU-MIMO with TurboRate provides a mean throughput gain of 1.7x and 2.3x, for 2-antenna and 3-antenna APs respectively.

Chun-Han Lin, Chih-Kai Kang, And Pi-Cheng Hsiu

Catch Your Attention: Quality-retaining Power Saving on Mobile OLED Displays

IEEE/ACM Design Automation Conference (DAC)

June 2014

Organic light-emitting diode (OLED) technology is considered as a promising alternative to mobile displays. This paper ex- plores how to reduce the OLED power consumption by exploiting visual attention. First, we model the problem of OLED im- age scaling optimization, with the objective of minimizing the power required to display an image without adversely impacting the user’s visual experience. Then, we propose an algorithm to solve the fundamental problem, and prove its optimality even without the accurate power model. Finally, based on the algorithm, we consider implementation issues and realize two application scenarios on a commercial OLED mobile tablet. The results of experiments conducted on the tablet with real images demonstrate that the proposed methodology can achieve significant power savings while retaining the visual quality.

Po-Hsien Tseng, Pi-Cheng Hsiu, Chin-Chiang Pan, And Tei-Wei Kuo

User-Centric Energy-Efficient Scheduling on Multi-Core Mobile Devices

IEEE/ACM Design Automation Conference (DAC)

June 2014

Mobile devices will provide improved computing resources to sustain progressively more complicated applications. However, the concept of fair scheduling and governing borrowed from legacy operating systems cannot be applied seamlessly in mobile systems, thereby degrading user experience or reducing energy efficiency. In this paper, we posit that mobile applications should be treated unfairly. To this end, we propose the concept of application sensitivity and devise a user-centric scheduler and governor that allocate computing resources to applications according to their sensitivity. Furthermore, we integrate our design into the Android operating system. The results of extensive experiments on a commercial smartphone with real-world mobile apps demonstrate that the proposed design can achieve significant energy efficiency gains while maintaining the quality of user experience.

D.-A. Huang, L.-W. Kang, Y.-C. F. Wang, And C.-W. Lin

Self-Learning Based Image Decomposition with Applications to Single Image Denoising

IEEE Transactions on Multimedia

January 2014

Decomposition of an image into multiple semantic components has been an effective research topic for various image processing applications such as image denoising, enhancement, and inpainting. In this paper, we present a novel self-learning based image decomposition framework. Based on the recent success of sparse representation, the proposed framework first learns an over-complete dictionary from the high spatial fre- quency parts of the input image for reconstruction purposes. We perform unsupervised clustering on the observed dictionary atoms (and their corresponding reconstructed image versions) via affinity propagation, which allows u s to identify image-dependent components with similar context information. While applying the proposed method for the applications of image denoising, we are able to automatically determine the undesirable patterns (e.g., rain streaks or Gaussian noise) from the derived image components directly from the input image, so that the task of single-image denoising can be addressed. Different from prior image processing works with sparse representation, our method does not need to collect training image data in advance, nor do we assume image priors such as the relationship between input and output image dictionaries. We conduct experiments on two denoising problems: single-image denoising with Gaussian noise and rain removal. Our empirical results confirm the effectiveness and robustness of our approach, which is shown to outperform state-of-the-art image denoising algorithms.

M.-C. Yang, W. K. Moon, Y.-C. F. Wang, M. S. Bae, C.-S. Huang, J.-H. Chen, And R.-F. Chang

Robust Texture Analysis Using Multi-resolution Gray-scale Invariant Features for Breast Sonographic Tumor Diagnosis

IEEE Transactions on Medical Imaging

December 2013

Computer-aided diagnosis (CAD) systems in gray-scale breast ultrasound images have the potential to reduce unnecessary biopsy of breast masses. The purpose of our study is to develop a robust CAD system based on the texture analysis. First, gray-scale invariant features are extracted from ultrasound images via multi-resolution ranklet transform. Thus, one can apply linear support vector machines (SVMs) on the resulting gray-level co-occurrence matrix (GLCM)-based texture features for discriminating the benign and malignant masses. To verify the effectiveness and robustness of the proposed texture analysis, breast ultrasound images obtained from three different platforms are evaluated based on cross-platform training/testing and leave-one-out cross-validation (LOO-CV) schemes. We compare our proposed features with those extracted by wavelet transform in terms of receiver operating characteristic (ROC) analysis. The AUC values derived from the area under the curve for the three databases via ranklet transform are 0.918 (95% confidence interval [CI], 0.848 to 0.961), 0.943 (95% CI, 0.906 to 0.968), and 0.934 (95% CI, 0.883 to 0.961), respectively, while those via wavelet transform are 0.847 (95% CI, 0.762 to 0.910), 0.922 (95% CI, 0.878 to 0.958), and 0.867 (95% CI, 0.798 to 0.914), respectively. Experiments with cross-platform training/testing scheme between each database reveal that the diagnostic performance of our texture analysis using ranklet transform is less sensitive to the sonographic ultrasound platforms. Also, we adopt several co-occurrence statistics in terms of quantization levels and orientations (i.e., descriptor settings) for computing the co-occurrence matrices with 0.632+ bootstrap estimators to verify the use of the proposed texture analysis. These experiments suggest that the texture analysis using multi-resolution gray-scale invariant features via ranklet transform is useful for designing a robust CAD system.