:::
Techniques of domain adaptation have been applied to address cross-domain recognition problems. In particular, such techniques favor the scenarios in which labeled data can be obtained at the source domain, but only few labeled target domain data are available during the training stage. In this paper, we propose a domain adaptation approach which is able to transfer source domain labeled data to the target domain, so that one can collect a sufficient amount of training data at that domain for recognition purposes. By advancing low-rank matrix decomposition for obtaining representative cross-domain data, our proposed model aims at transferring source domain labeled data to the target domain while preserving class label information. This introduces additional discriminating ability into our model, and thus improved recognition can be expected. Empirical results on cross-domain image datasets confirm the use of our proposed model for solving cross-domain recognition problems.
FingerPad: Private and Subtle Interaction Under Fignertips
Anomaly detection has been an important research topic in data mining and machine learning. Many real-world applications such as intrusion or credit card fraud detection require an effective and efficient framework to identify deviated data instances. However, most anomaly detection methods are typically implemented in batch mode, and thus cannot be easily extended to large-scale problems without sacrificing computation and memory requirements. In this paper, we propose an online oversampling principal component analysis (osPCA) algorithm to address this problem, and we aim at detecting the presence of outliers from a large amount of data via an online updating technique. Unlike prior principal component analysis (PCA)-based approaches, we do not store the entire data matrix or covariance matrix, and thus our approach is especially of interest in online or large-scale problems. By oversampling the target instance and extracting the principal direction of the data, the proposed osPCA allows us to determine the anomaly of the target instance according to the variation of the resulting dominant eigenvector. Since our osPCA need not perform eigen analysis explicitly, the proposed framework is favored for online applications which have computation or memory limitations. Compared with the well-known power method for PCA and other popular anomaly detection algorithms, our experimental results verify the feasibility of our proposed method in terms of both accuracy and efficiency.
This paper presents a saliency-based video object extraction (VOE) framework. The proposed framework aims to automatically extract foreground objects of interest without any user interaction or the use of any training data (i.e., not limited to any particular type of object). To separate foreground and background regions within and across video frames, the proposed method utilizes visual and motion saliency information extracted from the input video. A conditional random field is applied to effectively combine the saliency induced features, which allows us to deal with unknown pose and scale variations of the foreground object (and its articulated parts). Based on the ability to preserve both spatial continuity and temporal consistency in the proposed VOE framework, experiments on a variety of videos verify that our method is able to produce quantitatively and qualitatively satisfactory results.
In this paper, we propose a novel approach for visualizing and recognizing different emotion categories using facial expression images. Extended by the unsupervised nonlinear dimension reduction technique of locally linear embedding (LLE), we propose a supervised LLE (sLLE) algorithm utilizing emotion labels of face expression images. While existing works typically aim at training on such labeled data for emotion recognition, our approach allows one to derive subspaces for visualizing facial expression images within and across different emotion categories, and thus emotion recognition can be properly performed. In our work, we relate the resulting two-dimensional subspace to the valence-arousal emotion space, in which our method is observed to automatically identify and discriminate emotions in different degrees. Experimental results on two facial emotion datasets verify the effectiveness of our algorithm. With reduced numbers of feature dimensions (2D or beyond), our approach is shown to achieve promising emotion recognition performance.
In this paper, we address the problem of robust face recognition using undersampled data. Given only one or few face images per class, our proposed method not only handles test images with large intra-class variations such as illumination and expression, it is also able to recognize the corrupted ones due to occlusion or disguise. In our work, we advocate the learning of auxiliary dictionaries from the subjects \emph{not} of interest. With the proposed optimization algorithm which jointly solves the tasks of auxiliary dictionary learning and sparse-representation based face recognition, our approach is able to model the above intra-class variations and corruptions for improved recognition. Our experiments on two face image datasets confirm the effectiveness and robustness of our approach, which is shown to outperform state-of-the-art sparse representation based methods.
Cognitive radio networks (CRNs) consisting of opportunistic links greatly elevate the networking throughput per bandwidth in future wireless communications. In this letter, we show that by encoding a data packet into several coded packets and transmitting in various time slots, the end-to-end transmission in the CRN can be equivalently formulated as a physical-layer multiple-input multiple-output ( MIMO ) communication problem. Two coding schemes are proposed in such a virtual MIMO scenario to enhance the end-to-end communication reliability by exploiting the path diversity. The closed-form expression of the theoretical error rate analysis validates that the proposed path - time codes can significantly improve the error rate performance.
We present an algorithm that carries out alternate Hough transform and inverted Hough transform to establish feature correspondences, and enhances the quality of matching in both precision and recall. Inspired by the fact that nearby features on the same object share coherent homographies in matching, we cast the task of feature matching as a density estimation problem in the Hough space spanned by the hypotheses of homographies. Specifically, we project all the correspondences into the Hough space, and determine the correctness of the correspondences by their respective densities. In this way, mutual verification of relevant correspondences is activated, and the precision of matching is boosted. On the other hand, we infer the concerted homographies propagated from the locally grouped features, and enrich the correspondence candidates for each feature. The recall is hence increased. The two processes are tightly coupled. Through iterative optimization, plausible enrichments are gradually revealed while more correct correspondences are detected. Promising experimental results on three benchmark datasets manifest the effectiveness of the proposed approach.
In this paper, we improve the efficiency of kernelized support vector machine (SVM) for image classification using linearized kernel data representation. Inspired by Nystrom approximation, we propose a decomposition technique for converting the kernel data matrix into an approximated primal form. This allows us to perform data classification in the kernel space using linear SVMs. Using our method, several benefits can be observed. First, we advance basis matrix selection for decomposing our proposed Nystrom approximation, which can be considered as feature/instance selection for performance guarantees. As a result, classifying approximated kernelized data in primal form using linear SVMs is able to achieve comparable recognition performance as nonlinear SVMs do. More importantly, the proposed selection technique significantly reduces the computation complexity for both training and testing, and thus makes our computation time be comparable to that of linear SVMs. Experiments on two benchmark datasets will support the use of our approach for solving the tasks of image classification.
The use of time-of-flight sensors enables the record of fullframe depth maps at video frame rate, which benefits a variety of 3D image or video processing applications. However, such depth maps are typically corrupted by noise and with limited resolution. In this paper, we present a learning-based depth map super-resolution framework by solving a MRF labeling optimization problem. With the captured depth map and the associated high-resolution color image, our proposed method exhibits the capability of preserving the edges of range data while suppressing the artifacts of texture copying due to color discontinuities. Quantitative and qualitative experimental results demonstrate the effectiveness and robustness of our approach over prior depth map upsampling works.