:::
In this paper, we propose a novel approach for visualizing and recognizing different emotion categories using facial expression images. Extended by the unsupervised nonlinear dimension reduction technique of locally linear embedding (LLE), we propose a supervised LLE (sLLE) algorithm utilizing emotion labels of face expression images. While existing works typically aim at training on such labeled data for emotion recognition, our approach allows one to derive subspaces for visualizing facial expression images within and across different emotion categories, and thus emotion recognition can be properly performed. In our work, we relate the resulting two-dimensional subspace to the valence-arousal emotion space, in which our method is observed to automatically identify and discriminate emotions in different degrees. Experimental results on two facial emotion datasets verify the effectiveness of our algorithm. With reduced numbers of feature dimensions (2D or beyond), our approach is shown to achieve promising emotion recognition performance.
In this paper, we address the problem of robust face recognition using undersampled data. Given only one or few face images per class, our proposed method not only handles test images with large intra-class variations such as illumination and expression, it is also able to recognize the corrupted ones due to occlusion or disguise. In our work, we advocate the learning of auxiliary dictionaries from the subjects \emph{not} of interest. With the proposed optimization algorithm which jointly solves the tasks of auxiliary dictionary learning and sparse-representation based face recognition, our approach is able to model the above intra-class variations and corruptions for improved recognition. Our experiments on two face image datasets confirm the effectiveness and robustness of our approach, which is shown to outperform state-of-the-art sparse representation based methods.
Cognitive radio networks (CRNs) consisting of opportunistic links greatly elevate the networking throughput per bandwidth in future wireless communications. In this letter, we show that by encoding a data packet into several coded packets and transmitting in various time slots, the end-to-end transmission in the CRN can be equivalently formulated as a physical-layer multiple-input multiple-output ( MIMO ) communication problem. Two coding schemes are proposed in such a virtual MIMO scenario to enhance the end-to-end communication reliability by exploiting the path diversity. The closed-form expression of the theoretical error rate analysis validates that the proposed path - time codes can significantly improve the error rate performance.
We present an algorithm that carries out alternate Hough transform and inverted Hough transform to establish feature correspondences, and enhances the quality of matching in both precision and recall. Inspired by the fact that nearby features on the same object share coherent homographies in matching, we cast the task of feature matching as a density estimation problem in the Hough space spanned by the hypotheses of homographies. Specifically, we project all the correspondences into the Hough space, and determine the correctness of the correspondences by their respective densities. In this way, mutual verification of relevant correspondences is activated, and the precision of matching is boosted. On the other hand, we infer the concerted homographies propagated from the locally grouped features, and enrich the correspondence candidates for each feature. The recall is hence increased. The two processes are tightly coupled. Through iterative optimization, plausible enrichments are gradually revealed while more correct correspondences are detected. Promising experimental results on three benchmark datasets manifest the effectiveness of the proposed approach.
The use of time-of-flight sensors enables the record of fullframe depth maps at video frame rate, which benefits a variety of 3D image or video processing applications. However, such depth maps are typically corrupted by noise and with limited resolution. In this paper, we present a learning-based depth map super-resolution framework by solving a MRF labeling optimization problem. With the captured depth map and the associated high-resolution color image, our proposed method exhibits the capability of preserving the edges of range data while suppressing the artifacts of texture copying due to color discontinuities. Quantitative and qualitative experimental results demonstrate the effectiveness and robustness of our approach over prior depth map upsampling works.
In this paper, we improve the efficiency of kernelized support vector machine (SVM) for image classification using linearized kernel data representation. Inspired by Nystrom approximation, we propose a decomposition technique for converting the kernel data matrix into an approximated primal form. This allows us to perform data classification in the kernel space using linear SVMs. Using our method, several benefits can be observed. First, we advance basis matrix selection for decomposing our proposed Nystrom approximation, which can be considered as feature/instance selection for performance guarantees. As a result, classifying approximated kernelized data in primal form using linear SVMs is able to achieve comparable recognition performance as nonlinear SVMs do. More importantly, the proposed selection technique significantly reduces the computation complexity for both training and testing, and thus makes our computation time be comparable to that of linear SVMs. Experiments on two benchmark datasets will support the use of our approach for solving the tasks of image classification.
In this paper, we investigate the wireless network deployment problem, which seeks the best deployment of a given limited number of wireless routers. We found that many goals for network deployment, such as maximizing the number of covered users or areas, or the total throughput of the network, can be modelled with the submodular set function. Specifically, given a set of routers, the goal is to find a set of locations S, each of which is equipped with a router, such that S maximizes a predefined submodular set function. However, this deployment problem is more difficult than the traditional maximum submodular set function problem, e.g., the maximum coverage problem, because it requires all the deployed routers to form a connected network. In addition, deploying a router in different locations might consume different costs. To address these challenges, this paper introduces two approximation algorithms, one for homogeneous deployment cost scenarios and the other for heterogeneous deployment cost scenarios. Our simulations, using synthetic data and real traces of census in Taipei, show that the proposed algorithms achieve a better performance than other heuristics.
Learning-based approaches for image superresolution (SR) have attracted the attention from researchers in the past few years. In this paper, we present a novel selflearning approach for SR. In our proposed framework, we advance support vector regression (SVR) with image sparse representation, which offers excellent generalization in modeling the relationship between images and their associated SR versions. Unlike most prior SR methods, our proposed framework does not require the collection of training low and high-resolution image data in advance, and we do not assume the reoccurrence (or selfsimilarity) of image patches within an image or across image scales. With theoretical supports of Bayes decision theory, we verify that our SR framework learns and selects the optimal SVR model when producing a SR image, which results in the minimum SR reconstruction error. We evaluate our method on a variety of images, and obtain very promising SR results. In most cases, our method quantitatively and qualitatively outperforms bicubic interpolation and state-of-the-art learning-based SR approaches.
Adaptation of modulation and transmission bit-rates for video multicast in a multi-rate wireless network is a challenging problem because of network dynamics, variable video bit-rates, and hetero- geneous clients who may expect differentiated video qualities. Prior work on the leader-based schemes selects the transmission bit-rate that provides reliable transmission for the node that experiences the worst channel condition. However, this may penalize other nodes that can achieve a higher throughput by receiving at a higher rate. In this work, we investigate a rate-adaptive video multicast scheme that can provide heterogeneous clients differentiated visual qualities matching their channel conditions. We first propose a rate scheduling model that selects the optimal transmission bit-rate for each video frame to maximize the total visual quality for a multicast group subject to the minimum-visual-quality-guaranteed constraint. We then present a practical and easy-to-implement protocol, called QDM, which constructs a cluster-based structure to characterize node heterogeneity and adapts the transmission bit-rate to network dynamics based on video quality perceived by the representative cluster heads. Since QDM selects the rate by a sample-based technique, it is suitable for real-time streaming even without any pre-process. We show that QDM can adapt to network dynamics and variable video-bit rates efficiently, and produce a gain of 2-5dB in terms of the average video quality as compared to the leader- based approach.
TBA