We aim to resolve the difficulties of action recognition arising from the large intra-class variations. These unfavorable variations make it infeasible to represent one action instance by other ones of the same action. We hence propose to extract both instance-specific and class-consistent features to facilitate action recognition. Specifically, the instance-specific features explore the self-similarities among frames of each video instance, while class-consistent features summarize withinclass similarities. We introduce a generative formulation to combine the two diverse types of features. The experimental results demonstrate the effectiveness of our approach.
Rain removal from a single image is one of the challenging image denoising problems. In this paper, we present a learning-based framework for single image rain removal, which focuses on the learning of context information from an input image, and thus the rain patterns present in it can be automatically identified and removed. We approach the single image rain removal problem as the integration of image decomposition and self-learning processes. More precisely, our method first performs context-constrained image segmentation on the input image, and we learn dictionaries for the highfrequency components in different context categories via sparse coding for reconstruction purposes. For image regions with rain streaks, dictionaries of distinct context categories will share common atoms which correspond to the rain patterns. By utilizing PCA and SVM classifiers on the learned dictionaries, our framework aims at automatically identifying the common rain patterns present in them, and thus we can remove rain streaks as particular high-frequency components from the input image. Different from prior works on rain removal from images/videos which require image priors or training image data from multiple frames, our proposed selflearning approach only requires the input image itself, which would save much pre-training effort. Experimental results demonstrate the subjective and objective visual quality improvement with our proposed method.
This paper concerns the development of a music codebook for summarizing local feature descriptors computed over time. Comparing to a holistic representation, this text-like representation better captures the rich and time-varying information of music. We systematically compare a number of existing codebook generation techniques and also propose a new one that incorporates labeled data in the dictionary learning process. Several aspects of the encoding system such as local feature extraction and codeword encoding are also analyzed. Our result demonstrates the superiority of sparsity-enforced dictionary learning over conventional VQ-based or exemplar-based methods. With the new supervised dictionary learning algorithm and the optimal settings inferred from the performance study we are able to achieve state-of-the-art accuracy of music genre classification using just the log-power spectrogram as the local feature descriptor. The classification accuracies for two benchmark datasets GTZAN and ISMIR2004Genre are 84.7% and 90.8%, respectively.
We propose a novel multiple kernel learning (MKL) algorithm with a group lasso regularizer, called group lasso regularized MKL (GL-MKL), for heterogeneous feature fusion and variable selection. For problems of feature fusion, assigning a group of base kernels for each feature type in an MKL framework provides a robust way in fitting data extracted from different feature domains. Adding a mixed $ell_{1,2}$ norm constraint (i.e., group lasso) as the regularizer, we can enforce the sparsity at the group/feature level and automatically learn a compact feature set for recognition purposes. More precisely, our GL-MKL determines the optimal base kernels, including the associated weights and kernel parameters, and results in improved recognition performance. Besides, our GL-MKL can also be extended to address heterogeneous variable selection problems. For such problems, we aim to select a compact set of variables (i.e., feature attributes) for comparable or improved performance. Our proposed method does not need to exhaustively search for the entire variable space like prior sequential-based variable selection methods did, and we do not require any prior knowledge on the optimal size of the variable subset either. To verify the effectiveness and robustness of our GL-MKL, we conduct experiments on video and image datasets for heterogeneous feature fusion, and perform variable selection on various UCI datasets.
Improving the endurance of PCM is a fundamental issue when the technology is considered as an alternative to main memory usage. In the design of memory-based wear leveling approaches, a major challenge is how to efficiently determine the appropriate memory pages for allocation or swapping. In this paper, we present an efficient wear-leveling design that is compatible with existing virtual memory management. Two implementations, namely, bucket-based and array-based wear leveling, with nearly zero search cost are proposed to tradeoff time and space complexity. The results of experiments conducted based on popular benchmarks to evaluate the efficacy of the proposed design are very encouraging.
We address the problem of robust face recognition, in which both training and test image data might be corrupted due to occlusion and disguise. From standard face recognition algorithms such as Eigenfaces to recently proposed sparse representation-based classification (SRC) methods, most prior works did not consider the possible contamination of data during training, and thus the associated performance might be degraded. Based on the recent success of low-rank matrix recovery, we propose a novel low-rank matrix approximation algorithm with structural incoherence for robust face recognition. Our method not only decomposes raw training data into a set of representative basis with a corresponding sparse error for better modeling the face images, we further advocate the structural incoherence between basis learned from different classes. These basis are encouraged to be as independent as possible due to the regularization on structural incoherence. We show that this provides additional discriminating ability to the original low-rank models for improved performance. Experimental results on public face databases verify the effectiveness and robustness of our method, which is also shown to outperform state-of-the-art SRC based approaches.
In this paper, a MIMO detection scheme is proposed based on a combination of Monte Carlo technique and list detection. Specifically, a list of Gaussian samples are first generated to determine the search range of constellation points in which the transmitted symbol is most likely to locate. Linear equalizations are then applied to equalize the effect caused by the channel mixing, and a list detector is used to search within the determined search range. By varying the parameters in the Monte Carlo method, different symbol error rate (SER) versus complexity tradeoff can be obtained to account for different system design requirements. Simulation results also show that near-ML SER performance with considerably less computational complexity can be achieved by the proposed scheme compared to the exhaustive search.
ZigBee, a unique communication standard designed for low-rate wireless personal area networks, has extremely low complexity, cost, and power consumption for wireless connectivity in inexpensive, portable, and mobile devices. Among the well-known ZigBee topologies, ZigBee cluster-tree is especially suitable for low-power and low-cost wireless sensor networks because it supports power saving operations and light weight routing. In a constructed wireless sensor network, the information about some area of interest may require further investigation such that more traffic will be generated. However, the restricted routing of a ZigBee cluster-tree network may not be able to provide sufficient bandwidth for the increased traffic load, so the additional information may not be delivered successfully. In this paper, we present an adoptive-parent-based framework for a ZigBee cluster-tree network to increase bandwidth utilization without generating any extra message exchange. To optimize the throughput in the framework, we model the process as a vertex-constraint maximum flow problem, and develop a distributed algorithm that is fully compatible with the ZigBee standard. The optimality and convergence property of the algorithm are proved theoretically. Finally, the results of simulation experiments demonstrate the significant performance improvement achieved by the proposed framework and algorithm over existing approaches.
The (t, n) visual cryptography (VC) is a secret sharing scheme where a secret image is encoded into n transparencies, and the stacking of any t out of n transparencies reveals the secret image. The stacking of t - 1 or fewer transparencies is unable to extract any information about the secret. We discuss the additions and deletions of users in a dynamic user group. To reduce the overhead of generating and distributing transparencies in user changes, this paper proposes a (t, n) VC scheme with unlimited n based on the probabilistic model. The proposed scheme allows n to change dynamically in order to include new transparencies without regenerating and redistributing the original transparencies. Specifically, an extended VC scheme based on basis matrices and a probabilistic model is proposed. An equation is derived from the fundamental definitions of the (t, n) VC scheme, and then the (t, ∞) VC scheme achieving maximal contrast can be designed by using the derived equation. The maximal contrasts with t = 2 to 6 are explicitly solved in this paper.
The tree representation of the multiple-input multiple-output (MIMO) detection problem is illuminating for the development, interpretation, and classification of various detection methods. Best-first detection based on Dijkstra's algorithm pursues tree search according to a sorted list of tree nodes. In the first part of the paper, a new probabilistic sorting scheme is developed and incorporated in a modified Dijkstra's algorithm for MIMO detection. The proposed sorting exploits the statistics of the problem and yields effective tree exploration and truncation in the proposed algorithm. The second part of the paper generalizes the results in the first part and removes some limitations. A generalized Dijkstra's algorithm is developed as a unified tree-search detection framework. The proposed framework incorporates a parameter triplet that allow the configuration of the memory usage, detection complexity, and sorting dynamic associated with the tree-search algorithm. By tuning different parameters, desired performance-complexity tradeoffs are attained and a fixed-complexity version can be produced. Simulation results and analytical discussions demonstrate that the proposed generalized Dijkstra's algorithm shows abilities to achieve highly favorable performance-complexity tradeoffs.