:::
This paper considers a device-to-device (D2D) communications underlaid multiple-input multiple-output (MIMO) cellular network and studies D2D mode selection from a previously unexamined perspective. Since D2D mode selection affects the network interference profile and vice versa, a joint D2D mode selection and interference management is desired but challenging. In this work, we propose a holistic approach to this problem with interference-free considerations. We adopt the degrees-of-freedom (DoF) as the mode-selection criterion and exploit the linear interference alignment (IA) technique for interference management. We analyze the achievable sum DoF of the potential D2D users according to their mode selections, and derive the probabilistic sum-rate relations between the proposed DoF-based mode selection scheme and the common received-signal-strength-index (RSSI)-based mode selection scheme in Poisson point process (PPP) networks. Simulation illustrates the theoretical insights and shows the advantages of the proposed DoF-based mode selection scheme over conventional mode selection schemes from various perspectives. The proposed scheme presents a promising proposal for D2D mode selection in 5G communications.
Multi-label classification is a practical yet challenging task in machine learning related fields, since it requires the pre- diction  of  more  than  one  label  category  for  each  input  in- stance.  We  propose  a  novel  deep  neural  networks  (DNN) based  model, C anonical C orrelated A uto E ncoder  (C2AE), for  solving  this  task.  Aiming  at  better  relating  feature  and label  domain  data  for  improved  classification,  we  uniquely perform joint feature and label embedding by deriving a deep latent space, followed by the introduction of label-correlation sensitive loss function for recovering the predicted label out- puts. Our C2AE is achieved by integrating the DNN archi- tectures  of  canonical  correlation  analysis  and  autoencoder, which  allows  end-to-end  learning  and  prediction  with  the ability to exploit label dependency. Moreover, our C2AE can be easily extended to address the learning problem with miss- ing labels. Our experiments on multiple datasets with differ- ent scales confirm the effectiveness and robustness of our pro- posed method, which is shown to perform favorably against state-of-the-art methods for multi-label classification.
Unsupervised domain adaptation deals with scenarios in which labeled data are available in the source domain, but only unlabeled data can be observed in the target domain. Since the classifiers trained by source-domain data would not be expected to generalize well in the target domain, how to transfer the label information from source to target domain data is a challenging task. A common technique for unsupervised domain adaptation is to match cross-domain data distributions, so that the domain and distribution differences can be suppressed. In this paper, we propose to utilize the label information inferred from the source domain, while the structural information of the unlabeled target-domain data will be jointly exploited for adaptation purposes. Our proposed model not only reduces the distribution mismatch between domains, improved recognition of target-domain data can be achieved simultaneously. In the experiments, we will show that our approach performs favorably against the state-of-the-art unsupervised domain adaptation methods on benchmark data sets. We will also provide convergence, sensitivity, and robustness analysis, which support the use of our model for cross-domain classification.
Textural style transfer aims to transfer the textural style identified from a reference image to a source image, while retaining the scene of the source image. This article proposes a context-aware style transfer algorithm based on sparse-representation-based textural synthesis. Whereas sparse representation is designed to extract the style component of the exemplar image, textural synthesis is performed in a context-aware setting to preserve the original scene structure of the source image. Unlike existing solutions that require prior knowledge of the textural style of interest or user interaction, this method performs the transfer automatically. Experimental results demonstrate the effectiveness of the proposed method for automatic style transfer from a single style template image that is not accompanied with its original real image.
We present a comprehensive performance study of a new time-domain approach for estimating the components of an observed monaural audio mixture. Unlike existing time-frequency approaches that use the product of a set of spectral templates and their corresponding activation patterns to approximate the spectrogram of the mixture, the proposed approach uses the sum of a set of convolutions of estimated activations with prelearned dictionary filters to approximate the audio mixture directly in the time domain. The approximation problem can be solved by an efficient convolutional sparse coding algorithm. The effectiveness of this approach for source separation of musical audio has been demonstrated in our prior work, but under rather restricted and controlled conditions, requiring the musical score of the mixture being informed a priori and little mismatch between the dictionary filters and the source signals. In this paper, we report an evaluation that considers wider, and more practical, experimental settings. This includes the use of an audio-based multi-pitch estimation algorithm to replace the musical score, and an external dataset of audio single notes to construct the dictionary filters. Our result shows that the proposed approach remains effective with a larger dictionary, and compares favorably with the state-of-the-art non-negative matrix factorization approach. However, in the absence of the score and in the case of a small dictionary, our approach may not be better.
Heterogeneous domain adaptation (HDA) addresses the task of associating data not only across dissimilar domains but also described by di erent types of features. Inspired by the recent advances of neural networks and deep learning, we propose Transfer Neural Trees (TNT) which jointly solves cross-domain feature mapping, adaptation, and classi cation in a NN-based architecture. As the prediction layer in TNT, we further propose Transfer Neural Decision Forest (Transfer-NDF), which effectively adapts the neurons in TNT for adaptation by stochastic pruning. Moreover, to address semi-supervised HDA, a unique embedding loss term for preserving prediction and structural consistency between target- domain data is introduced into TNT. Experiments on classi cation tasks across features, datasets, and modalities successfully verify the e ectiveness of our TNT.
N/A
N/A
N/A
In music auto-tagging, people develop models to automati- cally label a music clip with attributes such as instruments, styles or acoustic properties. Many of these tags are actu- ally descriptors of local events in a music clip, rather than a holistic description of the whole clip. Localizing such tags in time can potentially innovate the way people retrieve and interact with music, but little work has been done to date due to the scarcity of labeled data with granularity speci c enough to the frame level. Most labeled data for training a learning-based model for music auto-tagging are in the clip level, providing no cues when and how long these attributes appear in a music clip. To bridge this gap, we propose in this paper a convolutional neural network (CNN) architec- ture that is able to make accurate frame-level predictions of tags in unseen music clips by using only clip-level anno- tations in the training phase. Our approach is motivated by recent advances in computer vision for localizing visual objects, but we propose new designs of the CNN architec- ture to account for the temporal information of music and the variable duration of such local tags in time. We re- port extensive experiments to gain insights into the prob- lem of event localization in music, and validate through ex- periments the e ectiveness of the proposed approach. In addition to quantitative evaluations, we also present quali- tative analyses showing the model can indeed learn certain characteristics of music tags.