資訊科技創新研究中心 | 近期研究成果

D. K. Verma, R. Y. Chang, And F.-T. Chien

Energy-Assisted Decode-and-Forward for Energy Harvesting Cooperative Cognitive Networks

IEEE Transactions on Cognitive Communications and Networking

September 2017

In this paper, we consider a simultaneous wireless information and power transfer (SWIPT)-enabled cooperative cognitive network that addresses energy scarcity and spectral scarcity, two important issues in 5G wireless communications. In the considered network, the self-sustainable, SWIPT-enabled relay assists primary user's transmission, while the relay itself is also a secondary user with its own information superimposed on the regenerated primary information for transmission. The SWIPT relay employs the proposed energy-assisted decode-and-forward (EDF) protocol, which enhances the conventional decode-and-forward (DF) protocol with energy-dimension-augmented information decoding. We conduct a comparative analysis of the proposed EDF and the conventional DF and amplify-and-forward (AF) protocols in this SWIPT cooperative cognitive framework in terms of capacity, outage probability, and throughput for both primary and secondary networks. Simulation corroborates the analysis and demonstrates performance advantages of EDF over DF/AF from various perspectives.

Chih-Yu Wang, Yan Chen, K.J. Ray Liu

Hidden Chinese Restaurant Game: Grand Information Extraction for Stochastic Network Learning

IEEE Transactions on Signal and Information Processing over Networks

June 2017

Agents in networks often encounter circumstances requiring them to make decisions. Nevertheless, the effectiveness of the decisions may be uncertain due to the unknown system state and the uncontrollable externality. The uncertainty can be eliminated through learning from information sources, such as user-generated contents or revealed actions. Nevertheless, the user-generated contents could be untrustworthy since other agents may maliciously create misleading contents for their selfish interests. The passively revealed actions are potentially more trustworthy and also easier to be gathered through simple observations. In this paper, we propose a new stochastic game-theoretic framework, Hidden Chinese Restaurant Game (H-CRG), to utilize the passively revealed actions in stochastic social learning process. We propose grand information extraction, a novel Bayesian belief extraction process, to extract the belief on the hidden information directly from the observed actions. We utilize the coupling relation between belief and policy to transform the original continuous belief-state Markov decision process (MDP) into a discrete-state MDP. The optimal policy is then analyzed in both centralized and game-theoretic approaches. We demonstrate how the proposed H-CRG can be applied to the channel access problem in cognitive radio networks. We then conduct data-driven simulations using the CRAWDAD Dartmouth campus wireless local area network (WLAN) trace. The simulation results show that the equilibrium strategy derived in H-CRG provides higher expected utilities for new users and maintains a reasonable high social welfare comparing with other candidate strategies.

A. Chern, Y.-H. Lai, Y. Chang, Y. Tsao, R. Y. Chang, And H.-W. Chang

A Smartphone-Based Multi-Functional Hearing Assistive System to Facilitate Speech Recognition in the Classroom

IEEE Access

June 2017

In this paper, we propose a smartphone-based hearing assistive system (termed SmartHear) to facilitate speech recognition for various target users who could benefit from enhanced listening clarity in the classroom. The SmartHear system consists of transmitter and receiver devices (e.g., smartphone and Bluetooth headset) for voice transmission, and an Android mobile application that controls and connects the different devices via Bluetooth or WiFi technology. The wireless transmission of voice signals between devices overcomes the reverberation and ambient noise effects in the classroom. The main functionalities of SmartHear include: 1) configurable transmitter/receiver assignment, to allow flexible designation of transmitter/receiver roles; 2) advanced noise-reduction techniques; 3) audio recording; and 4) voice-to-text conversion, to give students visual text aid. All the functions are implemented as a mobile application with an easy-to-navigate user interface. Experiments show the effectiveness of the noise-reduction schemes at low signal-to-noise ratios (SNR) in terms of standard speech perception and quality indices, and show the effectiveness of SmartHear in maintaining voice-to-text conversion accuracy regardless of the distance between the speaker and listener. Future applications of SmartHear are also discussed.

Tsun-Yi Yang, Jo-Han Hsu, Yen-Yu Lin, And Yung-Yu Chuang

DeepCD: Learning Deep Complementary Descriptors for Patch Representations

IEEE International Conference on Computer Vision (ICCV), Poster Session

October 2017

This paper presents the DeepCD framework which learns a pair of complementary descriptors jointly for image patch representation by employing deep learning techniques. It can be achieved by taking any descriptor learning architecture for learning a leading descriptor and augmenting the architecture with an additional network stream for learning a complementary descriptor. To enforce the complementary property, a new network layer, called data-dependent modulation (DDM) layer, is introduced for adaptively learning the augmented network stream with the emphasis on the training data that are not well handled by the leading stream. By optimizing the proposed joint loss function with late fusion, the obtained descriptors are complementary to each other and their fusion improves performance. Experiments on several problems and datasets show that the proposed method is simple yet effective, outperforming state-of-the-art methods

Han-Yi Lin, Pi-Cheng Hsiu, And Tei-Wei Kuo

ShiftMask: Dynamic OLED Power Shifting Based on Visual Acuity for Interactive Mobile Applications

IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)

July 2017

OLED power management on mobile devices is very challenging due to the dynamic nature of human-screen interaction. This paper presents the design, algorithms, and implementation of a lightweight mobile app called ShiftMask, which allows the user to dynamically shift OLED power to the portion of interest, while dimming the remainder of the screen based on visual acuity. To adapt to the user’s focus of attention, we propose efficient algorithms that consider visual fixation in static scenes, as well as changes in focus and screen scrolling. The results of experiments conducted on a commercial smartphone with popular interactive apps demonstrate that ShiftMask can achieve substantial energy savings, while preserving acceptable readability.

H.-W. Lee, C.-P. Wei, And Y.-C. F. Wang

Learning Grassmann Manifolds for Object State Discovery

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Poster Session

March 2017

N/A

Chuan-Ju Wang, Ting-Hsiang Wang*, Hsiu-Wei Yang*, Bo-Sin Chang, And Ming-Feng Tsai

ICE: Item Concept Embedding via Textual Information

Machine Learning

August 2017

This paper proposes an item concept embedding (ICE) framework to model item concepts via textual information. Specifically, in the proposed framework there are two stages: graph construction and embedding learning. In the first stage, we propose a generalized network construction method to build a network involving heterogeneous nodes and a mixture of both homogeneous and heterogeneous relations. The second stage leverages the concept of neighborhood proximity to learn the embeddings of both items and words. With the proposed carefully designed ICE networks, the resulting embedding facilitates both homogeneous and heterogeneous retrieval, including item-to-item and word-to-item retrieval. Moreover, as a distributed embedding approach, the proposed ICE approach not only generates related retrieval results but also delivers more diverse results than traditional keyword-matching-based approaches. As our experiments on two real-world datasets show, ICE encodes useful textual information and thus outperforms traditional methods in various item classification and retrieval tasks.

Ya-Fang Shih, Yang-Ming Yeh, Yen-Yu Lin, Ming-Feng Weng, Yi-Chang Lu, And Yung-Yu Chuang

Deep co-occurrence feature learning for visual object recognition

IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Poster Session

July 2017

This paper addresses three issues in integrating partbased representations into convolutional neural networks (CNNs) for object recognition. First, most part-based models rely on a few pre-specified object parts. However, the optimal object parts for recognition often vary from category to category. Second, acquiring training data with part-level annotation is labor-intensive. Third, modeling spatial relationships between parts in CNNs often involves an exhaustive search of part templates over multiple network streams. We tackle the three issues by introducing a new network layer, called co-occurrence layer. It can extend a convolutional layer to encode the co-occurrence between the visual parts detected by the numerous neurons, instead of a few pre-specified parts. To this end, the feature maps serve as both filters and images, and mutual correlation filtering is conducted between them. The co-occurrence layer is end-to-end trainable. The resultant co-occurrence features are rotation- and translation-invariant, and are robust to object deformation. By applying this new layer to the VGG-16 and ResNet-152, we achieve the recognition rates of 83.6% and 85.8% on the Caltech-UCSD bird benchmark, respectively. The source code is available at https://github.com/yafangshih/Deep-COOC.

Tak-Shing Chan And Yi-Hsuan Yang

Informed group-sparse representation for singing voice separation

IEEE Signal Processing Letters

February 2017

Singing voice separation attempts to separate the vocal and instrumental parts of a music recording, which is a fundamental problem in music information retrieval. Recent work on singing voice separation has shown that the low-rank representation and informed separation approaches are both able to improve separation quality. However, low-rank optimizations are computationally inefficient due to the use of singular value decompositions. Therefore, in this paper, we propose a new lineartime algorithm called informed group-sparse representation, and use it to separate the vocals from music using pitch annotations as side information. Experimental results on the iKala dataset confirm the efficacy of our approach, suggesting that the music accompaniment follows a group-sparse structure given a pretrained instrumental dictionary. We also show how our work can be easily extended to accommodate multiple dictionaries using the DSD100 dataset.

W.-J. Ko, J.-Y.- Yu, W.-Y. Chen, And Y.-C. F. Wang

Enhanced Canonical Correlation Analysis with Local Density for Cross-Domain Visual Classification

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Poster Session

March 2017

N/A