資訊科技創新研究中心 | 近期研究成果

Chun-Hsiang Wang, Kang-Chun Fan, Chuan-Ju Wang, And Ming-Feng Tsai

UGSD: User Generated Sentiment Dictionaries from Online Customer Reviews

Machine Learning

January 2019

Customer reviews on platforms such as TripAdvisor and Amazon provide rich information about the ways that people convey sentiment on certain domains. Given these kinds of user reviews, this paper proposes UGSD, a representation learning framework for constructing domain-specific sentiment dictionaries from online customer reviews, in which we leverage the relationship between user-generated reviews and the ratings of the reviews to associate the reviewer sentiment with certain entities. The proposed framework has the following three main advantages. First, no additional annotations of words or external dictionaries are needed for the proposed framework; the only resources needed are the review texts and entity ratings. Second, the framework is applicable across a variety of user-generated content from different domains to construct domain-specific sentiment dictionaries. Finally, each word in the constructed dictionary is associated with a low-dimensional dense representation and a degree of relatedness to a certain rating, which enable us to obtain more fine-grained dictionaries and enhance the application scalability of the constructed dictionaries as the word representations can be adopted for various tasks or applications, such as entity ranking and dictionary expansion. The experimental results on three real-world datasets show that the framework is effective in constructing high-quality domain-specific sentiment dictionaries from customer reviews.

Yu-Lun Liu, Yi-Tung Liao, Yen-Yu Lin, And Yung-Yu Chuang

Deep Video Frame Interpolation using Cyclic Frame Generation

AAAI Conference on Artificial Intelligence (AAAI), Poster Session

January 2019

Video frame interpolation algorithms predict intermediate frames to produce videos with higher frame rates and smooth view transitions given two consecutive frames as inputs. We propose that: synthesized frames are more reliable if they can be used to reconstruct the input frames with high quality. Based on this idea, we introduce a new loss term, the cycle consistency loss. The cycle consistency loss can better utilize the training data to not only enhance the interpolation results, but also maintain the performance better with less training data. It can be integrated into any frame interpolation network and trained in an end-to-end manner. In addition to the cycle consistency loss, we propose two extensions: motion linearity loss and edge-guided training. The motion linearity loss approximates the motion between two input frames to be linear and regularizes the training. By applying edge-guided training, we further improve results by integrating edge information into training. Both qualitative and quantitative experiments demonstrate that our model outperforms the state-of-the-art methods. The source codes of the proposed method and more experimental results will be available at https://github.com/alex04072000/CyclicGen.

C. M. Yetis And R. Y. Chang

Distributed Multi-Stream Beamforming in MIMO Multi-Relay Interference Networks

IEEE Access

January 2019

In this paper, multi-stream transmission in interference networks aided by multiple amplify-and-forward (AF) relays in the presence of direct links is considered. The objective is to minimize the sum power of transmitters and relays by beamforming optimization under the stream signal-to-interference-plus-noise-ratio (SINR) constraints. For transmit beamforming optimization, the problem is a well-known non-convex quadratically constrained quadratic program (QCQP) that is NP-hard to solve. After semi-denite relaxation (SDR), the problem can be optimally solved via alternating direction method of multipliers (ADMM) algorithm for distributed implementation. Analytical and extensive numerical analyses demonstrate that the proposed ADMM solution converges to the optimal centralized solution. The convergence rate, computational complexity, and message exchange load of the proposed algorithm outperforms the existing solutions. Furthermore, by SINR approximation at the relay side, distributed joint transmit and relay beamforming optimization is also proposed that further improves the total power saving at the cost of increased complexity.

J.-Y. Lin, R. Y. Chang, C.-H. Lee, H.-W. Tsao, And H.-J. Su

Multi-Agent Distributed Beamforming with Improper Gaussian Signaling for MIMO Interference Broadcast Channels

IEEE Transactions on Wireless Communications

January 2019

For rate optimization in interference limited network, improper Gaussian signaling has shown its capability to outperform the conventional proper Gaussian signaling. In this work, we study a weighted sum-rate maximization problem with improper Gaussian signaling for the multiple-input multiple-output interference broadcast channel (MIMO-IBC). To solve this nonconvex and NP-hard problem, we propose an effective separate covariance and pseudo-covariance matrices optimization algorithm. In the covariance optimization, a weighted minimum mean square error (WMMSE) algorithm is adopted, and, in the pseudo-covariance optimization, an alternating optimization (AO) algorithm is proposed, which guarantees convergence to a stationary solution and ensures a sum-rate improvement over proper Gaussian signaling. An alternating direction method of multipliers (ADMM)-based multi-agent distributed algorithm is proposed to solve an AO subproblem with the globally optimal solution in a parallel and scalable fashion. The proposed scheme exhibits favorable convergence, optimality, and complexity properties for future large-scale networks. Simulation results demonstrate the superior sum-rate performance of the proposed algorithm as compared to existing schemes with proper as well as improper Gaussian signaling under various network configurations.

Yi-Wen Chen, Yi-Hsuan Tsai, Chu-Ya Yang, Yen-Yu Lin, And Ming-Hsuan Yang

Unseen Object Segmentation in Videos via Transferable Representations

Asian Conference on Computer Vision (ACCV)

December 2018

In order to learn object segmentation models in videos, conventional methods require a large amount of pixel-wise ground truth annotations. However, collecting such supervised data is time-consuming and labor-intensive. In this paper, we exploit existing annotations in source images and transfer such visual information to segment videos with unseen object categories. Without using any annotations in the target video, we propose a method to jointly mine useful segments and learn feature representations that better adapt to the target frames. The entire process is decomposed into two tasks: 1) solving a submodular function for selecting object-like segments, and 2) learning a CNN model with a transferable module for adapting seen categories in the source domain to the unseen target video. We present an iterative update scheme between two tasks to self-learn the nal solution for object segmentation. Experimental results on numerous benchmark datasets show that the proposed method performs favorably against the state-of-the-art algorithms.

Shang-Hong Hsu, Chi-Han Lin, Chih-Yu Wang, Wen-Tsuen Chen

Breaking Bandwidth Limitation for Mission-Critical IoTs using Semi-Sequential Multiple Relays

IEEE Internet of Things Journal

October 2018

Most existing or currently developing Internet of Things (IoT) communication standards are based on the assumption that the IoT services only require low data rate transmission and therefore can be supported by limited resources such as narrow-band channels. This assumption rules out those IoT services with burst traffic, critical missions, and low latency requirements. In this paper, we propose to utilize the idle devices in mission-critical IoT networks to boost the transmission data rate for critical tasks through multiple concurrent transmissions. This approach virtually expands the existing narrow-band IoT protocols to break the bandwidth limitation in order to provide low latency services for critical tasks. In this approach, we propose the task-balance method and the first-link descending order to determine the relay order and data partition in a given relay set. We theoretically prove that the optimal relay configuration that minimizes the uploading latency in single source scenario can be derived by the proposed algorithms in polynomial time when we have sufficient number of available channels. We also propose a greedy algorithm to approximate the optimal solution within a 1/2 performance lower bound in general scenarios. The simulation results shows that the proposed approach can reduce the latency of critical tasks up to 76% comparing with traditional approaches.

Yuan-Yao Shih, Pi-Cheng Hsiu, And Ai-Chun Pang

A Data Parasitizing Scheme for Effective Health Monitoring in Wireless Body Area Networks

IEEE Transactions on Mobile Computing

January 2019

Wireless body area networks (WBANs) have emerged recently to provide health monitoring for chronic patients. In a WBAN, the patient's smartphone is deemed an appropriate sink to help forward the sensing data to back-end servers. Through a real-world case study, we observe that temporary disconnection between sensors and the associated smartphone can happen frequently due to postural changes, causing a significant amount of data to be lost forever. In this paper, we propose a scheme to parasitize the data in surrounding Wi-Fi networks whenever temporary disconnection occurs. Specifically, we model data parasitizing as an optimization problem, with the objective of maximizing the system lifetime without any data loss. Then, we propose an optimal offline algorithm to solve the problem, as well as an online algorithm that allows practical implementations. We have also implemented a prototype system, where the online algorithm serves as the underlying technique, based on Arduino. To evaluate our scheme, we conduct a series of experiments with the prototype system in controlled and real-world environments. The results show that the lifetime is prolonged by 100 times, and it could be further doubled if the health monitoring application permits a few packet losses.

Bryan Wang And Yi-Hsuan Yang

PerformanceNet: Score-to-audio music generation with multi-band convolutional residual network

AAAI Conference on Artificial Intelligence

January 2019

Music creation is typically composed of two parts: composing the musical score, and then performing the score with instruments to make sounds. While recent work has made much progress in automatic music generation in the symbolic domain, few attempts have been made to build an AI model that can render realistic music audio from musical scores. Directly synthesizing audio with sound sample libraries often leads to mechanical and deadpan results, since musical scores do not contain performance-level information, such as subtle changes in timing and dynamics. Moreover, while the task may sound like a text-to-speech synthesis problem, there are fundamental differences since music audio has rich polyphonic sounds. To build such an AI performer, we propose in this paper a deep convolutional model that learns in an end-toend manner the score-to-audio mapping between a symbolic representation of music called the pianorolls and an audio representation of music called the spectrograms. The model consists of two subnets: the ContourNet, which uses a U-Net structure to learn the correspondence between pianorolls and spectrograms and to give an initial result; and the TextureNet, which further uses a multi-band residual network to refine the result by adding the spectral texture of overtones and timbre. We train the model to generate music clips of the violin, cello, and flute, with a dataset of moderate size. We also present the result of a user study that shows our model achieves higher mean opinion score (MOS) in naturalness and emotional expressivity than a WaveNet-based model and two offthe- shelf synthesizers. We open our source code at https: //github.com/bwang514/PerformanceNet.

Juhan Nam, Keunwoo Choi, Jongpil Lee, Szu-Yu Chou, And Yi-Hsuan Yang

Deep learning for audio-based music classification and tagging

IEEE Signal Processing Magazine

January 2019

Over the last decade, music-streaming services have grown dramatically. Pandora, one company in the field, has pioneered and popularized streaming music by successfully deploying the Music Genome Project [1] (https://www.pandora.com/about/mgp) based on human-annotated content analysis. Another company, Spotify, has a catalog of over 40 million songs and over 180 million users as of mid-2018 (https://press.spotify.com/us/about/), making it a leading music service provider worldwide. Giant technology companies such as Apple, Google, and Amazon have also been strengthening their music service platforms. Furthermore, artificial intelligence speakers, such as Amazon Echo, are gaining popularity, providing listeners with a new and easily accessible way to listen to music.

Chun-Han Lin, Chih-Kai Kang, And Pi-Cheng Hsiu

Quality-Enhanced OLED Power Savings on Mobile Devices

ACM Transactions on Design Automation of Electronic Systems

January 2019

In the future, mobile systems will increasingly feature more advanced organic light-emitting diode (OLED) displays. The power consumption of these displays is highly dependent on the image content. However, existing OLED power-saving techniques either change the visual experience of users or degrade the visual quality of images in exchange for a reduction in the power consumption. Some techniques attempt to enhance the image quality by employing a compound objective function. In this paper, we present a win-win scheme that always enhances the image quality while simultaneously reducing the power consumption. We define metrics to assess the benefits and cost for potential image enhancement and power reduction. We then introduce algorithms that ensure the transformation of images into their quality-enhanced power-saving versions. Next, the win-win scheme is extended to process videos at a justifiable computational cost. All the proposed algorithms are shown to possess the win-win property without assuming accurate OLED power models. Finally, the proposed scheme is realized through a practical camera application and a video camcorder on mobile devices. The results of experiments conducted on a commercial tablet with a popular image database and on a smartphone with real-world videos are very encouraging and provide valuable insights for future research and practices.