:::
In this paper, multi-stream transmission in interference networks aided by multiple amplify-and-forward (AF) relays in the presence of direct links is considered. The objective is to minimize the sum power of transmitters and relays by beamforming optimization under the stream signal-to-interference-plus-noise-ratio (SINR) constraints. For transmit beamforming optimization, the problem is a well-known non-convex quadratically constrained quadratic program (QCQP) that is NP-hard to solve. After semi-denite relaxation (SDR), the problem can be optimally solved via alternating direction method of multipliers (ADMM) algorithm for distributed implementation. Analytical and extensive numerical analyses demonstrate that the proposed ADMM solution converges to the optimal centralized solution. The convergence rate, computational complexity, and message exchange load of the proposed algorithm outperforms the existing solutions. Furthermore, by SINR approximation at the relay side, distributed joint transmit and relay beamforming optimization is also proposed that further improves the total power saving at the cost of increased complexity.
Video frame interpolation algorithms predict intermediate frames to produce videos with higher frame rates and smooth view transitions given two consecutive frames as inputs. We propose that: synthesized frames are more reliable if they can be used to reconstruct the input frames with high quality. Based on this idea, we introduce a new loss term, the cycle consistency loss. The cycle consistency loss can better utilize the training data to not only enhance the interpolation results, but also maintain the performance better with less training data. It can be integrated into any frame interpolation network and trained in an end-to-end manner. In addition to the cycle consistency loss, we propose two extensions: motion linearity loss and edge-guided training. The motion linearity loss approximates the motion between two input frames to be linear and regularizes the training. By applying edge-guided training, we further improve results by integrating edge information into training. Both qualitative and quantitative experiments demonstrate that our model outperforms the state-of-the-art methods. The source codes of the proposed method and more experimental results will be available at https://github.com/alex04072000/CyclicGen.
Customer reviews on platforms such as TripAdvisor and Amazon provide rich information about the ways that people convey sentiment on certain domains. Given these kinds of user reviews, this paper proposes UGSD, a representation learning framework for constructing domain-specific sentiment dictionaries from online customer reviews, in which we leverage the relationship between user-generated reviews and the ratings of the reviews to associate the reviewer sentiment with certain entities. The proposed framework has the following three main advantages. First, no additional annotations of words or external dictionaries are needed for the proposed framework; the only resources needed are the review texts and entity ratings. Second, the framework is applicable across a variety of user-generated content from different domains to construct domain-specific sentiment dictionaries. Finally, each word in the constructed dictionary is associated with a low-dimensional dense representation and a degree of relatedness to a certain rating, which enable us to obtain more fine-grained dictionaries and enhance the application scalability of the constructed dictionaries as the word representations can be adopted for various tasks or applications, such as entity ranking and dictionary expansion. The experimental results on three real-world datasets show that the framework is effective in constructing high-quality domain-specific sentiment dictionaries from customer reviews.
In order to learn object segmentation models in videos, conventional methods require a large amount of pixel-wise ground truth annotations. However, collecting such supervised data is time-consuming and labor-intensive. In this paper, we exploit existing annotations in source images and transfer such visual information to segment videos with unseen object categories. Without using any annotations in the target video, we propose a method to jointly mine useful segments and learn feature representations that better adapt to the target frames. The entire process is decomposed into two tasks: 1) solving a submodular function for selecting object-like segments, and 2) learning a CNN model with a transferable module for adapting seen categories in the source domain to the unseen target video. We present an iterative update scheme between two tasks to self-learn the nal solution for object segmentation. Experimental results on numerous benchmark datasets show that the proposed method performs favorably against the state-of-the-art algorithms.
Most existing or currently developing Internet of Things (IoT) communication standards are based on the assumption that the IoT services only require low data rate transmission and therefore can be supported by limited resources such as narrow-band channels. This assumption rules out those IoT services with burst traffic, critical missions, and low latency requirements. In this paper, we propose to utilize the idle devices in mission-critical IoT networks to boost the transmission data rate for critical tasks through multiple concurrent transmissions. This approach virtually expands the existing narrow-band IoT protocols to break the bandwidth limitation in order to provide low latency services for critical tasks. In this approach, we propose the task-balance method and the first-link descending order to determine the relay order and data partition in a given relay set. We theoretically prove that the optimal relay configuration that minimizes the uploading latency in single source scenario can be derived by the proposed algorithms in polynomial time when we have sufficient number of available channels. We also propose a greedy algorithm to approximate the optimal solution within a 1/2 performance lower bound in general scenarios. The simulation results shows that the proposed approach can reduce the latency of critical tasks up to 76% comparing with traditional approaches.
Wireless body area networks (WBANs) have emerged recently to provide health monitoring for chronic patients. In a WBAN, the patient's smartphone is deemed an appropriate sink to help forward the sensing data to back-end servers. Through a real-world case study, we observe that temporary disconnection between sensors and the associated smartphone can happen frequently due to postural changes, causing a significant amount of data to be lost forever. In this paper, we propose a scheme to parasitize the data in surrounding Wi-Fi networks whenever temporary disconnection occurs. Specifically, we model data parasitizing as an optimization problem, with the objective of maximizing the system lifetime without any data loss. Then, we propose an optimal offline algorithm to solve the problem, as well as an online algorithm that allows practical implementations. We have also implemented a prototype system, where the online algorithm serves as the underlying technique, based on Arduino. To evaluate our scheme, we conduct a series of experiments with the prototype system in controlled and real-world environments. The results show that the lifetime is prolonged by 100 times, and it could be further doubled if the health monitoring application permits a few packet losses.
Over the last decade, music-streaming services have grown dramatically. Pandora, one company in the field, has pioneered and popularized streaming music by successfully deploying the Music Genome Project [1] (https://www.pandora.com/about/mgp) based on human-annotated content analysis. Another company, Spotify, has a catalog of over 40 million songs and over 180 million users as of mid-2018 (https://press.spotify.com/us/about/), making it a leading music service provider worldwide. Giant technology companies such as Apple, Google, and Amazon have also been strengthening their music service platforms. Furthermore, artificial intelligence speakers, such as Amazon Echo, are gaining popularity, providing listeners with a new and easily accessible way to listen to music.
Music creation is typically composed of two parts: composing the musical score, and then performing the score with instruments to make sounds. While recent work has made much progress in automatic music generation in the symbolic domain, few attempts have been made to build an AI model that can render realistic music audio from musical scores. Directly synthesizing audio with sound sample libraries often leads to mechanical and deadpan results, since musical scores do not contain performance-level information, such as subtle changes in timing and dynamics. Moreover, while the task may sound like a text-to-speech synthesis problem, there are fundamental differences since music audio has rich polyphonic sounds. To build such an AI performer, we propose in this paper a deep convolutional model that learns in an end-toend manner the score-to-audio mapping between a symbolic representation of music called the pianorolls and an audio representation of music called the spectrograms. The model consists of two subnets: the ContourNet, which uses a U-Net structure to learn the correspondence between pianorolls and spectrograms and to give an initial result; and the TextureNet, which further uses a multi-band residual network to refine the result by adding the spectral texture of overtones and timbre. We train the model to generate music clips of the violin, cello, and flute, with a dataset of moderate size. We also present the result of a user study that shows our model achieves higher mean opinion score (MOS) in naturalness and emotional expressivity than a WaveNet-based model and two offthe- shelf synthesizers. We open our source code at https: //github.com/bwang514/PerformanceNet.
In the future, mobile systems will increasingly feature more advanced organic light-emitting diode (OLED) displays. The power consumption of these displays is highly dependent on the image content. However, existing OLED power-saving techniques either change the visual experience of users or degrade the visual quality of images in exchange for a reduction in the power consumption. Some techniques attempt to enhance the image quality by employing a compound objective function. In this paper, we present a win-win scheme that always enhances the image quality while simultaneously reducing the power consumption. We define metrics to assess the benefits and cost for potential image enhancement and power reduction. We then introduce algorithms that ensure the transformation of images into their quality-enhanced power-saving versions. Next, the win-win scheme is extended to process videos at a justifiable computational cost. All the proposed algorithms are shown to possess the win-win property without assuming accurate OLED power models. Finally, the proposed scheme is realized through a practical camera application and a video camcorder on mobile devices. The results of experiments conducted on a commercial tablet with a popular image database and on a smartphone with real-world videos are very encouraging and provide valuable insights for future research and practices.
Real-time computing provides insightful ways to explore the optimization in resource usages, especially from the time point of view. Nevertheless, real-time task scheduling is recognized by its high complexity when there are non-preemptive shared resources and multiple processors. When more and more practical factors in system designs are considered, such as energy consumption and memory allocation, even some sub-problems in real-time task scheduling become intractable. Although people often criticize various artificial assumptions in real-time task scheduling, they have to admit that ideas in real-time computing and their extensions, such as tradeoff in cost, performance, energy, and even the quality of service, can be applied to multi-dimensional optimization in system designs. In this direction, we witness the rapid development of the embedded system industry and join the task force in system designs, especially mobile devices and non-volatile memory systems. Resource management on mobile devices, with a special emphasis on user experience, should not only consider the response time but also the visual perception of users. Non-volatile memory has also blurred the boundary between the memory and the storage. It enables certain unified considerations of the main memory and storage and also in-memory computing. It shows the ways to break the boundaries between hardware and software layers and have better integration of computing and memory/storage units. The advances in mobile systems and memory innovations inspire the evolution of embedded system designs and have also brought us insights to solutions regarding how systems should be restructured and how computing should be done. They might also provide their feedback to real-time computing and even shape the future direction of real-time computing in various innovative ways.