Electrophoretic displays are ideal for self-powered systems, but currently require an uninterrupted power supply to carry out the full display update cycle. Although sensible for battery-powered devices, when directly applied to intermittently-powered systems, guaranteeing display update atomicity usually results in repeated execution until completion or can incur high hardware/software overheads, heavy programmer intervention and large energy buffering requirements to provide sufficient display update energy. This paper introduces the concept, design and implementation of accumulative display updating, which relaxes the atomicity constraints of display updating, such that the display update process can be accumulatively completed across power cycles, without the need for sufficient energy for the entire display update. To allow for process logical continuity, we track the update progress during execution and facilitate a safe display shutdown procedure to overcome physical and operability issues related to abrupt power failure. Additionally, a context-aware updating policy is proposed to handle data freshness issues, where the delay in addressing new update requests can cause the display contents to be in conflict with new data available. Experimental results on a Texas Instruments device with an integrated electrophoretic display show that, compared to atomic display updating, our design can significantly increase accurate forward progress, decrease the average response time of display updating and reduce time and energy wastage when displaying fresh data.
Device-free Wi-Fi indoor localization has received significant attention as a key enabling technology for many Internet of Things (IoT) applications. Machine learning-based location estimators, such as the deep neural network (DNN), carry proven potential in achieving high-precision localization performance by automatically learning discriminative features from the noisy wireless signal measurements. However, the inner workings of DNNs are not transparent and not adequately understood especially in the indoor localization application. In this paper, we provide quantitative and visual explanations for the DNN learning process as well as the critical features that DNN has learned during the process. Toward this end, we propose to use several visualization techniques, including: 1) dimensionality reduction visualization, to project the high-dimensional feature space to the 2D space to facilitate visualization and interpretation, and 2) visual analytics and information visualization, to quantify relative contributions of each feature with the proposed feature manipulation procedures. The results provide insightful views and plausible explanations of the DNN in device-free Wi-Fi indoor localization using channel state information (CSI) fingerprints.
Self-powered intermittent systems waste considerable I/O energy because volatile I/O modules repeatedly issue identical operations under power failure conditions, and also due to the use of the inefficient I/O stack originally developed for battery-powered systems. This paper presents the concept, design, and implementation of autonomous I/O, which can accumulatively and transparently complete I/O operations regardless of power stability. We define its two essential functionalities, separate the general I/O stack to make accumulatively-completed I/O operations transparent to application tasks, and propose an access protocol that allows for energy efficiency and compatibility with the general I/O stack. To evaluate the efficacy, we implement our design and conduct extensive experiments on a Texas Instruments device with commodity sensor and Wi-Fi modules. Experimental results show that autonomous I/O can achieve 1.8 times the throughout achieved with nonvolatile I/O when the power is relatively steady, while reducing the completion time of individual I/O operations by at least 34% with relatively unstable power.
Image deblurring aims to restore the latent sharp images from the corresponding blurred ones. In this paper, we present an unsupervised method for domain-specific single-image deblurring based on disentangled representations. The disentanglement is achieved by splitting the content and blur features in a blurred image using content encoders and blur encoders. We enforce a KL divergence loss to regularize the distribution range of extracted blur attributes such that little content information is contained. Meanwhile, to handle the unpaired training data, a blurring branch and the cycle-consistency loss are added to guarantee that the content structures of the deblurred results match the original images. We also add an adversarial loss on deblurred results to generate visually realistic images and a perceptual loss to further mitigate the artifacts. We perform extensive experiments on the tasks of face and text deblurring using both synthetic datasets and real images, and achieve improved results compared to recent state-of-the-art deblurring methods.
Graphics-intensive mobile games place different and varying levels of demand on the associated CPUs and GPUs. In contrast to the workload variability that characterizes games, the current design of the energy governor employed by mobile systems appears to be outdated. In this work, we review the energy-saving mechanism implemented in an Android system coupled with graphics-intensive gaming workloads from three perspectives: user perception, application status, and the interplay between the CPU and GPU. We observe that there are information gaps in the current system, which may result in unnecessary energy wastage. To resolve the problem, we propose an online user-centric CPU-GPU governing framework. To bridge the identified information gaps, we classify rendered game frames into redundant/changing frames to satisfy user demand, categorize an application into GPU sensitive/insensitive phases to understand the application’s demand, and determine the frequency scaling intents of the CPU and GPU to capture processor demand. In response to the measured demand, we employ a required workload estimator, a unified policy selector, and a frequency-scaling intent communicator in the framework to save energy. The proposed framework was implemented on an LG Nexus 5X smartphone, and extensive experiments with realworld 3D gaming applications were conducted. According to the experiment results, for an application which is low interactive and infrequent phase changing, the proposed framework can respectively reduce energy consumption by 25.3% and 39% compared with our previous work and Android governors while maintaining user experience.
Vehicular fog computing (VFC) is a promising approach to provide ultra-low-latency service to vehicles and end users by extending fog computing to the conventional vehicular networks. Parked vehicle assistance (PVA), as a critical technique in VFC, can be integrated with smart parking in order to exploit its full potentials. In this paper, we propose a smart VFC system, by combining both PVA and smart parking. A VFC-aware parking reservation auction is proposed to guide the on-the-move vehicles to the available parking places with less effort and meanwhile exploit the fog capability of parked vehicles to assist the delay-sensitive computing services by monetary rewards to compensate for their service cost. The proposed allocation rule maximizes the aggregate utility of the smart vehicles and the proposed payment rule guarantees incentive compatibility, individual rationality, and budget balance. We further provide an observation stage with dynamic offload pricing update to improve the offload efficiency and the profit of the fog system. The simulation results confirm the win–win performance enhancement to the fog node controller, the smart vehicles, and the parking places from the proposed design.
Covariates are factors that have a debilitating influence on face verification performance. In this paper, we comprehensively study two covariate related problems for unconstrained face verification: first, how covariates affect the performance of deep neural networks on the large-scale unconstrained face verification problem; second, how to utilize covariates to improve verification performance. To study the first problem, we implement five state-of-the-art deep convolutional networks and evaluate them on three challenging covariates datasets. In total, seven covariates are considered: pose (yaw and roll), age, facial hair, gender, indoor/outdoor, occlusion (nose and mouth visibility, and forehead visibility), and skin tone. We first report the performance of each individual network on the overall protocol and use the score-level fusion method to analyze each covariate. Some of the results confirm and extend the findings of previous studies, and others are new findings that were rarely mentioned previously or did not show consistent trends. For the second problem, we demonstrate that with the assistance of gender information, the quality of a precurated noisy large-scale face dataset for face recognition can be further improved. After retraining the face recognition model using the curated data, performance improvement is observed at low false acceptance rates.
Unsupervised domain adaptation algorithms aim to transfer the knowledge learned from one domain to another (e.g., synthetic to real images). The adapted representations often do not capture pixel-level domain shifts that are crucial for dense prediction tasks (e.g., semantic segmentation). In this paper, we present a novel pixel-wise adversarial domain adaptation algorithm. By leveraging image-to-image translation methods as a data augmentation technique, our key insight is that while the translated images between domains may differ in styles, their predictions for the task should be exactly the same. We exploit this property and introduce a cross-domain consistency loss that enforces our adapted model to produce consistent predictions. We validate our method through a wide variety of unsupervised domain adaptation tasks, including synthetic-to-real semantic segmentation, optical flow estimation, and depth prediction as well as adapting semantic segmentation models across different cities. Extensive experimental results demonstrate that the proposed approach performs favorably against the state-of-the-arts.
In this paper, we address a new task called instance co-segmentation. Given a set of images jointly covering object instances of a specific category, instance co-segmentation aims to identify all of these instances and segment each of them, i.e. generating one mask for each instance. This task is important since instance-level segmentation is preferable for humans and many vision applications. It is also challenging because no pixel-wise annotated training data are available and the number of instances in each image is unknown. We solve this task by dividing it into two sub-tasks, co-peak search and instance mask segmentation. In the former sub-task, we develop a CNN-based network to detect the co-peaks as well as co-saliency maps for a pair of images. A co-peak has two endpoints, one in each image, that are local maxima in the response maps and similar to each other. Thereby, the two endpoints are potentially covered by a pair of instances of the same category. In the latter subtask, we design a ranking function that takes the detected co-peaks and co-saliency maps as inputs and can select the object proposals to produce the final results. Our method for instance co-segmentation and its variant for object colocalization are evaluated on four datasets, and achieve favorable performance against the state-of-the-art methods. The source codes and the collected datasets are available at https://github.com/KuangJuiHsu/DeepCO3/.
This paper proposes a method for head pose estimation from a single image. Previous methods often predict head poses through landmark or depth estimation and would require more computation than necessary. Our method is based on regression and feature aggregation. For having a compact model, we employ the soft stagewise regression scheme. Existing feature aggregation methods treat inputs as a bag of features and thus ignore their spatial relationship in a feature map. We propose to learn a fine-grained structure mapping for spatially grouping features before aggregation. The fine-grained structure provides part-based information and pooled values. By utilizing learnable and non-learnable importance over the spatial location, different variant models as a complementary ensemble can be generated. Experiments show that our method outperforms the state-of-the-art methods including both the landmark free ones and the ones based on landmark or depth estimation. Based on a single RGB frame as input, our method even outperforms methods utilizing multi-modality information (RGB-D, RGB-Time) on estimating the yaw angle. Furthermore, the memory overhead of the proposed model is 100 smaller than that of previous methods.