:::
Deep neural network (DNN) inference on intermittently-powered battery-less devices has the potential to unlock new possibilities for sustainable and intelligent edge applications. Existing intermittent inference approaches preserve progress information separate from the computed output features during inference. However, we observe that even in highly specialized approaches, the additional overhead incurred for inference progress preservation still accounts for a significant portion of the inference latency. This work proposes the concept of stateful neural networks, which enables a DNN to indicate the inference progress itself. Our runtime middleware embeds state information into the DNN such that the computed and preserved output features intrinsically contain progress indicators, avoiding the need to preserve them separately. The specific position and representation of the embedded states jointly ensure both output features and states are not corrupted while maintaining model accuracy, and the embedded states allow the latest output feature to be determined, enabling correct inference recovery upon power resumption. Evaluations were conducted on different Texas Instruments devices under varied intermittent power strengths and network models. Compared to the state of the art, our approach can speed up intermittent inference by 1.3 to 5 times, achieving higher performance when executing modern convolutional networks with weaker power.
Internet-of-Things (IoT) devices are gradually adopting battery-less, energy harvesting solutions, thereby driving the development of an intermittent computing paradigm to accumulate computation progress across multiple power cycles. While many attempts have been made to enable standalone intermittent systems, little attention has focused on IoT networks formed by intermittent devices. We observe that the computation progress improved by \textit{distributed task concurrency} in an intermittent network can be significantly offset by data unavailability due to frequent system failures. This paper presents an intermittent-aware distributed concurrency control protocol which leverages existing data copies inherently created in the network to improve the computation progress of concurrently executed tasks. In particular, we propose a borrowing-based data management method to increase data availability and an intermittent two-phase commit procedure incorporated with distributed backward validation to ensure data consistency in the network. The proposed protocol was integrated into a FreeRTOS-extended intermittent operating system running on Texas Instruments devices. Experimental results show that the computation progress can be significantly improved, and this improvement is more apparent under weaker power, where more devices will remain offline for longer duration.
Mobile Edge Computing (MEC) is a promising technique in the 5G Era to improve the Quality of Experience (QoE) for online video streaming due to its ability to reduce the backhaul transmission by caching certain content. However, it still takes effort to address the user association and video quality selection problem under the limited resource of MEC to fully support the low-latency demand for live video streaming. We found the optimization problem to be a non-linear integer programming, which is impossible to obtain a globally optimal solution under polynomial time. In this paper, we formulate the problem and derive the closed-form solution in the form of Lagrangian multipliers; the searching of the optimal variables is formulated as a Multi-Arm Bandit (MAB) and we propose a Deep Deterministic Policy Gradient (DDPG) based algorithm exploiting the supply-demand interpretation of the Lagrange dual problem. Simulation results show that our proposed approach achieves significant QoE improvement, especially in the low wireless resource and high user number scenario compared to other baselines.
Beamforming is regarded as a promising technique for future wireless communication systems. In this regard, codebook-based beamforming offers satisfactory performance with acceptable computational complexity; however, it requires a high power-consumed beam overhead. To achieve balance between the utilized beam overhead and the achieved spectral efficiency performance, we propose a super-resolution-based scheme using a hierarchical codebook. We consider beam sweeping as an inference problem, where low-resolution beam radiating responses are used as an input, and high-resolution beam sweeping responses are output. Simulation results confirm that the proposed scheme exhibits extraordinary performance-overhead tradeoffs as compared with state-of-the-art codebook-based beamforming designs.
Vehicle positioning is a key component of autonomous driving. The global positioning system (GPS) is the most commonly used vehicle positioning system currently. However, its accuracy will be affected by environmental differences and thus fails to meet the requirements of meter-level accuracy. We consider a coordinate neighboring vehicle positioning system (CNVPS) based on GPS, omnidirectional radar, and V2V communication ability to obtain additional information from neighboring vehicles to improve the GPS positioning accuracy of vehicles in various environments. We further use the concept of transfer learning (TL) wherein an adversarial mechanism is designed to eliminate the deviation of multiple environments to optimize vehicle positioning accuracy in multiple environments using one model. The simulation results show that, compared with the existing methods, the proposed system architecture not only improves the performance but also effectively reduces the amount of data required for training.
Device-free wireless indoor localization is an essential technology for the Internet of Things (IoT), and fingerprint-based methods are widely used. A common challenge to fingerprint-based methods is data collection and labeling. This paper proposes a few-shot transfer learning system that uses only a small amount of labeled data from the current environment and reuses a large amount of existing labeled data previously collected in other environments, thereby significantly reducing the data collection and labeling cost for localization in each new environment. The core method lies in graph neural network (GNN) based few-shot transfer learning and its modifications. Experimental results conducted on real-world environments show that the proposed system achieves comparable performance to a convolutional neural network (CNN) model, with 40 times fewer labeled data.
Millimeter wave (mmWave) is a key technology for fifth-generation (5G) and beyond communications. Hybrid beamforming has been proposed for large-scale antenna systems in mmWave communications. Existing hybrid beamforming designs based on infinite-resolution phase shifters (PSs) are impractical due to hardware cost and power consumption. In this paper, we propose an unsupervised-learning-based scheme to jointly design the analog precoder and combiner with low-resolution PSs for multiuser multiple-input multiple-output (MU-MIMO) systems. We transform the analog precoder and combiner design problem into a phase classification problem and propose a generic neural network architecture, termed the phase classification network (PCNet), capable of producing solutions of various PS resolutions. Simulation results demonstrate the superior sum-rate and complexity performance of the proposed scheme, as compared to state-of-the-art hybrid beamforming designs for the most commonly used low-resolution PS configurations.
Device-free wireless indoor localization is a key enabling technology for the Internet of Things (IoT). Fingerprint-based indoor localization techniques are a commonly used solution. This paper proposes a semi-supervised, generative adversarial network (GAN)-based device-free fingerprinting indoor localization system. The proposed system uses a small amount of labeled data and a large amount of unlabeled data (i.e., semi-supervised), thus considerably reducing the expensive data labeling effort. Experimental results show that, as compared to the state-of-the-art supervised scheme, the proposed semi-supervised system achieves comparable performance with equal, sufficient amount of labeled data, and significantly superior performance with equal, highly limited amount of labeled data. Besides, the proposed semi-supervised system retains its performance over a broad range of the amount of labeled data. The interactions between the generator, discriminator, and classifier models of the proposed GAN-based system are visually examined and discussed. A mathematical description of the proposed system is also presented.
Current peripheral execution approaches for intermittently-powered systems require full access to the internal hardware state for checkpointing or rely on application-level energy estimation for task partitioning to make correct forward progress. Both requirements present significant practical challenges for energy-harvesting, intelligent edge IoT devices, which perform hardware accelerated DNN inference. Sophisticated compute peripherals may have inaccessible internal state, and the complexity of DNN models makes it difficult for programmers to partition the application into suitably sized tasks that fit within an estimated energy budget. This paper presents the concept of inference footprinting for intermittent DNN inference, where accelerator progress is accumulatively preserved across power cycles. Our middleware stack, HAWAII, tracks and restores inference footprints efficiently and transparently to make inference forward progress, without requiring access to the accelerator internal state and application-level energy estimation. Evaluations were carried out on a Texas Instruments device, under varied energy budgets and network workloads. Compared to a variety of task-based intermittent approaches, HAWAII improves the inference throughput by 5.7% to 95.7%, particularly achieving higher performance on heavily accelerated DNNs.
Spatiotemporal super-resolution (SR) aims to upscale both the spatial and temporal dimensions of input videos, and produces videos with higher frame resolutions and rates. It involves two essential sub-tasks: spatial SR and temporal SR. We design a two-stream network for spatiotemporal SR in this work. One stream contains a temporal SR module followed by a spatial SR module, while the other stream has the same two modules in the reverse order. Based on the interchangeability of performing the two sub-tasks, the two network streams are supposed to produce consistent spatiotemporal SR results. Thus, we present a cross-stream consistency to enforce the similarity between the outputs of the two streams. In this way, the training of the two streams is correlated, which allows the two SR modules to share their supervisory signals and improve each other. In addition, the proposed cross-stream consistency does not consume labeled training data and can guide network training in an unsupervised manner. We leverage this property to carry out semi-supervised spatiotemporal SR. It turns out that our method makes the most of training data, and can derive an effective model with few high-resolution and high-frame-rate videos, achieving the state-of-the-art performance.