IEICE globals.ieice.org Site

Keyword Search Result

[Keyword] actor-critic(3hit)

1-3hit

Dynamic VNF Scheduling: A Deep Reinforcement Learning Approach
Zixiao ZHANG Fujun HE Eiji OKI

PAPER-Network

Pubricized:
2023/01/10
Vol:
E106-B No:7
Page(s):
557-570
This paper introduces a deep reinforcement learning approach to solve the virtual network function scheduling problem in dynamic scenarios. We formulate an integer linear programming model for the problem in static scenarios. In dynamic scenarios, we define the state, action, and reward to form the learning approach. The learning agents are applied with the asynchronous advantage actor-critic algorithm. We assign a master agent and several worker agents to each network function virtualization node in the problem. The worker agents work in parallel to help the master agent make decision. We compare the introduced approach with existing approaches by applying them in simulated environments. The existing approaches include three greedy approaches, a simulated annealing approach, and an integer linear programming approach. The numerical results show that the introduced deep reinforcement learning approach improves the performance by 6-27% in our examined cases.
Learning and Control Model of the Arm for Loading
Kyoungsik KIM Hiroyuki KAMBARA Duk SHIN Yasuharu KOIKE

PAPER-Biocybernetics, Neurocomputing

Vol:
E92-D No:4
Page(s):
705-716
We propose a learning and control model of the arm for a loading task in which an object is loaded onto one hand with the other hand, in the sagittal plane. Postural control during object interactions provides important points to motor control theories in terms of how humans handle dynamics changes and use the information of prediction and sensory feedback. For the learning and control model, we coupled a feedback-error-learning scheme with an Actor-Critic method used as a feedback controller. To overcome sensory delays, a feedforward dynamics model (FDM) was used in the sensory feedback path. We tested the proposed model in simulation using a two-joint arm with six muscles, each with time delays in muscle force generation. By applying the proposed model to the loading task, we showed that motor commands started increasing, before an object was loaded on, to stabilize arm posture. We also found that the FDM contributes to the stabilization by predicting how the hand changes based on contexts of the object and efferent signals. For comparison with other computational models, we present the simulation results of a minimum-variance model.
Reinforcement Learning for Continuous Stochastic Actions--An Approximation of Probability Density Function by Orthogonal Wave Function Expansion--
Hideki SATOH

PAPER-Nonlinear Problems

Vol:
E89-A No:8
Page(s):
2173-2180
A function approximation based on an orthonormal wave function expansion in a complex space is derived. Although a probability density function (PDF) cannot always be expanded in an orthogonal series in a real space because a PDF is a positive real function, the function approximation can approximate an arbitrary PDF with high accuracy. It is applied to an actor-critic method of reinforcement learning to derive an optimal policy expressed by an arbitrary PDF in a continuous-action continuous-state environment. A chaos control problem and a PDF approximation problem are solved using the actor-critic method with the function approximation, and it is shown that the function approximation can approximate a PDF well and that the actor-critic method with the function approximation exhibits high performance.