ME/EE 239: Optimal Data-Driven Control (Spring 2022)

发布时间：2022-05-18

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ME/EE 239: Optimal Data-Driven Control (Spring 2022)

Final Project Description

Description

During this project, you are required to design data-driven optimal controllers for an unknown system using 3 distinct methods:

(i) 1 model-based method from class,

(ii) 1 model-free method from class, and

(iii) 1 method not covered in class.

Instead of providing you with pre-collected data, you will be given direct access to the system (see below) to collect your own input-output data and test your models and controllers. All you know from the system is the following:

• It has one input and one output channel

• It runs in continuous time

The objective is to design a controller, in open-loop or closed-loop, that minimizes the quadratic cost

J = Z0T y2 (t) + λu2 (t), T = 1000s,λ = 10

If you design your controller in open-loop, you minimize J over all signals u(t),t ∈ [0,T] to find the optimal control signal u ∗ (t), whereas if you design your controller in closed-loop, you minimize J over all functions π : R → R to find the optimal control policy u(t) = π∗ (y(t)),t ∈ [0,T].

You can write your codes in MATLAB or Python, but the input-output interface with the system is using MATLAB files (details below). You have to hard-code the two methods that you choose from the class (for the most part, ask if you are unsure about using pre-existing software here), but you can use codes from Github or any other general-purpose, publicly available software for your third method. You cannot, of course, have someone (a friend, someone over the internet, etc.) write code for you.

Access to the System (Collecting Data, Testing Models and Controllers)

The system is running continuously on the cloud (in simulation). Once the group compositions are finalized, the members of each group will be given access to a Google Drive folder where they have to place any input that they want to apply to the system and collect the resulting outputs from the system. As noted earlier, the system is running in continuous time, so you will have two ways to provide control input to the system:

(i) Open-loop: you place one file called IO .mat in your group’s folder which contains (at least) 3 variables: a vector called u that contains the discretized input sequence you want to apply to the system, a scalar called fs giving the sampling frequency of your input (and the corresponding output), and a logical variable called run. If run == 1, the simulator will run the simulation by first using zero-order hold (ZOH) to turn your discrete input sequence u to a continuous-time signal uc and then applying it to the system:

uc(t) = u(k), kTs ≤ t < (k + 1)Ts, Ts = 1/fs

Therefore, the first element of u is applied throughout the interval [0,Ts], the second element is applied throughout the interval [Ts , 2Ts], and so on. Once the simulation is complete, the system’s output is placed in the form of a vector called y in the same file IO .mat and the value of run is set to 0 therein. The simulator will ignore this file until you change run back to 1.

NOTE: When using open-loop simulation, your sampling frequency cannot exceed 100Hz (mim- icking the frequency limit of any digital to analogue converter (DAC)) and your total simulation time (number of elements in u times Ts = 1/fs) cannot exceed one hour per each simulation. You can, however, run repeated simulations as many times as you want. If the limit on sampling fre- quency or total simulation time is exceeded, an error will be thrown and the simulation will not be performed.

(ii) Closed-loop: you place one file called IO .mat in your group’s folder which contains (at least) 4 variables: a function handle called pi that contains the feedback policy mapping outputs to inputs, a scalar called T that indicates for how long the simulation must be run, a scalar called fs giving the sampling frequency for the output, and a logical variable called run as in the open-loop case. In this case, the simulations are run directly in continuous-time and the sampling frequency is only needed for recording the output. At any point in time, the simulator sets

u(t) = π(y(t)), 0 ≤ t ≤ T

where π is the function handle you provide. Once the simulation is complete, the system’s output is placed in the form of a vector called y in the same IO .mat, and the value of the run is set to 0 as in the open-loop case. You can use anonymous function handles or function handles that refer to a file. In the latter case, make sure you place the needed files in the same folder.

NOTE: As in the open-loop case, the simulations in each run are limited to one hour (T ≤ 3600) and the sampling frequency is limited to 100Hz. Otherwise, an error is thrown and the simulation will not be performed.

Notice that both open-loop and closed-loop simulations use the same file IO .mat. To avoid con- fusion, once you have collected the results of your simulation, delete this file and write it again with u, fs, run or pi, T, fs, run. To decide whether to run the simulation in open-loop or closed-loop, the simulator will first check (after checking if run == 1) whether the variables u, fs are present. If so, the simulation will be run in open-loop even if pi, T are present. Otherwise, the simulator will check if pi, T, fs are present and, if so, will run the simulation in closed-loop.

In both cases, if an error occurs (due to violating the time/rate limits or organically during the sim- ulation of the system), y will be set to NaN and an additional error structure called ME will be placed in IO .mat. Inspect this structure (for example, by calling error(ME) in MATLAB) to fix the error and re-run the simulation.

The simulator will loop through the groups, so you may need to wait before your updated simulation is run again. However, expect to have your simulation run in less than a minute each time.

Choices for Method 3

You are completely free in choosing your third method, but here are some ideas:

• Deep learning (model-based + MPC or model-free)

• Reinforcement learning (model-based or model-free)

• Any of the myriad of methods from Chapter 5 or the more recent tutorial paper:

Schoukens, J., & Ljung, L. (2019). Nonlinear system identification: A user-oriented road map. IEEE Control Systems Magazine, 39(6), 28-99.

• non-parameteric frequency-domain methods (Ljung’s chapter 6) + MPC

• Sparse Identification of Nonlinear Dynamics (SINDy) + MPC

• Proportioanl-integral-derivative (PID) control

State Estimators (Kalman Filter and Extensions) and Delay Embedding

Many methods, both for system identification and control, will need you to have full state measurements from your system. In most real-world situations, however, you don’t. This is why you don’t have access to full system’s state in this project either. So you need some way of extracting the system’s state x(t) from the input u(t) and output y(t) data you have collected. There are many ways to do this, and here are some suggestions:

• Kalman Filter (KF) based on Certainty Equivalence: the most standard way to extract the states of a system from its input-output data is KF (linear) or its nonlinear extensions. The problem is that KF needs you to know the true model of your system, which in this case you don’t. But what you can do is first learn a model from your input-output data and then use that model as the true model of your system for KF. This assumption of your learned model as the true model of your system is called Certainty Equivalence (CE). Also, note that the KF needs the model of your system to be in state-space form, so if you have learned an input-output (transfer function) model, you can turn it into a state space model using realization theory or simply by calling the ss function in MATLAB. For nonlinear systems, the two common extensions of KF are the EKF and the UKF, where the latter often works better. MATLAB has both of them implemented, and you can use them if you want.

• Simultaneous parameter and state estimation: this is a slightly more sophisticated version of the above which does not use the CE assumption and often works better. What you need is a parametric model in state space form, linear or nonlinear. Say, in the nonlinear case, you have

x(t + 1) = f(x(t),u(t), θ) + w(t)

y(t) = h(x(t), θ) + v(t)

where θ is an unknown vector of model parameters. Given that this is a constant vector and does not change with time, you can augment the state of the system to get

x(t + 1) = f(x(t),u(t), θ) + w(t)

θ(t + 1) = θ(t) + wθ(t)

y(t) = h(x(t), θ) + v(t)

so now the state of your system is and you can run a standard KF/UKF to learn both x(t)

and θ(t) at the same time. Note that θ can include unknown noise covariances that you need for

KF as well.

• Delay Embedding: The problem with both of the above methods is that you first need to choose a parametric family of models. So if your choice of your model is not good, nor will be your state estimates. If you want to estimate your states independent of any models, you can use delay embedding. There are many online tutorials and Youtube videos that you can use to learn the

method, but in a nutshell, the idea is that for essentially any linear or nonlinear system,

x(t) =

y(t − 2τ) ∈ Rn

is a good estimate of the state of the system provided that the delay parameter τ and the state dimension n are chosen properly. The latter is not trivial, and there is a vast literature just on how to choose τ and n, but probably you don’t want to go there. Just use trial and error and pick some combination of τ and n that seems to work. Keep in mind that τ is a more important parameter to tune. n can always be chosen larger than its optimal value only at the cost of getting a larger state vector that can slow down your computations.