ESE 605-001: Modern Convex Optimization Spring 2021 Final exam

发布时间：2024-05-09

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ESE 605-001: Modern Convex Optimization

Spring 2021

Final exam

Date: May 2021

This is a 24 hour take-home inal. Please turn it in on Gradescope within 24 hours of your respective start times. The exam is open textbook, open notes, and open homework, and you may cite equations from the book, notes, and homework. You may also refer to online documentation for CVX* and your programming language of choice. Any other outside or online resources are strictly prohibited.

There are six problems on this exam. All problems have equal weight, but we will only count the best ive out of six questions towards your inal exam grade. Some are (quite) straightforward. Others, not so much. Be sure you are using the most recent version of CVX, CVXPY, or Convex.jl.

You may not discuss the exam with anyone until May 7th, 9PM EST, after everyone has taken the exam. The only exception is that you can ask the teaching staf for clariication, via e-mail – please do not use Piazza so as to avoid accidentally sending a public post that was meant to be private. We have made the exam as unambiguous and clear as possible, so we are unlikely to say much.

Assemble your solutions in order (problem 1, problem 2, problem 3, . . . ), starting a new page for each problem. Put everything associated with each problem (e.g., text, code, plots) together; do not attach code or plots at the end of the inal.

We will deduct points from long, needlessly complex solutions, even if they are correct. Our solutions are not long, so if you ind that your solution to a problem goes on and on for many pages, you should try to igure out a simpler one. You do not have to LATEXyour solutions, but we do expect neat, legible exams from everyone.

When a problem involves computation you must give all of the following: a clear discussion and justii- cation of exactly what you did, the source code that produces the result, and the inal numerical results or plots. Remember, the more information and explanation that you give us regarding your thought process, the easier it is for us to give you partial marks!

Files containing problem data can be found as ProbX.mat and ProbX.* iles here:

https://www.dropbox.com/sh/dd0vvd0wo7kh9uk/AAAix_d9j6HdFFiUBWgm7 s a?dl=0

For problems asking you to generate random data, please use the following random seeds:

. Matlab: randn(’state’,0); rand(’state’,0);

. Python: np.random.seed(0)

. Julia: Random.seed!(0);

Please respect Penn’s Academic Integrity Code: although we allow you to work on homework assignments in small groups, you cannot discuss the inal with anyone until everyone has taken it.

Check your email often during the exam, just in case we need to send out an important announcement.

Some of the data iles generate random data (with a ixed seed), which are not necessarily the same for Matlab, Python, and Julia.

By taking this exam you are agreeing to respect Penn’s Academic Integrity Code. Good luck!

1. An undirected graph with p vertices can be described by its adjacency matrix A 2 Sp given by

1 if there is an edge between nodes i and j

Aij = {

0 otherwise.

Two undirected graphs are isomorphic if the vertices of one graph can be permuted so that it is the same as the other, i.e., such that the same pairs of vertices are connected by edges. If we describe these graphs by their adjacency matrices A1 and A2 , then A1 is isomorphic to A2 if and only if there exists a permutation matrix P 2 Rp p such that PA1 P」 = A2 . Recall that a matrix P is a permutation matrix if each row and each column has exactly one entry equal to 1, and all other entries equal to 0. This problem comes up in several applications, such as determining if two molecular structures are the same, or whether a circuit’s physical layout and schematic diagram are consistent.

(a) Find a set of linear equalities and inequalities on P 2 Rp p , that together with a Boolean con- straint Pij 2 f0, 1g, are necessary and sufficient for P to be a permutation matrix satisfying PA1 P」 = A2 .

(b) Consider the relaxed version of the problem found in part (a), i.e., the LP that results when the constraints Pij 2 f0, 1g are replaced with Pij 2 [0, 1]. When this LP is infeasible, we know that two graphs are not isomorphic, and similarly, if a solution is found such that Pij 2 f0, 1g for all i,j, then the graphs are isomorphic because we have the original Boolean problem. However, this does not always happen even if the graphs are isomorphic.

A standard trick to encourage the entries of P to take on values in f0, 1g is to add a random linear objective to the relaxed feasibility LP, as this doesn’t change whether the problem is feasible or not. In other words, we minimize：i;j WijPij , where the Wij are chosen randomly, say from Ⅵ (0, 1). Carry out this scheme for the two isomorphic graphs with adjacency matrices A1 and A2 given in Prob1.* to ind a permutation matrix P that satisies PA1 P」 = A2 . Report the permutation vector given by the matrix-vector product Pv, where v = (1, 2, . . . , p). Verify that all the required conditions on P hold. To check that the entries of the solution of the LP are (close to) f0, 1g, report maxi;j Pij (1 - Pij ). And yes, you may have to attempt more than one instance of the randomized method described above before you inda permutation that establishes isomorphism of the two graphs: if that is the case, increase the random seed by one at each trial, i.e., trial 0 will have random seed‘0’, trial 1 will have random seed‘1’, etc.

2. A symmetric matrix is positive semideinite if and only if all of its principal minors are nonnegative. Here we consider approximations of the positive-semideinite cone by partially relaxing this condition.

Denote by K1;p the cone of matrices whose 1 × 1 principal minors (i.e., diagonal elements) are nonneg- ative, so that

K1;p = fM 2 Sp : Mii ≥ 0 for all ig.

Similarly, denote by K2;p the cone of matrices whose 1 × 1 and 2 × 2 principal minors are nonnegative:

K2;p = {M 2 Sp : lM(M)ij(ii) M(M)jj(ij)] > 0 for all i j } ,

i.e., the cone of symmetric matrices with positive semideinite 2 × 2 principal submatrices. These two cones are convex and proper and satisfy:

K1(*);p K2(*);p S K2;p K1;p ,

where K1(*);p and K2(*);p are the dual cones of K1;p and K2;p, respectively. The last two inclusions follow

immediately from the deinitions of the cones, and the irst two follow from the second bullet on page 53 of the textbook.

(a) Give an explicit characterization of K1(*);p. Argue that both K1;p and K1(*);p can be represented using

LPs.

(b) Give an explicit characterization of K2(*);p. Argue that both K2;p and K2(*);p can be represented using

SOCPs.

minimize trCX

subject to trAX = b

X 2 K

with variable X 2 Sp. The problem parameters are C 2 Sp , A 2 Sp , b 2 R, and the cone K Sp. Using the data in Prob2.* solve this problem ive times, each time replacing K with one of the

ive cones K1;p, K2;p , S, K2(*);p , K1(*);p.. Report the ive diferent optimal values you obtain.

Note: Python users who run into numerical issues might want to use the SCS solver by using prob.solve(solver=cvxpy.SCS).

Note: For parts (a) and (b), the shorter and clearer your description is, the more points you will receive. At the very least, it should be possible to implement your description in CVX*.

3. Flux balance analysis is based on a very simple model of the reactions going on in a cell, keeping track only of the gross rate of consumption and production of various chemical species within the cell. Based on the known stoichiometry of the reactions, and known upper bounds on some of the reaction rates, we can compute bounds on the other reaction rates, or cell growth, for example.

We focus on m metabolites in a cell, labeled M1 , . . . , Mm . There are n reactions going on, labeled, R1 , . . . , Rn , with nonnegative reaction rates v1 , . . . , vn. Each reaction has a (known) stoichiometry, which tells us the rate of consumption and production of the metabolites per unit of reaction rate. The stoichiometry is given by the stoichiometry matrix S 2 Rm ×n , deined as follows: Sij is the rate of production of Mi due to unit reaction rate vj = 1. Here we consider consumption of a metabolite as negative production; so Sij = 一2, for example, means that reaction Rj causes metabolite Mi to be consumed at a rate 2vj .

As an example, suppose reaction R1 has the form M1 ! M2 + 2M3. The consumption rate of M1 , due to this reaction, is v1 ; the production rate of M2 is v2 ; and the production rate of M3 is 2v1. Here, the reaction R1 has no efect on the metabolites M4 , . . . , Mm . This corresponds to a irst column of S of the form (一1, 1, 2, 0, . . . , 0).

Reactions are also due to model low of metabolites into and out of the cell. For example, suppose that reaction R2 corresponds to the low of metabolite M1 into the cell, with v2 giving the low rate. This corresponds to a second column of S of the form (1, 0, . . . , 0).

The last reaction, Rn , corresponds to biomass creation, or cell growth, so the reaction rate vn is the cell growth rate. The last column of S gives the amounts of metabolites used or created per unit of cell growth rate.

Since our reactions include metabolites entering or leaving the cell, as well as those converted to biomass within the cell, we have conservation of the metabolites, which can be expressed as Sv = 0. In addition, we are given upper limits on some of the reaction rates, which we express as v 京 vmax where we set vj(max) = 1 if no upper limit on reaction j is known. The goal is to ind the maximum

possible cell growth (i.e., the largest possible value of vn ) consistent with the constraints

Sv = 0, v > 0, v 京 vmax.

The questions below pertain to the data found in Prob3 .*.

(a) Find the maximum possible cell growth rate G* , as well as the optimal Lagrange multipliers for the reaction rate limits. How sensitive is the maximum growth rate to the various reaction rate limits?

Hint. These documentation pages may come in handy:

. Matlab: http://cvxr.com/cvx/doc/basics.html#dual-variables

. Python: https://www.cvxpy.org/tutorial/advanced/index.html#dual-variables . Julia: https://www.juliaopt.org/Convex.jl/stable/advanced/#Dual-Variables-1

(b) For simplicity, we’ll assume that each reaction is controlled by an associated gene, i.e., gene Gi controls reaction Ri. Knocking out a set of genes associated with some reactions has the efect of setting the reaction rates (or equivalents the associated entries of vmax ) to zero, which of course reduces the maximum possible growth rate. If the maximum growth rate becomes small enough or zero, it is reasonable to guess that knocking out the set of genes will kill the cell. An essential gene is one that when knocked out reduces the maximum growth rate below a given threshold Gmin. Note that Gn is always an essential gene by this deinition. A synthetic lethal is a pair of non-essential genes that when knocked out reduces the maximum growth rate below the threshold. Find all essential genes and synthetic lethals for the given problem instance, using the threshold Gmin = 0.2G* .

4. Implement the infeasible start Newton method for solving the centering problem arising in the standard form LP,

minimize c」x - ： log xi

subject to Ax = b,

with variable x. The data are A 2 Rp ×n , with p < n, c 2 Rn , and b 2 Rp. You can assume that A has full row-rank. Note that this problem cannot be solved when it is infeasible or when it is unbounded below.

Your code should accept A, b, c, and an initial iterate x0 and return the primal optimal point x* , the primal dual optimal point * , and the number of Newton steps executed. The initial iterate x0 must satisfy x0 > 0 but it need not satisfy the equality constraints.

Use block elimination to compute the Netwon step, α = 0.1, β = 0.5 for the backtracking line- search parameters, and plot Ir(x, )I2 , the norm of the concatenated primal and dual residuals, versus the iteration number k for randomly generated A, b, c and initial conditions to verify that your implementation achieves quadratic convergence. In particular, for p = 80, n = 250, generate one set of random problem parameters (A,b, c) and 5 randomly initial conditions x0 for random seeds f0, 1, 2, 3, 4g, and plot the corresponding evolution of Ir(x, )I2 vs. number of iterations k. As stopping criterion, you can use Ir(x, )I2 10-6 (which means that the problem was solved) or some maximum number of iterations (say 50) was reached, which means it was not solved (likely because the problem is either infeasible or unbounded below).

To generate problem data (i.e., A, b, c, x0 ) that are feasible, you can irst generate A, then random positive vector q, and set b = Aq. You can be sure that the problem is not unbounded by making one row of A have all positive entries. You may also want to check that A is full rank.

Hint. When domf is not all of Rn , make sure that you irst multiply t by β until x+tΔx 2 domf before you start to check whether the backtracking stopping inequality f (x + tΔx) f (x) + αtΔf (x)」Δx holds.

5. A dynamical system (such as a chemical plant, or linearized model of a robotic actuator) is characterized by y = Gu + v, where y 2 Rp is the output, u 2 Rp is the input, and v 2 Rp is a disturbance signal. The matrix G 2 Rp ×p is called a system input-output matrix. The input signal u is found using a linear feedback control policy u = Ky where K 2 Rp ×p is the feedback gain matrix, which is what we need to design. From the equations above, we have

y = (I - GK)-1 v, u = K(I - GK)-1 v.

We will assume that (I - GK) is invertible.

The disturbance v is random satisfying Ev = 0, Evv」 = σ 2 I, where σ is known. The objective is

to minimize Pi=1;:::;pEyi(2), the sum of the mean square values of the output components, subject to

the constraint that Eui(2) 1, i = 1, . . . , p, i.e., each input component has a mean square value not

exceeding one. The variable to be chosen is the feedback gain matrix K 2 Rp ×p.

(a) Explain how to use convex (or quasi-convex) optimization to indan optimal feedback gain matrix. You can assume that the matrices arising in any change of variables you make are invertible; you do not need to worry about any special cases when they are not. You may also assume that G is invertible if you are stuck, but we will deduct a few points from these answers.

(b) Carry out your method for the problem instance with data

06(.)3 00(.)13 03(.)9 00(.)42

σ = 1, G = -0 3 -0 6 0 2 0 5 .

-0.(.)1 0.8(.) -0(.).2 0.(.)4

Give an optimal K and the associated optimal objective value.

6. In this problem we will consider the following basic portfolio optimization problem:

maximize μ」w - (λ/2)w」 Σw

subject to 1」w = 1,

with variable w 2 Rp representing the normalized portfolio (negative entries meaning short positions), and data μ (mean return), Σ 2 S (return covariance), and λ > 0 (the risk aversion parameter). The return covariance has the factor form Σ = FQF」 + D where F 2 Rp ×r (with rank r) is the factor loading matrix, Q 2 S is the factor covariance matrix, and D is a diagonal matrix with positive entries, called the idiosyncratic risk (since it describes the risk of each asset that is independent of the factors). This form for Σ is referred to as a‘r-factor risk model’. Some typical dimensions are p = 2500 (assets) and r = 30 (factors).

(a) What is the lop count (up to leading order) for computing the optimal portfolio if the low-rank plus diagonal structure of Σ is not exploited. You can assume that λ = 1 without loss of generality by absorbing it into Σ .

(b) Explain how to compute the optimal portfolio more efficiently, and give the lop count (up to leading order) for your method. You can assume that r << p. You do not have to give the best method; any method that has linear complexity in p is ine. Once again, you can assume that λ = 1.

Hint:. You may want to introduce a new variable y = F」w and work with the matrix

G = I 0(1) FI] 2 R(p+r) × (1+r)

treating it as dense, ignoring the little exploitable structure in it.

(c) Carry out your method from part (b) on ive randomly generated problem instances with di- mensions p = 2500 and r = 30 for random seeds f0, 1, 2, 3, 4g. For comparison (and as a check on your method), compute the optimal portfolio using the method of part (a) as well. give the (approximate) CPU time for each method, using tic and toc in Matlab/Julia, or by computing the diference of time.time() in Python.

Hints: After you generate D and Q randomly, add 0.1I to each to avoid any issues related to poor conditioning. Also, to be able to invert a block diagonal matrix efficiently, you’ll need to recast it as sparse. These functions/modules may come in handy: sparse in Matlab; scipy.sparse.diags,

scipy.sparse, scipy.sparse.linalg, scipy.sparse.linalg.factorized, and scipy.sparse.linalg.spsolve in Python; SparseArrays in Julia.

(d) Now suppose we want to compute the optimal portfolio for N values of the risk aversion parameter λ . Explain how to do this efficiently, and give the complexity in terms of N , p, and r. Compare the complexity of using the method of part (b) N times. Hint: Show that the optimal portfolio is an affine function of 1/λ .