Programming Interview
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Programming Interview
Question 1:
Read the following paper https://www.biorxiv.org/content/10.1101/2021.07.08.451443v1.abstract and write a report limited to 3 pages, including citations, figures etc. Reports beyond 3 pages will be rejected automatically.
Implement a simpler version of the method using the MNIST data set for regression on digit 0 and digit 7. Each bag consists of 100 images with a fraction x of digit 0 and 1-x of digit 7. Then train neural network on regression using the neural network architecture specified in the given paper.
Report and code is graded based on:
1. Clarity
2. Show of understanding in the biological and cancer domain knowledge
3. Show of understanding in the machine learning technology
4. Generate results on the MNIST toy data set – graphs and plots to show that your code is working
5. Put a version of your source code in github, code is graded based on:
a. Good code design
b. Good coding habits
c. Correctness
Question 2: Enzyme Kinetics
Enzymes are catalysts that help convert molecules that we will call substrates into other molecules that we will products. They themselves are not changed by the reaction. Within cells, enzymes are typically proteins. They can speed up biological reactions, sometimes by up to millions of times. They are also regulated by a very complex set of positive and negative feedback systems. Computational biologists are painstakingly mapping out this complex set of reactions. In this problem, we will model and simulate a simplified enzyme reaction.
An enzyme E converts the substrate S into the product P through a two-step process. First, E forms a complex with S to form an intermediate species ES in a reversible manner at the forward rate k1 and reverse rate k2. The intermediate ES then breaks down into the product P at a rate k3, thereby releasing E. Schematically, we write
8.1. Using the law of mass action, write down four equations for the rate of changes of the four species, E, S, ES, and P.
8.2. Write a code to numerically solve these four equations using the fourth-order Runge- Kutta method. For this exercise, assume that the initial concentration of E is 1 µM, the initial concentration of S is 10 µM, and the initial concentrations of ES and P are both 0. The rate constants are: k1=100/µM/min, k2=600/min, k3=150/min.
8.3. We define the velocity, V, of the enzymatic reaction to be the rate of change of the product P. Plot the velocity V as a function of the concentration of the substrate S. You should find that, when the concentrations of S are small, the velocity V increases approximately linearly. At large concentrations of S, however, the velocity V saturates to a maximum value, Vm. Find this value Vm from your plot.
Question 3:
Read the below paper and write a report limited to 5 pages, including citations, figures etc. Reports beyond 5 pages will be rejected automatically.
1-s2.0-S1359644621
004554-main.pdf
The report should contain whether you think doppelganger effects are unique to biomedical data, and how you think it can be avoided in the practice and development of machine learning models for health and medical science.
Extra points are awarded if [1] you can find interesting examples in other data types e.g. imaging, gene sequencing, metabonomics. [2] Demonstrate clear understanding on how these doppelganger effects emerge from a quantitative angle, and [3] propose interesting and useful ways of avoiding or checking for doppelganger effects.
.
2022-02-25