-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I optimality criterion #428
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: Johannes P. Dürholt <johannespeter.duerholt@evonik.com>
Co-authored-by: Johannes P. Dürholt <johannespeter.duerholt@evonik.com>
Co-authored-by: Johannes P. Dürholt <johannespeter.duerholt@evonik.com>
Setz mal folgendes an:
1. Design mit drei Faktoren X1, X2, X3, alle -1 ... 1
2. Constraint 1: X1 + X2 <= 0.8
3. Constraint 2: X1 + X2 >= 0.2
4. Modell: Y ~ X1 + X2 + X3 + X1*X2 + X1*X3 + X2*X3 + X1^2 + X2^2 + X3^2
5. 16 Runs
Ergebnis jmp:
0,8454641457 -0,340784303 0
1 -0,8 -1
1 -0,261259178 1
-0,49600524 1 1
-0,8 1 0
1 -0,8 1
0,3972024633 0,0427975367 0
-0,2 1 0
1 -0,2 -1
-0,08 0,88 -1
-0,512263201 1 -1
-0,08 0,28 1
0,16 0,64 1
-0,08 0,28 -1
0,2386392856 0,2013607144 0
0,3420421924 0,1377730655 0
***@***.***
Dr. Kai Michael Exner
Senior Specialist Digitalization (Data Analytics)
Mobile: +49 174 3496381, Email: ***@***.***
Postal Address: BASF SE, RGQ/DA – A015, Carl-Bosch-Strasse 38, 67056 Ludwigshafen am Rhein, Germany
***@***.***
BASF SE, Registered Office: 67056 Ludwigshafen, Germany
Registration Court: Amtsgericht Ludwigshafen, Registration No.: HRB 6000
Chairman of the Supervisory Board: Kurt Bock
Board of Executive Directors:
Markus Kamieth, Chairman;
Dirk Elvermann, Michael Heinz, Anup Kothari, Stephan Kothrade, Katja Scharpwinkel
Information on data protection can be found here: https://www.basf.com/global/en/legal/data-protection-at-basf.htmll
From: Dominik Linzner ***@***.***>
Sent: Wednesday, August 28, 2024 1:41 PM
To: experimental-design/bofire ***@***.***>
Cc: Kai Michael Exner ***@***.***>; Review requested ***@***.***>
Subject: [EXT] Re: [experimental-design/bofire] I optimality criterion (PR #428)
Sie erhalten nicht oft eine E-Mail von ***@***.******@***.***>. Erfahren Sie, warum dies wichtig ist<https://aka.ms/LearnAboutSenderIdentification>
@dlinzner-bcs<https://github.com/dlinzner-bcs> requested your review on: #428<#428> I optimality criterion.
—
Reply to this email directly, view it on GitHub<#428 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BEJBJSU5BNDE4HYSC2N3G2TZTWZOJAVCNFSM6AAAAABMYW445CVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJUGA2DKMJWHAZTIMI>.
You are receiving this because your review was requested.Message ID: ***@***.******@***.***>>
|
Hi @KaiExner, Danke für das Beispiel. Der folgende Code generiert einen Versuchsplan dazu und erzeugt die Abbildung unten
Das in bofire generierte design sieht so aus:
Leider nicht ganz das gleiche. Kann man sich sicher sein, dass jmp tatsächlich ein globales Minimum findet? Weil die Lösung aus jmp irgendwie etwas unregelmäßig aussieht (ist z.B. nicht symmetrisch bzgl. Vertauschung von X1 und X2) und die bofire-Lösung im zweiten Bsp. (siehe unten) sehr nah an der exakten Lösung zu liegen scheint... Ich habe gemerkt, dass im Paper, das ich für die Beispiele in
|
Hallo Aaron,
Ob das jmp Design global I-optimal ist, ist sicherlich nicht gesichert.
Problem mit dem in BoFire generierten Design: X3 ist nur auf zwei Level eingestell, -1 und 1. Damit kann ich X3^2 nicht bestimmen!
Das ist also kein Design für Y ~ X1 + X2 + X3 + X1X2 + X1X3 + X2X3 + X1^2 + X2^2 + X3^2.
Grüße,
Kai
Dr. Kai Michael Exner
Senior Specialist Digitalization (Data Analytics)
Mobile: +49 174 3496381, Email: ***@***.***
Postal Address: BASF SE, RGQ/DA – A015, Carl-Bosch-Strasse 38, 67056 Ludwigshafen am Rhein, Germany
***@***.***
BASF SE, Registered Office: 67056 Ludwigshafen, Germany
Registration Court: Amtsgericht Ludwigshafen, Registration No.: HRB 6000
Chairman of the Supervisory Board: Kurt Bock
Board of Executive Directors:
Markus Kamieth, Chairman;
Dirk Elvermann, Michael Heinz, Anup Kothari, Stephan Kothrade, Katja Scharpwinkel
Information on data protection can be found here: https://www.basf.com/global/en/legal/data-protection-at-basf.htmll
From: Aaron Osburg ***@***.***>
Sent: Wednesday, August 28, 2024 11:59 PM
To: experimental-design/bofire ***@***.***>
Cc: Kai Michael Exner ***@***.***>; Mention ***@***.***>
Subject: [EXT] Re: [experimental-design/bofire] I optimality criterion (PR #428)
Hi @KaiExner<https://github.com/KaiExner>,
Danke für das Beispiel. Der folgende Code generiert in der Abbildung darunter
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import FormatStrFormatter
from bofire.data_models.constraints.api import (
NonlinearEqualityConstraint,
NonlinearInequalityConstraint,
LinearEqualityConstraint,
LinearInequalityConstraint,
InterpointEqualityConstraint,
)
from bofire.data_models.domain.api import Domain
from bofire.data_models.features.api import ContinuousInput, ContinuousOutput
from bofire.strategies.doe.design import find_local_max_ipopt
from bofire.strategies.enum import OptimalityCriterionEnum
domain = Domain(
inputs = [
ContinuousInput(key="X1", bounds = (-1,1)),
ContinuousInput(key="X2", bounds = (-1,1)),
ContinuousInput(key="X3", bounds = (-1,1))
],
outputs = [ContinuousOutput(key="y")],
constraints = [
LinearInequalityConstraint(features=["X1","X2"], coefficients=[1,1], rhs=0.8),
LinearInequalityConstraint(features=["X1","X2"], coefficients=[-1,-1], rhs=-0.2)
]
)
i_optimal_design = find_local_max_ipopt(
domain,
"Y ~ X1 + X2 + X3 + X1*X2 + X1*X3 + X2*X3 + X1^2 + X2^2 + X3^2",
n_experiments=16,
ipopt_options={"disp":5},
objective=OptimalityCriterionEnum.I_OPTIMALITY
).to_numpy().T
jmp = np.array([[0.8454641457,-0.340784303,0],
[1,-0.8,-1],
[1,-0.261259178,1],
[-0.49600524,1,1],
[-0.8,1,0],
[1,-0.8,1],
[0.3972024633,0.0427975367,0],
[-0.2,1,0],
[1,-0.2,-1],
[-0.08,0.88,-1],
[-0.512263201,1,-1],
[-0.08, 0.28, 1],
[0.16, 0.64, 1],
[-0.08, 0.28,-1],
[0.2386392856,0.2013607144,0],
[0.3420421924,0.1377730655,0]])
fig = plt.figure(figsize=((10,10)))
ax = fig.add_subplot(111, projection='3d')
ax.view_init(45, 45)
ax.set_xlabel("$x_1$")
ax.set_ylabel("$x_2$")
ax.set_zlabel("$x_3$")
plt.rcParams["figure.figsize"] = (10,8)
#plot D-optimal solutions
ax.scatter(
xs=i_optimal_design[0],
ys=i_optimal_design[1],
zs=i_optimal_design[2],
marker="o",
s=40,
color="orange",
label="our solution, 16 points"
)
ax.scatter(
xs=jmp[:,0],
ys=jmp[:,1],
zs=jmp[:,2],
marker="o",
s=40,
color="red",
label="jmp solution, 16 points"
)
plt.legend()
image.png (view on web)<https://github.com/user-attachments/assets/38be1515-c899-4aa9-9af3-a14f32517fcf>
Das in bofire generierte design sieht so aus:
[[-0.2 , 1. , 1. ],
[ 0.092, 0.108, -1. ],
[ 1. , -0.2 , -1. ],
[-0.8 , 1. , -1. ],
[-0.2 , 1. , -1. ],
[ 1. , -0.8 , -1. ],
[-0.8 , 1. , -1. ],
[ 1. , -0.2 , 1. ],
[ 0.117, 0.083, 1. ],
[ 0.092, 0.108, -1. ],
[ 0.419, 0.381, -1. ],
[ 0.117, 0.083, 1. ],
[-0.8 , 1. , 1. ],
[ 1. , -0.8 , -1. ],
[ 0.299, 0.501, 1. ],
[ 1. , -0.8 , 1. ]]
Leider nicht ganz das gleiche. Kann man sich sicher sein, dass jmp tatsächlich ein globales Minimum findet? Weil die Lösung aus jmp irgendwie etwas unregelmäßig aussieht und die bofire-Lösung im zweiten Bsp. (siehe unten) sehr nah an der exakten Lösung zu liegen scheint...
Ich habe gemerkt, dass im Paper, das ich für die Beispiele in tutorials/doe/basic_examples.ipynb verwendet habe (siehe hier<https://www.sciencedirect.com/science/article/pii/S0169743917303106>), auch Beispiele für exakte I-optimale designs vorgegeben sind. Der folgende Code erzeugt die in der Abb. gezeigten Designs (grün: exaktes I-optimales Design, gelb: unser Design)
domain = Domain(
inputs = [
ContinuousInput(key="x1", bounds = (0,1)),
ContinuousInput(key="x2", bounds = (0.1, 1)),
ContinuousInput(key="x3", bounds = (0, 0.6))
],
outputs = [ContinuousOutput(key="y")],
constraints = [
LinearEqualityConstraint(features=["x1","x2","x3"], coefficients=[1,1,1], rhs=1),
LinearInequalityConstraint(features=["x1","x2"], coefficients=[5,4], rhs=3.9),
LinearInequalityConstraint(features=["x1","x2"], coefficients=[-20,5], rhs=-3)
]
)
i_optimal_design = find_local_max_ipopt(
domain,
"x1 + x2 + x3 + {x1**2} + {x2**2} + {x3**2} + {x1**3} + {x2**3} + {x3**3} + x1:x2 + x1:x3 + x2:x3 + x1:x2:x3",
n_experiments=12,
objective=OptimalityCriterionEnum.I_OPTIMALITY).to_numpy().T
i_opt = np.array([[0.7000, 0.1000, 0.2000],
[0.3000, 0.6000, 0.1000],
[0.2031, 0.1969, 0.6000],
[0.5899, 0.2376, 0.1725],
[0.4105, 0.4619, 0.1276],
[0.2720, 0.4882, 0.2398],
[0.2281, 0.3124, 0.4595],
[0.3492, 0.1000, 0.5508],
[0.5614, 0.1000, 0.3386],
[0.4621, 0.2342, 0.3037],
[0.3353, 0.2228, 0.4419],
[0.3782, 0.3618, 0.2600]]).T # values taken from paper
fig = plt.figure(figsize=((10,10)))
ax = fig.add_subplot(111, projection='3d')
ax.set_title("cubic model")
ax.view_init(45, 45)
ax.set_xlabel("$x_1$")
ax.set_ylabel("$x_2$")
ax.set_zlabel("$x_3$")
plt.rcParams["figure.figsize"] = (10,8)
#plot feasible polytope
ax.plot(
xs=[7/10, 3/10, 1/5, 3/10, 7/10],
ys=[1/10, 3/5, 1/5, 1/10, 1/10],
zs=[1/5, 1/10, 3/5, 3/5, 1/5],
linewidth=2
)
#plot I-optimal solution
ax.scatter(
xs=i_opt[0],
ys=i_opt[1],
zs=i_opt[2],
marker="o",
s=40,
color="darkgreen",
label="I-optimal design, 12 points"
)
ax.scatter(
xs=i_optimal_design[0],
ys=i_optimal_design[1],
zs=i_optimal_design[2],
marker="o",
s=40,
color="orange",
label="optimal_design solution, 12 points"
)
plt.legend()
image.png (view on web)<https://github.com/user-attachments/assets/5337f86c-2d87-42cb-a4e9-93cbdd44d83b>
—
Reply to this email directly, view it on GitHub<#428 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BEJBJSUAWP2BR4JSEWR7Y7LZTZB3ZAVCNFSM6AAAAABMYW445CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJWGMYTQNZZGA>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hallo Kai, Das stimmt natürlich, ich habe auch gefunden, woran es lag. Ich habe den Ausdruck
übergeben, die Terme zweiter Ordnung müssen aber als {Xi**2} statt Xi^2 geschrieben werden, sonst werden sie von formulaic ignoriert. Mit dem richtigen Ausdruck
kommt folgendes Design heraus, sieht das besser aus? :)
Grüße |
Danke Aaron!
Hier der Vergleich der Designs:
***@***.***
Bedeutung:
I: average relative prediction variance over the full space -> [-1, 1] for all variables. Does not respect the constraints
Ix: average relative prediction variance evaluated over the design points only (kind of a bogus metric)
Ips: average relative prediction variance evaluated over the feasible space (here 1.8 mio evenly spaced grid points)
Relevante Größe ist Ips: hier ist das jmp-Design mit einer mittleren Vorhersagevarianz von 0.35 gegenüber dem BoFire-Design mit 0.59 grob 2x besser.
Grüße,
Kai
Dr. Kai Michael Exner
Senior Specialist Digitalization (Data Analytics)
Mobile: +49 174 3496381, Email: ***@***.***
Postal Address: BASF SE, RGQ/DA – A015, Carl-Bosch-Strasse 38, 67056 Ludwigshafen am Rhein, Germany
***@***.***
BASF SE, Registered Office: 67056 Ludwigshafen, Germany
Registration Court: Amtsgericht Ludwigshafen, Registration No.: HRB 6000
Chairman of the Supervisory Board: Kurt Bock
Board of Executive Directors:
Markus Kamieth, Chairman;
Dirk Elvermann, Michael Heinz, Anup Kothari, Stephan Kothrade, Katja Scharpwinkel
Information on data protection can be found here: https://www.basf.com/global/en/legal/data-protection-at-basf.htmll
From: Aaron Osburg ***@***.***>
Sent: Thursday, August 29, 2024 10:43 PM
To: experimental-design/bofire ***@***.***>
Cc: Kai Michael Exner ***@***.***>; Mention ***@***.***>
Subject: [EXT] Re: [experimental-design/bofire] I optimality criterion (PR #428)
Hallo Kai,
Das stimmt natürlich, ich habe auch gefunden, woran es lag. Ich habe den Ausdruck
"Y ~ X1 + X2 + X3 + X1X2 + X1X3 + X2*X3 + X1^2 + X2^2 + X3^2"
übergeben, die Terme zweiter Ordnung müssen aber als {Xi**2} statt Xi^2 geschrieben werden, sonst werden sie von formulaic ignoriert. Mit dem richtigen Ausdruck
"Y ~ X1 + X2 + X3 + X1X2 + X1X3 + X2*X3 + {X12} + {X22} + {X3**2}"
kommt folgendes Design heraus, sieht das besser aus? :)
image.png (view on web)<https://github.com/user-attachments/assets/6aae0551-ee69-4e64-9a69-1d267730e9a7>
[[ 1. , -0.8 , 1. ],
[ 0.27, -0.07, -1. ],
[ 0.87, -0.07, 1. ],
[ 1. , -0.8 , -1. ],
[ 0.29, 0.26, -1. ],
[ 0.24, -0.04, 1. ],
[-0.8 , 1. , 1. ],
[-0.2 , 1. , 0.02],
[ 1. , -0.2 , 0.02],
[-0.8 , 1. , -1. ],
[ 1. , -0.2 , -1. ],
[-0.03, 0.23, -0.01],
[-0.2 , 1. , -1. ],
[-0.03, 0.23, -0.01],
[ 1. , -0.67, 0.04],
[-0.27, 1. , 1. ]]
—
Reply to this email directly, view it on GitHub<#428 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BEJBJSXZFGG5BTFK2Z3LL4LZT6BWZAVCNFSM6AAAAABMYW445CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJYHE2DKOBVGM>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hi @KaiExner und @jduerholt , danke für die Auswertung, dann passt etwas mit der Erstellung des Designs noch nich...
Für das BoFire-Design wurden nur 80 gridpoints und ein sehr großer Regularisierungsparameter von 1e-2 verwendet, ich konnte mir vorstellen, dass das zum schlechten Ergebnis führt. Das Design unten verwendet für die Optimierung das oben genannte Kriterium mit mehr gridpoints und den Regularisierungsparameter habe ich auf 1e-7 gesetzt. Das folgende Design kommt dabei heraus (, was im BoFire-Kriterium vergleichbar gut wie das jmp Design abschneidet).
@KaiExner falls du Zeit und Lust hast: Schneidet dieses Design auch in jmp deutlich besser ab als das vorherige? (sorry, habe selbst keinen Zugriff) Code für "I-Kriterium" mit 1.8e6 gridpoints
|
Hallo Aaron,
Hier meine Bewertung des jmp-Designs und des neuen Designs, das Du geschickt hast:
***@***.***
Das BoFire-Design kommt also minimal besser weg.
Auswertung in jmp:
jmp-Design: 0.346, BoFire V: 0.339 – also auch in jmp das neue Design etwas besser, Zahlenwerte leicht unterschiedlich.
Die Gründe der leichten Abweichungen würde ich in der Berechnung / Schätzung der mittleren Vorhersagevarianz vermuten.
Grüße,
Kai
Dr. Kai Michael Exner
Senior Specialist Digitalization (Data Analytics)
Mobile: +49 174 3496381, Email: ***@***.***
Postal Address: BASF SE, RGQ/DA – A015, Carl-Bosch-Strasse 38, 67056 Ludwigshafen am Rhein, Germany
***@***.***
BASF SE, Registered Office: 67056 Ludwigshafen, Germany
Registration Court: Amtsgericht Ludwigshafen, Registration No.: HRB 6000
Chairman of the Supervisory Board: Kurt Bock
Board of Executive Directors:
Markus Kamieth, Chairman;
Dirk Elvermann, Michael Heinz, Anup Kothari, Stephan Kothrade, Katja Scharpwinkel
Information on data protection can be found here: https://www.basf.com/global/en/legal/data-protection-at-basf.htmll
From: Aaron Osburg ***@***.***>
Sent: Sunday, September 1, 2024 12:13 AM
To: experimental-design/bofire ***@***.***>
Cc: Kai Michael Exner ***@***.***>; Mention ***@***.***>
Subject: [EXT] Re: [experimental-design/bofire] I optimality criterion (PR #428)
Hi @KaiExner<https://github.com/KaiExner> und @jduerholt<https://github.com/jduerholt> ,
danke für die Auswertung, dann passt etwas mit der Erstellung des Designs noch nich...
Ich habe mir die Sache noch einmal angesehen und versucht, ein ähnliches Kriterium wie das in jmp zu bauen. Dazu habe ich auch äquidistante gridpoints erstellt (anstelle der Bestimmung mittels SpaceFilling Objective) und diejenigen Punkte, die die Constraints nicht erfüllen, weggeschmissen (siehe Code am Ende der Nachricht). Das spacing ist so gewählt, dass auch 1.8e6 gridpoints übrig bleiben.
Ich komme dann auch zu dem Ergebnis, dass das bisherige Design aus bofire deutlich schlechter als das von jmp abschneidet (wobei sich die genauen Werte von denen aus deiner letzten Nachricht unterscheiden, äquivalent sind die Kriterien also immer noch nicht :( )
# delta = 1e-2, objective using 80 gridpoints, criterion using 1.8e6 gridpoints
print("ours:", criterion(i_optimal_design.T.flatten()), "jmp:", criterion(jmp.flatten()))
ours: 1.464475006770114 jmp: 0.3522515468462639
Für das BoFire-Design wurden nur 80 gridpoints und ein sehr großer Regularisierungsparameter von 1e-2 verwendet, ich konnte mir vorstellen, dass das zum schlechten Ergebnis führt. Das Design unten verwendet für die Optimierung das oben genannte Kriterium mit mehr gridpoints und den Regularisierungsparameter habe ich auf 1e-7 gesetzt. Das folgende Design kommt dabei heraus (, was im BoFire-Kriterium vergleichbar gut wie das jmp Design abschneidet).
# delta = 1e-7, objective using 1.8e6 gridpoints, criterion using 1.8e6 points
print("ours:", criterion(i_optimal_design.T.flatten()), "jmp:", criterion(jmp.flatten()))
print(np.round(i_optimal_design,2).T)
ours: 0.34273537557403694 jmp: 0.3522515468462639
[[-0.49 1. 0.02]
[ 1. -0.46 -0. ]
[-0.2 1. -0.58]
[-0.2 1. 1. ]
[ 0.1 0.1 -0.01]
[ 0.13 0.36 -1. ]
[ 1. -0.2 1. ]
[ 0.21 0.26 1. ]
[-0.8 1. -1. ]
[ 1. -0.8 -1. ]
[ 0.93 -0.13 -1. ]
[ 0.1 0.1 -0.01]
[ 0.19 0.29 0.05]
[ 1. -0.46 -0. ]
[ 1. -0.8 1. ]
[-0.8 1. 1. ]]
image.png (view on web)<https://github.com/user-attachments/assets/fedcb837-8b97-48e2-8ca1-80c785a4f63d>
@KaiExner<https://github.com/KaiExner> falls du Zeit und Lust hast: Schneidet dieses Design auch in jmp deutlich besser ab als das vorherige? (sorry, habe selbst keinen Zugriff)
@jduerholt<https://github.com/jduerholt> Es sieht also so aus, als bräuchte man sehr viele gridpoints, um ein gescheites Kriterium zu generieren. Dafür ist eine Optimierung mit SpaceFilling unpraktisch, weil es ewig rechnen würde. Andererseits ist die Alternative (d.h. grid erstellen und constraintverletzende Punkte rausschmeißen) glaub ich nicht so toll für Gleichheitsconstraints... Ich habe jetzt mal erlaubt, dass der Benutzer auch ein selbst erstelltes objective übergeben kann (und by default wird SpaceFilling verwendet), das find ich aber auch iwie unbefriedigend. Wie siehst du das?
Code für "I-Kriterium" mit 1.8e6 gridpoints
from bofire.strategies.doe.objective import IOptimality
import pandas as pd
from itertools import product
import torch
formula = Formula("X1 + X2 + X3 + X1*X2 + X1*X3 + X2*X3 + {X1**2} + {X2**2} + {X3**2}")
domain = Domain(
inputs = [
ContinuousInput(key="X1", bounds = (-1,1)),
ContinuousInput(key="X2", bounds = (-1,1)),
ContinuousInput(key="X3", bounds = (-1,1))
],
outputs = [ContinuousOutput(key="y")],
constraints = [
LinearInequalityConstraint(features=["X1","X2"], coefficients=[1,1], rhs=0.8),
LinearInequalityConstraint(features=["X1","X2"], coefficients=[-1,-1], rhs=-0.2)
]
)
criterion = IOptimality(
domain=domain,
model=formula,
n_experiments=16,
delta=1e-7,
transform_range=None,
n_space=80,
ipopt_options={"disp":0, "linear_solver":"ma57", "hsllib":"libcoinhsl.dylib", "tol":1e-15, "acceptable_tol":1e-15},
)
gridpoints = np.linspace(-1,1,200)
points = list(product(gridpoints, repeat=3))
points = np.array(points)
points = pd.DataFrame(points, columns=["X1", "X2", "X3"])
fulfilled = domain.constraints(experiments=points)
fulfilled = np.array(np.logical_and(fulfilled[0] <= 0, fulfilled[1] <= 0), dtype=bool)
points = points[fulfilled]
X = formula.get_model_matrix(points).to_numpy()
criterion.YtY = torch.from_numpy(X.T @ X) / X.shape[0]
criterion.YtY.requires_grad = False
print("Number of gridpoints fulfilling the constraints:", X.shape[0])
—
Reply to this email directly, view it on GitHub<#428 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BEJBJSTN435NPO34XR7WELLZUI5VFAVCNFSM6AAAAABMYW445CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRTGA2TMMRWGM>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hallo Kai, danke für das Feedback :) Dann scheint die Implementierung jetzt ja fürs erste in Ordnung zu sein (außer du willst, dass sie erst noch an mehr Problemen getestet wird). @jduerholt: wäre dann jetzt kein draft mehr. Wenn die Dinge für dich in Ordnung ausschauen gerne jederzeit mergen. Liebe Grüße |
Hallo Aaron,
Aus meiner Sicht erst einmal keine weiteren Tests nötig.
Danke für Deinen Einsatz!
Liebe Grüße,
Kai
Dr. Kai Michael Exner
Senior Specialist Digitalization (Data Analytics)
Mobile: +49 174 3496381, Email: ***@***.***
Postal Address: BASF SE, RGQ/DA – A015, Carl-Bosch-Strasse 38, 67056 Ludwigshafen am Rhein, Germany
***@***.***
BASF SE, Registered Office: 67056 Ludwigshafen, Germany
Registration Court: Amtsgericht Ludwigshafen, Registration No.: HRB 6000
Chairman of the Supervisory Board: Kurt Bock
Board of Executive Directors:
Markus Kamieth, Chairman;
Dirk Elvermann, Michael Heinz, Anup Kothari, Stephan Kothrade, Katja Scharpwinkel
Information on data protection can be found here: https://www.basf.com/global/en/legal/data-protection-at-basf.htmll
From: Aaron Osburg ***@***.***>
Sent: Sunday, September 1, 2024 11:43 PM
To: experimental-design/bofire ***@***.***>
Cc: Kai Michael Exner ***@***.***>; Mention ***@***.***>
Subject: [EXT] Re: [experimental-design/bofire] I optimality criterion (PR #428)
Hallo Kai,
danke für das Feedback :) Dann scheint die Implementierung jetzt ja fürs erste in Ordnung zu sein (außer du willst, dass sie erst noch an mehr Problemen getestet wird). @jduerholt<https://github.com/jduerholt>: wäre dann jetzt kein draft mehr. Wenn die Dinge für dich in Ordnung ausschauen gerne jederzeit mergen.
Liebe Grüße
Aaron
—
Reply to this email directly, view it on GitHub<#428 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BEJBJSVTBL2PBTVN3UIXW5DZUOC6BAVCNFSM6AAAAABMYW445CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRTGUYDOMZQGY>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @Osburg and @KaiExner for this! I understand the code works now but is computationally intensive due to the large number of grid points for integration. That is ok for now I think. Just an idea: Since our models are polynomials, we could also analytically integrate all the resulting expressions for the I criterion, right?
where F is the model matrix and f is our polynomial. So we would need a symbolic integration to get YtY, right? I do not have time at the moment to implement this at the moment - but I can have a look once that changes :)
Hi @dlinzner-bcs, the computationally intensive part is only where the IOptimality object is generated. A space-filling design Y with many (typically Memory can become an issue in higher dimensions as well even when the equidistant grid is used, bc i did not implement it smartly. For these cases i think we would have to try to subdivide the calculation of M into subtasks where not the full matrix Y is needed at any single time point. Cheers P.S.: I don't know if anyone uses it but in general also nonpolynomial terms are supported. If it is imaginable that someone wants to make use of this at some point we should try to keep the criterion compatible with nonpolynomial terms, or what do you think? If not we can surely use a different implement as well! |
Hi @Osburg, thank you very much. I will have a look in the next days. Best, Johannes |
Honestly, I do not know if I am the correct one to review :D @Osburg Do you check and fix the failing tests? I can then have a rough look again :D |
Hey @jduerholt! Yes, will take care of the tests. sry was very busy with work in the last couple of days :D |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Osburg,
thank you very much. I let some comments especially regarding the code structure. I am also a bit under the impression, that the DoE module needs some kind of restructuring in total as we are just adding more features on top which makes it sometimes more messy.
Just look at my comments! It is not meant as criticism but more as a general observation.
Best,
Johannes
self.YtY = torch.from_numpy(X.T @ X) / n_space | ||
self.YtY.requires_grad = False | ||
|
||
print("Done!") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need all this prints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No :D
model_type=model, rhs_only=True, domain=domain | ||
) | ||
X = model_formula.get_model_matrix(design).to_numpy() | ||
self.space_filling_design = design |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you use this attribute somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again no :D will remove it
|
||
model_formula = get_formula_from_string( | ||
model_type=model, rhs_only=True, domain=domain | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understood it correctly, until here, a lot of functionality is just recoding other stuff that to come up with a starting set. And this recoding is partially done to prevent, circular imports. What do you think about adding an additional keyword argument to init, which just takes the inital design as input. Then the user also has much more flexibility in controlling this critical part of the design. Furtheremore, it would make the code much cleaner. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the DoEStrategy, we could then add this part of coming up with an inital setup in case somebody wants to perform an I-optimal design. But here, the user would still have the option todo custom stuff.
What do you think about a tidy-up of the DoE functionality in general (after this PR ;))?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could start to revamp find_local_max_iptopt
, by changing the signature in a way that it just accepts the instantiated objective
class, which is holding the domain
instead also taking the 'domainas argument. The model type would then also be only part of the objective and not also of the
find_local_max_ipopt` signature. This way, we can make it all much cleaner and modular. What do you think?
HI @jduerholt, totally agree with your opinion that stuff is getting more and more messy and one (or me :D) should spend time to rewrite the doe part in a nicer way. Since you know more about overall bofire etc + probably you have the better ideas how a better version would look like: Would you like to discuss this in a zoom call or so including @dlinzner-bcs if he likes (or alternatively somewhere here on github, but this would take more time after all i guess :D)? Starting from this I could then take care of the reimplementation.
<-- These points are not to be discussed or answered now, it's just a bunch of open questions i would want to discuss before I start to rewrite stuff. cheers |
Moved to #485 |
This PR should close #337. It should implement the i-optimality criterion as described by @KaiExner as the class
IOptimality
.When constructing a new
IOptimality
object, the space of feasible points is filled using the space filling objective (the number of pointsn_space
can be set by the user and by default the number of points is set to be equal to the requested number of experiments in the design to be foundn_experiments
).From these points, the corresponding design matrix
Y
is generated according to the user-provided model which is then used to define the evaluation function of the criterion as tr((Y.T @ Y) / n_space @ inv(X.T @ X)) whereX
is the model matrix corresponding to some inputx
.@dlinzner-bcs @KaiExner do you think this is roughly what it supposed to look like?
This is still a draft since tests etc. are still missing.