Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DenseKKTSystem structure #65

Merged
merged 3 commits into from
Sep 3, 2021
Merged

Conversation

frapac
Copy link
Collaborator

@frapac frapac commented Aug 30, 2021

  • Slight revisitation of the AbstractKKTSystem interface
  • New constructor that allows to build an AbstractKKTSystem directly from a NonLinearProgram
  • Add special tests to ensure that DenseKKTSystem matches exactly SparseKKTSystem
  • Add Random to tests' dependencies

@frapac frapac requested a review from sshin23 August 30, 2021 11:16
@codecov
Copy link

codecov bot commented Aug 30, 2021

Codecov Report

Merging #65 (9a33164) into master (64657de) will increase coverage by 0.64%.
The diff coverage is 97.34%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #65      +/-   ##
==========================================
+ Coverage   86.69%   87.33%   +0.64%     
==========================================
  Files          30       30              
  Lines        3141     3213      +72     
==========================================
+ Hits         2723     2806      +83     
+ Misses        418      407      -11     
Impacted Files Coverage Δ
src/matrixtools.jl 100.00% <ø> (+17.39%) ⬆️
src/interiorpointsolver.jl 93.15% <96.92%> (+0.26%) ⬆️
src/kktsystem.jl 98.72% <97.11%> (-1.28%) ⬇️
src/nonlinearprogram.jl 82.60% <100.00%> (+62.60%) ⬆️
src/MadNLP.jl 50.00% <0.00%> (-12.50%) ⬇️
src/utils.jl 73.68% <0.00%> (-7.80%) ⬇️
src/Interfaces/MOI_interface.jl 84.37% <0.00%> (-0.60%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 64657de...9a33164. Read the comment docs.

@frapac
Copy link
Collaborator Author

frapac commented Aug 30, 2021

Solve #36

Copy link
Member

@sshin23 sshin23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @frapac what an incredible work! It seems that the AbstractKKTSystem abstraction you built really pays here. It is even quite surprising that dense formulation just works with this small change. I only have a few minor comments. Could you run a benchmark just to make sure that we don't have an unexpected performance regression? I think running the benchmark with --quick option would be enough here.

@@ -331,13 +332,26 @@ function Solver(nlp::NonlinearProgram;
set_blas_num_threads(opt.blas_num_threads; permanent=true)

@trace(logger,"Initializing variables.")
ind_ineq = findall(nlp.gl.!=nlp.gu)
ns = length(ind_ineq)
info_constraints = get_index_constraints(nlp; fixed_variable_treatment=opt.fixed_variable_treatment)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the code way cleaner :) Maybe just rename it consistently as ind_cons?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!


function set_con_scale!(con_scale::AbstractVector, jac::SparseMatrixCOO, nlp_scaling_max_gradient)
for i in 1:nnz(jac)
row = jac.I[i]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe @inbounds here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, indeed.

ind_fixed::Vector{Int}
jacobian_scaling::VT
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤩


# If the the problem is unconstrained, then KKT system is directly equal
# to the Hessian (+ some regularization terms)
aug_com = if (m == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting, maybe there are things we can do less (or further) in the NLP formulation as well as the algorithm when it is unconstrained.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. We may want to use a different line-search in the unconstrained case, too. Maybe that's something we want to investigate in the future.

jac::MT
pr_diag::VT
du_diag::VT
diag_hess::VT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we separately need diag_hess, which we don't have in sparse KKT?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In sparse KKT, pr_diag and du_diag are defined implicitly in the KKT's COO matrix, by adding artificial duplicate entries to the sparsity pattern. That allows to do the regularization seamlessly.

In dense KKT, things are a bit different, as we cannot store pr_diag and du_diag directly in the dense KKT matrix. Each time we update the regularization, we need to update the diagonal terms in the dense KKT matrix, as

for i in 1:n
    aug_com[i, i] = hess[i, i] + pr_diag[i]

In case aug_com === hess, we are adding two time the terms pr_diag in the second regularization. That's why I thought it would be better to store the diagonal terms of the Hessian matrix once, and then do the regularization as

for i in 1:n
    aug_com[i, i] = diag_hess[i] + pr_diag[i]

But there might be a better solution!

@@ -51,7 +50,6 @@ testset = [
()->MadNLP.Optimizer(
inertia_correction_method=MadNLP.INERTIA_FREE,
kkt_system=MadNLP.SPARSE_UNREDUCED_KKT_SYSTEM,
reduced_system=false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops. Great catch!

@sshin23
Copy link
Member

sshin23 commented Aug 31, 2021

I'll add a benchmark for dense problems using the dense problems within CUTEst instances.

@sshin23
Copy link
Member

sshin23 commented Aug 31, 2021

One observation is that the current DenseKKT system assumes a different behavior of NonlinearProgramming functions; in particular, the hessian and jacobian callback functions should behave differently. Maybe it would be good to introduce a new abstract type AbstractNonlinearProgram, and separate SparseNonlinearProgram and DenseNonlinearProgram. And this can be independent of the KKTSystem type, since some dense NLPs are in sparse NLP format and the vice versa is also possible.

Could you make those NonlinearProgram subtypes? I can work on MOI/NLPModels interfaces in a separate PR

@frapac
Copy link
Collaborator Author

frapac commented Sep 1, 2021

You are raising an interesting question about NonlinearProgramming functions. Personally, I am in favor of keeping the current NonlinearProgramming as is. The reasons are

  • I think NonlinearProgramming (model) and KKTSystem (solver) are two different abstractions. We may want to solve a sparse nonlinear problem in dense mode (but the reverse is rarely true, indeed).
  • In my opinion, specifying dense callbacks in a NonlinearProgramming object should happen at the dense level, by adding new callbacks for the Jacobian and the Hessian supporting dense inputs:
        function eval_hess(hess::AbstractMatrix, x, l, sig)
            ...
        end
    (but current situation is not ideal though, as we are defining the callbacks as closures)
  • but I agree that current situation is not great, too. When we are using MadNLP in dense mode, we should at least check that the corresponding dense callbacks for Jacobian and Hessian are available in the NonlinearProgramming model passed as input.

@sshin23
Copy link
Member

sshin23 commented Sep 1, 2021

Indeed. Totally agreed.

I think one way we can check the availability of sparse (or dense) callback is by checking length(methods(nlp.lag_hess!,(AbstractVector,Any,Any,Any))) != 0. Does julia have a function like iscallable(f,[types])::Bool?

With regards to closures, I think we should discuss entirely relying on NLPModels and getting rid of NonlinearProgram. Then we can save time for supporting the MOI interface, which is quite complex. But of course, not in this PR :)


struct DenseKKTSystem{T, VT, MT} <: AbstractKKTSystem{T, MT}
hess::MT
jac::MT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't realize this in my first review, but can't these be views so that we don't need to transfer this in _build_dense_kkt_system?

@sshin23 sshin23 merged commit 23b5328 into MadNLP:master Sep 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants