Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cognitive versus Cyclomatic Complexity #924

Open
lv-janpieter opened this issue Feb 5, 2025 · 16 comments
Open

Cognitive versus Cyclomatic Complexity #924

lv-janpieter opened this issue Feb 5, 2025 · 16 comments

Comments

@lv-janpieter
Copy link

We have been starting to use LC for a few months now, and really appreciate it. And consider it now mandatory for our projects.

We think LC0010 is really interesting. We are convinced functions should be easy to understand. And having a metric to determine this is really handy. We want it to be part of the process for all of our developers.

However LC0010 is producing too much false positives. We know we can raise the treshold but that would grant too much coplexity for other type of functions. So I would like to make a few suggestions which in my opinion would produce better results. This tweaks could be included as additional settings depending on preference just like the treshold.

Wondering how you all feel about this.

Tweak 1:
View case statements as only 1 added complexity.
Case options are easy to read and should not raise the complexity.
I know this issue was raised before a time ago #566

Tweak 2:
Do not consider early exits (also known as guard clauses) as additional complexity.
We use early exit's extensively to avoid a lot of nesting. But this quickly raises cyclomatic complexity.
Would be nice to have a setting where you can say "ignore unnested if statements which are immediatly followed by an exit".
For example, we have a lot of functions styled like the one below (imagine this a function being on the sales line).
This function already has 8 points in score, but we consider this a well designed function. In reality we would only want the main section counted.

  procedure TestFunction()
   begin
       // guard clause section non idented if statement resulting in only exit, do not count
       // guard clause, don't count
       if Type <> Type::Item then
           exit;

       // guard clause, don't count
       if "Unit of Measure Code" = '' then
           exit;

       // guard clause, don't count
       if Status <> Status::Open then
           exit;
       // end guard clause section 

       // main section
       repeat
           if Quantity < 0 then
               HandleNegativeQuantity();
           if Quantity = 0 then
               HandleZeroQuantity();
           if Quantity > 0 then
               HandlePositiveQuantity();
        until  AllHandled();
       // end main section
   end;

This variant of the function however should definitly have full complexity count:

  procedure TestFunction()
  begin
       if Type = Type::Item then begin
           if "Unit of Measure Code" <> '' then begin
               if Status = Status::Open then begin
                   repeat
                        if Quantity < 0 then
                            HandleNegativeQuantity();
                        if Quantity = 0 then
                             HandleZeroQuantity();
                        if Quantity > 0 then
                            HandlePositiveQuantity();
                   until  AllHandled();
               end;
          end;
      end;
  end;

Thanks a lot!

@Arthurvdv
Copy link
Collaborator

Thank you for raising this issue! It really challenged me to think about how we can improve the approach to measuring code complexity.

I completely understand your remarks on the early exits and case-statements, where this significantly increases the cyclomatic complexity. The cyclomatic complexity is a established standard, which represents a quantitative measure of the number of linearly independent paths.

Interestingly, both of the examples you provided return the same Cyclomatic Complexity score of 8. If you would have to write automated test and want to archive a 100% coverage, the amount of test's are the same between the both examples.

This highlights an important point: Cyclomatic Complexity isn’t always the right metric when it comes to code readability. The good news is that I believe I have a another metric which is a better solution for your question.

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and understand.

A Cognitive Complexity score is assessed according to three basic rules:

  1. Ignore structures that allow multiple statements to be readably shorthanded into one
  2. Increment (add one) for each break in the linear flow of the code
  3. Increment when flow-breaking structures are nested

When applying this metric to your examples, the difference in complexity becomes much clearer:

Image

I’ll be diving deeper into this and plan to introduce Cognitive Complexity as a new rule in LinterCop in the coming weeks. If you have feedback that would been incredibly helpful shaping this direction, and I appreciate the insights you’ve shared.

Thanks again for starting this discussion, I’ll keep you updated on the progress.

@lv-janpieter
Copy link
Author

lv-janpieter commented Feb 10, 2025

Thanks for weighing in Arthur, that looks about right! Nesting should be punished a lot more.

I did read about the cognetive complexity in issue #566 but since this topic was closed I thought it has lost traction. Also I didn't really understand the metric, now I think I do, thank you for this clear example. And you are right cyclomatic complexity is an established metric that is why I proposed a setting, but it looks like cognetive complexity is even better. In that case there is no need for improved cyclomatic complexity.

In my opinion this would really be usefull and a fair metric, and can be made mandatory and possibly integrated into build pipelines.

Somehow functions always have a tendancy to grow, and we need to avoid raising thresholds or pragma warnings because the metric is not optimal.

Looking forward for the release.

@Arthurvdv
Copy link
Collaborator

Two new rules are available in the v0.34.1 Pre-Release version of the LinterCop:

It would be great if you could give it a try and I’d love to hear your feedback on it!

@lv-janpieter
Copy link
Author

I would like to try this, but somehow it does not pick up the pre-release.
When selecting 'installe specific version' the version numbers do not comply with the version numbers on github.
0.1.10 vs 0.34.1
Highest i can pick is 0.1.10

I'm sure it is me being stupid but what am I doing wrong?

    "linterCop.autoDownload": true,
    "linterCop.load-pre-releases": true,
    "linterCop.repositories": [
        {
            "url": "https://github.com/StefanMaron/BusinessCentral.LinterCop",
            "shortName": "BcLntr",
            "fileName": "BusinessCentral.LinterCop.dll"
        }

@someC0d3r
Copy link
Contributor

Would also love to the test the new rule but also struggling with getting the pre-release version:

Image

When setting "linterCop.load-pre-releases": true and execute command "LC: Download Linter Cop", the command is getting registered in output log (see screenshot), but no new version will be downloaded. Am I missing sth? :O

@Arthurvdv
Copy link
Collaborator

Image

Oops 😳

The Pre-release was missing assets, so the VS Code extension couldn't find an version to download. I'm not sure yet what went wrong, as some versions failed on the automated tests. I'm guessing I hit an thread safety issue, but I'm unsure how I can test this.

I have now created an updated v0.34.1 Pre-release with Assets(!), so I would expect that the VS Code extension will now work as expected.

@lv-janpieter
Copy link
Author

Got it now!

One of our major (very large) vertical:

  • has 259 cyclomatic warnings (treshold 8).
  • went down to 85 for cognetive (treshold 15)

And had a quick look 95% of these functions could use a diet.

Thank you for this effort!

@Arthurvdv
Copy link
Collaborator

Great to hear that the results are looking promising!

I’ll need to look into some last thing, like else and else if constructs, so the calculations can slightly change before it will be in the release version of the LinterCop.

@lv-janpieter
Copy link
Author

lv-janpieter commented Feb 11, 2025

Yeah I do see the main reason why we still have many high cognetive complexity (despite our best efforts ofcourse) because of the following guard clause pattern we still have a lot in our code. We do not use begin end anymore for single statement if conditions, but there is still a lot of legacy code programmed this way.

This counts as 1:

if (SalesLine.Type = SalesLine.Type::Item) then begin
    exit;
end;

This one is for free:

if (SalesLine.Type = SalesLine.Type::Item) then 
    exit;

This is probably by design and doesn't bother me as much, is easy to refactor.
It is however important to know these variants produce different results.

So I guess I will be busy for a while. :)

@lv-janpieter
Copy link
Author

lv-janpieter commented Feb 11, 2025

Also we may have to treat an error the same as an exit in my opinion.

this guard clause is for free:

if (SalesLine.Type = SalesLine.Type::Item) then 
    exit;

this error guard clause is punished with 1:

if (SalesLine.Type = SalesLine.Type::Item) then 
    Error('I cannot be an item!');

@Arthurvdv
Copy link
Collaborator

@lv-janpieter Thank you for trying it out and providing feedback. It's been really helpful while working on this new rule and trying to convert the information of the whitepaper to the constructs of the AL Language.

I've update the v0.34.1 Pre-release of the LinterCop today. Currently the rule will analyze the following;

Category Increment Nesting Level Nesting Penalty
if, ternary operator X X X
else if, else X X  
case X X X
for, foreach X X X
while, repeat X X X
sequences of binary logical operators X    

There's also something on recursion in the whitepaper, where I'm unsure if I'm able to implement this.

Cognitive Complexity adds a fundamental increment for each method in a recursion cycle, whether direct or indirect.

procedure selfRecursive()
begin
    this.selfRecursive();
end;

procedure indirectRecursiveA()
begin
    this.indirectRecursiveB();
end;

procedure indirectRecursiveB()
begin
    this.indirectRecursiveA();
end;

I'm doubting if it's feasible on the AL Language to analyze this. I'll have a look into this, but I'm thinking of Dependency Injection on Interfaces combined with Enums it could get extremely complicated. I could do the selfRecursive, but not sure if implementing only a part of the recursion is an good idea. Would be great to hear some idea's on this.

I wanted also to provide some feedback on your findings;

This is probably by design and doesn't bother me as much; it's easy to refactor.

You're right, this is indeed by design. The code analyzer can detect, in the second example without the begin/end block, that there's an exit clause inside the then-statement. However, when begin/end are used, it’s treated as a code block, and I would need to add an extra loop to analyze the block and identify the exit statement. I would’ve expected CodeCop AA0005 to raise a diagnostic on this. 🤔 For now I think it's best to leave it as it is, also to prevent the rule from becoming more complex.

Also, we may have to treat an error the same as an exit, in my opinion.

In the documentation I found on Guard Clauses, the context focuses on return statements, which translate to exit in AL. An Error translates more closely to a try/catch pattern, where the catch is counted as an increment. Additionally, this can get messy if you need to handle ErrorInfo or log telemetry alongside the error, something like;

if (SalesLine.Type = SalesLine.Type::Item) then begin
    EmitTelemetry('Wrong Sales Line Type', SalesLine);
    Error('I cannot be an item!');
end;

Personally I prefer to use the Guard Clauses to return an boolean to the calling method, to deligate the ErrorHandling back to the calling method.

local procedure IsValidRecord(var SalesHeader: Record "Sales Header"): Boolean
begin
    if SalesHeader.IsTemporary() then
        exit(false);

    if not (SalesHeader."Document Type" in [
        Enum::"Sales Document Type"::"Quote",
        Enum::"Sales Document Type"::"Order"])
    then
        exit(false);

    if SalesHeader."Order Date" = 0D then
        exit(false);

    exit(true);
end;

Would be great to gather more viewpoint on this.

@lv-janpieter
Copy link
Author

using begin/end
Indeed we ignore AA0005 rule for a few reasons, we do not think this always improves readability and we relied on txt2al conversion for quite a long time. Different topic. But for single line if statements we omit the begin/end nowadays. I don't really think it is necessary to take into account this scenario, the code can be easily refactored. However it is good to know for refactoring that you can reduce the metric by removing begin and end. I just did in our major vertical.

counting error
About the if ... then error() BC I disagree that this resembles a try / cacth. Also like in c# you can throw an exeption without catching this inside your own procedure. The calling procedure might catch the error but that would add up to the complexity of the calling function, not the function throwing the error.

AL does not know such thing as a try catch. Only a TryFunction might resemble something simular. Also in this case the complexity is added to the calling function and not the tryfunction itself. An Error will always halt the execution of a function, just like an exit.

You might argue that raising an Error is just calling another function. Just like raising a FieldError. And maybe some more functions i'm missing. A TestField might raise an error but it might continue. So I agree it is a bit of a gray area.

I don't know what the documentation sais about throwing an error without try / catch within a single statement if. That would be the simular case. I cannot find any documentation however how this situation is handled. I personally feel it would make sense not make it count. If however this is counted it would not be a big deal, most guard clauses are exits.

@lv-janpieter
Copy link
Author

Also just a wild idea. Some functions are difficult to refactor (or we are just unwilling to) and we might need to pragma the congnitive complexity. However when doing this everone is free to expand the function. And we don't want this. Would it be feasable to have a way of allowing a higher score for a specific function?

Maybe add a comment before the function:
// LC0090 increase threshold with 1

Or
// LC0090 change treshold to 20

With a system like this we are able to keep enforcing cognetive complexity and forbid pragma it, while in the code review we can review every raise in complexity and reject it if we don't agree with the justification of raising it.

@Arthurvdv
Copy link
Collaborator

Arthurvdv commented Feb 13, 2025

In the Cognitive Complexity whitepaper I find this on early exits:

Jumps to labels
goto adds a fundamental increment to Cognitive Complexity, as do break or continue to a label and other multi-level jumps such as the break or continue to a number found in some languages. But because an early return can often make code much clearer, no other jumps or early exits cause an increment.

I've read this multiple times, but still didn't fully comprehend it. I asked help of ChatGTP to break it down for me;

1. "goto adds a fundamental increment to Cognitive Complexity"

  • Explanation: The use of goto makes the control flow harder to follow, as it creates non-linear jumps in execution.
  • Impact on Complexity: +1 for every goto statement because it breaks the logical flow.

Example (if AL had goto)

procedure MyProcedure()
begin
    if SomeCondition then
        goto Label1;

    Label1:
    DoSomething();
end;

👆 This goto would add +1 to Cognitive Complexity.

2. "Break or continue to a label and other multi-level jumps"

* Explanation: These control structures skip multiple levels of execution, making it harder to follow.
* Impact on Complexity: +1 for each multi-level break/continue.

#### Example (if AL had break on repetitive statements)

procedure MyProcedure()
var
    i, j: Integer;
begin
    for i := 1 to 10 do begin
        for j := 1 to 10 do begin
            if i + j > 10 then
                break; // Multi-level jump → +1 complexity
        end;
    end;
end;

👆 The break skips multiple iterations and increases Cognitive Complexity by +1.

3. "But because an early return can often make code much clearer, no other jumps or early exits cause an increment"

Key phrase: "early return can often make code much clearer"
Explanation: Unlike goto, break, or continue, an early return (exit in AL) is considered good practice for readability.
Impact on Complexity: No complexity increment!

Example

procedure MyProcedure()
begin
    if not SomeCondition then
        exit; // Early return → No increase in complexity

    DoSomething();
end;

Conclusion

This still doesn't really give us any clarity on whether or not an error is early access, so I asked again ChatGPT to reason on this.

Should Error() be considered an early exit?

Arguments for treating Error() like exit (no complexity increase):
✅ It stops execution immediately, just like exit.
✅ It prevents unnecessary nesting, which improves readability.
✅ It doesn’t introduce extra paths that make the flow harder to follow.

Arguments for increasing Cognitive Complexity (+1):
❌ It is not a simple return, but an exception-throwing mechanism.
❌ Error handling (if any) could make reasoning about the flow harder.

The whitepaper explicitly states: "An early return can often make code much clearer, so no other jumps or early exits cause an increment."

🔹 Key phrase: "no other jumps or early exits cause an increment".
🔹 Since Error() is a form of early exit, it logically should not increase Cognitive Complexity.


Okay, what ChatGPT is explaining here is well substantiated and I think this lines up with your argumentation. Also I think I've found an C# equivalent of this here, where the example below also counts as a guard clause.

public void ProcessOrder(Order order)
{
    if (order == null)
        throw new ArgumentNullException(nameof(order), "Order cannot be null.");

    if (!order.IsPaid)
        throw new Exception("Order has not been paid for.");

    if (!order.IsInStock)
        throw new Exception("Order cannot be shipped as it is out of stock.");

    ShipOrder(order);
}

I'm going to give it some more thought in the next days, because I want to make a well-considered and correct decision on this, to to exclude the possibility that we would have to revise this in the future, which could result in a "breaking change" as the calculation of the metric then would change.

multi-level jump

With this detailed breakdown of the documentation I see I need to implement multi-level jumps and also include Quit(), Break(), Skip() for report and xmlport objects.

Currently the continue will always count as an Early return, where I need to handle multi-level jumps.

procedure MyProcedure()
begin
    foreach Number in Numbers do
        if Number > 10 then
            continue;    // +0 Early return
end;

procedure MyProcedure2()
begin
    foreach Number in Numbers do
        if Number > 10 then
            if not System.GuiAllowed() then
                continue;    // +1 multi-level jump
end;

@Arthurvdv Arthurvdv changed the title LC0010 cyclomatic complexity finetuning Cognitive versus Cyclomatic Complexity Feb 13, 2025
@lv-janpieter
Copy link
Author

lv-janpieter commented Feb 13, 2025

Wow indeed there are many more scenario's to consider. I did not realize. Well done. I agree is should be well thought through. Althoug officially lintercop is still in preview so users have to expect some surprises. But when it is integrated in build pipelines you don't want additional warnings pop up because of changed rules.

Exactly. It is equilavant to a throw error in c#. It stop's execution of the function (unless there is a try / catch, but in that case I guess the try catch is already counted). ChatGPT is much better in explaining than me.

Multi level jump
So what do you mean with the multi-level jump? The continue is only exiting the function because there is no coding behind the foreach? Normally the continue is a linear flow breaking call and should have a score unlike exit's. In this case because there is no code it is for debate, like the single statement begin/end it is avoidable by just using exit. I'm not sure if the absent of code behind the foreach should mean it doesn't add a score. When reading the code you still have to search for the end of the loop to find out there is no code. In this example, it is easy because the function and loop is small. In longer loops you may have to scroll down to draw this conclusion. So in my opinion this is harder to read then a straight exit.

Your link to the documentation doesn't work, but I found another explanation (targeted at java).
https://www.baeldung.com/java-cognitive-complexity
Chapter 3, continue is considered as a 'jump-to label' (breaking linear code) and this would add a score.

@Arthurvdv
Copy link
Collaborator

Arthurvdv commented Feb 15, 2025

Some more reading into jumps and labels, I've found this the cognitive complexity whitepaper is a bit vague/ambiguous, which helped me better understanding this.

Jumps to labels

goto adds a fundamental increment to Cognitive Complexity, as do break or continue to
a label
and other multi-level jumps such as the break or continue to a number found in
some languages. But because an early return can often make code much clearer, no other
jumps or early exits cause an increment.

B1. Increments

  • goto LABEL, break LABEL, continue LABEL, break NUMBER, continue
    NUMBER

Where the AL Language doesn't support break or continue to a number, makes that we can handle the continue the same as an exit. Folks, always verify the output you're getting from an AI generated response 😅

Should Error() be considered an early exit?

In the same post as I was referring to, I find;

https://www.sonarsource.com/docs/CognitiveComplexity.pdf

Appendix C has a code examples containing a break; (break without a label)
No increment in assigned/given

If I now understand the "..an early return can often make code much clearer, no other jumps or early exits cause an increment.." correct, it means that a "no other jumps" mean a jump without an label or number. By this logic a Error is an jump without a label/number and could be treated as a early exit.


I think I now have everything covered, and the rule is ready for a release candidate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants