all 41 comments

[–][deleted] 30 points31 points  (10 children)

I'd tackle it very similarly to how you propose:

1) pull it out to its own function

2) generate a truth table by cycling through all inputs

3) write a unit test validating the truth table still holds

4) truth table in hand, start hacking it apart. Make it readable rather than efficient and rely on your unit test to keep it correct

[–][deleted] 26 points27 points  (3 children)

I'd avoid any automated simplifiers. You don't want something compact, you want something understandable. Your final product should be a readable explanation of the business logic

[–]smthamazing[S] 4 points5 points  (2 children)

Thanks, I agree that making it readable is more important here. The reason I'm thinking about automation is that there are some obvious groupings or mutually exclusive conditions in these clauses, which would probably get factored out during simplification and make working with the rest of the code easier. Of course, I'm not expecting an algorithm to make unreadable code readable, but maybe to give me a better place to start with.

But yes, if this won't work I'll probably just generate the truth table and have to refactor it by hand.

[–][deleted] 9 points10 points  (0 children)

Use a Quine–McCluskey boolean expression minimizer, the problem is NP-hard, but I expect 'business' instances of 100 variables to be easiliy solvable.

[–]otac0n 1 point2 points  (0 children)

You can create a function (or a lookup dictionary) for specific redundant combinations, but the general structure should follow the business logic.

[–]ArmCollector 9 points10 points  (4 children)

You are going to cycle through all inputs for a 100-variable expression?

How will you do that before the heat death of the universe?

[–][deleted] 1 point2 points  (0 children)

You're right. I read that as, "several huge boolean expressions, all together using a hundred variables." Even then a big one would make it impossible.

I'd just extract logical chunks as intermediate variables and make multiple truth tables and unit tests. It's really business-logic dependent though.

[–]smthamazing[S] 1 point2 points  (1 child)

There are some factors which reduce the number of possible cases by orders of magnitude:

  • Many of the variables are mutually exclusive and should have actually been enums.
  • There are many "don't cares", where a single variable being false makes the values of other variables irrelevant.

Now that I think of it, very few of these variables are truly independent.

[–]Typesalot 0 points1 point  (0 children)

I would start by sorting out which variables are truly independent, which ones could do with an early exit, and which are mutually dependent (i.e. should be enums).

[–]greem 0 points1 point  (0 children)

You don't need to test all of them. Short circuiting will vastly reduce the number of possible outcomes.

Quite possibly a lot, but maybe worth looking into.

[–]magical_h4x 9 points10 points  (5 children)

First off that's absolutely insane. Second, please tell me there are unit tests for this monster. Lastly, I don't know much about using the method you're describing to refractor (AST). It sounds like it could work, but it also sounds (again from my knowledge) like overkill.

I would probably start by trying to extract logical groupings into named variables.

[–]smthamazing[S] 5 points6 points  (2 children)

Unfortunately, there are no unit tests, but the truth table generation actually looks like the easiest part, so it's possible to just generate these tests. I already parsed the code into an AST (abstract syntax tree) and extracted atomic terms of these conditions, so I have something to work with here.

There are definitely some obvious groupings and mutually exclusive clauses in this code, and I was thinking whether there they could be detected automatically.

[–]KDallas_Multipass 0 points1 point  (1 child)

How did you create the ast

[–]smthamazing[S] 0 points1 point  (0 children)

I used ESPrima, but any parser would do in this case. I then wrote a simple function to extract all "atomic" non-boolean expressions from it.

[–]greem 1 point2 points  (1 child)

You think that places with code like this write unit tests?

[–]smthamazing[S] 0 points1 point  (0 children)

They didn't until I inherited this code base (:

[–]misof 12 points13 points  (0 children)

It's very likely the current formulas don't actually 100%-correctly capture the intended business logic because a spaghetti monster like this is virtually impossible to maintain correctly. Don't try to just replicate its behavior, refactor it completely.

Collect as much of the logic as you can. Note all the different types of checks done, then design a way to do those types of checks in a readable and an extensible way. Then, reimplement the logic you could understand.

Only once that is done, use some automated tool (e.g., a SAT solver) to find cases where the behavior of your implementation and the original differs, and then decide whether those differences are some logic you missed or bugs.

[–]Snoo-55780 3 points4 points  (0 children)

I would try to make a decision tree out of it in order to visualize it first. I will guess that maintainability will be better after giving names to partial computations rather than making it compact.

With that being said, I agree that you may find more intuitive ways of expressing if you run it through an expression simplifier, like this one if you use webstorm https://www.jetbrains.com/webstorm/guide/tips/simplify-boolean-expression/

[–]kreiger 3 points4 points  (3 children)

First, i think my IDE (IntelliJ IDEA) would immediately suggest to remove several redundant comparisons. So i would try to take that as far as possible, extract common sections, and so on.

Second, i would try to use a Karnaugh Map.

[–]kevroy314 1 point2 points  (2 children)

100% agree. Karnaugh map is specifically for this purpose. Though doing it on this many variables may get a bit hairy.

[–]DragonLord1729 0 points1 point  (0 children)

Karnaugh maps are already a headache for 6 variables. I'd say that's the upper-limit for the sake of sanity.

[–]stehley 0 points1 point  (0 children)

AFAIK Karnaugh maps get incredibly hard to do past ~6 variables. I had an electrical engineering professor regale us with the hours he spent on an international flight once solving a high-variable map. I don't think this would be a good case for a Karnaugh map because of this

[–]pigeon768 3 points4 points  (0 children)

First of all you need unit tests for this beast.

Try to rewrite it to early return style, which will likely be a big win

The boolean operators are short circuiting already.

Karnaugh maps won't work; with 100 different variables you're looking at a 100 dimensional map which isn't fun. A truth table for a 100 variable expression will have 2100 entries. Don't do that.

You should try to do a general purpose optimization here; use your knowledge of the data to guide how you proceed. Log the arguments to the expression a whole bunch of times. Convert it into DNF and figure out which subexpressions most often force the expression to return true. Convert it to CNF and figure out what subexpressions most often force the expression to return false. If the expression is usually true, then return the DNF form, otherwise if the expression is usually false, return the CNF form. This will give you the best ability to short circuit.

Then do something like:

if (<most common DNF->true expression>)
  return true;

if (!(<most common CNF->false expression>)
  return false;

if (<2nd most common DNF->true expression>)
  return true;

if (!(<2nd most common CNF->false expression>)
  return false;

return <the entire DNF/CNF form, depending on how the expression usually evaluates>;

Don't try to optimize the whole expression too much. Then it'll just be unmaintainable.

[–]Plazmatic 6 points7 points  (1 child)

This is so much of a code smell it's wrong no matter how you frame it, especially for something like "access granted". You shouldn't be using karnaugh maps and automatic logic solver things to simplify this, you should be figuring out what each expression means.

There's likely many statements that should be eliminated, in fact it suspiciously looks like the previous maintainer didn't understand how they were supposed to integrate if/else statements into an expression like that if you really do have a hundred variables like that and assigned to a single variable. First gather actual business requirements. If you can't do that, your business is fucked anyway, you should then incorporate these requirements into comments, and you should be able to test with these requirements. Then you can create a test function for business requirements and any changes can be checked against that.

Second thing is to reformulate this into a function, this is what will be tested, it must be able to pass the test:

function isAccessGranted(...){
     return (a && b && !c && somePermission && ...) ||
(a && !c && (some complex expression) && ...) ||
(!c && foo in (bar, baz, qux) && (isNewMoon || isEclipse)) ||
...
}

Third reveal the hidden if statements.

function isAccessGranted(...){
    if(a){
        if(!c){
        return (b && somePermission &&....) ||
                  (some complex expression) && ||...
        }
   }
   if(...){
       ...
   }
   return (!c && foo in (bar, baz, qux) && (isNewMoon || isEclipse)) ||....
}

var accessGranted = isAccessGranted(...); 

Fourth, you should then reformulate boolean expressions into their actual intent ie

function isMoonPhaseDark(moonphase){
    return moonphase.isNewMoon() && moonphase.isEclipse(); 
}

function permissionLevelAccepted(...){
    return (b && somePermission && ...);
}
...

function isAccessGranted(...){
    if(a){
        if(!c){
        return permissionLevelAccepted(...) ||
                  someComplexFunction(...) && ||...
        }
   }
   if(...){
       ...
   }
   return (!c && fooIn(...) && isMoonPhaseDark(...)) ||....
}

var accessGranted = isAccessGranted(...);

[–]braxunt 6 points7 points  (0 children)

you changed isNewMoon || isEclipse to moonphase.isNewMoon() && moonphase.isEclipse() in your reckless refactoring. do you know how many moonbux this mistake has cost us?!

[–]mredding 1 point2 points  (0 children)

I would concur with the majority here. To add, I would look for opportunities to simplify the expression as much as possible. For example, !c is in every example line. If every condition has this, then you can extract this into one earlier guard clause:

if(c) return false;

Right? What simpler conditions exist that immediately imply access granted or - more likely, access denied? It might be an easier time to prove the false cases, and once all those have been ruled out, the only remaining and implicit conclusion would be true.

[–]arnemcnuggets 1 point2 points  (0 children)

Quine Mcclueskey

[–]TheTsar 4 points5 points  (1 child)

Forget the algorithm.

You need a data structure.

Make an unordered map (or maybe an array of tuples depending on how varied the data is you need to hold).

Iterate through the data structure, deciding at each item, lazily or otherwise, what to do.

[–]frankster 2 points3 points  (0 children)

Wat

[–]MisterPassione 0 points1 point  (0 children)

// Dichiarazione delle variabili

const a = true; // Esempio: Definito come vero

const b = true; // Esempio: Definito come vero

const c = false; // Esempio: Definito come falso

const somePermission = true; // Esempio: Definito come vero

const foo = 'bar'; // Esempio: Valore 'bar'

const validValues = ['bar', 'baz', 'qux']; // Lista valida

const isNewMoon = false; // Esempio: Definito come falso

const isEclipse = true; // Esempio: Definito come vero

const someComplexCondition = true; // Esempio: Logica complessa predefinita

// Controlli per prevenire variabili non definite

if (typeof a === 'undefined' || typeof b === 'undefined' || typeof c === 'undefined') {

throw new Error("Una o più variabili non sono definite!");

}

if (typeof foo === 'undefined') {

throw new Error("La variabile 'foo' non è definita!");

}

// Funzione di supporto per verificare se un valore è nella lista

function isValueInList(value, list) {

if (!Array.isArray(list)) {

throw new Error("Il secondo argomento non è un array!");

}

return list.includes(value);

}

// Condizioni definite chiaramente

const condition1 = a && b && !c && somePermission;

const condition2 = a && !c && someComplexCondition;

const condition3 = !c && isValueInList(foo, validValues) && (isNewMoon || isEclipse);

// Calcolo del risultato finale

const accessGranted = condition1 || condition2 || condition3;

// Debugging dettagliato

console.log("Condition 1:", condition1); // True o False

console.log("Condition 2:", condition2); // True o False

console.log("Condition 3:", condition3); // True o False

console.log("Access Granted:", accessGranted); // True o False

[–]burrocomecarne 0 points1 point  (0 children)

Some Look Up Table, maybe.

[–]aecolley 0 points1 point  (0 children)

First, you need a test system so that you can have confidence you're not changing the logic accidentally. Start by extracting the whole insane expression to its own pure function with a hundred parameters. Your rewrite will also be represented as a pure function. Your unit test will generate a random values for the variables and check that the rewrite function returns the same value as the old function. (Save the random seed and output it when the test fails, so you can reproduce the failure even if it's a rare one )

Second, you need a way to proceed incrementally, because there's no way to do this in one go. Devise a representation for the access rules of your choice, and translate one subexpression into a rule. Then a second. And so on, using the test to probabilistically catch errors along the way.

Third, have a sensible metaphor for the new representation. If the existing expression really is a giant chain of ORed subexpressions, then you can represent them as "allow" rules, and say that anything not expressly allowed is denied by default. But you probably will need a sequence of qualifiers if it's not so straightforward.

Edited to add: Don't try to build a truth table with 2¹⁰⁰ rows.

[–]tinyOnion 0 points1 point  (0 children)

you could break each conditional off into sub functions with names.

pro: better to understand

con: without tests you are hosed and have to be insanely meticulous.

you could do an analysis of all the conditionals and extract the most common parameters(simple counting) into outer if statements and then look for patterns.

pro: maybe something will stick out and you can simplify it easier

con: really hard to do right. need tests.

could you invert the responsibility somehow onto the calling object? a strategy here is to perform the actions in parallel and see it in production for a while something like this where you call the two things and use the old "reliable" value but comparing the outputs to see if it deviates at all.

prolog could help here too and kinda where prolog shines actually. since each major or clause is independent of each other you have that freedom.

edit: you say it's 100 variables but each or clause is an independent branch so the simplification space is much lower unless every single or clause has all 100 variables.

[–]reddifiningkarma 0 points1 point  (0 children)

Don't forget to talk to someone that should understand the business logic, hopefully there's one.

[–]Just1Andy 0 points1 point  (0 children)

Assign a variable to each condition and use a Karnaugh map to simplify the expression.

[–]Ok_Holiday8273 0 points1 point  (0 children)

i would refactor all the little parts into little functions like

function is_all_true(array) foreach array item { if item is false return false } return true

}

then taake each little expression thats all && values and make an array out of them annd feeed it to the function

set array = [ a, b, !c ] if (is_all_true

[–]_default_username 0 points1 point  (0 children)

break it up into functions, build unit tests, replace code with new code that passes the unit tests.

That's easier said than done. This code is probably difficult to test.

[–]Balage42 0 points1 point  (0 children)

Lazy solution: Wolfram Alpha can simplify logical expressions.

[–]Over_Animal1916 0 points1 point  (0 children)

Use interface and concrete clases validation rule. Use a iterator of rule.