-
Notifications
You must be signed in to change notification settings - Fork 0
University algorithm-oriented project that accurately detects complex networks of influential mafia families represented as vertices and edges of rare graphs. The algorithms are being reduced to the Satisfiability Problem using a boolean formula that will be evaluated by an SAT Solver, deciding whether the initial clause has been satisfied or not.
mirunamariafatu/Reduction-to-SAT-Problem
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
******************************************************************************* GIGEL AND THE MAFIA - Universiy Project Algorithm Analysis 2020-2021 Fatu Miruna-Maria 321CA ******************************************************************************* The main goal of this project is to conduct an investigation on the mafia society in the town, which aims to detect complex networks of influential mafia families, ending with the arrest of as few families as possible through which all its allied families will be discovered and abolished. In this respect, the problems encountered in this investigation process will be reduced to the SAT problem, querying and asking for help of an Ancient Oracle at every step in order to validate our proposed solution. Generally, the satisfiability (SAT) problem consists of determining if a Boolean formula, that will be later evaluated by the Ancient Oracle, is satisfiable or unsatisfiable using the conjunctive normal form - a conjuction of clauses, where every clause is a disjunction. -------- ~ TASK 1 - Planting the Spies ~ --------- The purpose of the first task of the project is to infiltrate a number of spies (k) in the networks of mafia families. In the interest of the mission to be accomplished and the spies not to be discovered, the families (nodes) that have unidirectional connections (edges) between them must necessarily have different spies. This property points out the similarity of the current problem with the famous problem of k-colorability by reducing to the SAT problem. Consider the following notations: N - number of families (nrFamilies) K - number of spies (nrSpies) M - number of relations (nrRel) • formulateOracleQuestion() This method builds the whole key plan to successfully distribute spies to families by creating valid clauses, as will be seen in the implementation logic: ▶ The logic of choosing variables: For each mafia family there will be a k number of numerical variables, where k - number of available spies. -> Total variables = N * K Demonstration: N -> 1 : N N = 1 var: 1 - family no.1 can be investigated by the spy no.1 var: 2 - family no.1 can be investigated by the spy no.2 ... var: K - family no.1 can be investigated by the spy no.K N = 2 var: K + 1 - family no.2 can be investigated by the spy no.1 ... var: (N-1) * K + K: family no.2 can be investigated by the spy no.K ... N = N var: N * (K-1) + 1: family no.N can be investigated by the spy no.1 ▶ The logic of the created clauses: In order to create the normal conjunctive form to be analyzed by the oracle, 3 types of clauses were designed: 1. Clauses stating that every family must have at least one assigned spy. 1 ∨ 2 ∨ 3 ∨ 4 -> Assuming that there are four spies available for ... the mission ... N * (K-1) + 1 ∨ 1 * (K-1) + 2 ∨ 1 * (K-1) + 3 1 * (K-1) + 4 2. Clauses requiring that each family should not have more than one assigned spy. 1 ∨ 2 ... -> Example 1 ∨ K 3. Clauses that establish that no family that has a direct connection with another one has the same assigned spy. ▶ Time Complexity: The first nested "for" instructions in this method, used for creating the first two boolean clauses have: O((K-1)*(K-2)/2 + K) * O(N) -> O(K^2 + K) * O(N) -> O(K^2)*O(N) --> O(K^2 * N) The second nested "for" instruction through which the third set of clause is created has: --> O(N * K * adjacent edges) ` ~ Final complexity: O (N*K^2) + O(N * K * adjacent edges) • readProblemData() The data in the input file is processes by constructing an unidirectional graph where the families represent the nodes, and the edges are represented by the relationships established between the families. The data extracted from the input is processed in O(M) complexity; • decipherOracleAnswer() Extract the oracle answer which consists of a list of negative or positive numerical variables, where the negative variables are marked as FALSE, while the positive ones are marked TRUE. - O(N*K) time complexity • writeAnswer() The method writes to the output file the number of the assigned spy to each family - O(N) complexity; -------- ~ TASK 2 - Investigating the extended families ~ --------- The mission of the second task is to investigate an extended family of a given dimension. To simplify this problem, families and the connections between them can be seen as a graph in which the extended family is a clique of size K. Consider the following notations: N - number of families (nrFamilies) K - number of spies (nrSpies) M - number of relations (nrRel) • formulateOracleQuestion() In the respect of implementing a process of reducing instances of the current problem to instances of the SAT Problem, there will be created CNF boolean clauses completing an entire final formula that will be satistiable only if the graph has a clique of size K. ▶ The logic of choosing the variables: A convinient way to start the implementation is by choosing a suitable set of numerical variables, as following: - for every 1 <= i <= k there will be a list of variables x_iv ,where i - index of the ith node in the clique v - current node Total number of variables = N * K Demonstration: N -> 1 : N N = 1 var: 1 - family no.1 might be part of the clique var: 2 - family no.1 might be part of the clique ... var: K - family no.1 might be part of the clique N = 2 var: K + 1 - family no.2 might be part of the clique ... var: (N-1) * K + K: family no.2 might be part of the clique ... N = N var: N * (K-1) + 1: family no.N might be part of the clique ▶ The logic of creating the clauses: The querying process of the Ancient Oracle involves defining three logical constraints that align with the bounds of the problem: 1. Clauses that implies the fact that there must be exactly one i_th vertex in the clique for each i between 1 and the given dimension K. x_1v ∨ x_2v ∨ x_3v ∨ x_4v , 1 <= i <= K 2. A set of clauses involving that no vertex can be at the same time the i_th and the j_th (where i != j) vertex of the clique. In other words, two different vertices can not be both the i_th vertex in the clique. ¬x_iv ∨ ¬x_jv , i != j ¬x_iv ∨ ¬x_iw , v != w 3. A list of clauses that emphasize the fact that if there in no edge between v and w, the nodes v and w can not belong to the same clique. ¬x_iv ∨ ¬x_jw , i != j, v != w ▶ Time Complexity: The first set of "for" instructions in which the first type of clauses are formed has a complexity of O(N*K). The second nested "for" intruction in this method contains two "for" loops each of them having N * K operations. Thus, the entire "for" construction will have O(N*K*N*K) which means a O(N^2*K^2) time complexity. Also, another way of computing the complexity of the instructions is by counting the number of created clauses due to the fact that the purpose of all "for" instrutions is to create suitable clauses for the CNF formula. Total time complexity: O(N*K) + O((N^2*K^2) • readProblemData() The data in the input file is processes by constructing an bidirectional graph using an adjacency matrix where the families represent the nodes, and the edges are represented by the relationships established between the families. The data extracted from the input is processed in O(M) complexity (read the data & create the adjacency matrix), while creating the variables takes up to O(N*K) complexity; • decipherOracleAnswer() Extract the oracle answer which consists of a list of negative or positive numerical variables, where the negative variables are marked as FALSE, while the positive ones are marked TRUE. - O(number of total variables) -> O(N*K) time complexity • writeAnswer() The method writes to the output file all families' index that make up the clique of size K. - O(N) complexity; -------- ~ TASK 3 - Mafia arrest ~ --------- In order to constitute an efficient investigation and the number of arrests to be minimal for which the mafia network will be abolished, the implementation of this problem will be reduced to two main ideas: 1. Creating a complementary graph of the complementary problem of this task 2. Reducing the current problem to a previous task - task2 in order to find out the maximal clique of the complementary graph of this problem The most important reason of reducing this problem to the SAT problem using a complementary graph is the fact that by successively querying the Oracle we can find the maximal clique of the complementary graph. Thus, the families (nodes) we are looking for are the ones that do not belong to the maximal clique because they represent the nodes that make up the minimal clique of the main graph which represent the families that should be arrested. Consider the following notations: N - number of families (nrFamilies) K - number of spies (nrSpies) M - number of relations (nrRel) ▶ Time complexity: By successively querying the Oracle and reducing the current problem to a previously solved task -task 2-, the time complexity should be: Main complexity: O(N * p) , p = how many times the Oracle has been queried (by successively using task2 functionalities) • readProblemData() The data in the input file is processes by constructing an bidirectional graph using an adjacency matrix where the families represent the nodes, and the edges are represented by the relationships established between the families. The data extracted from the input is processed in O(M) complexity (read the data & create the adjacency matrix). • reduceToTask2() Writes to the input file of task 2, creating the adjacency matrix of the complementary problem. -> O(N^2) time complexity • extractAnswerFromTask2() Extracting all processed data from the task2 output file. -> O(dimension of the maximal clique found using the complementary graph) • writeAnswer() The method writes to the output file all families' index that should be arrested. - O(N) complexity; -------- ~ BONUS - Mafia arrest 2nd stage ~ --------- This task closely follows the implementation logic of task 2 and 3, reducing the current problem to the SAT problem in a very similar way. Thus, the process of creating the variables and clauses follows the same algorithms as the ones previously described for task 2, considering some essential changes: 1. Approximating the value of the maximal clique size (K), reducing the current graph to its complementary problem 2. Using the Weigthed Partial Max-SAT input format, the created clauses can belong to two different categories: hard clauses or soft clauses Consider the following notations: N - number of families (nrFamilies) K - number of spies (nrSpies) M - number of relations (nrRel) ▶ The algorithm for approximating the value of maximum k can be demonstrated as follows: Considering that for each node in the complementary graph there will be added edges between them and all the remaining nodes in the graph, are obtained the following operations: - Adding N-1 edges -> there will be edges between the first node and all the remaining nodes -> clique dimension -> 2 - Adding N-2 edges -> two nodes in the graph will have connections with all the remaining nodes in the graph -> clique dimension -> 3 ....... - Adding N - k -> clique dimension -> k + 1 Therefore, by repeatedly subtracting the possible edges added to the graph from the total number of edges of the graph (complementaryEdges), the approximate value of the maximal clique dimension of the complementary graph will be determined. ▶ Determining the type of the clauses: The first set of clauses created will be the only set of soft clauses, each of them having a different weight than the previous one. By doing this, the Oracle will try to fulfill as many clauses as it can in order to find the maximal clique. Hard clauses are represented by the last two sets of clauses created in the formulateOracleQuestion() method so that they establish the minimum necessary conditions that must be met so that a clique is valid. These refer to clauses no.2 and no.3 of the clauses used for task 2 (more details are provided in the explanation of task 2). For further information, comments in the code should clarify any ambiguity.
About
University algorithm-oriented project that accurately detects complex networks of influential mafia families represented as vertices and edges of rare graphs. The algorithms are being reduced to the Satisfiability Problem using a boolean formula that will be evaluated by an SAT Solver, deciding whether the initial clause has been satisfied or not.
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published