RLFAP

Introduction
The french "Centre d'Électronique de l'Armement" (CELAR) has made available, in the framework of the european project EUCLID CALMA (Combinatorial Algorithms for Minitary Applications) s set of Radio Link Frequency Assignment benchmark problems (RLFAP) build from a real network, with simplified data. These benchmarks had been previously designed by the CELAR to assess several different Constraint Programming languages.

These benchmarks are extremely valuable as benchmarks for the CSP community and more largely for constraint programming:

• The constraints are all binary (involving no more than two variables), non linear and the variables have finite domains.
• These are real-world size problems, the larger instances having round one thousand variables and more than five thousand constraints. All these instances have been built from a unique real instance with 916 links and 5744 constraints in 11 connected components.
The aim of this page is to present some essential facts which will be useful for people trying to tackle these instances. I also tried to give some ideas of the purely practical aspect of these problems and more specifically of the origine of the criteria optimized. I'm definitely not a radio link assignment specialist: let the knowledgeable ones give me more to say.

From a purely algorithmic point of view, if you got better results than the "best results" presneted in this page, I would be deighted to get your results in order to update the page. Let me finally remind you that anybody presenting results on these instances should explicitely refer to the "Centre d'Electronique de l'Armement" in his paper.

People interested in Frquency Aassignemnt Problems should have a lokk to the FAP web site.

Problem description
The Radio Link frequency Assignment Problem consists in assigning frequencies to a set of radio links defined between pairs of sites in order to avoid interferences. Each radio link is represented by a variable whose domain is the set of all frequences that are available for this link. The essential constraints involve two variables F1 et F2:

|F1-F2|> k12

The two variables represent two radio links which are "close" one to the other. The constant k12 depends on the position of the two links and also on the physical environment. It is obtained using a mathematical model of electromagnetic waves propagation which is still very "rough". Therefore, the constants k12 are not necessarily correct (it is possible that the minimum difference in frequency between F1 and F2 that does not yield interferences is actually different from k12). In practice, k12 is often overestimated in order to effectively guarantee the absence of interference. For each pair of sites, two frequencies must be assigned: one for the communications from A to B, the other one for the communications from B to A. In the case of the CELAR instances, a technological constraint appears: the distance in frequency between the A->B link and the B->A link must be exactly equal to 238. In all CELAR instances, these pairs of links are represented as pairs of variables numbered 2k and 2k+1. The possibility of expressing constraints such as >|F1-F2|> k12 suffices to express the graph coloring problem and it is therefore clear that the RLFAP is NP-hard.

The criteria optimized
In order to facilitate the later addition of new radio links, one tries to build solutions that leave room for these new links. The pure satisfaction problem is therefore not crucial in itself (even if...). Two essential criteria are ususally considered for feasible instances:
• Minimization of the maximum frequency used: the frequencies above the last frequency used can be tried for new radio links.
• Minimization of the number of frequencies useds: the different frequencies unused can be tries for new radio links. Here, it is likely that the various frequencies available will be more distant one from the other and therefore it is more likely that a new radio link can be inserted without any modification of the previous setup. In practice, this criteria seems therefore more interesting than the previous one.
The first criteria, a Min-Max type criteria, is usually less expensive to optimize from an algorithmic view point. A third type of criteria is used in practice when the problem is unfeasible (inconsistent): Minimization of a weighted sum of the violated constraints. As we said before, in constraints such as |F1-F2|> k12, the constant k12 is usually overestimated and it is possible that such a constraint be violated without any real interference occurring in practice. The minimization of a weighted sum of the constraints violated corresponds to a minimization of costs induced to check the quality of the communication (and possibly a modification of the position of the antennas). The criteria used on unfeasible instances being different from the criteria used on feasible instances, this gives back some interest to the pure satisfaction problem which, in practice, will allow to select the optimization criteria to use.

When nwe links are added to an existing communication network, some variables are already assigned and it is either impossible to modify the value of these variables (hard constraints) or, again, one must minimize a weighted sum of the variables being modified. The size of the modification is not important. What is important is that the variable values has changed which means that somebody must apply this change. One can imagine that the cost associated with a variable is related to the cost of having the change done.

Where can I find the instances
Eleven CELAR instances are available. Forteen extra GRAPH instances have been independently generated. It is possible to download these instances here. These eleven instances are all built from a single network (some instances are closely related to each others).For each instance, a criteria to optimize on this instance is given. Considering these instances as benchmarks, we propose to forget this indication. In the sequel, we will distinguish eleven instances on one side and the various problems (criteria) that one can consider to solve on each of these instances.
Syntax of the files
Each instance is distributed in four files in a single directory.
• var.txt: gives the list of the variables,
• dom.txt: gives the domain's definition,
• ctr.txt: gives the list of all constraints,
• cst.txt: defines the criteria to be optimized a priori on this instance.
All the files are made of fixed size fields and use space as a seprator.
The var.txt file
This file describes all the variables in the instance. Each line corresponds to one variable using 4 fields from which only the two first fields are always present.
• Field 1: the variable number. ATTENTION, the numbers are not necessary in sequence. Some instances actually involve only a subet of the variable of the original problem.
• Field 2: the domain number for the variable (cf. dom.txt file).
• Field 3: the initial value of the variable (if any). This field is not always present. If present, it means that the link has laready been assigned a frequency.
• Field 4: mobility of the variable i.e., the index of the cost of modification of its value (cf. cst.txt). The mobility can be equal to 0, 1 2, 3 et 4. A mobility equal to 0 indicates that the variable value cannot be modified (hard constraint). The values from 1 to 4 corresponds to increasing penalties whose values are defined in the cst.txt file.

The dom.txt file.
This file describes the domains used by the variables of the problem. Each line describes one domain. The first line defines a dummy domain that results from the union of all the domains.
• Field 1: the domain number, used in the var.txt file .
• Field 2: the domain cardinality.
• Field 3 and next: the values of the domains, in increasing order.

The ctr.txt file.
This file describes the constraints of the instance. Each line defines a binary constraint.
• Field 1: the number of the first variable.
• Field 2: the number of the second variable.
• Field 3: the constraint type. It is defined by a single character: (D (difference), C (cosite), F (fixe), P (préfixé) ou L (far fields). This field is useless in practice and is only available here to identify the origin of the constraint.
• Field 4: the operator used in the constraint (see next field). It can be "=" or ">".
• Field 5: the constraint deviation. It defines the constant k12 already mentionned. Together, Fields 4 and 5 define the constraint. The Field 4 indicates the realtional operator that must be used to compare the absolute value of the difference of the values of the variables to the integer given in the field 5 (deviation). The semnatics of a constraint is therefore:

• |Field1 - Field2| Field4 Field5

• Field 6: The index of weighting of the constraint. As for the mobility of assigned variables, it is an index that varies from 0 to 4. Value 0 indicates a hard constraint. Values from 1 to 4 indicates increasing weights as given in the cst.txt field. If this field is absent from the file, the constraint must be considered as an hard constraint (index 0).

The cst.txt file
It defines the criteria to be optimized on the instance a priori. This file has no precise syntax. It gives a textual description of the criteria and may also contain the definition of 8 numbers a1,..., a4, b1,..., b4. When the instance is unfeasible, one must minimize:

a1*nc1+...+a4*nc4+b1*nv1+...+b4*nv4

Where nci is the number of constraints whose weighting index is equal to i and which have been violated and where nvi is the number of assigned variables whose mobility index is equal to i and which have been modified. When may note that, according to the instance considered, the criteria to optimize a priori can be:

• to simply minimize the number of freqeuncies used (instances 1, 2, and 3).
• to minimize the number of frequencies used if the problem is feasible or else to minimize a weighted sum of violated constraints defined by a1,..., a4, b1,..., b4 (instances 6 to 11).
• to minimize the largest frequency used if the problem is feasible or else to minimize a weighted sum of violated constraints defined by a1,..., a4, b1,..., b4 (instances 4 and 5).

Significant facts and best results on the instances
Fo rall instances, the dom.txt is the same one bu the number of the domain used for each variable in each instance may change from one instance to the other. The largest domain contains 44 values.
• Instances 1,2 et 3: on purpose, these are under-constrained instances obtained by extracting a sub-problem from the original problem. The instances 2 and 3 are connected and defined by a subgraph of the largest connected component of the original problem (that contains 680 variables).
• scen01: The instance involves 916 variables and 5548 constraints. All the constraints are hard. None of the variables has been assigned yet. The instance is feasible (consistent). The constraint graph has several connected components. The criteria associated with the instance consists in minimizing the number of frequencies used. The provenly optimal solution uses 16 frequencies. This instance is extremely easy for satisfaction. A simple "backtrack" algorithm, with no filtering suffices to find a solution. The optimization problem is much more difficult.
• scen02: The instance involves 200 variables and 1235 constraints. All the constraints are hard. None of the variables has been assigned yet. The instance is feasible (consistent). The constraint graph has only one connected components. The criteria associated with the instance consists in minimizing the number of frequencies used. The provenly optimal solution uses 14 frequencies. This instance is extremely easy for satisfaction. A simple "backtrack" algorithm, with no filtering suffices to find a solution. The optimization problem is more difficult.
• scen03: The instance involves 400 variables and 2760 constraints. All the constraints are hard. None of the variables has been assigned yet. The instance is feasible (consistent). The constraint graph has only one connected components. The criteria associated with the instance consists in minimizing the number of frequencies used. The provenly optimal solution uses 14 frequencies. This instance is extremely easy for satisfaction. A simple "backtrack" algorithm, with no filtering suffices to find a solution. The optimization problem is more difficult.
• Instances 4 and 5: there are more constrained but not yet overconstrained problems.
• scen04: The instance involves 680 variables and 3967 constraints. It is in fact the largest connected component of the original problem. 280 variables among the 600 have already been assigned, with no mobility (hard constraint). Some of the constraints of the original problem have been slightly modified in order to guarantee the existence of a solution given the pre-assigned values. All the constraints are hard. The instance is feasible (consistent). The constraint graph has only one connected components. The criteria associated with the instance consists in minimizing the number of frequencies used. The provenly optimal solution uses 46 frequencies. ATTENTION, there was an error in the data-set originally that indicated that the criteria to optimize on this problem was the minimum maximum frequency criteria. Some of the variables being already assigned, a pre-filtering by arc-consistency gives a lot of pruning and yields an instance that is extremely easy for satisfaction. A simple "backtrack" algorithm, with no filtering, then suffices to find a solution. The optimization problem is slightly more difficult.
• scen05: The instance involves 400 variables and 2598 constraints. The graph of the problem is a subgraph of the largest connected component of the original problem. All the constraints are hard. None of the variables has been assigned yet. The instance is feasible (consistent). The constraint graph has only one connected components. The criteria associated with the instance consists in minimizing the maximum frequency used. The provenly optimal solution uses the maximum frequency 792. A pre-filtering by arc-consistency gives a lot of pruning and yields an instance that is extremely easy for satisfaction and which contains a domain reduced to the single frequency 792. This frequency must therefore be used and is also the maximum frequency available. The optimization problem is therefore solved by satisfaction alone.
• Instances 6,7 and 8: these instances are overconstrained and unfeasible (inconsistent).
• scen06: The instance involves 200 variables and 1322 constraints. The graph of the problem is a subgraph of the largest connected component of the original problem. Some of the constraints of the original problem have been slightly modified in order to guarantee unfeasibility. All the constraints using the "=" operator are hard (frequency shift between one way communication and the other way). None of the variables has been assigned yet. The instance is unfeasible (inconsistent). The constraint graph has only one connected components. The criteria associated with the instance consists in minimizing a weighted sum of violated constraints using weights of 1, 10, 100 and 1000. The best solution known has a cost of 3389. A first lower bound of 3005 has been proved. The cost of 3389 has later been proved to be optimal (see this paper ). Unfeasiblity is directly proved using arc-consistency enforcing. The optimization problem is challenging. Recently, a direct proof of optimality for has been obtained by A. Koster using a dynamic programming algorithm (using a good tree decomposition).
• scen07: The instance involves 400 variables and 2865 constraints. The graph of the problem is a subgraph of the largest connected component of the original problem. Some of the constraints of the original problem have been slightly modified in order to guarantee unfeasibility. All the constraints using the "=" operator are hard (frequency shift between one way communication and the other way). None of the variables has been assigned yet. The instance is unfeasible (inconsistent). The constraint graph has only one connected components. The criteria associated with the instance consists in minimizing a weighted sum of violated constraints using weights of 1, 100, 10000 and 1000000. The best solution known has a cost of 343 592. No proof of optimality or non-trivial lower bound exists for this instance. Unfeasiblity is directly proved using arc-consistency enforcing. The optimization problem is challenging. A new lower bound of 300 000 has been exhibited by A. Koster.
• scen08: The instance involves 916 variables and 5744 constraints. It is very close to the original problem. Few constraints of the original problem have been slightly modified in order to guarantee unfeasibility. All the constraints using the "=" operator are hard (frequency shift between one way communication and the other way) all the other are soft. None of the variables has been assigned yet. The instance is unfeasible (inconsistent). The constraint graph has more than one connected components. The criteria associated with the instance consists in minimizing a weighted sum of violated constraints using weights of 1, 2, 3 and 4. The best solution known has a cost of 262. No proof of optimality or non-trivial lower bound exists for this instance. Unfeasiblity is directly proved using arc-consistency enforcing. The optimization problem is challenging. A new lower bound of 150 has been exhinited by A. Koster usinh integer programming.
• Instances 9 and 10: These 2 instances are essentially identical. Only the weighting changes. They both are overconstrained.
• scen09: The instance involves 680 variables and 4103 constraints obtained from the largest connected component of the original problem. All the constraints using the "=" operator are hard (frequency shift between one way communication and the other way) all the other are soft. 586 variables are already assigned among which 280 have no mobility (index 0, hard constraint).The instance is unfeasible (inconsistent). The constraint graph has one connected components. The criteria associated with the instance consists in minimizing a weighted sum of violated constraints using weights of 1, 10, 100 and 1000. The best solution known has a cost of 15571. No proof of optimality exists for this instance but lower bound of 14875 has been proven. ATTENTION, when this lower bound was computed, le file syntax was ambiguous ans apparently, the lower bound was computed using a data interpretation that overestimated the lower bound. The value you may find in rechnical report of 14969 is incorrect. The value 14969-94=14875 should be correct. Unfeasiblity is directly proved using arc-consistency enforcing. The optimization problem is challenging. However, instances 9 and 10 appear to be easier than 6, 7 and 8 (with the same criteria). The best known solution has been proved optimal by A. Koster.
• scen10: The instance involves 680 variables and 4103 constraints obtained from the largest connected component of the original problem. All the constraints using the "=" operator are hard (frequency shift between one way communication and the other way) all the other are soft. 586 variables are already assigned among which 280 have no mobility (index 0, hard constraint).The instance is unfeasible (inconsistent). The constraint graph has one connected components. The criteria associated with the instance consists in minimizing a weighted sum of violated constraints using weights of 1, 2, 100 and 1000 for constraints and 10, 100, 10000 and 100000 for variables. The best solution known has a cost of 31516. No proof of optimality exists for this instance but lower bound of 31204 has been proven. ATTENTION, when this lower bound was computed, le file syntax was ambiguous ans apparently, the lower bound was computed using a data interpretation that overestimated the lower bound. The value you may find in rechnical report of 14969 is incorrect. The value 32144-940=31204 should be correct. Unfeasiblity is directly proved using arc-consistency enforcing. The optimization problem is challenging. However, instances 9 and 10 appear to be easier than 6, 7 and 8 (with the same criteria). The best known solution has been proved optimal by A. Koster.
• Scénario 11: It is the largest connected component of the original problem with no modification. The instance involves 680 variables and 4103 constraints obtained from the largest connected component of the original problem. All the constraints using the "=" operator are hard (frequency shift between one way communication and the other way). No variable is already assigned..The instance is feasible (consistent). The constraint graph has one connected components. The criteria associated with the instance consists in minimizing the number of frequencies used if it is feasible (it is) or else in minimizing a weighted sum of violated constraints. The best solution known uses 22 frequencies and is optimal. This instance is the only instance that can be considered as significant as a benchmark for satisfaction. Some filtering (forward-checking) as well as a good variable ordering is needed. The optimization problem is much more difficult.

The different problems one can tackle
Initially, each CELAR instance has one associated criteria. It seems reasonnable to distinguish instances and problems solved on each instance. One can therefore rey to tackle the following problem on each instance:
• simple satisfaction
• minimizing the number of frequencies used,
• minimizing the maximum frequency,
• minimizing a weighted sum of violated constraints using several weightings: 1,2,3,4 or 1,10,100,1000 or 1,100,10000,1000000.

Several observations and tricks have been observed and used during the CALMA poject:

For all CELAR instances, the precise contents of the variables along with the specific nature of the constraints using the "=" operator makes it possible to divide the number of variables by two. This is possible because these constraints are actually BIJECTIVE (for any given value in a domain there is a single value which is compatible in the other domain). One can verify this by considering domain 0 (dummy) that results from the union of all possible domains: for each values x below 238 (i.e., from 16 to 170), only values x+238 (i.e., from 254 to 408) is compatible with it. Value 240 is compatible with 478 alone. Each value from 254 to 408 is only compatible with one value below (between 16 and 170). The same schema reproduces itself for values from 414 to 554, each being compatible with one value between 652 and 792 (and vice versa). It is therefore possible to replace each pair of variable numbered 2k, 2k+1 by a single variable whose domain is composed of pairs of compatible values. Naturally, the constraints that previously involved theses variables should be modified accordingly (and the "=" based constraint can removed): the constraints that involved variable 2k must now extract the first component of the pair while the constraints that involved variable 2k+1 must extract the second component of the pair. For example, if a pair of variables 2k, 2k+1 both with doamin 5 (142 170 240 380 408 478) is considered, it will be replaced by a single variable whose domain will also contain 6 values: ((142,380), (170,408), (240,478), (380,142), (408,170), (478,240)). This simplification process is included in the CELAR problem reader available in the LVCSP library (file input-csp-celar.lisp, loading with the flag :simplified). Note that this process may (and usually does) yield a problem with a non-simple graph (there may be several edges between a single pair of variables). It is always a good idea to merge these multiple constraints in a single one to get a better filtering.

For solving optimization problems where the aim is to minimize the maximum frequency used, the following approach has given excellent results: one uses arc-consistency filtering to localize the minimum value of the maximum frequency that does not lead to a wipe-out. This can be done in a dichotomic way (the domains have a maximum number of 48 values and one can locate the "arc-inconsistency" frontier in less than 6 AC filtering). This is nothing but enforcing arc-consistency in a fuzzy/possibilistic CSP where the following possibility distribution is used on domains: the lower the value, the more possible it is. Finally, a satisfaction algorithm is applied using the minimum frequency identified above (it did not yield a wipe-out). In most cases, this instance is deasible (consistent) et is also optimal since smaller values did generate a wipe-out. If it is unfeasible, a dichtomic process may be used again on the values that remain.

For solving optimization problems where the aim is to minimize the number of frequencies used, it gets more complicated. One can sophisticate a satisfaction algorithm using a branch and bound approach. The problem is to find a good lower bound on the minimum number of frequencies that must be used. In the CALMA project, this lower bound was provided by the chromatic number or more precisely the T-coloring number which has been used (in the simplified problem where equality constraints have been removed there are only |x-y|> k constraints which are always harder than a difference constraint). One can improve this lower bound by taking into account the "=" based constraints. Computing the chromatic number of a graph is NP-hard, but the CELAR graphs are not very dense and graph-rewriting techniques (cited in Kanesky et Cheeseman AAAI paper "Where the really problems are") gives excellent results. This technique has been adapated and used by A. Kolen in the framework of the CALMA project with nice results. For more information, one can have a look to this CALMA technical report. One must also use a good value ordering heuristics which is here induced by the criteria. One prefers a value that has already been used and which is also present in a maximum number of future domains.

For solving optimization problems where the aim is to minimize a weighted sum of violated constraints, CSP algorithms designed to solve valued CSP such as the Russian Doll Search algorithm have been used along with a graph partitioning process to prove optimality of the best solution found so far of scen06. For finding good solutions, the best results have been obtained by local search techniques (actually, a genetic algorithm that uses integer linear programming to compute a fitness has been used by A. Kolen to get this optimal solution and others best solutions known on these problems). Recently, A. Koster has been able to build very tight (often optimal) lower bounds for these problems using a low tree-width decomposition of the CSP graph. This work is reported in his PhD thesis. Scen07 and scen08 are the only instances from the original set of problems which are open.