Minimum relevant variables in linear system (Min-RVLS) is a problem in

mathematical optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfi ...

. Given a

linear program Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements are represented by linear relationships. Linear programming is ...

, it is required to find a feasible solution in which the number of non-zero variables is as small as possible. The problem is known to be

NP-hard In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...

and even hard to approximate.

Definition

A Min-RVLS problem is defined by: * A binary relation ''R'', which is one of ; * An ''m''-by-''n'' matrix ''A'' (where ''m'' is the number of constraints and ''n'' the number of variables); * An ''m''-by-1 vector ''b''. The linear system is given by: ''A x'' ''R'' ''b.'' It is assumed to be feasible (i.e., satisfied by at least one ''x''). Depending on R, there are four different variants of this system: ''A x = b, A x ≥ b, A x > b, A x ≠ b''. The goal is to find an ''n''-by-1 vector ''x'' that satisfies the system ''A x'' ''R'' ''b'', and subject to that, contains as few as possible nonzero elements.

Special case

The problem Min-RVLS was presented by Garey and Johnson, who called it "minimum weight solution to linear equations". They proved it was NP-hard, but did not consider approximations.

Applications

The Min-RVLS problem is important in

machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

and

linear discriminant analysis Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to find a linear combination of features ...

. Given a set of positive and negative examples, it is required to minimize the number of features that are required to correctly classify them. The problem is known as the

minimum feature set problem In mathematical analysis, the maxima and minima (the respective plurals of maximum and minimum) of a function, known collectively as extrema (the plural of extremum), are the largest and smallest value of the function, either within a given ra ...

. An algorithm that approximates Min-RVLS within a factor of

O(\log(m))

could substantially reduce the number of training samples required to attain a given accuracy level. The shortest codeword problem in

coding theory Coding theory is the study of the properties of codes and their respective fitness for specific applications. Codes are used for data compression, cryptography, error detection and correction, data transmission and data storage. Codes are stud ...

is the same problem as Min-RVLS when the coefficients are in GF(2).

Related problems

In minimum unsatisfied linear relations (Min-ULR), we are given a binary relation ''R'' and a linear system ''A x'' ''R'' ''b'', which is now assumed to be ''infeasible''. The goal is to find a vector ''x'' that violates as few relations as possible, while satisfying all the others. Min-ULR ��is trivially solvable, since any system with real variables and a finite number of inequality constraints is feasible. As for the other three variants: * Min-ULR ,>,≥are NP-hard even with homogeneous systems and bipolar coefficients (coefficients in ). * The NP-complete problem

Minimum feedback arc set In graph theory and graph algorithms, a feedback arc set or feedback edge set in a directed graph is a subset of the edges of the graph that contains at least one edge out of every cycle in the graph. Removing these edges from the graph breaks a ...

reduces to Min-ULR �� with exactly one 1 and one -1 in each constraint, and all right-hand sides equal to 1. * Min-ULR ,>,≥are polynomial if the number of variables ''n'' is constant: they can be solved polynomially using an algorithm of Greer in time

O(n\cdot m^n / 2^)

. * Min-ULR ,>,≥are linear if the number of constraints ''m'' is constant, since all subsystems can be checked in time ''O''(''n''). *Min-ULR ��is polynomial in some special case. *Min-ULR ,>,≥can be approximated within ''n'' + 1 in polynomial time. *Min-ULR ,≥are minimum-dominating-set-hard, even with homogeneous systems and ternary coefficients (in ). *Min-ULR cannot be approximated within a factor of

2^

, for any

\varepsilon>0

, unless NP is contained in

DTIME In computational complexity theory, DTIME (or TIME) is the computational resource of computation time for a deterministic Turing machine. It represents the amount of time (or number of computation steps) that a "normal" physical computer would ta ...

(

n^

). In the complementary problem maximum feasible linear subsystem (Max-FLS), the goal is to find a maximum subset of the constraints that can be satisfied simultaneously. * Max-FLS ��can be solved in polynomial time. * Max-FLS is NP-hard even with homogeneous systems and bipolar coefficients. * . With integer coefficients, it is hard to approximate within

m^

. With coefficients over GF it is ''q''-approximable. * Max-FLS and Max-FLS ��are NP-hard even with homogeneous systems and bipolar coefficients. They are 2-approximable, but they cannot be approximated within any smaller constant factor.

Hardness of approximation

All four variants of Min-RVLS are hard to approximate. In particular all four variants cannot be approximated within a factor of

2^

, for any

\varepsilon>0

, unless NP is contained in

(

n^

). The hardness is proved by reductions: * There is a reduction from Min-ULR to Min-RVLS It also applies to Min-RVLS ��and Min-RVLS since each equation can be replaced by two complementary inequalities. * There is a reduction from minimum-dominating-set to Min-RVLS �� On the other hand, there is a reduction from Min-RVLS to Min-ULR It also applies to Min-ULR ��and Min-ULR since each equation can be replaced by two complementary inequalities. Therefore, when R is in , Min-ULR and Min-RVLS are equivalent in terms of approximation hardness.

References

{{Reflist Linear programming NP-hard problems Approximation algorithms Combinatorial optimization