NP-hard problems aren't always hard

And aren't the kind of hard that cryptography and AI care about

There seems to be a persistent popular misunderstanding of NP-hard problems as problems that are always hard to solve: that if you have an instance of an NP-hard problem, and want the answer, you had better give up any hope of getting the answer quickly. But that isn't really the case. If P≠NP, the worst-case time for solving such problems will never be polynomial. But this tells us nothing about how common the worst cases are, which is crucially important in a lot of applications.


This misunderstanding seems to particularly crop up in relation to cryptography. Public-key cryptography is based on one-way functions that are easy to compute in one direction, but hard to reverse. For example, it's easy to multiply two large prime numbers to get their product, but hard to factor that product back into the two numbers. There seems to be a widespread belief that NP-hardness is the mathematical sauce behind one-way functions, but that isn't the case.

NP-completeness does have some similarity to this one-wayness: NP-complete problems are those where a solution can be verified in polynomial time, but not, unless P=NP, computed in polynomial time. So it seems that we might be able to use an NP-complete problem for the one-way function. It's easy, for example, to verify that a given set of variable assignments satisfies a boolean satisfiability problem (SAT), but hard to find a satisfying set of variable assignments. Unfortunately, it's not actually always hard: if you randomly generate SAT problems, a huge proportion of them are easy to solve. The worst-case assumptions of NP-completeness are fatal here, because the worst case for our attacker (trying to compute our function) is the best case for us, and analyzing the security of a cryptosystem in the best case isn't very interesting! We want to know how secure it is in the worst case, which means we need a one-way function that is always or at least almost always hard to reverse. NP-hard problems don't give us that.

The similarity is nonetheless evocative enough that cryptography experts in the 1970s did consider it a promising line of investigation. Diffie and Hellman's breakthrough 1976 paper [pdf] proposing Diffie–Hellman key exchange suggested that the development of NP-completeness theory might herald a new mathematical foundation for one-way functions, if it could be developed beyond worst-case analysis. Their own one-way function, though, depended on an apparently hard problem that isn't known to be NP-hard (the Diffie–Hellman problem). Despite some development in the 1980s of complexity theory that proved stronger hardness theorems, such as Leonid Levin's notion of average-case-complete problems (those which, unless P=NP, cannot be solved in polynomial time in the average case), the hard problems used as one-way functions in cryptography have generally not come out of NP-hardness or its descendents: integer factorization, the Diffie–Hellman problem, and the discrete logarithm problem are the three in most widespread use. These seem to be hard in somewhat similar ways, and in particular, almost all their instances seem to be hard. What lies behind this always-hardness is a bit of an open question, though. Study of one-way functions of this sort has diverged enough from NP-completeness that Christos Papadimitriou remarked in 1995, in an otherwise positive retrospective [pdf] on the influence of NP-completeness theory, that "it is now common knowledge among computer scientists that NP-completeness is largely irrelevant to public-key cryptography".


The flip-side of this makes NP-completeness of only limited interest in artificial intelligence as well (not irrelevant, but not speaking to the core issues, either). Much as NP-hard problems aren't necessarily hard enough for cryptography, they aren't necessarily too hard for AI, either. SAT, the very first problem to be proven NP-complete, is now frequently used in AI as an almost canonical example of a fast problem: if you can reduce your problem to SAT, you can take advantage of fast, off-the-shelf tools to solve it (SatPlan largely initiated this trend in 1992). This is good, because pretty much anything interesting in AI is NP-hard, so it would be unfortunate if we had to give up on all of it.

Much like in cryptography, NP-hardness doesn't seem to capture the right notion of hardness for AI. Current AI systems have little trouble solving some kinds of NP-hard problems (large scheduling and planning problems, for example), but are totally clueless on solving others (how to compose a symphony, say). So I can't really agree with folks like theory professor Scott Aaronson who suggest that the P=NP question is fundamentally tied to creativity or high-level intelligence. The problem is not that we have algorithms for human-level intelligence and creativity, but, alas, they run exponentially. The problem is that we don't actually know what the algorithms should look like! And there's little reason, imo, to believe that NP-completeness is the barrier. We don't need an algorithm that's guaranteed to write great symphonies in polynomial time for all instances of the symphony-composition problem; it would be more than enough to have a symphony-writing algorithm that composed symphonies as well as SAT-solvers solve SAT.

NP-completeness doesn't tell us why symphony composition is so much harder than SAT-solving, which is the more relevant kind of hardness. The answer seems to be a mixture of symphony-writing being a hard problem to formalize in the first place (certainly compared to SAT), combined with the fact that humans seem much better than machines at actually avoiding solving formally hard problems, deftly navigating solution spaces to find low-hanging fruit, skip dead ends, and sometimes even opportunistically change the initial problem and end up solving a different one. These kinds of heuristic and pattern-matching weaknesses of current AI compared to humans don't seem tied to any unique ability that humans possess to solve formally hard problems in polynomial time. Hence a replacement notion in AI circles of "AI-completeness": a problem that, if computers could solve it well, would imply that they could also solve any other problem requiring intelligence.

* * *

That isn't to say that NP-completeness is uninteresting. Understanding it seems closely connected to fundamental questions in theory of computation, and extensions of the theory (such as various distributional complexity classes) give us some practical information about expected problem complexity. But its practical implications don't, for the most part, include making crypto possible or making AI impossible.

Mark J. Nelson, 2010-08-16.
Comments welcome:
<Note index>