Heh heh...I've been in that situation before. Its like Cringely's observations on Donald Knuth. One can get so confident about the accuracy and brilliance and simplicity of the design that one can completely ignore flaws in the implementation. If one can do it, two can do it rather easily, too.
His blog does go on to support the fallback of tests. Problems like this should be approached with more scientific-like tests. If X happens here, but not in "the original", then some approximation of X should be added as a test to the original. Even if all it shows is that case X is wrong to start with and the original is fine, the original should probably be modified to minimize the garbage-in-garbage-out result.
This assumes these two folks are the only two people working on it. Other
members of the team (other pairs) presumably won't buy into the same shared
illusion as the original pair. Or if they do they'll at least demand some
justification :-) Additionally, the same pair might not keep their
rationalizations up-to-date so when they look at the code two months from
now they'll be able to honestly say, "What were we thinking?"