This is the crux of one of the central problems in computer science. Some problems are too hard to solve in any reasonable amount of time. But their solutions are easy to check. Given that, computer scientists want to know: How complicated can a problem be while still having a solution that can be verified?

Turns out, the answer is: Almost unimaginably complicated.

In a paper released in April, two computer scientists dramatically increased the number of problems that fall into the hard-to-solve-but-easy-to-verify category. They describe a method that makes it possible to check answers to problems of almost incomprehensible complexity. “It seems insane,” said Thomas Vidick, a computer scientist at the California Institute of Technology who wasn’t involved in the new work.

The research applies to quantum computers — computers that perform calculations according to the nonintuitive rules of quantum mechanics. Quantum computers barely exist now but have the potential to revolutionize computing in the future.

The new work essentially gives us leverage over that powerful oracle. Even if the oracle promises to tell you answers to problems that are far beyond your own ability to solve, there’s still a way to ensure the oracle is telling the truth.

When a problem is hard to solve but easy to verify, finding a solution takes a long time, but verifying that a given solution is correct does not.

For example, imagine someone hands you a graph — a collection of dots (vertices) connected by lines (edges). The person asks you if it’s possible to color the vertices of the graph using only three colors, such that no connected vertices have the same color.

This “three-color” problem is hard to solve. In general, the time it takes to find a three-coloring of a graph (or determine that none exists) increases exponentially as the size of the graph increases. If, say, finding a solution for a graph with 20 vertices takes 3^{20} nanoseconds — a few seconds total — a graph with 60 vertices would take on the order of 3^{60} nanoseconds, or about 100 times the age of the universe.

But let’s say someone claims to have three-colored a graph. It wouldn’t take long to check whether their claim is true. You’d just go through the vertices one by one, examining their connections. As the graph gets bigger, the time it takes to do this increases slowly, in what’s called polynomial time. As a result, a computer doesn’t take much longer to check a three-coloring of a graph with 60 vertices than it does to check a graph with 20 vertices.

“It’s easy, given a proper three-coloring, to check that it works,” said John Wright, a physicist at the Massachusetts Institute of Technology who wrote the new paper along with Anand Natarajan of Caltech.

In the 1970s computer scientists defined a class of problems that are easy to verify, even if some are hard to solve. They called the class “NP,” for nondeterministic polynomial time. Since then, NP has been the most intensively studied class of problems in computer science. In particular, computer scientists would like to know how this class changes as you give the verifier new ways to check the truth of a solution.

Prior to Natarajan and Wright’s work, verification power had increased in two big leaps.

To understand the first leap, imagine that you’re colorblind. Someone places two blocks on the table in front of you and asks whether the blocks are the same or different colors. This is an impossible task for you. Moreover, you can’t verify someone else’s solution.

But you’re allowed to interrogate this person, whom we’ll call the prover. Let’s say the prover tells you that the two blocks are different colors. You designate one block as “Block A” and the other as “Block B.” Then you place the blocks behind your back and randomly switch which hand holds which block. Then you reveal the blocks and ask the prover to identify Block A.

If the blocks are different colors, this couldn’t be a simpler quiz. The prover will know that Block A is, say, the red block and will correctly identify it every single time.

But if the blocks are actually the same color — meaning the prover erred in saying that they were different colors — the prover can only guess which block is which. Because of this, it will only be possible for the prover to identify Block A 50 percent of the time. By repeatedly probing the prover about the solution, you will be able to verify whether it’s correct.

“The verifier can send the prover questions,” Wright said, “and maybe at the end of the conversation the verifier can become more convinced.”

In 1985 a trio of computer scientists proved that such interactive proofs can be used to verify solutions to problems that are more complicated than the problems in NP. Their work created a new class of problems called IP, for “interactive polynomial” time. The same method used to verify the coloring of two blocks can be used to verify solutions to much more complicated questions.

The second major advance took place in the same decade. It follows the logic of a police investigation. If you have two suspects you believe committed a crime, you’re not going to question them together. Instead, you’ll interrogate them in separate rooms and check each person’s answers against the other’s. By questioning them separately, you’ll be able to reveal more of the truth than if you had only one suspect to interrogate.

“It’s impossible for to form some sort of distributed, consistent story because they simply don’t know what answers the other is giving,” Wright said.

In 1988 four computer scientists proved that if you ask two computers to separately solve the same problem — and you interrogate them separately about their answers — you can verify a class of problems that’s even larger than IP: a class called MIP, for multi-prover interactive proofs.

With a multi-prover interactive approach, for example, it’s possible to verify three-colorings for a sequence of graphs that increase in size much faster than the graphs in NP. In NP, graph sizes increase at a linear rate — the number of vertices might grow from 1 to 2 to 3 to 4 and so on — so that the size of a graph is never hugely disproportionate to the amount of time needed to verify its three-coloring. But in MIP, the number of vertices in a graph grows exponentially — from 2^{1} to 2^{2} to 2^{3} to 2^{4} and so on.

As a result, the graphs are too big even to fit in the verifying computer’s memory, so it can’t check three-colorings by running through the list of vertices. But it’s still possible to verify a three-coloring by asking the two provers separate but related questions.

In MIP, the verifier has enough memory to run a program that allows it to determine whether two vertices in the graph are connected by an edge. The verifier can then ask each prover to state the color of one of the two connected vertices — and it can cross-reference the provers’ answers to make sure the three-coloring works.

The expansion of hard-to-solve-but-easy-to-verify problems from NP to IP to MIP involved classical computers. Quantum computers work very differently. For decades it’s been unclear how they change the picture — do they make it harder or easier to verify solutions?

The new work by Natarajan and Wright provides the answer.

Quantum computers perform calculations by manipulating quantum bits, or “qubits.” These have the strange property that they can be entangled with one another. When two qubits — or even large systems of qubits — are entangled, it means that their physical properties play off each other in a certain way.

In their new work, Natarajan and Wright consider a scenario involving two separate quantum computers that share entangled qubits.

This kind of setup would seem to work against verification. The power of a multi-prover interactive proof comes precisely from the fact that you can question two provers separately and cross-check their answers. If the provers’ answers are consistent, then it’s likely they’re correct. But two provers sharing an entangled state would seem to have more power to consistently assert incorrect answers.

And indeed, when the scenario of two entangled quantum computers was first put forward in 2003, computer scientists assumed entanglement would reduce verification power. “The obvious reaction of everyone, including me, is that now you’re giving more power to the provers,” Vidick said. “They can use entanglement to correlate their answers.”

Despite that initial pessimism, Vidick spent several years trying to prove the opposite. In 2012, he and Tsuyoshi Ito proved that it’s still possible to verify all the problems in MIP with entangled quantum computers.

Natarajan and Wright have now proved that the situation is even better than that: A wider class of problems can be verified with entanglement than without it. It’s possible to turn the connections between entangled quantum computers to the verifier’s advantage.

To see how, remember the procedure in MIP for verifying three-colorings of graphs whose sizes grow exponentially. The verifier doesn’t have enough memory to store the whole graph, but it does have enough memory to identify two connected vertices, and to ask the provers the colors of those vertices.

With the class of problems Natarajan and Wright consider — called NEEXP for nondeterministic doubly exponential time — the graph sizes grow even faster than they do in MIP. Graphs in NEEXP grow at a “doubly exponential” rate. Instead of increasing at a rate of powers of two — 2^{1}, 2^{2}, 2^{3}, 2^{4} and so on — the number of vertices in the graph increases at a rate of powers of powers of two — $latex 2^{2^1}, 2^{2^2}, 2^{2^3}, 2^{2^4}$ and so on. As a result, the graphs quickly become so big that the verifier can’t even identify a single pair of connected vertices.

“To label a vertex would take 2* ^{n}* bits, which is exponentially more bits than the verifier has in its working memory,” Natarajan said.

But Natarajan and Wright prove that it’s possible to verify a three-coloring of a doubly-exponential-size graph even without being able to identify which vertices to ask the provers about. This is because you can make the provers come up with the questions themselves.

The idea of asking computers to interrogate their own solutions sounds, to computer scientists, as advisable as asking suspects in a crime to interrogate themselves — surely a foolish proposition. Except Natarajan and Wright prove that it’s not. The reason is entanglement.

“Entangled states are a shared resource,” Wright said. “Our entire protocol is figuring out how to use this shared resource to generate connected questions.”

If the quantum computers are entangled, then their choices of vertices will be correlated, producing just the right set of questions to verify a three-coloring.

At the same time, the verifier doesn’t want the two quantum computers to be so intertwined that their answers to those questions are correlated (which would be the equivalent of two suspects in a crime coordinating their false alibis). Another strange quantum feature handles this concern. In quantum mechanics, the uncertainty principle prevents us from knowing a particle’s position and momentum simultaneously — if you measure one property, you destroy information about the other. The uncertainty principle strictly limits what you can know about any two “complementary” properties of a quantum system.

Natarajan and Wright take advantage of this in their work. To compute the color of a vertex, they have the two quantum computers make complementary measurements. Each computer computes the color of its own vertex, and in doing so, it destroys any information about the other’s vertex. In other words, entanglement allows the computers to generate correlated questions, but the uncertainty principle prevents them from colluding when answering them.

“You have to force the provers to forget, and that’s the main thing do in their paper,” Vidick said. “They force the prover to erase information by making a measurement.”

Their work has almost existential implications. Before this new paper, there was a much lower limit on the amount of knowledge we could possess with complete confidence. If we were presented with an answer to a problem in NEEXP, we’d have no choice but to take it on faith. But Natarajan and Wright have burst past that limit, making it possible to verify answers to a far more expansive universe of computational problems.

And now that they have, it’s unclear where the limit of verification power lies.

“It could go much further,” said Lance Fortnow, a computer scientist at the Georgia Institute of Technology. “They leave open the possibility that you could take another step.”

]]>Whose movie preferences are closest to yours: Adrienne’s, Brandon’s or Cora’s? And how far are your cinematic tastes from those of the other two? It might seem strange to ask “how far” here. That’s a question about distance, after all. What does distance mean when it comes to which movies you like? How would we measure it?

As strange as it may seem, companies like Netflix measure and make use of these kinds of distances every day. By watching what you watch and analyzing the data, they create measurements of your fondness for comedies, romances, documentaries and other kinds of movies, and use that information to imagine your position in the abstract space of movie preferences. Then, by identifying the people closest to you in this abstract space — your nearest neighbors, so to speak — Netflix can recommend new things for you to watch: namely, the things your neighbors like.

When it comes to these kinds of predictive engines, knowing who is closest to whom is crucial in understanding, classifying and analyzing complex data sets. And it all starts with some simple ideas developed in high school geometry.

Let’s start by thinking of Adrienne and Brandon as points in the plane.

In reality, movie preference space is much bigger and more complex (and real recommendation engines are much more sophisticated) than what we are imagining here, but this is a friendly place from which to start exploring the math.

To help see if we’re closer to *A* or *B*, let’s draw a perpendicular bisector — a line segment that is perpendicular to $latex \overline{AB} $ and cuts it in half. The perpendicular bisector is a very useful tool in Euclidean geometry. Anytime you need to cut something in half, there are perpendicular bisectors around.

In the plane, the perpendicular bisector of a line segment is a line, as seen below.

One important property of perpendicular bisectors is that its points are equidistant from the endpoints of the segment it bisects. This intimately connects perpendicular bisectors to isosceles triangles — triangles with two sides of the same length. Suppose a point *Q* lies on the perpendicular bisector of segment $latex \overline{AB} $, like this.

Because it lies on the perpendicular bisector of segment $latex \overline{AB} $, point *Q* is equidistant to *A* and *B*. That is, *QA* = *QB*. If *Q* represents Quentin, then Quentin is just as close to Adrienne as to Brandon.

We can prove *QA* = *QB *using basic properties of either triangles or reflections, but we can also demonstrate it by folding. If you draw this diagram on a sheet of paper and fold along the perpendicular bisector, you’ll see *A* line up with *B*. And since *Q* lies on the crease, the segment $latex \overline{QA} $ will overlap with $latex \overline{QB} $ . This means they must be the same length. This makes $latex \Delta AQB $ an isosceles triangle. In fact, the perpendicular bisector of $latex \overline{AB} $ is the set of all the points that make an isosceles triangle with $latex \overline{AB} $ as its base. (There is one situation we should be slightly worried about. Can you spot it?)

Since every point on the perpendicular bisector of segment $latex \overline{AB} $ is equidistant to *A *and *B*, we have another powerful way to think about the perpendicular bisector: as a dividing line.

Let’s put Yvette in our plane, at some point *Y* that is not on the perpendicular bisector of $latex \overline{AB} $.

We know immediately that *Y* is not equidistant to *A* and *B*, but we can say more. We can geometrically show that Yvette is closer to Brandon than to Adrienne. Imagine segments $latex \overline{YA} $ and $latex \overline{YB} $.

We can show that *YB* < *YA* by adding a little something to our diagram. Let *I* be the intersection of $latex \overline{YA} $ and the perpendicular bisector, and draw a new segment $latex \overline{IB} $.

Since *I* is on the perpendicular bisector of $latex \overline{AB} $, we know *I* is equidistant to *A* and *B*. This means *IA* = *IB*. Using some basic facts about line segments, we see that

*YA *= *YI* + *IA* = *YI* + *IB*

But in a triangle, any one side must be shorter than the other two sides put together. This fundamental fact, called the “triangle inequality,” basically says the shortest path between two points is the straight line segment that connects them. Applying the triangle inequality in $latex \Delta YIB $ gives us

*YB* < *YI* + *IB*

And since *YI* + *IA* = *YI* + *IB* = *YA*, we see that *YB* < *YA*.

Notice that Yvette is on Brandon’s side of $latex \overline{AB} $’s perpendicular bisector. That tells us that Yvette is closer to Brandon than to Adrienne. Similarly, everything on Adrienne’s side of $latex \overline{AB} $’s perpendicular bisector is closer to Adrienne than to Brandon. So the perpendicular bisector of $latex \overline{AB} $ divides the plane into two sets: things that are closer to *A* and things that are closer to *B*. If you want to know whom you are closer to, you just need to know which side of the perpendicular bisector you are on.

This strategy works with more than two points. Let’s add Cora.

Here’s the perpendicular bisector of $latex \overline{AB} $.

We know this divides the plane into two regions: things that are closer to *A,* and things that are closer to *B*. Notice that since *C* lies on *A’*s side, *C* is closer to *A* than *B*. Now let’s construct the perpendicular bisector of line segment $latex \overline{AC} $.

This also divides the plane into two regions: things that are closer to *A *and things that are closer to *C*. Notice that *B* is closer to *C* than to *A*, since it lies on the *C *side of $latex \overline{AC} $’s perpendicular bisector.

Several remarkable things happen when we construct the remaining perpendicular bisector of $latex \overline{BC} $.

First, all three perpendicular bisectors intersect at a single point. This “concurrency” of the perpendicular bisectors of $latex \overline{AB} $, $latex \overline{BC} $ and $latex \overline{AC} $ is almost magical and is one of the loveliest results in elementary geometry. And it’s not hard to see why it happens. Let’s go back to when we had just the two perpendicular bisectors.

Let’s call the intersection of these two perpendicular bisectors point *O.* Because it’s the point of intersection, *O* lies on both perpendicular bisectors. Since it’s on the perpendicular bisector of $latex \overline{AB} $, we know that *OA* = *OB* . But *O *also lies on the perpendicular bisector of $latex \overline{AC} $, which means *OA* = *OC*.

But if *OA* = *OB* and *OA* = *OC*, then *OB* = *OC*. This means *O *must lie on the perpendicular bisector of $latex \overline{BC} $, and all three perpendicular bisectors pass through a single point!

This remarkable concurrency isn’t the only remarkable thing happening here. Let’s take a look at the regions created by these three intersecting lines.

Consider the region in red. These points are on the *C* side of the perpendicular bisector of $latex \overline{BC} $, so they are closer to *C* than to *B*. But they are also on the *C* side of the perpendicular bisector of $latex \overline{AC} $, which means they are closer to *C* than to *A*. Thus, every point in this region is closer to *C* than to either *A *or *B*.

Similar reasoning allows us to fill in the rest of the diagram like this.

Clearly, the red points are all those closest to *C*, the yellow points are those closest to *A*, and the blue points are those closest to *B. *Using perpendicular bisectors, we have created a “Voronoi diagram”: a partitioning of the plane into regions of points that have the same “nearest neighbor” among the points *A, B* and *C. *

In this simplistic imagining of movie preference space, here I am at point *P*: closer to Adrienne’s superheroes than Brandon’s toy stories, but heading toward Cora and her horror shows.

Since I’m in the yellow region, this partition tells me that Adrienne is my nearest neighbor: If I’m looking for a new movie to enjoy, trying out something she likes would be a good bet. And, in a movie preference space with millions of viewers, there will be even nearer neighbors whose preferences will better match mine than Adrienne’s will. As you watch movies and get located in this space, your nearest neighbor can recommend movies for you, too.

This geometric analysis works for any number of points in any number of dimensions. The only thing that really changes is the nature of perpendicular bisector itself. In the plane, a perpendicular bisector is a line, but in three-dimensional space, a perpendicular bisector is a plane. And in 10-dimensional space, a perpendicular bisector is whatever splits 10-dimensional space in half (a nine-dimensional hyperplane).

While the approach may stay the same, computations become more difficult as the dataset gets more complex. Having more dimensions means more coordinates for each data point, which means more complicated distance calculations. And since you have to find the perpendicular bisector for each pair of points, the number you must find grows very quickly as the number of points increases: With three points, there are three perpendicular bisectors, but with 100 points, there are nearly 5,000.

In a space of millions of points with thousands of coordinates each, computing all the necessary distances can become computationally infeasible. To deal with this complexity, mathematicians and computer scientists have developed algorithms to partition space and find nearest neighbors that are much more efficient than simply finding all the perpendicular bisectors. This often involves estimating nearest neighbors rather than locating them exactly, making the classic trade in mathematics of accuracy for simplicity.

Things get even more complicated when we start using different notions of distance. We’ve been using the standard “Euclidean” definition of distance, the one that relies on straight lines and the Pythagorean theorem. But different datasets require different measures of distance, and different measures of distance give rise to different, and more complex, partitions of space.

For example, here’s what a Voronoi diagram can look like using what’s known as “taxicab distance.”

In a world measured by “taxicab distance,” the shortest path between two points isn’t the one the crow flies; it’s the one the taxi drives, where only horizontal and vertical streets are available. This notion of distance can be useful in real-life applications like urban planning — deciding how hospitals should be spaced out depends more on driving distance than on flying distance.

“Taxicab distance” also shows up in abstract contexts, as do other kinds of distance, like “edit distance” (a measure of how different two strings of words or symbols are) and graph distance (the length of the shortest path between two nodes of a graph). The ability to model distance in different ways in different spaces is a source of great mathematical power, but it also has the potential to make each kind of partitioning problem complicated in its own unique way.

This is partly why mathematicians were so excited and surprised by the recent discovery of a general-purpose method for finding nearest neighbors in complex data sets. It even surprised the researchers who discovered it: They originally set out to prove that it was impossible.

In a high-dimensional movie preference space, researchers can create multiple notions of distance. They might look at differences in how users rate movies, how long they watch them, what previews they click on, and more. The data are endless and can be combined in endless ways. The same is true of modeling patients in the space of health statistics, where predictive engines make educated guesses about what illness might afflict you next, and of how match-making services pair up users who are just the right distance apart in relationship space (close, but probably not too close).

But even across these vastly different spaces, there are general techniques for finding nearest neighbors. And whether it’s measured in movie preferences, medical symptoms or relationship qualities, the distances to our nearest neighbors can tell us a lot about ourselves.

- Above, the perpendicular bisector of $latex \overline{AB} $ was described as “the set of all points that make an isosceles triangle with $latex \overline{AB} $ as its base,” but it was noted that there was one situation that we should be slightly worried about. What situation is that?
- Any two distinct points have a unique perpendicular bisector. What is the maximum number of perpendicular bisectors found among 100 distinct points? Could it be possible to have fewer than this maximum number?
- In the Cartesian plane, how many lattice points (points with integer coefficients) are exactly 10 units away from the origin, using the “taxicab” distance?
- The perpendicular bisector of $latex \overline{AB} $ is the set of all points equidistant from
*A*and*B*. In the plane, this makes a line. What does the set of all points twice as far from*A*as from*B*make in the plane?

Click for Answer 1: The midpoint of $latex \overline{AB} $ is on the perpendicular bisector, but it doesn’t really make a triangle with $latex \overline{AB} $ as its base, as the points are collinear. Or maybe it makes a degenerate triangle!

Click for Answer 2: This is equal to the number of pairs of points among 100 total points, which is $latex _{100}C_2 $ = $latex \begin{pmatrix} 100 \\ 2 \end{pmatrix} $ = $latex \frac{100!}{98!2!} $ = $latex \frac{100⋅99}{2}$ = 4,950. And yes, some perpendicular bisectors could overlap, if two different pairs of points were collinear and shared the same midpoint.

Click for Answer 3: The points are (10,0), (9,1), (8,2), …, (1,9), then (0,10), (-1,9), …, (-9,1), then (-10,0), …, (-1, -9), then (0,-10), …, (9, -1). Four sets of 10 points, so 40 points.

Click for Answer 4: A circle. Proof is left to the reader.

*Corrected on May 22, 2019: Due to a typographical error, an earlier version of this column gave the incorrect reason for point B being closer to point C in one of the diagrams. Also, because of a change in the color scheme of a different diagram, the column previously referred incorrectly to "green" points that actually appeared as blue points. Both errors have been corrected.*

Not only are some viruses split into multiple segments that infect host cells separately, but as researchers in France have now discovered, those fractured viruses can flourish with their genomes scattered like puzzle pieces across a multitude of host cells. Something — presumably, the diffusion of molecules among the infected cells — allows complete viral particles to replicate, self-assemble and infect anew.

“You can get all of the necessary gene products together to produce new viruses in a cell that doesn’t actually have all the gene segments in it,” explained Christopher Brooke, a virologist at the University of Illinois at Urbana-Champaign.

“A classical view in virology assumes that the viral replication cycle occurs within individual cells,” said Anne Sicard, the lead author of the new study and a plant pathologist at the French National Institute for Agricultural Research (Institut National de la Recherche Agronomique, or INRA) in Montpellier. But in the case of this “multipartite” virus that she and her colleagues examined, “it seems that this is not true. The segments infect cells independently and accumulate independently in the plant host cells.” She added, “It really shows that the virus doesn’t work at a single-cell level, but at a multicellular level.”

Multipartite viruses have been known for over half a century, when researchers realized that a virus could be composed of two or more independent pieces, all of which were vital for infection. One piece might be necessary for making essential viral enzymes, for instance, while the other would be needed to make the capsule in which the viral particles (or virions) are packaged and transported to other cells.

But being multipartite carries considerable risks. Parts of the genome can easily be lost or left behind, dooming the rest by breaking the cycle of infection. Because the segments are frequently found in different proportions — some may be common, while others are rare — the rare ones can be lost especially easily.

Scientists have therefore wondered about multipartite viruses since they discovered them. “Why on earth would a virus do this? Why would you separate your genome? What are the advantages to having these segments that are packaged separately?” asked Mark P. Zwart, an evolutionary virologist at the Netherlands Institute of Ecology.

To explore these questions, theoreticians developed models to predict the circumstances under which this multipartite lifestyle would evolve from a more typical viral ancestor, all built on the assumption that the full set of viral segments had to coinfect one cell. But the results were perplexing. A study from 2012 concluded that, whatever the benefits of multipartition might be, the disadvantages were so great that a virus with more than four segments should be impossible. Yet some multipartite viruses, like the faba bean necrotic stunt virus (FBNSV), were known to have as many as eight segments, each carried in a different particle. Theoretically, it couldn’t have evolved. What could explain its existence?

“We thought that the way we conceptualize these viruses must be wrong,” said Stéphane Blanc, a plant virologist at INRA and the senior author of the new study. They decided to verify the key assumption that all the segments must be together within a cell for the infection to work. “It was not done before because it was so evident that they have to be together that no one actually tested it,” he said.

What they found when they scrutinized FBNSV infections blew them away. By tagging two viral segments at a time with different colored fluorescent probes, the team could see that the full complement of viral segments was absent from the vast majority of individual host plant cells they examined. Furthermore, the researchers showed that a protein required for viral replication was present in cells that did not have the genome segment coding for it.

From this, they inferred that the virus particles must be sharing gene products — either messenger RNA molecules or proteins — among cells, so that each particle could replicate and package itself into a capsule to spread. How exactly those necessary components are shared across plant cells isn’t fully understood, but Blanc and his team are looking into it. The answer may involve the plasmodesmata, networks of microscopic canals that extend through plant cell walls and allow adjacent cells to share other proteins.

This new understanding explains how a multipartite virus can sustain infections within a plant, but it opens up new mysteries about how one spreads. FBNSV, for instance, depends on aphids that eat faba bean plants to transmit it. But the little insects must collectively capture all eight segments of FBNSV and introduce them to the same plant to successfully pass on the infection. Presumably, a large proportion of infection events never succeed because the aphids pick up only a subset of the eight particles.

The discovery “alleviates the problem at the within-host level because the particles don’t have to reach all the cells together, but we still have a problem for between-host transmission,” Blanc said.

Why a virus would benefit from a multipartite lifestyle is also up for debate. One idea, Blanc says, is that partitioning the genome allows each segment to vary in frequency as a quick-and-dirty way to regulate gene expression, because a given gene’s level of activity can depend on the number of copies of it in a cell. Each time the virus infects a new host, the frequency of the segments changes — which could enable the virus to test how much gene expression works best in a new cellular environment.

Eric Freundt, a virologist at the University of Tampa, speculates that if a host plant’s innate defenses destroy only those cells that express certain viral proteins, then distributing the genes for the proteins into different particles might guarantee that the virus goes undetected in some cells. Another possibility, Freundt suggests, is that distribution tiptoes around the “unfolded protein response,” which can kill cells when a virus overwhelms them by attempting to produce all of its proteins at once. By distributing its genome across many plant cells, the virus may avoid overwhelming the machinery of any single cell.

Still, Blanc and Freundt are quick to acknowledge that these are just hypotheses. “The reason for their evolution is still quite a mystery,” Blanc said.

Zwart points out that most ideas about the advantages of multipartition are really about genome segmentation, not partition of the virus into different infective units. Separating the genome into segments allows different viruses to recombine various advantageous forms of their genes easily.

Arvind Varsani, a virologist at Arizona State University, agrees. “From a modularity perspective, you could see the advantages of multicomponent viruses where each module can be independent,” he said. In a plant with multiple coinfecting viruses, “you can gain elements much more quickly, and adapt to the environment by mixing and matching.”

Proof of the strength of this strategy can be found in the influenza virus, a master of reassortment. The flu genome also has eight segments, although those segments are packaged together in one viral capsule. That allows it to reap the benefits of segmentation without paying all the costs of multipartition.

But the flu may be more similar to the FBNSV than first meets the eye. Brooke discovered that, depending on the strain, only a tiny fraction (1-10 percent) of flu virus particles contain functional copies of all eight genome segments. “The vast majority of flu particles are these incomplete, or what we call semi-infectious, particles that are incapable on their own of initiating productive replication,” he explained. “This was a surprise because it suggests that this virus, which is super successful and highly transmissible, is largely existing as these particles that cannot replicate on their own.”

He predicted, “Asking how viruses operate as populations rather than individual virion particles is going to end up being important for lots of different viral systems.”

Part of that importance is conceptual: It may be too limiting to think of the DNA inside any individual viral capsule as defining its genome. Instead, it might be better to imagine a viral genome as the suite of genes represented across an entire viral population. Theoreticians are already delving into the implications.

Zwart expects that new theoretical models will come out soon to explore these insights, such as ways of framing the evolution of these viruses in terms of multiple levels of natural selection. Within an individual host plant, local forces of natural selection will allow the virus to balance the production rates of its segments successfully. But when the virus moves on to a new plant, it must be able to adapt to that new host environment as well, so it needs to retain some versatility. A higher level of selection may therefore sometimes temper the local level and rebalance the ratio of segments more evenly.

“There is such a richness in all those dynamics,” Zwart said. “It’s really fascinating.”

]]>The story of chaos is usually told like this: Using the LGP-30, Lorenz made paradigm-wrecking discoveries. In 1961, having programmed a set of equations into the computer that would simulate future weather, he found that tiny differences in starting values could lead to drastically different outcomes. This sensitivity to initial conditions, later popularized as the butterfly effect, made predicting the far future a fool’s errand. But Lorenz also found that these unpredictable outcomes weren’t quite random, either. When visualized in a certain way, they seemed to prowl around a shape called a strange attractor.

About a decade later, chaos theory started to catch on in scientific circles. Scientists soon encountered other unpredictable natural systems that looked random even though they weren’t: the rings of Saturn, blooms of marine algae, Earth’s magnetic field, the number of salmon in a fishery. Then chaos went mainstream with the publication of James Gleick’s *Chaos: Making a New Science* in 1987. Before long, Jeff Goldblum, playing the chaos theorist Ian Malcolm, was pausing, stammering and charming his way through lines about the unpredictability of nature in *Jurassic Park*.

All told, it’s a neat narrative. Lorenz, “the father of chaos,” started a scientific revolution on the LGP-30. It is quite literally a textbook case for how the numerical experiments that modern science has come to rely on — in fields ranging from climate science to ecology to astrophysics — can uncover hidden truths about nature.

But in fact, Lorenz was not the one running the machine. There’s another story, one that has gone untold for half a century. A year and a half ago, an MIT scientist happened across a name he had never heard before and started to investigate. The trail he ended up following took him into the MIT archives, through the stacks of the Library of Congress, and across three states and five decades to find information about the women who, today, would have been listed as co-authors on that seminal paper. And that material, shared with *Quanta*, provides a fuller, fairer account of the birth of chaos.

In the fall of 2017, the geophysicist Daniel Rothman, co-director of MIT’s Lorenz Center, was preparing for an upcoming symposium. The meeting would honor Lorenz, who died in 2008, so Rothman revisited Lorenz’s epochal paper, a masterwork on chaos titled “Deterministic Nonperiodic Flow.” Published in 1963, it has since attracted thousands of citations, and Rothman, having taught this foundational material to class after class, knew it like an old friend. But this time he saw something he hadn’t noticed before. In the paper’s acknowledgments, Lorenz had written, “Special thanks are due to Miss Ellen Fetter for handling the many numerical computations.”

“Jesus … *who is Ellen Fetter*?” Rothman recalls thinking at the time. “It’s one of the most important papers in computational physics and, more broadly, in computational science,” he said. And yet he couldn’t find anything about this woman. “Of all the volumes that have been written about Lorenz, the great discovery — nothing.”

With further online searches, however, Rothman found a wedding announcement from 1963. Ellen Fetter had married John Gille, a physicist, and changed her name. A colleague of Rothman’s then remembered that a graduate student named Sarah Gille had studied at MIT in the 1990s in the very same department as Lorenz and Rothman. Rothman reached out to her, and it turned out that Sarah Gille, now a physical oceanographer at the University of California, San Diego, was Ellen and John’s daughter. Through this connection, Rothman was able to get Ellen Gille, née Fetter, on the phone. And that’s when he learned another name, the name of the woman who had preceded Fetter in the job of programming Lorenz’s first meetings with chaos: Margaret Hamilton.

When Margaret Hamilton arrived at MIT in the summer of 1959, with a freshly minted math degree from Earlham College, Lorenz had only recently bought and taught himself to use the LGP-30. Hamilton had no prior training in programming either. Then again, neither did anyone else at the time. “He loved that computer,” Hamilton said. “And he made me feel the same way about it.”

For Hamilton, these were formative years. She recalls being out at a party at three or four a.m., realizing that the LGP-30 wasn’t set to produce results by the next morning, and rushing over with a few friends to start it up. Another time, frustrated by all the things that had to be done to make another run after fixing an error, she devised a way to bypass the computer’s clunky debugging process. To Lorenz’s delight, Hamilton would take the paper tape that fed the machine, roll it out the length of the hallway, and edit the binary code with a sharp pencil. “I’d poke holes for ones, and I’d cover up with Scotch tape the others,” she said. “He just got a kick out of it.”

There were desks in the computer room, but because of the noise, Lorenz, his secretary, his programmer and his graduate students all shared the other office. The plan was to use the desk computer, then a total novelty, to test competing strategies of weather prediction in a way you couldn’t do with pencil and paper.

First, though, Lorenz’s team had to do the equivalent of catching the Earth’s atmosphere in a jar. Lorenz idealized the atmosphere in 12 equations that described the motion of gas in a rotating, stratified fluid. Then the team coded them in.

Sometimes the “weather” inside this simulation would simply repeat like clockwork. But Lorenz found a more interesting and more realistic set of solutions that generated weather that wasn’t periodic. The team set up the computer to slowly print out a graph of how one or two variables — say, the latitude of the strongest westerly winds — changed over time. They would gather around to watch this imaginary weather, even placing little bets on what the program would do next.

And then one day it did something really strange. This time they had set up the printer not to make a graph, but simply to print out time stamps and the values of a few variables at each time. As Lorenz later recalled, they had re-run a previous weather simulation with what they thought were the same starting values, reading off the earlier numbers from the previous printout. But those weren’t actually the same numbers. The computer was keeping track of numbers to six decimal places, but the printer, to save space on the page, had rounded them to only the first three decimal places.

After the second run started, Lorenz went to get coffee. The new numbers that emerged from the LGP-30 while he was gone looked at first like the ones from the previous run. This new run had started in a very similar place, after all. But the errors grew exponentially. After about two months of imaginary weather, the two runs looked nothing alike. This system was still deterministic, with no random chance intruding between one moment and the next. Even so, its hair-trigger sensitivity to initial conditions made it unpredictable.

This meant that in chaotic systems the smallest fluctuations get amplified. Weather predictions fail once they reach some point in the future because we can never measure the initial state of the atmosphere precisely enough. Or, as Lorenz would later present the idea, even a seagull flapping its wings might eventually make a big difference to the weather. (In 1972, the seagull was deposed when a conference organizer, unable to check back about what Lorenz wanted to call an upcoming talk, wrote his own title that switched the metaphor to a butterfly.)

Many accounts, including the one in Gleick’s book, date the discovery of this butterfly effect to 1961, with the paper following in 1963. But in November 1960, Lorenz described it during the Q&A session following a talk he gave at a conference on numerical weather prediction in Tokyo. After his talk, a question came from a member of the audience: “Did you change the initial condition just slightly and see how much different results were?”

“As a matter of fact, we tried out that once with the same equation to see what could happen,” Lorenz said. He then started to explain the unexpected result, which he wouldn’t publish for three more years. “He just gives it all away,” Rothman said now. But no one at the time registered it enough to scoop him.

In the summer of 1961, Hamilton moved on to another project, but not before training her replacement. Two years after Hamilton first stepped on campus, Ellen Fetter showed up at MIT in much the same fashion: a recent graduate of Mount Holyoke with a degree in math, seeking any sort of math-related job in the Boston area, eager and able to learn. She interviewed with a woman who ran the LGP-30 in the nuclear engineering department, who recommended her to Hamilton, who hired her.

Once Fetter arrived in Building 24, Lorenz gave her a manual and a set of programming problems to practice, and before long she was up to speed. “He carried a lot in his head,” she said. “He would come in with maybe one yellow sheet of paper, a legal piece of paper in his pocket, pull it out, and say, ‘Let’s try this.’”

The project had progressed meanwhile. The 12 equations produced fickle weather, but even so, that weather seemed to prefer a narrow set of possibilities among all possible states, forming a mysterious cluster which Lorenz wanted to visualize. Finding that difficult, he narrowed his focus even further. From a colleague named Barry Saltzman, he borrowed just three equations that would describe an even simpler nonperiodic system, a beaker of water heated from below and cooled from above.

Here, again, the LGP-30 chugged its way into chaos. Lorenz identified three properties of the system corresponding roughly to how fast convection was happening in the idealized beaker, how the temperature varied from side to side, and how the temperature varied from top to bottom. The computer tracked these properties moment by moment.

The properties could also be represented as a point in space. Lorenz and Fetter plotted the motion of this point. They found that over time, the point would trace out a butterfly-shaped fractal structure now called the Lorenz attractor. The trajectory of the point — of the system — would never retrace its own path. And as before, two systems setting out from two minutely different starting points would soon be on totally different tracks. But just as profoundly, wherever you started the system, it would still head over to the attractor and start doing chaotic laps around it.

The attractor and the system’s sensitivity to initial conditions would eventually be recognized as foundations of chaos theory. Both were published in the landmark 1963 paper. But for a while only meteorologists noticed the result. Meanwhile, Fetter married John Gille and moved with him when he went to Florida State University and then to Colorado. They stayed in touch with Lorenz and saw him at social events. But she didn’t realize how famous he had become.

Still, the notion of small differences leading to drastically different outcomes stayed in the back of her mind. She remembered the seagull, flapping its wings. “I always had this image that stepping off the curb one way or the other could change the course of any field,” she said.

After leaving Lorenz’s group, Hamilton embarked on a different path, achieving a level of fame that rivals or even exceeds that of her first coding mentor. At MIT’s Instrumentation Laboratory, starting in 1965, she headed the onboard flight software team for the Apollo project.

Her code held up when the stakes were life and death — even when a mis-flipped switch triggered alarms that interrupted the astronaut’s displays right as Apollo 11 approached the surface of the moon. Mission Control had to make a quick choice: land or abort. But trusting the software’s ability to recognize errors, prioritize important tasks, and recover, the astronauts kept going.

Hamilton, who popularized the term “software engineering,” later led the team that wrote the software for Skylab, the first U.S. space station. She founded her own company in Cambridge in 1976, and in recent years her legacy has been celebrated again and again. She won NASA’s Exceptional Space Act Award in 2003 and received the Presidential Medal of Freedom in 2016. In 2017 she garnered arguably the greatest honor of all: a Margaret Hamilton Lego minifigure.

Fetter, for her part, continued to program at Florida State after leaving Lorenz’s group at MIT. After a few years, she left her job to raise her children. In the 1970s, she took computer science classes at the University of Colorado, toying with the idea of returning to programming, but she eventually took a tax preparation job instead. By the 1980s, the demographics of programming had shifted. “After I sort of got put off by a couple of job interviews, I said forget it,” she said. “They went with young, techy guys.”

Chaos only reentered her life through her daughter, Sarah. As an undergraduate at Yale in the 1980s, Sarah Gille sat in on a class about scientific programming. The case they studied? Lorenz’s discoveries on the LGP-30. Later, Sarah studied physical oceanography as a graduate student at MIT, joining the same overarching department as both Lorenz and Rothman, who had arrived a few years earlier. “One of my office mates in the general exam, the qualifying exam for doing research at MIT, was asked: How would you explain chaos theory to your mother?” she said. “I was like, whew, glad I didn’t get that question.”

Today, chaos theory is part of the scientific repertoire. In a study published just last month, researchers concluded that no amount of improvement in data gathering or in the science of weather forecasting will allow meteorologists to produce useful forecasts that stretch more than 15 days out. (Lorenz had suggested a similar two-week cap to weather forecasts in the mid-1960s.)

But the many retellings of chaos’s birth say little to nothing about how Hamilton and Ellen Gille wrote the specific programs that revealed the signatures of chaos. “This is an all-too-common story in the histories of science and technology,” wrote Jennifer Light, the department head for MIT’s Science, Technology and Society program, in an email to *Quanta.* To an extent, we can chalk up that omission to the tendency of storytellers to focus on solitary geniuses. But it also stems from tensions that remain unresolved today.

First, coders in general have seen their contributions to science minimized from the beginning. “It was seen as rote,” said Mar Hicks, a historian at the Illinois Institute of Technology. “The fact that it was associated with machines actually gave it less status, rather than more.” But beyond that, and contributing to it, many programmers in this era were women.

In addition to Hamilton and the woman who coded in MIT’s nuclear engineering department, Ellen Gille recalls a woman on an LGP-30 doing meteorology next door to Lorenz’s group. Another woman followed Gille in the job of programming for Lorenz. An analysis of official U.S. labor statistics shows that in 1960, women held 27 percent of computing and math-related jobs.

The percentage has been stuck there for a half-century. In the mid-1980s, the fraction of women pursuing bachelor’s degrees in programming even started to decline. Experts have argued over why. One idea holds that early personal computers were marketed preferentially to boys and men. Then when kids went to college, introductory classes assumed a detailed knowledge of computers going in, which alienated young women who didn’t grow up with a machine at home. Today, women programmers describe a self-perpetuating cycle where white and Asian male managers hire people who look like all the other programmers they know. Outright harassment also remains a problem.

Hamilton and Gille, however, still speak of Lorenz’s humility and mentorship in glowing terms. Before later chroniclers left them out, Lorenz thanked them in the literature in the same way he thanked Saltzman, who provided the equations Lorenz used to find his attractor. This was common at the time. Gille recalls that in all her scientific programming work, only once did someone include her as a co-author after she contributed computational work to a paper; she said she was “stunned” because of how unusual that was.

Since then, the standard for giving credit has shifted. “If you went up and down the floors of this building and told the story to my colleagues, every one of them would say that if this were going on today … they’d be a co-author!” Rothman said. “Automatically, they’d be a co-author.”

Computation in science has become even more indispensable, of course. For recent breakthroughs like the first image of a black hole, the hard part was not figuring out which equations described the system, but how to leverage computers to understand the data.

Today, many programmers leave science not because their role isn’t appreciated, but because coding is better compensated in industry, said Alyssa Goodman, an astronomer at Harvard University and an expert in computing and data science. “In the 1960s, there was no such thing as a data scientist, there was no such thing as Netflix or Google or whoever, that was going to suck in these people and really, really value them,” she said.

Still, for coder-scientists in academic systems that measure success by paper citations, things haven’t changed all that much. “If you are a software developer who may never write a paper, you may be essential,” Goodman said. “But you’re not going to be counted that way.”

]]>One type of recursive sentence uses a grammatical structure called “center-embedding.” An example is the sentence, “The dog the man the maid married owned died.” You can make this somewhat easier to understand by inserting the linking words: “The dog *that* the man *whom* the maid married owned died.” Most people can understand who did what here, but it’s already getting tough. Now consider the Yale football cheer “Bulldogs bulldogs bulldogs fight fight fight.” The Rutgers University cognitive philosopher Jerry Fodor once pointed out that this is a grammatically correct triple center-embedded sentence. Your challenge is to try to understand how this cheer works as a real sentence. To make it more specific, imagine that the first set of bulldogs is red, the second brown and the third white. Try to answer the following questions:

- Whom do the red bulldogs fight?
- What color bulldogs do the brown bulldogs fight?
- Which bulldogs fight the brown bulldogs?
- What color bulldogs do the white bulldogs fight?

This puzzle can make your brain feel like the bruised and battered bulldogs so wonderfully rendered in Dan Page’s illustration. The best way to deal with recursion while minimizing brain strain is as follows:

- Start as simply as possible.
- Build on the recursion one element at a time, looking for a pattern.
- Once you find the pattern, let the pattern do the work.

Let’s apply these techniques to the fighting bulldogs. The simplest possible sentence you can start with by removing the center-embedding is: Bulldogs fight.

Who do they fight? This is not specified — *fight* is an intransitive verb here, a verb without an object. So the meaning conveyed here is that these bulldogs fight generally or among themselves.

Following our color code, these were the red bulldogs, so the red bulldogs fight generally.

Now let’s add the center-embedding of the second set of bulldogs: Bulldogs fight. The simplest sentence, adding in the colors, becomes: The (red) bulldogs fight.

The verb *fight* is transitive here and does have an object. The brown bulldogs fight the red bulldogs specifically. The red bulldogs, as far as we know, don’t change their character. They continue to fight generally.

Okay, let’s add the third set: Bulldogs fight. With the colors, we have this sentence: The (red) bulldogs fight.

The verb *fight* is again transitive here, and its object is the brown bulldogs. The white bulldogs fight the brown bulldogs specifically. Again, the description of the first two sets of bulldogs is not changed in any way.

The answers to the questions are:

- The red bulldogs fight
**generally**. - The brown bulldogs fight the
**red**bulldogs. - The
**white**bulldogs fight the brown bulldogs. - The white bulldogs fight the
**brown**bulldogs.

(You may object that since the brown bulldogs fight the red bulldogs, the red bulldogs must fight them too. But this inference is not necessarily true: The red bulldogs might just keep fighting generally or among themselves, without paying any attention to the brown bulldogs, for all we know.)

As you can see, we arrive at this pattern: Group one fights generally, group two fights group one, group three fights group two, and so on. Thus, we can construct a recursive center-embedded sentence consisting of any number of bulldog groups and very easily determine who fights whom. A hypothetical group four would fight group three, group five would fight group four, and so on. Finding the pattern enables you to correctly answer the questions by offloading the work to the pattern or algorithm, without having to tax your brain’s short-term memory.

This is similar to how mathematicians deal with higher dimensions. No one can visualize the fourth dimension and beyond in the way we can visualize the first three. Not even Einstein could visualize higher dimensions. You have to find patterns or algorithms that generalize to the higher dimensions. Then you can accurately answer questions about them.

Several readers got the right answers, including boymeetswool, plouf, Nqabutho and Danielle Schaper. I enjoyed boymeetswool’s diagram and explanation of the sentence tree — it is perfectly correct. Danielle Schaper also produced some very nice diagrams based on the idea that subsets of bulldogs of each color are being referred to. This is one way to interpret the sentence, but it involves an additional assumption that is not specified explicitly. The most parsimonious approach is to assume that all the bulldogs of a given color do the same thing.

Our second problem is briefly summarized below. For the full description, see the original puzzle column.

On a faraway planet with perfectly logical beings, there was a legislative assembly consisting of 100 members who met every day. Some among them were pathological liars, or “pathos.” A patho was perceived as having a long nose by everybody else, including other pathos. However, a patho was completely unaware that he or she had a long nose and couldn’t hear about it from anyone because of a taboo that prevented others from pointing out or saying anything about anybody else’s long nose. Any patho who logically inferred that he or she was a patho was compelled to resign before the end of the business day. One day an alien leader not bound by the taboo addressed the assembly and said, “I perceive that at least one of you has a long nose.” Many days later, about half of the legislators resigned en masse.

Question 1 (parallel universe situation): What could have caused this strange pattern of resignations?

Let us apply the recursion-solving rules stated above and start small. Suppose that there was only one person in the assembly who had a long nose. Then the alien leader’s statement would allow that one person to infer that he or she was the one with the long nose and cause him or her to resign. Let’s say this happens on day one (more on this choice below).

If there were two people with long noses (let’s say a male, *A*, and a female, *B*), each of them would see one person with a long nose. Legislator *A* infers that *B* would see nobody with a long nose if he, *A*, doesn’t have one, or she would see one person if he, *A*, does. In the first case, *B* would be compelled to resign on day one. But that wouldn’t happen because she does see *A*’s long nose, and by an identical reasoning process, she waits to see if he resigns on day one, which would mean that she, *B*, did not have a long nose. Thus, neither *A* nor *B* resign on day one. On day two, both *A* and *B* realize that they have long noses and resign.

If there were three people with long noses, each would see two pathos, and they would have to wait two days to see if both resign on day two, which would mean they themselves were not pathos. When that doesn’t happen, then they would all be forced to resign on day three.

The pattern is already becoming clear. If *n* people in the assembly are pathos, then:

- The mass resignations will happen on day
*n*. - Everyone’s resignation day is one more than the number of pathos they see. Pathos see
*n-1*pathos, so their resignation day is*n*, while all nonpathos see*n*pathos, so their resignation day (if others don’t resign first) is*n+1*.

All of the pathos are hoping that the mass resignations happen on day *n–1*, but when that doesn’t happen, they realize they are pathos themselves and resign next day. Nonpathos see *n* people with long noses, and they are hoping that the resignations happen on day *n*, which indeed happens, confirming that they are not pathos.

So if about half of the legislators in the parallel universe resigned, then about half of them must have been pathos to begin with.

We know that the pathos’ days as legislators are numbered, but do we number them starting from zero or one? This point came up in the long discussion between plouf and Sunil Nandella.

We decided above that we should call the day the lone patho resigns “day one.” If we are thinking in terms of days after the alien leader’s visit, then the alien leader must have addressed the assembly after the day’s work was done (say, at an after-dinner speech), so a lone patho would have to resign the next day. On the other hand, if the alien leader addressed the assembly in a morning gathering, then a lone patho would be compelled to resign the same day, which would be day zero. This would change the number of days by 1. This is the basis of the very common off-by-one-error (OBOE or OB1) that can cause bugs in computer software. Whether you want to use the index 1 or 0 for the first element of an array of numbers or first step of a recursion is an arbitrary choice that has to be made initially and applied consistently. Some computer languages force you to use one or the other (usually 0), whereas others allow you to choose, using a declaration such as “Option Base 0” or “Option Base 1”). The first option is commonly used by software engineers; the second is the one that all of us use naturally when counting, and that’s what we used here.

The above parallel universe scenario can take place only if all the legislators meet all the others every day and no one is ever absent. The *Quanta* universe scenario (questions 2 through 6) considers what happens in a messier situation where people may be absent temporarily (lost at sea and later found), absent indefinitely (slip into a coma) or where a nonpatho can change into a patho. Specifically, the puzzle stated that:

a.) On the 35th day, a nonpatho changes into a patho irreversibly.

b.) On the 43rd day, three legislators, including one patho, were lost at sea. On the 46th evening, one of the pathos was found and rejoined the assembly the next day.

c.) On the 45th day, a legislator who was a patho slipped into a coma.

d.) On the 49th day, there was a mass resignation of a large number of legislators.

Before we answer the remaining questions, let us consider how the absence of a patho from the assembly (as happens in items b and c) should be handled: Should the remaining pathos take the absence into account and resign one day earlier, or should they stick to the original plan? Let’s examine a simple situation.

Assume that there are three pathos in the group. After day one, one of them goes missing, so that as above, only a male *A* and female *B* remain. Should they resign on day two now, because there are only two of them actually present (modified plan), or should they follow the original plan and resign on day three?

On day two, legislator *A* will reason as follows: “Either I have a long nose, or I do not. If I do not, then the only long-nosed person *B* would have seen is *C,* and she can assume that the alien leader was referring to *C* as the one with the long nose. So there is nothing to compel her to resign today. If I do have a long nose, then *B* will reason in the same way about me that I am reasoning about her. So I cannot be certain that I don’t have a long nose, and neither can she.” Hence neither *A* nor *B* can be fully certain that they are pathos, and so neither can logically resign on day two. The modified plan does not work.

On the other hand, everything works if they adhere to the original plan and wait until day three, imagining that *C* is watching everything from some virtual place and would have followed the plan too if he were present (even if he is actually dead, for all they know). All three pathos had originally seen two people with long noses, and nobody resigned on day two. So they must resign on day three.

What about the addition of a newly created patho, as happens in item a above? In this case, from the point of view of the person who has become a patho, nothing has changed because she never knew whether she was a patho anyway: She still sees the original number of *n* pathos. The pathos now also see *n* pathos, and the nonpathos see *n+1*. This calls for everyone except the affected person to modify their resignation plan to one day later than it originally was. The pathos’ resignation day is now *n+1*, and the nonpathos’ is *n+2*. You can see that this situation is indistinguishable from having *n+1* pathos from the beginning, provided that all the legislators except the changed one know about the change, which is true in this instance.

So the principle for absences is: Stick to the original plan, and the resignation day remains the same. For additions such as the universally observable change (except to the affected person) of a nonpatho into a patho, add 1 to the resignation day.

With this in mind, let’s answer the remaining questions:

2. Given all the events described, how many pathos were in the assembly originally and how many resigned on the 49th day?

There were 48 pathos originally, and 48 resigned on the 49th day (with the base day understood as day one). The 48 pathos would have all resigned on day 48, but the addition of the new patho in plain sight to everyone except the affected legislator forced them to change the resignation day to day 49. The comatose person was the missing 49th patho. (If the base day was understood as day zero, add 1 to all the answers.)

3. Was the originally truthful legislator who became a patho on the 35th day among those who resigned?

Yes. He had always seen 48 other pathos, so his resignation day was always 49.

4. The legislator who had slipped into a coma recovered and returned to the assembly on the 50th day. He was briefed about everything that had happened. Did he resign?

Yes. When he returned, he asked for the names of the people who had resigned. He saw the name of the legislator who had changed to a patho on the 35th day. He went down the list, desperately scouring it for a name that he knew to have been that of a nonpatho as of the 44th day. He didn’t find any. He resigned.

5. What would have happened if the legislator who had been lost at sea had been found a day later?

Nothing. All the resignations would still have happened on the 49th day. Absences do not affect the set plan.

6. Why was the Truon leader’s visit necessary for the mass resignations to occur? After all, all of the legislators always knew that at least one of them had a long nose.

This question holds the key to this puzzle. This is what is known as a “common knowledge” problem. Common knowledge, in logic, is defined as the knowledge of some truth, *p*, in a group, such that everybody in the group knows *p*, they all know they know *p*, they all know that they all know that they know *p*, and so on ad infinitum. It is this common, infinitely recursive, web of knowledge that the alien leader’s words impart to the legislators. This web of recursive knowledge remains intact in the parallel universe because the legislators are present every day and are able to carry the recursive reasoning forward one step at a time. Thus, all the pathos are able to reach the same conclusion on the same day. When absences and additions are present, the original plan may need to be modified, and this can be done logically but only among the original legislators who were present at the declaration and therefore knew that everyone knew what they knew. The alien leader’s declaration also gives a common base to count the days starting on the day the declaration was made and the web of common knowledge was constructed. As the days pass, this web is recursively unraveled from the outside in, until resignation day arrives and everyone can logically infer their true nature.

The *Quanta* prize for this puzzle goes to plouf, who answered both puzzles correctly. Plouf’s answers for the second puzzle counted the base day as zero, so they are 1 off from the ones given here, but they are correct for the assumed base day.

I hope you enjoyed this brain twister. If you need some time to untwist, you have a few weeks before we return next month with more *Insights*.