IMC, Author at International Maths Challenge

Five ways ancient India changed the world – with maths

Posted on April 24, 2022April 22, 2025 by IMC

It should come as no surprise that the first recorded use of the number zero, recently discovered to be made as early as the 3rd or 4th century, happened in India. Mathematics on the Indian subcontinent has a rich history going back over 3,000 years and thrived for centuries before similar advances were made in Europe, with its influence meanwhile spreading to China and the Middle East.

As well as giving us the concept of zero, Indian mathematicians made seminal contributions to the study of trigonometry, algebra, arithmetic and negative numbers among other areas. Perhaps most significantly, the decimal system that we still employ worldwide today was first seen in India.

The number system

As far back as 1200 BC, mathematical knowledge was being written down as part of a large body of knowledge known as the Vedas. In these texts, numbers were commonly expressed as combinations of powers of ten. For example, 365 might be expressed as three hundreds (3×10²), six tens (6×10¹) and five units (5×10⁰), though each power of ten was represented with a name rather than a set of symbols. It is reasonable to believe that this representation using powers of ten played a crucial role in the development of the decimal-place value system in India.

Brahmi numerals. Wikimedia

From the third century BC, we also have written evidence of the Brahmi numerals, the precursors to the modern, Indian or Hindu-Arabic numeral system that most of the world uses today. Once zero was introduced, almost all of the mathematical mechanics would be in place to enable ancient Indians to study higher mathematics.

The concept of zero

Zero itself has a much longer history. The recently dated first recorded zeros, in what is known as the Bakhshali manuscript, were simple placeholders – a tool to distinguish 100 from 10. Similar marks had already been seen in the Babylonian and Mayan cultures in the early centuries AD and arguably in Sumerian mathematics as early as 3000-2000 BC.

But only in India did the placeholder symbol for nothing progress to become a number in its own right. The advent of the concept of zero allowed numbers to be written efficiently and reliably. In turn, this allowed for effective record-keeping that meant important financial calculations could be checked retroactively, ensuring the honest actions of all involved. Zero was a significant step on the route to the democratisation of mathematics.

These accessible mechanical tools for working with mathematical concepts, in combination with a strong and open scholastic and scientific culture, meant that, by around 600AD, all the ingredients were in place for an explosion of mathematical discoveries in India. In comparison, these sorts of tools were not popularised in the West until the early 13th century, though Fibonnacci’s book liber abaci.

Solutions of quadratic equations

In the seventh century, the first written evidence of the rules for working with zero were formalised in the Brahmasputha Siddhanta. In his seminal text, the astronomer Brahmagupta introduced rules for solving quadratic equations (so beloved of secondary school mathematics students) and for computing square roots.

Rules for negative numbers

Brahmagupta also demonstrated rules for working with negative numbers. He referred to positive numbers as fortunes and negative numbers as debts. He wrote down rules that have been interpreted by translators as: “A fortune subtracted from zero is a debt,” and “a debt subtracted from zero is a fortune”.

This latter statement is the same as the rule we learn in school, that if you subtract a negative number, it is the same as adding a positive number. Brahmagupta also knew that “The product of a debt and a fortune is a debt” – a positive number multiplied by a negative is a negative.

For the large part, European mathematicians were reluctant to accept negative numbers as meaningful. Many took the view that negative numbers were absurd. They reasoned that numbers were developed for counting and questioned what you could count with negative numbers. Indian and Chinese mathematicians recognised early on that one answer to this question was debts.

For example, in a primitive farming context, if one farmer owes another farmer 7 cows, then effectively the first farmer has -7 cows. If the first farmer goes out to buy some animals to repay his debt, he has to buy 7 cows and give them to the second farmer in order to bring his cow tally back to 0. From then on, every cow he buys goes to his positive total.

Basis for calculus

This reluctance to adopt negative numbers, and indeed zero, held European mathematics back for many years. Gottfried Wilhelm Leibniz was one of the first Europeans to use zero and the negatives in a systematic way in his development of calculus in the late 17th century. Calculus is used to measure rates of changes and is important in almost every branch of science, notably underpinning many key discoveries in modern physics.

Leibniz: Beaten to it by 500 years.

But Indian mathematician Bhāskara had already discovered many of Leibniz’s ideas over 500 years earlier. Bhāskara, also made major contributions to algebra, arithmetic, geometry and trigonometry. He provided many results, for example on the solutions of certain “Doiphantine” equations, that would not be rediscovered in Europe for centuries.

The Kerala school of astronomy and mathematics, founded by Madhava of Sangamagrama in the 1300s, was responsible for many firsts in mathematics, including the use of mathematical induction and some early calculus-related results. Although no systematic rules for calculus were developed by the Kerala school, its proponents first conceived of many of the results that would later be repeated in Europe including Taylor series expansions, infinitessimals and differentiation.

The leap, made in India, that transformed zero from a simple placeholder to a number in its own right indicates the mathematically enlightened culture that was flourishing on the subcontinent at a time when Europe was stuck in the dark ages. Although its reputation suffers from the Eurocentric bias, the subcontinent has a strong mathematical heritage, which it continues into the 21st century by providing key players at the forefront of every branch of mathematics.

For more insights like this, visit our website at www.international-maths-challenge.com.
Credit of the article given to Christian Yates

How statistical thinking should shape the courtroom

Posted on April 21, 2022April 22, 2025 by IMC

The probabilistic revolution first kicked off in the 1600s, when gamblers realized that estimating the likelihood of an event could give them an edge in games of chance.

Today, statistics has become the dominant way to communicate scientific findings. But courts can be hesitant to incorporate statistical evidence into decisions. Indeed, they have historically been antagonistic toward probabilities and are loath to be swindled by slippery statistics.

However, as an educator of statistics who has consulted in a variety of contexts and has served as expert witness to the U.S. District Court in Montana, I find that both my experience and my review of the evidence suggest that courts increasingly feature statistical thinking – whether or not it is identified as such.

Society needs to prioritize educating juries in the language of statistics. Otherwise, juries will be forever at the mercy of convincing, yet potentially invalid, testimony. Courtroom decisions should be based on facts and probabilities, not manipulation by a skilled prosecutor or defense attorney.

Thinking statistically

Probabilities changed the way human beings thought about outcomes. They are a useful tool for expressing our uncertainty about events in the world.

Will it rain today? It will or it will not, that much is certain. But probability allows us to express our ignorance about whether it will rain and quantify the degree to which we are uncertain. Stating “it will probably rain today” constituted a very innovative and different way of thinking.

Probabilities play a role in our daily lives, in decisions from whether to take an umbrella to work to whether to purchase flood insurance. We can consider “statistical thinking” to be any situation where probabilities are involved.

To some extent, humans are intuitive statisticians. For instance, research suggests we can revise a belief in the light of new evidence as prescribed by a statistical theorem, if the probabilities are given in a relatively intuitive rather than abstract fashion.

Statistical reasoning pervades many of the conclusions we draw regarding scientific phenomena. Even physics has had to acknowledge the reality of probabilities. So, if the courts use scientific findings as evidence, probabilities should naturally make their way into courtroom decisions.

Evaluating the evidence

If juries do not understand the nature of statistical conclusions, then they will be tempted to believe that scientific evidence is conclusive and deterministic, rather than probabilistic. For example, probabilities show us that cigarette smoking does not necessarily lead to cancer. Rather, extensive nicotine addiction likely leads to cancer.

Heads or tails? armydre2008/flickr, CC BY

Evidence can only fit a theory probabilistically. If we flip a coin 10 times and get 10 heads in a row, that suggests the coin may not be fair, but does not “prove” that it is biased.

Consider the analysis of DNA found at the crime scene. Is the DNA that of the accused? Maybe. Not definitively. A statistician might say, “The probability of this degree of DNA match occurring by chance is extremely small. The match may be due to chance, but since this probability is so small, we may conclude that it likely did not occur by chance, and use it as evidence against the accused.”

Of course, human judgment is fickle. Until jurors are trained to make rational decisions based on facts and probabilities, they will continue to be easily swayed by convincing litigators.

In the 1995 trial of OJ Simpson, for example, the bloody gloves found at the crime scene constituted powerful evidence against the accused. The samples obtained were extremely likely to belong to the defendant.

A statistically educated jury would not fall for Johnnie Cochran’s classic defense: “If it does not fit, you must acquit.” They would know in advance that no evidence, whatever the kind, fits a theory perfectly.

Cochran’s statement was, statistically speaking, utter nonsense. Of course no model fits perfectly, but which is the more probable model? That’s the task jurors ultimately face, even if they often perceive it as a “guilt” versus “no guilt” decision.

Whenever courts work with DNA matches, they must incorporate acceptable risk and error. But if such uncertainty can be quantified accurately, then it can serve as an aid in decision-making.

Statistical thinking indeed plays a role in the decision between guilt and innocence in a criminal trial. When a jury renders a “guilty” verdict, there is always the chance that the accused is not guilty, but that the many circumstances of the case simply lined up against him or her to lead the jury to a guilty verdict. In other words, the probability of the observed evidence under the assumption of innocence is so low that the evidence likely occurred under a more probable “narrative” – that of guilt.

But, when we make such a decision, we do so with a risk of error. This could be quite devastating to a defendant falsely put to death when all along he or she was innocent. For example, when researchers applied DNA testing to death row inmates in Illinois, they found that the results exonerated several inmates.

Errors in probability-based decisions can indeed be costly. Without a grasp of how virtually all decisions are based on probabilistic thinking, no jury can be expected to adequately assess any evidence in a rational way.

Base rates

Courts also struggle with whether and how to use base rates, another type of statistical tool.

A base rate is the probability of some characteristic being present in the population. For instance, say an individual takes a diagnostic test for a disease, such as HIV. The probability that she has the disease would be higher if she were sampled from a high-risk group – for example, if she shares needles to support a drug addiction, or engages in promiscuous sex with risky partners.

Courts often ignore base rate information. In Stephens v. State in 1989, the Wyoming Supreme Court heard testimony that “80 to 85 percent of child sexual abuse is committed by a close relative of the child.” They ultimately dismissed this, concluding that it was difficult to understand how statistical information would help reach a decision in an individual case.

In another case, a justice of the Minnesota Supreme Court proclaimed that she was “at a loss to understand” how base rates could help predict whether a particular person posed a danger to the public.

Part of the problem is that this information can appear biased against the accused. For instance, consider again the defendant accused of child sexual abuse. The probability that he is guilty might be evaluated in light of the fact that most perpetrators of abuse are relatives of or closely related to the family. This could be interpreted as biasing the evaluation against the accused. However, the courts have considered base rates in employment discrimination cases, an area where perhaps this information seems more naturally relevant (for example, Hazelwood School District v. United States).

If the courts are willing to use base rate information in discrimination cases, they should be encouraged to consider them in other cases as well, even if they seem less intuitive.

Learning to think statistically

Courts should make it a priority to instruct juries on how to interpret probabilistic evidence, so that they are not at the mercy of a convincing, yet potentially misleading, prosecutor or expert witness.

For example, juries might learn elementary statistics through coin-flipping lessons. This could help them, at minimum, find a way to think about the usual “beyond a reasonable doubt” instruction in a criminal trial.

When the assumption of innocence is rejected in favor of guilt, one does so with a risk of being wrong. How much risk is a jury willing to tolerate? Five percent? One percent? Surely such risk must also depend on the severity of the proposed punishment. Every decision is an exercise in risk and cost benefit analysis.

Until juries learn elements of statistical thinking, they are likely to continue making verdict decisions without the appropriate framework in mind. Probabilities have taken over the world, and this fact needs to be recognized by the courts.

For more insights like this, visit our website at www.international-maths-challenge.com.
Credit of the article given to Daniel J. Denis

Thinking about How and Why We Prove

Posted on April 5, 2022April 22, 2025 by IMC

Credit: Evelyn Lamb

Stacking oranges leads to computer-assisted mathematics. But does it feel like mathematics?

Earlier this month, I attended the Joint Mathematics Meetings in Seattle. One of the reasons I enjoy going to the JMM is that I can get a feel for what is going on in parts of mathematics that I’m not terribly familiar with. This year, I attended two talks in a session called “mathematical information in the digital age,” that got me thinking about what mathematicians do.

First, a confession: I went to the session because I like oranges. The first talk was by Thomas Hales, who is probably best known for his proof of the Kepler conjecture. In short, the conjecture says that the way grocers stack oranges is indeed the most efficient way to do it. The proof was a long case-by-case exhaustion, and Hales was not satisfied with a referee report that said the referee was 99% sure the proof was correct. So he did what any* mathematician would do: he took more than a decade to write and verify a formal computer proof of the result. I attended the talk because I figured there’s a small chance that any talk that mentions the Kepler conjecture might have oranges for the audience.

Hales’ talk was called simply “Formal Proofs.” These are not proofs that are written using stuffy language, with every single step written out, but proofs that can be input into a computer and verified all the way down to the foundations of mathematics, whichever foundations one chooses.

Hales began his talk with some examples of less-than-formal proofs, starting with a passage from William Thurston in which he used the phrase “subdivide and jiggle,” clearly not a rigorous way to describe mathematics. (Incidentally, Thurston also did mathematics with oranges. He would ask students to peel oranges to better understand 2- and 3-dimensional geometry.)

Although I never met Thurston, I am one of his many mathematical descendants. his approach to mathematics, particularly his emphasis on intuition and imagination, has permeated the culture in my extended mathematical family and has had a great deal of influence on how I think about mathematics. That is why it was so refreshing for me to go to a talk where intuition wasn’t a primary focus.

Hales was certainly not insinuating that Thurston was a bad mathematician. Thurston was only the first mathematician he used as an example of less-than-rigorously stated mathematics. A few slides later he mentioned the Bourbaki book on set theory. Yes, even that paragon of formal mathematics sucked dry of every drop of intuition, is not really full of formal proofs.

Hales’ talk was a nice overview of the formal proof programs out there, some mathematical results that have been proved formally (including some that were already known), and a nice introduction to where the field is going. I’m particularly interested in learning more about the QED manifesto and FABSTRACTS, a service that would formalize the abstracts of mathematical papers, a much more tractable goal than formalizing an entire paper.

The most amusing moment of the talk, at least to me, was a question from someone in the audience about the possibility of using a formal proof assistant to verify Mochizuki’s proof of the abc conjecture. Hales replied that with the current technology, you do need to understand the proof as you enter it, so there aren’t many people who can do it. The logical response: why doesn’t Mochizuki do it himself? Let’s just say I’m not holding my breath.

The second talk I attended in the session was Michael Shulman’s called “From the nLab to the HoTT book.” He talked about both the nLab, a wiki for category theory, and the writing of the Homotopy Type Theory “research textbook,” a 600-page tome put together during an IAS semester about homotopy type theory, an alternative to set theory as a foundational system for mathematics. The theme of Shulman’s talk was “one size does not fit all,” either in the way people collaborate (contrasting the wiki and the textbook) or even in the foundations of mathematics (type theory versus set theory).

I don’t know if it was intended, but I thought Shulman’s talk was an interesting counterpoint to Hales,’ most relevantly to me in the way it answered one of the questions Hales posed: why don’t more mathematicians use proof assistants? Beyond the fact that proof assistants are currently too unwieldy for many of us, Shulman’s answer was that we do mathematics for understanding, not just truth. He said what I was thinking during Hales’ talk, which was that to many mathematicians, using a formal proof assistant does not “feel like” mathematics. I am not claiming moral high ground here. It is actually something of a surprise to me that the prospect of being able to find and verify new truths more quickly is not more important to me.

You never know what you’re going to get when you wander into a talk that is well outside your mathematical comfort zone. In my case, I didn’t end up with any oranges, but I got some interesting new-ti-me perspectives about how and why we prove.

*almost no

For more insights like this, visit our website at www.international-maths-challenge.com.

Credit of the article given to Evelyn Lamb

There’s a mathematical formula for choosing the fastest queue

Posted on March 24, 2022April 22, 2025 by IMC

It seems obvious. You arrive at the checkouts and see one queue is much longer than the other, so you join the shorter one. But, before long, the people in the bigger line zoom past you and you’ve barely moved towards the exit.

When it comes to queuing, the intuitive choice is often not the fastest one. Why do queues feel like they slow down as soon as you join them? And is there a way to decide beforehand which line is really the best one to join? Mathematicians have been studying these questions for years. So can they help us spend less time waiting in line?

The intuitive strategy seems to be to join the shortest queue. After all, a short queue could indicate it has an efficient server, and a long queue could imply it has an inexperienced server or customers who need a lot of time. But generally this isn’t true.

Without the right information, it could even be disadvantageous to join the shortest queue. For example, if the short queue at the supermarket has two very full trolleys and the long queue has four relatively empty baskets, many people would actually join the longer queue. If the servers are equally efficient, the important quantity here is the number of total items in the queue, not the number of customers. But if the trolleys weren’t very full but the hand baskets were, it wouldn’t be so easy to estimate and the choice wouldn’t be so clear.

This simple example introduces the concept of service time distribution. This is a random variable that measures how long it will take a customer to be served. It contains information about the average (mean) service time and about the standard deviation from the mean, which represents how the service time fluctuates depending on how long different customers need.

The other important variable is how often customers join the queue (the arrival rate). This depends on the average amount of time that passes between two consecutive customers entering the shop. The more people that arrive to use a service at a specific time, the longer the queues will be.

Never mind the queue, I picked the wrong shop. Shutterstock

Depending on what these variables are, the shortest queue might be the best one to join – or it might not. For example, in a fish and chip shop you might have two servers both taking orders and accepting money. Then it is most often better to join the shortest queue since the time the servers’ tasks take doesn’t vary much.

Unfortunately, in practice, it’s hard to know exactly what the relevant variables are when you enter a shop. So you can still only guess what the fastest queue to join will be, or rely on tricks of human psychology, such as joining the leftmost queue because most right-handed people automatically turn right.

Did you get it right?

Once you’re in the queue, you’ll want to know whether you made the right choice. For example, is your server the fastest? It is easy to observe the actual queue length and you can try to compare it to the average. This is directly related to the mean and standard deviation of the service time via something called the Pollaczek-Khinchine formula, first established in 1930. This also uses the mean inter-arrival time between customers.

Unfortunately, if you try to measure the time the first person in the queue takes to get served, you’ll likely end up feeling like you chose the wrong line. This is known as Feller’s paradox or the inspection paradox. Technically, this isn’t an actual logical paradox but it does go against our intuition. If you start measuring the time between customers when you join a queue, it is more likely that the first customer you see will take longer than average to be served. This will make you feel like you were unlucky and chose the wrong queue.

The inspection paradox works like this: suppose a bank offers two services. One service takes either zero or five minutes, with equal probability. The other service takes either ten or 20 minutes, again with equal probability. It is equally likely for a customer to choose either service and so the bank’s average service time is 8.75 minutes.

If you join the queue when a customer is in the middle of being served then their service can’t take zero minutes. They must be using either the five, ten or 20 minute service. This pushes the time that customer will take to be served to more than 11 minutes on average, more than the true average for the of 8.75 minutes. In fact, two out of three times you encounter the same situation, the customer will want either the 10 or 20 minute service. This will make it seem like the line is moving more slowly than it should, all because a customer is already there and you have extra information.

So while you can use maths to try to determine the fastest queue, in the absence of accurate data – and for your own peace of mind – you’re often better just taking a gamble and not looking at the other options once you’ve made your mind up.

For more insights like this, visit our website at www.international-maths-challenge.com.
Credit of the article given to Enrico Scalas, Nicos Georgiou

Why do we need to know about prime numbers with millions of digits?

Posted on March 13, 2022April 22, 2025 by IMC

SeventyFour via Shutterstock

Prime numbers are more than just numbers that can only be divided by themselves and one. They are a mathematical mystery, the secrets of which mathematicians have been trying to uncover ever since Euclid proved that they have no end.

An ongoing project – the Great Internet Mersenne Prime Search – which aims to discover more and more primes of a particularly rare kind, has recently resulted in the discovery of the largest prime number known to date. Stretching to 23,249,425 digits, it is so large that it would easily fill 9,000 book pages. By comparison, the number of atoms in the entire observable universe is estimated to have no more than 100 digits.

The number, simply written as 2⁷⁷²³²⁹¹⁷-1 (two to the power of 77,232,917, minus one) was found by a volunteer who had dedicated 14 years of computing time to the endeavour.

You may be wondering, if the number stretches to more than 23m digits, why we need to know about it? Surely the most important numbers are the ones that we can use to quantify our world? That’s not the case. We need to know about the properties of different numbers so that we can not only keep developing the technology we rely on, but also keep it secure.

Secrecy with prime numbers

One of the most widely used applications of prime numbers in computing is the RSA encryption system. In 1978, Ron Rivest, Adi Shamir and Leonard Adleman combined some simple, known facts about numbers to create RSA. The system they developed allows for the secure transmission of information – such as credit card numbers – online.

The first ingredient required for the algorithm are two large prime numbers. The larger the numbers, the safer the encryption. The counting numbers one, two, three, four, and so on – also called the natural numbers – are, obviously, extremely useful here. But the prime numbers are the building blocks of all natural numbers and so even more important.

Take the number 70 for example. Division shows that it is the product of two and 35. Further, 35 is the product of five and seven. So 70 is the product of three smaller numbers: two, five, and seven. This is the end of the road for 70, since none of these can be further broken down. We have found the primal components that make up 70, giving its prime factorisation.

Multiplying two numbers, even if very large, is perhaps tedious but a straightforward task. Finding prime factorisation, on the other hand, is extremely hard, and that is precisely what the RSA system takes advantage of.

Suppose that Alice and Bob wish to communicate secretly over the internet. They require an encryption system. If they first meet in person, they can devise a method for encryption and decryption that only they will know, but if the initial communication is online, they need to first openly communicate the encryption system itself – a risky business.

However, if Alice chooses two large prime numbers, computes their product, and communicates this openly, finding out what her original prime numbers were will be a very difficult task, as only she knows the factors.

So Alice communicates her product to Bob, keeping her factors secret. Bob uses the product to encrypt his message to Alice, which can only be decrypted using the factors that she knows. If Eve is eavesdropping, she cannot decipher Bob’s message unless she acquires Alice’s factors, which were never communicated. If Eve tries to break the product down into its prime factors – even using the fastest supercomputer – no known algorithm exists that can accomplish that before the sun will explode.

The primal quest

Large prime numbers are used prominently in other cryptosystems too. The faster computers get, the larger the numbers they can crack. For modern applications, prime numbers measuring hundreds of digits suffice. These numbers are minuscule in comparison to the giant recently discovered. In fact, the new prime is so large that – at present – no conceivable technological advancement in computing speed could lead to a need to use it for cryptographic safety. It is even likely that the risks posed by the looming quantum computers wouldn’t need such monster numbers to be made safe.

It is neither safer cryptosystems nor improving computers that drove the latest Mersenne discovery, however. It is mathematicians’ need to uncover the jewels inside the chest labelled “prime numbers” that fuels the ongoing quest. This is a primal desire that starts with counting one, two, three, and drives us to the frontiers of research. The fact that online commerce has been revolutionised is almost an accident.

The celebrated British mathematician Godfrey Harold Hardy said: “Pure mathematics is on the whole distinctly more useful than applied. For what is useful above all is technique, and mathematical technique is taught mainly through pure mathematics”. Whether or not huge prime numbers, such as the 50th known Mersenne prime with its millions of digits, will ever be found useful is, at least to Hardy, an irrelevant question. The merit of knowing these numbers lies in quenching the human race’s intellectual thirst that started with Euclid’s proof of the infinitude of primes and still goes on today.

For more insights like this, visit our website at www.international-maths-challenge.com.
Credit of the article given to Ittay Weiss

The Unforgiving Math That Stops Epidemics

Posted on March 7, 2022April 22, 2025 by IMC

Credit: Peter Dazeley Getty Images

Not getting a flu shot could endanger more than just one’s own health, herd immunity calculations show

As the annual flu season approaches, medical professionals are again encouraging people to get flu shots. Perhaps you are among those who rationalize skipping the shot on the grounds that “I never get the flu” or “if I get sick, I get sick” or “I’m healthy, so I’ll get over it.” What you might not realize is that these vaccination campaigns for flu and other diseases are about much more than your health. They’re about achieving a collective resistance to disease that goes beyond individual well-being—and that is governed by mathematical principles unforgiving of unwise individual choices.

When talking about vaccination and disease control, health authorities often invoke “herd immunity.” This term refers to the level of immunity in a population that’s needed to prevent an outbreak from happening. Low levels of herd immunity are often associated with epidemics, such as the measles outbreak in 2014-2015 that was traced to exposures at Disneyland in California. A study investigating cases from that outbreak demonstrated that measles vaccination rates in the exposed population may have been as low as 50 percent. This number was far below the threshold needed for herd immunity to measles, and it put the population at risk of disease.

The necessary level of immunity in the population isn’t the same for every disease. For measles, a very high level of immunity needs to be maintained to prevent its transmission because the measles virus is possibly the most contagious known organism. If people infected with measles enter a population with no existing immunity to it, they will on average each infect 12 to 18 others. Each of those infections will in turn cause 12 to 18 more, and so on until the number of individuals who are susceptible to the virus but haven’t caught it yet is down to almost zero. The number of people infected by each contagious individual is known as the “basic reproduction number” of a particular microbe (abbreviated R0), and it varies widely among germs. The calculated R0 of the West African Ebola outbreak was found to be around 2 in a 2014 publication, similar to the R0computed for the 1918 influenza pandemic based on historical data.

If the Ebola virus’s R0 sounds surprisingly low to you, that’s probably because you have been misled by the often hysterical reporting about the disease. The reality is that the virus is highly infectious only in the late stages of the disease, when people are extremely ill with it. The ones most likely to be infected by an Ebola patient are caregivers, doctors, nurses and burial workers—because they are the ones most likely to be present when the patients are “hottest” and most likely to transmit the disease. The scenario of an infectious Ebola patient boarding an aircraft and passing on the disease to other passengers is extremely unlikely because an infectious patient would be too sick to fly. In fact, we know of cases of travelers who were incubating Ebola virus while flying, and they produced no secondary cases during those flights.

Note that the R0 isn’t related to how severe an infection is, but to how efficiently it spreads. Ebola killed about 40 percent of those infected in West Africa, while the 1918 influenza epidemic had a case-fatality rate of about 2.5 percent. In contrast, polio and smallpox historically spread to about 5 to 7 people each, which puts them in the same range as the modern-day HIV virus and pertussis (the bacterium that causes whooping cough).

Determining the R0 of a particular microbe is a matter of more than academic interest. If you know how many secondary cases to expect from each infected person, you can figure out the level of herd immunity needed in the population to keep the microbe from spreading. This is calculated by taking the reciprocal of R0 and subtracting it from 1. For measles, with an R0 of 12 to 18, you need somewhere between 92 percent (1 – 1/12) and 95 percent (1 – 1/18) of the population to have effective immunity to keep the virus from spreading. For flu, it’s much lower—only around 50 percent. And yet we rarely attain even that level of immunity with vaccination.

Once we understand the concept of R0, so much about patterns of infectious disease makes sense. It explains, for example, why there are childhood diseases—infections that people usually encounter when young, and against which they often acquire lifelong immunity after the infections resolve. These infections include measles, mumps, rubella and (prior to its eradication) smallpox—all of which periodically swept through urban populations in the centuries prior to vaccination, usually affecting children.

Do these viruses have some unusual affinity for children? Before vaccination, did they just go away after each outbreak and only return to cities at approximately five- to 10-year intervals? Not usually. After a large outbreak, viruses linger in the population, but the level of herd immunity is high because most susceptible individuals have been infected and (if they survived) developed immunity. Consequently, the viruses spread slowly: In practice, their R0 is just slightly above 1. This is known as the “effective reproduction number”—the rate at which the microbe is actually transmitted in a population that includes both susceptible and non-susceptible individuals (in other words, a population where some immunity already exists). Meanwhile, new susceptible children are born into the population. Within a few years, the population of young children who have never been exposed to the disease dilutes the herd immunity in the population to a level below what’s needed to keep outbreaks from occurring. The virus can then spread more rapidly, resulting in another epidemic.

An understanding of the basic reproduction number also explains why diseases spread so rapidly in new populations: Because those hosts lack any immunity to the infection, the microbe can achieve its maximum R0. This is why diseases from invading Europeans spread so rapidly and widely among indigenous populations in the Americas and Hawaii during their first encounters. Having never been exposed to these microbes before, the non-European populations had no immunity to slow their spread.

If we further understand what constellation of factors contributes to an infection’s R0, we can begin to develop interventions to interrupt the transmission. One aspect of the R0 is the average number and frequency of contacts that an infected individual has with others susceptible to the infection. Outbreaks happen more frequently in large urban areas because individuals living in crowded cities have more opportunities to spread the infection: They are simply in contact with more people and have a higher likelihood of encountering someone who lacks immunity. To break this chain of transmission during an epidemic, health authorities can use interventions such as isolation (keeping infected individuals away from others) or even quarantine (keeping individuals who have been exposed to infectious individuals—but are not yet sick themselves—away from others).

Other factors that can affect the R0 involve both the host and the microbe. When an infected person has contact with someone who is susceptible, what is the likelihood that the microbe will be transmitted? Frequently, hosts can reduce the probability of transmission through their behaviors: by covering coughs or sneezes for diseases transmitted through the air, by washing their contaminated hands frequently, and by using condoms to contain the spread of sexually transmitted diseases.

These behavioral changes are important, but we know they’re far from perfect and not particularly efficient in the overall scheme of things. Take hand-washing, for example. We’ve known of its importance in preventing the spread of disease for 150 years. Yet studies have shown that hand-washing compliance even by health care professionals is astoundingly low — less than half of doctors and nurses wash their hands when they’re supposed to while caring for patients. It’s exceedingly difficult to get people to change their behavior, which is why public health campaigns built around convincing people to behave differently can sometimes be less effective than vaccination campaigns.

How long a person can actively spread the infection is another factor in the R0. Most infections can be transmitted for only a few days or weeks. Adults with influenza can spread the virus for about a week, for example. Some microbes can linger in the body and be transmitted for months or years. HIV is most infectious in the early stages when concentrations of the virus in the blood are very high, but even after those levels subside, the virus can be transmitted to new partners for many years. Interventions such as drug treatments can decrease the transmissibility of some of these organisms.

The microbes’ properties are also important. While hosts can purposely protect themselves, microbes don’t choose their traits. But over time, evolution can shape them in a manner that increases their chances of transmission, such as by enabling measles to linger longer in the air and allowing smallpox to survive longer in the environment.

By bringing together all these variables (size and dynamics of the host population, levels of immunity in the population, presence of interventions, microbial properties, and more), we can map and predict the spread of infections in a population using mathematical models. Sometimes these models can overestimate the spread of infection, as was the case with the models for the Ebola outbreak in 2014. One model predicted up to 1.4 million cases of Ebola by January 2015; in reality, the outbreak ended in 2016 with only 28,616 cases. On the other hand, models used to predict the transmission of cholera during an outbreak in Yemen have been more accurate.

The difference between the two? By the time the Ebola model was published, interventions to help control the outbreak were already under way. Campaigns had begun to raise awareness of how the virus was transmitted, and international aid had arrived, bringing in money, personnel and supplies to contain the epidemic. These interventions decreased the Ebola virus R0 primarily by isolating the infected and instituting safe burial practices, which reduced the number of susceptible contacts each case had. Shipments of gowns, gloves and soap that health care workers could use to protect themselves while treating patients reduced the chance that the virus would be transmitted. Eventually, those changes meant that the effective R0 fell below 1—and the epidemic ended. (Unfortunately, comparable levels of aid and interventions to stop cholera in Yemen have not been forthcoming.)

Catch-up vaccinations and the use of isolation and quarantine also likely helped to end the Disneyland measles epidemic, as well as a slightly earlier measles epidemic in Ohio. Knowing the factors that contribute to these outbreaks can aid us in stopping epidemics in their early stages. But to prevent them from happening in the first place, a population with a high level of immunity is, mathematically, our best bet for keeping disease at bay.

For more insights like this, visit our website at www.international-maths-challenge.com.

Credit of the article given to Tara C. Smith & Quanta Magazine

How to avoid a sucker bet – with a little help from maths

Posted on February 25, 2022April 22, 2025 by IMC

Sitting in a bar, you start chatting to a man who issues you a challenge. He hands you five red and two black cards. After shuffling, you lay them on the bar, face down. He bets you that you cannot turn over three red cards. And to help you, he explains the odds.

When you draw the first card, the odds are 5-2 (five red cards, two black cards) in favour of picking a red card. The second draw is 4-2 (or 2-1) and the third draw is 3-2. Each time you draw a card the odds appear to be in your favour, in that you have more chance of drawing a red card than a black card. So, do you accept the bet?

If you answered yes, perhaps it’s time for you to go over your maths. It’s a foolish bet. The odds given above are only for a perfect draw. The real odds of you being able to carry out this feat are actually 5-2 against you. That is, for every seven times you play, you’ll lose five times.

Odds against you

This type of bet is often called a proposition bet, which is defined as a wager on something that seems like a good idea, but for which the odds are actually against you, often very much against you, perhaps even making it impossible for you to win.

Let’s assume that you took the bet and, almost inevitably, lost money. But this is just for fun, right? So your new “friend” suggests a way that you can get your money back. He takes two more red cards and hands them to you, so you now have seven red cards and two black cards. You shuffle the nine cards and lay them out, face down, in a three by three grid. He bets you even money that you can’t pick out a straight line (vertical, horizontal or diagonal) that has only red cards.

Nine Card Hustle. Graham Kendall created image

Intuitively, this might sound like a better bet and the odds are actually evens if the two black cards are next to each other in a corner (see image). In total there are eight lines to choose from and four contain only red cards, and four contain a black card. But that is as good as it gets.

If the black cards are in opposite corners then you can only win by choosing the centre horizontal or vertical row so the odds are 6-2 (or 3-1) against you winning. Every other layout gives you three winning lines and five losing lines. This bet only has 12 ways of succeeding, against 22 ways of you losing. Hardly an even-chance bet.

Have another go

Try to evaluate the odds for this proposition bet.

You shuffle a pack of cards and cut it into three piles. You are offered even money that one of the cards on top of the piles will be a picture card (a jack, queen or king). That is, if a picture card shows up, you lose. Do you think this is a good bet?

One way of reasoning is that there are only 12 losing cards against 40 winning cards, so the odds look better than evens? But this is the wrong way of looking at it. It is really what’s known as a combinatorics problem. We should also realise that we are just choosing three cards at random.

There are 22,100 ways of choosing three cards from a 52 card deck. Of these, 12,220 will contain at least one picture card – so you lose – meaning that 9,880 will not contain a picture card – when you win. If you translate this to odds, you will lose fives times out of every nine times you play (5-4 against you). The even chance bet you have been offered is not the good value that you thought it was and you will lose money if you play a few times.

A Final Example

We can all agree that you have a 50/50 chance of guessing heads or tails in a coin toss. But if you toss the coin ten times, would you expect to see five heads and five tails? If you were offered odds of 2-1 to try this, would you take the bet? You’d be a sucker if you did.

Five heads and five tails will occur more often than any other combination, but there are many other ways that ten flips of a coin can land. In fact, the bet is 5-2 against you.

Another name for a proposition bet is the “sucker” bet, and there is no surprise who the sucker is. But don’t feel too bad. We are all generally very poor at evaluating true odds. A famous example is the Monty Hall Problem. Even mathematicians could not agree on the right answer to this seemingly simple problem.

We have focused on bets where it is difficult, especially when under the pressure of deciding whether to bet or not, to calculate the true odds. But there are many other proposition bets that do not rely on calculating odds. And there are many other sucker bets, with probably the most famous being the Three Card Monty.

If faced with this type of bet, what is the best thing you can do? I’d suggest you simply walk away.

For more insights like this, visit our website at www.international-maths-challenge.com.
Credit of the article given to Graham Kendall

The Mathematics of Cake Cutting

Posted on February 6, 2022April 22, 2025 by IMC

Credit: OLENA SHMAHALO Quanta Magazine

Computer scientists have come up with an algorithm that can fairly divide a cake among any number of people

Two young computer scientists have figured out how to fairly divide cake among any number of people, setting to rest a problem mathematicians have struggled with for decades. Their work has startled many researchers who believed that such a fair-division protocol was probably impossible.

Cake-cutting is a metaphor for a wide range of real-world problems that involve dividing some continuous object, whether it’s cake or, say, a tract of land, among people who value its features differently—one person yearning for chocolate frosting, for example, while another has his eye on the buttercream flowers. People have known at least since biblical times that there’s a way to divide such an object between two people so that neither person envies the other: one person cuts the cake into two slices that she values equally, and the other person gets to choose her favorite slice. In the book of Genesis, Abraham (then known as Abram) and Lot used this “I cut, you choose” procedure to divide land, with Abraham deciding where to divide and Lot choosing between Jordan and Canaan.

Around 1960, mathematicians devised an algorithm that can produce a similarly “envy-free” cake division for three players. But until now, the best they had come up with for more than three players was a procedure created in 1995 by political scientist Steven Brams of New York University and mathematician Alan Taylor of Union College in Schenectady, New York, which is guaranteed to produce an envy-free division, but it is “unbounded,” meaning that it might need to run for a million steps, or a billion, or any large number, depending on the players’ cake preferences.

Brams and Taylor’s algorithm was hailed as a breakthrough at the time, but “the fact that it wasn’t bounded I think was a huge shortcoming,” said Ariel Procaccia, a computer scientist at Carnegie Mellon University and one of the creators of Spliddit, a free online tool that provides fair-division algorithms for tasks like dividing chores or rent among roommates.

Over the past 50 years, many mathematicians and computer scientists, including Procaccia, had convinced themselves that there was probably no bounded, envy-free algorithm for dividing cake among n players.

“This is the very problem that got me into the subject of fair division,” said Walter Stromquist, a mathematics professor at Bryn Mawr College in Pennsylvania who proved some of the seminal results on cake cutting in 1980. “I have thought all my life that I would come back to it when I had time and prove that this particular extension of the result was impossible.”

But in April, two computer scientists defied expectations by posting a paper online describing an envy-free cake-cutting algorithm whose running time depends only on the number of players, not on their individual preferences. One of the pair—27-year-old Simon Mackenzie, a postdoctoral researcher at Carnegie Mellon—will present the pair’s findings on Oct. 10 at the 57th annual IEEE Symposium on Foundations of Computer Science, one of computer science’s pre-eminent conferences.

The algorithm is extraordinarily complex: Dividing a cake among n players can require as many as n^n^n^n^n^n steps and a roughly equivalent number of cuts. Even for just a handful of players, this number is greater than the number of atoms in the universe. But the researchers already have ideas for making the algorithm much simpler and faster, said the other half of the team, Haris Aziz, a 35-year-old computer scientist at the University of New South Wales and Data61, a data research group in Australia.

For the people who study the theory of fair division, this is “definitely the biggest result in decades,” Procaccia said.

Pieces of Cake

Aziz and Mackenzie’s new algorithm builds on an elegant procedure that mathematicians John Selfridge and John Conway independently came up with around 1960 for dividing a cake among three people.

If Alice, Bob and Charlie want to share a cake, the algorithm starts by having Charlie cut the cake into three slices that are equally valuable from his perspective. Alice and Bob are each asked to point to their favorite slices, and if they like different slices, we’re done—they each take their favorite, Charlie takes the remaining slice, and everyone goes home happy.

If Alice and Bob have the same favorite, then Bob is asked to trim a little cake off that slice so that what remains is equal in value to his second-favorite slice; the trimmed bit is set aside for later. Now Alice gets to choose her favorite piece from among the three slices, and then Bob gets to choose, with the requirement that if Alice didn’t choose the trimmed slice, he must take it. Charlie gets the third slice.

At this stage, none of the players envy each other. Alice is happy since she got to choose first; Bob is happy since he got one of his two equally preferred top choices; and Charlie is happy because he got one of his three original pieces, all of which are equal in his eyes.

But there’s still the trimmed bit to be divided. What makes it possible to divide this bit without creating still more trimmings, and getting into an infinite cycle of trimming and choosing, is the fact that Charlie is more than merely satisfied with the cake he has gotten so far; he would not feel cheated even if the player with the trimmed slice gets all the cake that’s waiting to be allocated, since the trimmed slice plus the trimming equals one of the original slices. Aziz and Mackenzie describe this relationship by saying that Charlie “dominates” the player who got the trimmed slice.

Now if, for example, Alice was the one who got the trimmed slice, the algorithm proceeds as follows: Bob cuts the trimmings into three pieces that he values equally, and then first Alice gets to choose a piece, then Charlie, then Bob. Everyone is happy: Alice because she got to choose first, Charlie because he gets a slice he likes better than Bob’s (and he didn’t care how much Alice took), and Bob because the three slices are equal in his view.

Brams and Taylor used the notion of domination (without calling it that) in designing their 1995 algorithm, but they couldn’t push the idea far enough to get a bounded algorithm. For the next 20 years, neither could anyone else. “I don’t think it’s for lack of trying,” Procaccia said.

Rookie Cake-Cutters

When Aziz and Mackenzie decided to tackle the problem a couple of years ago, they were comparative newcomers to the cake-cutting problem. “We did not have as much background as people who have been intensely working on it would have,” Aziz said. “Although that is mostly a disadvantage, in this case it was somewhat of an advantage, because we were not thinking in the same way.

Aziz and Mackenzie wet their feet by studying the three-player problem from scratch, and their analysis eventually led them to find a bounded envy-free algorithm for the four-player case, which they posted online last year.

They couldn’t immediately see how to extend their algorithm to more than four players, but they dived feverishly into the problem. “After we submitted our paper for the four-agent case, we were really keen that we should try it before someone much more experienced, much more clever would generalize it to the n-agent case,” Aziz said. After about a year, their efforts succeeded.

As with the Selfridge-Conway algorithm, Aziz and Mackenzie’s complicated protocol repeatedly asks individual players to cut cake into n equal pieces, then asks other players to make trims and choose pieces of cake. But the algorithm also carries out other steps, such as periodically exchanging portions of players’ cake stashes in a carefully controlled way, with an eye toward increasing the number of domination relationships between players.

These domination relationships allow Aziz and Mackenzie to reduce the complexity of the problem: If, for example, three players dominate all the others, those three can be sent away with their slices of cake—they’ll be happy no matter who gets the remaining trimmings. Now there are fewer players to worry about, and after a bounded number of such steps, everyone has been satisfied and all the cake given out.

“Seeing, in retrospect, how complicated the algorithm is, it’s not surprising that it took a long time before somebody found one,” Procaccia said. But Aziz and Mackenzie already think that they can simplify their algorithm considerably, to one that doesn’t need the cake exchanges and takes fewer than n^n^n steps. They are currently writing up these new results, Aziz said.

Even a simpler such algorithm would be unlikely to have practical implications, Brams cautioned, since the cake portions that players receive would typically include many tiny crumbs from different parts of the cake—not a feasible approach if you’re dividing something like a tract of land.

But for mathematicians and computer scientists who study cake cutting, the new result “resets the subject,” Stromquist said.

Now that researchers know it’s possible to fairly divide cake in a bounded number of steps, the next goal, Procaccia said, is to understand the huge gulf between Aziz and Mackenzie’s upper bound and the existing lower bound on the number of cuts needed to divide a cake.

Procaccia had previously proved that an envy-free cake-cutting algorithm will require at least about n² steps—but that bound is minuscule compared to n^n^n^n^n^n or even n^n^n.

Researchers now have to figure out how to close this gap, Aziz said. “I think there can be progress in both directions.”

For more insights like this, visit our website at www.international-maths-challenge.com.

Credit of the article given to Erica Klarreich & Quanta Magazine

A newly discovered prime number makes its debut

Posted on January 16, 2022April 22, 2025 by IMC

The distribution of prime numbers from 1 to 76,800, from left to right and top to bottom. A black pixel means that the number is first, while a white pixel means that it is not.

On December 26, 2017, J. Pace, G. Woltman, S. Kurowski, A. Blosser, and their co-authors announced the discovery of a new prime number): 2⁷⁷²³²⁹¹⁷-1. It’s an excellent opportunity to take a small tour through the wonderful world of prime numbers to see how this result was achieved and why it is so interesting.

A prime number is one that is divisible only by itself and the number 1, that is, essentially a number that has no divisor. Some speak of prime numbers as the atoms of the mathematical universe, others as precious stones.

US stamp featuring the prime number 2¹¹²¹³-1. Author provided, CC BY

It is to Euclid that we owe the first two definitions of a prime number:

Any number is the unique product of prime factors.
They are infinite in number. The demonstration of this result is regarded as the first proof by absurdity: Suppose there is only a finite number of prime numbers, so they are all smaller than an integer n. Any integer greater than n would therefore be divisible by a prime number less than n. However, the number (2 * 3 * … * n) + 1 is not divisible by any integer from 2 to n since the remainder of the division is always 1 – a contradiction of the preceding sentence.

Eratosthenes, who lived from -276 to -194, proposed a process that allows us to find all prime numbers less than a given natural number N. The process consists of eliminating from a table integers from 2 to N that are multiples of those numbers. By deleting all the multiples, there remain only integers that are not multiples of any integer, and so are prime numbers. The search for efficient algorithms is an active research topic – for example for the Lucas-Lehmer test.

After the Greek era, there was a long dark period that lasted until the end of the 16th century and the arrival of French theologian and mathematician Marin Mersenne (1588-1648). He was an advocate of Catholic orthodoxy, yet also believed that religion must welcome any updated truth. He was a Cartesian and translator of Galileo.

Mersenne was looking for a formula that would generate all the prime numbers. In particular, he studied the numbers M_p = 2^p-1, where p is prime. These numbers are now called Mersenne numbers or Mersenne primes. In 1644 he wrote that Mp is prime for p = 2, 3, 5, 7, 13, 17, 19, 31, 67, 127, 257, and compound – in other words, non-prime – for the other 44 lower p values at 257. These definition actually commits five errors: M61, M89 and M107 are prime, while M67 and M257 are not.

The new prime number discovered at the very end of 2017 corresponds to M77232917. It has 23,249,425 digits – almost a million digits more than the previous record-holding prime. If the number were contained by a document written in the font Times New Roman with a point size of 10 and standard page margins, it would fill 3,845 pages.

The official date of discovery of a prime number is the day that someone declares the result. This is in keeping with tradition: M4253 is reputed not to have one because in 1961 the American mathematician Alexander Hurwitz read a printer output from the end forward, and found M4423 a few seconds before seeing M4253. The previous Mersenne number also had a complicated history: the computer reported the result to the server on September 17, 2015, but a bug blocked the email. The prime number remained unnoticed until January 7, 2016.

Quantum cryptography

We often refer to the use of prime numbers in cryptography, but they’re too big to be really useful. (There is hope that quantum cryptography will change things.) Historically, Mersenne’s search for prime numbers has been used as a test for computer hardware. In 2016, the premium95 community discovered a flaw in Intel’s Skylake CPU as well as many PCs. This prime number was found as part of the Great Internet Mersenne Prime Search Project (GIMPS).

2⁷⁷²³²⁹¹⁷-1 is the 50th Mersenne prime and if the challenge to discover the 51st tempts you, the verification program is available to all – and there’s even a $3,000 prize.

For more insights like this, visit our website at www.international-maths-challenge.com.
Credit of the article given to Avner Bar-Hen

The Mathematics of (Hacking) Passwords

Posted on January 8, 2022April 22, 2025 by IMC

Credit: Gaetan Charbonneau Getty Images

The science and art of password setting and cracking continues to evolve, as does the war between password users and abusers

At one time or another, we have all been frustrated by trying to set a password, only to have it rejected as too weak. We are also told to change our choices regularly. Obviously such measures add safety, but how exactly?

I will explain the mathematical rationale for some standard advice, including clarifying why six characters are not enough for a good password and why you should never use only lowercase letters. I will also explain how hackers can uncover passwords even when stolen data sets lack them.

Choose#W!sely@*

Here is the logic behind setting hack-resistant passwords. When you are asked to create a password of a certain length and combination of elements, your choice will fit into the realm of all unique options that conform to that rule—into the “space” of possibilities. For example, if you were told to use six lowercase letters—such as, afzjxd, auntie, secret, wwwwww—the space would contain 26⁶, or 308,915,776, possibilities. In other words, there are 26 possible choices for the first letter, 26 possible choices for the second, and so forth. These choices are independent: you do not have to use different letters, so the size of the password space is the product of the possibilities, or 26 x 26 x 26 x 26 x 26 x 26 = 26⁶.

If you are told to select a 12-character password that can include uppercase and lowercase letters, the 10 digits and 10 symbols (say, !, @, #, $, %, ^, &, ?, / and +), you would have 72 possibilities for each of the 12 characters of the password. The size of the possibility space would then be 72¹² (19,408,409,961,765,342,806,016, or close to 19 x 10²¹).

That is more than 62 trillion times the size of the first space. A computer running through all the possibilities for your 12-character password one by one would take 62 trillion times longer. If your computer spent a second visiting the six-character space, it would have to devote two million years to examining each of the passwords in the 12-character space. The multitude of possibilities makes it impractical for a hacker to carry out a plan of attack that might have been feasible for the six-character space.

Calculating the size of these spaces by computer usually involves counting the number of binary digits in the number of possibilities. That number, N, is derived from this formula: 1 + integer(log₂(N)). In the formula, the value of log₂(N) is a real number with many decimal places, such as log₂(26⁶) = 28.202638…. The “integer” in the formula indicates that the decimal portion of that log value is omitted, rounding down to a whole number—as in integer(28.202638… 28). For the example of six lowercase letters above, the computation results in 29 bits; for the more complex, 12-character example, it is 75 bits. (Mathematicians refer to the possibility spaces as having entropy of 29 and 75 bits, respectively.) The French National Cybersecurity Agency (ANSSI) recommends spaces having a minimum of 100 bits when it comes to passwords or secret keys for encryption systems that absolutely must be secure. Encryption involves representing data in a way that ensures it cannot be retrieved unless a recipient has a secret code-breaking key. In fact, the agency recommends a possibility space of 128 bits to guarantee security for several years. It considers 64 bits to be very small (very weak); 64 to 80 bits to be small; and 80 to 100 bits to be medium (moderately strong).

Moore’s law (which says that the computer-processing power available at a certain price doubles roughly every two years) explains why a relatively weak password will not suffice for long-term use: over time computers using brute force can find passwords faster. Although the pace of Moore’s law appears to be decreasing, it is wise to take it into account for passwords that you hope will remain secure for a long time.

For a truly strong password as defined by ANSSI, you would need, say, a sequence of 16 characters, each taken from a set of 200 characters. This would make a 123-bit space, which would render the password close to impossible to memorize. Therefore, system designers are generally less demanding and accept low- or medium-strength passwords. They insist on long ones only when the passwords are automatically generated by the system, and users do not have to remember them.

There are other ways to guard against password cracking. The simplest is well known and used by credit cards: after three unsuccessful attempts, access is blocked. Alternative ideas have also been suggested, such as doubling the waiting time after each successive failed attempt but allowing the system to reset after a long period, such as 24 hours. These methods, however, are ineffective when an attacker is able to access the system without being detected or if the system cannot be configured to interrupt and disable failed attempts.

How Long Does It Take to Search All Possible Passwords?

For a password to be difficult to crack, it should be chosen randomly from a large set, or “space,” of possibilities. The size, T, of the possibility space is based on the length, A, of the list of valid characters in the password and the number of characters, N, in the password.

The size of this space (T = A^N) may vary considerably.

Each of the following examples specifies values of A, N, T and the number of hours, D, that hackers would have to spend to try every permutation of characters one by one. X is the number of years that will have to pass before the space can be checked in less than one hour, assuming that Moore’s law (the doubling of computing capacity every two years) remains valid. I also assume that in 2019, a computer can explore a billion possibilities per second. I represent this set of assumptions with the following three relationships and consider five possibilities based on values of A and N:

Relationships

T = A^N
D = T/(10⁹ × 3,600)
X = 2 log₂[T/(10⁹ × 3,600)]

Results

_________________________________

If A = 26 and N = 6, then T = 308,915,776
D = 0.0000858 computing hour
X = 0; it is already possible to crack all passwords in the space in under an hour

_________________________________

If A = 26 and N = 12, then T = 9.5 × 10¹⁶
D = 26,508 computing hours
X = 29 years before passwords can be cracked in under an hour

_________________________________

If A = 100 and N = 10, then T = 10²⁰
D = 27,777,777 computing hours
X = 49 years before passwords can be cracked in under an hour

_________________________________

If A = 100 and N = 15, then T = 10³⁰
D = 2.7 × 10¹⁷ computing hours
X = 115 years before passwords can be cracked in under an hour

________________________________

If A = 200 and N = 20, then T = 1.05 × 10⁴⁶
D = 2.7 × 10³³ computing hours
X = 222 years before passwords can be cracked in under an hour

Weaponizing Dictionaries and Other Hacker Tricks

Quite often an attacker succeeds in obtaining encrypted passwords or password “fingerprints” (which I will discuss more fully later) from a system. If the hack has not been detected, the interloper may have days or even weeks to attempt to derive the actual passwords.

To understand the subtle processes exploited in such cases, take another look at the possibility space. When I spoke earlier of bit size and password space (or entropy), I implicitly assumed that the user consistently chooses passwords at random. But typically the choice is not random: people tend to select a password they can remember (locomotive) rather than an arbitrary string of characters (xdichqewax).

This practice poses a serious problem for security because it makes passwords vulnerable to so-called dictionary attacks. Lists of commonly used passwords have been collected and classified according to how frequently they are used. Attackers attempt to crack passwords by going through these lists systematically. This method works remarkably well because, in the absence of specific constraints, people naturally choose simple words, surnames, first names and short sentences, which considerably limits the possibilities. In other words, the nonrandom selection of passwords essentially reduces possibility space, which decreases the average number of attempts needed to uncover a password.

If you use password or iloveyou, you are not as clever as you thought! Of course, lists differ according to the country where they are collected and the Web sites involved; they also vary over time.

For four-digit passwords (for example, the PIN code of SIM cards on smartphones), the results are even less imaginative. In 2013, based on a collection of 3.4 million passwords each containing four digits, the DataGenetics Web site reported that the most commonly used four-digit sequence (representing 11 percent of choices) was 1234, followed by 1111 (6 percent) and 0000 (2 percent). The least-used four-digit password was 8068. Careful, though, this ranking may no longer be true now that the result has been published. The 8068 choice appeared only 25 times among the 3.4-million four-digit sequences in the database, which is much less than the 340 uses that would have occurred if each four-digit combination had been used with the same frequency. The first 20 series of four digits are: 1234; 1111; 0000; 1212; 7777; 1004; 2000; 4444; 2222; 6969; 9999; 3333; 5555; 6666; 1122; 1313; 8888; 4321; 2001; 1010.

Even without a password dictionary, using differences in frequency of letter use (or double letters) in a language makes it possible to plan an effective attack. Some attack methods also take into account that, to facilitate memorization, people may choose passwords that have a certain structure—such as A1=B2=C3, AwX2AwX2 or O0o.lli. (which I used for a long time)—or that are derived by combining several simple strings, such as password123 or johnABC0000. Exploiting such regularities makes it possible to for hackers to speed up detection.

Advice for Web Sites

Web sites, too, follow various rules of thumb. The National Institute of Standards and Technology recently published a notice recommending the use of dictionaries to filter users’ password choices.

Among the rules that a good Web server designer absolutely must adhere to is, do not store plaintext lists of usernames and passwords on the computer used to operate the Web site.

The reason is obvious: hackers could access the computer containing this list, either because the site is poorly protected or because the system or processor contains a serious flaw unknown to anyone except the attackers (a so-called zero-day flaw), who can exploit it.

One alternative is to encrypt the passwords on the server: use a secret code that transforms them via an encryption key into what will appear to be random character sequences to anyone who does not possess the decryption key. This method works, but it has two disadvantages. First, it requires decrypting the stored password every time to compare it with the user’s entry, which is inconvenient. Second, and more seriously, the decryption necessary for this comparison requires storing the decryption key in the Web site computer’s memory. This key may therefore be detected by an attacker, which brings us back to the original problem.

A better way to store passwords is through what are called hash functions that produce “fingerprints.” For any data in a file—symbolized as F—a hash function generates a fingerprint. (The process is also called condensing or hashing.) The fingerprint—h(F)—is a fairly short word associated with F but produced in such a way that, in practice, it is impossible to deduce F from h(F). Hash functions are said to be one-way: getting from F to h(F) is easy; getting from h(F) to F is practically impossible. In addition, the hash functions used have the characteristic that even if it is possible for two data inputs, F and F’, to have the same fingerprint (known as a collision), in practice for a given F, it is almost impossible to find an F’ with a fingerprint identical to F.

Using such hash functions allows passwords to be securely stored on a computer. Instead of storing the list of paired usernames and passwords, the server stores only the list of username/fingerprint pairs.

When a user wishes to connect, the server will read the individual’s password, compute the fingerprint and determine whether it corresponds to the list of stored username/fingerprint pairs associated with that username. That maneuver frustrates hackers because even if they have managed to access the list, they will be unable to derive the users’ passwords, inasmuch as it is practically impossible to go from fingerprint to password. Nor can they generate another password with an identical fingerprint to fool the server because it is practically impossible to create collisions.

For more insights like this, visit our website at www.international-maths-challenge.com.

Credit of the article given to Jean-Paul Delahaye