An artificial intelligence that can turn mathematical concepts written in English into a formal proving language for computers could make problems easier for other AIs to solve.
Maths can be difficult for a computer to understand
An artificial intelligence can translate maths problems written in plain English to formal code, making them easier for computers to solve in a crucial step towards building a machine capable of discovering new maths.
Computers have been used to verify mathematical proofs for some time, but they can only do it if the problems have been prepared in a specifically designed proving language, rather than for the mix of mathematical notation and written text used by mathematicians. This process, known as formalisation, can take years of work for just a single proof, so only a small fraction of mathematical knowledge has been formalised and then proved by a machine.
Yuhuai Wu at Google and his colleagues used a neural network called Codex created by AI research company OpenAI. It has been trained on large amounts of text and programming data from the web and can be used by programmers to generate workable code.
Proving languages share similarities with programming languages, so the team decided to see if Codex could formalise a bank of 12,500 secondary school maths competition problems. It was able to translate a quarter of all problems into a format that was compatible with a formal proof solver program called Isabelle. Many of the unsuccessful translations were the result of the system not understanding certain mathematical concepts, says Wu. “If you show the model with an example that explains that concept, the model can then quickly pick it up.”
To test the effectiveness of this auto-formalisation process, the team then applied Codex to a set of problems that had already been formalised by humans. Codex generated its own formal versions of these problems, and the team used another AI called MiniF2F to solve both versions.
The auto-formalised problems improved MiniF2F’s success rate from 29 per cent to 35 per cent, suggesting that Codex was better at formalising these problems than the humans were.
It is a modest improvement, but Wu says the team’s work is only a proof of concept. “If the goal is to train a machine that is capable of doing the same level of mathematics as the best humans, then auto-formalisation seems to be a very crucial path towards it,” says Wu.
Improving the success rate further would allow AIs to compete with human mathematicians, says team member Albert Jiang at the University of Cambridge. “If we get to 100 per cent, we will definitely be creating an artificial intelligence agent that’s able to win an International Maths Olympiad gold medal,” he says, referring to the top prize in a leading maths competition.
While the immediate goal is to improve the auto-formalisation models, and automated proving machines, there could be larger implications. Eventually, says Wu, the models could uncover areas of mathematics currently unknown to humans.
The capacity for reasoning in such a machine could also make it well-suited for verification tasks in a wide range of fields. “You can verify whether a piece of software is doing exactly what you asked it to do, or you can verify hardware chips, so it has applications in financial trading algorithms and hardware design,” says Jiang.
It is an exciting development for using machines to find new mathematics, says Yang-Hui He at the London Institute for Mathematical Sciences, but the real challenge will be in using the model on mathematical research, much of which is written in LaTeX, a typesetting system. “We only use LaTeX because it types nicely, but it’s a natural language in some sense, it has its own rules,” says He.
Users can define their own functions and symbols in LaTeX that might only be used in a single mathematical paper, which could be tricky for a neural network to tackle that has only been trained on the plain text, says He.
For more such insights, log into www.international-maths-challenge.com.
*Credit for article given to Alex Wilkins*