Hidden meanings: Using artificial intelligence to translate ancient texts

The ancient world is full of mystery. Many mysteries, in fact. Many mysteries indeed. 

Who built the monolithic and megalithic structures found all over the world? Why did they build them? How did they build them? What technology did they use?

And perhaps most importantly from the point of view answering all the other questions: Where are the texts that the builders produced?

We assume that if the ancients were capable of building structures that modern humans cannot replicate even now with the latest technology, they must have been a literate civilization which recorded and stored information. 

But where is it?

These are among the multitude of questions that have actively and specifically preoccupied archaeologists and historians for more than a century.

A huge amount of progress has been made as a result of the dedicated pursuit of the answers. It has spawned a multibillion-dollar global tourism industry and some relatively well-funded academic projects. A lot of museums and films can also be said to be somewhat beholden to this obsession with the ancient past.

But in terms of definitively answering those big questions, progress has been rather slow and painstaking.

rosetta stone

The Rosetta Stone

It would, of course, help if more artifacts like the Rosetta Stone (pictured above) were discovered.

The Rosetta Stone, created in around 200 BC and discovered in the year 1800, is a black stone on which three different languages were written – Egyptian hieroglyphics, Greek, and a more common Egyptian language called Demotic.

This stone enabled people studying ancient cultures to finally understand the Egyptian hieroglyphics which cover acres of surface area on pyramids and temples in the country.

The presumption is made that the three statements on the Rosetta Stone are direct and literal translations of each other, but since academics have been studying it for a long time, we can probably safely make that presumption.

Other ancient languages, however, are proving more evasive. The Indus Valley civilization, which is said to be one of the oldest ever discovered, used a language that is defying almost all attempts at translations because it has no established relationship with any other language on Earth, although it is pictorial in part.

The Sumerian language is more amenable to translation because some Sumerian people appear to have been bilingual, also speaking a contemporary language called Akkadian.

Translation work has so far been undertaken by humans, but soon, artificial intelligence systems will, inevitably, be used to not only speed up the process, but also improve accuracy – and perhaps identify similarities and patterns across many languages humans may not have the time or ability to interpret.

optical character recognition copy

The greatest whodunnit in history

Taken together, the global effort to learn about human history has become the greatest whodunnit in history.

Optical character recognition has been around for some time and is typically used to scan a physical text document and create digital representations of the letters and words it contains.

OCR technology has become near-enough flawless, which means that the old job of “copy typist” has pretty much disappeared.

Unless, of course, the typist has to copy a handwritten document or, worse still, translate text from one language to another.

Automated translation algorithms have made much progress and services such as Google Translate can save much time, although it’s probably advisable to check and edit what it produces as the translation.

Where such translation software may help humanity to make discoveries of literally earth-shattering proportions is if and when they are applied to ancient texts.

Almost all cultures which have ancient religions talk of “sky gods” and wars fought in the skies. Until recently, these stories were generally thought to be myths and allegory. But as new information is gathered, researchers are connecting the dots and the picture that is emerging is quite amazing.

Picture courtesy of Online Star Register

Aligning with the stars

Some people may be astonished when they first learn that the layout of the pyramid complex at Giza, Egypt (pictured above) mirrors the star constellation of Orion.

The largest pyramid aligns with the largest of the three stars in Orion’s Belt. The smaller two align with the other two.

This correlation was said to have first been spotted in modern times by Robert Bauval, in 1983, and it may have opened up the field of study to a whole new perspective, one which integrates the ancient structures with not only this planet but also many other celestial bodies very directly.

Until recently, the pyramids are thought to have been 5,000 years old. But the last time the three stars of Orion’s Belt aligned perfectly with the three pyramids at Giza was around 10,000 years ago, which obviously means that these structures may be much, much older than previously thought.

Hieroglyphs may shed some light on this, but given that some people believe hieroglyphs are the work of later civilizations, no one yet knows what information will emerge.

Computer games giant Ubisoft last year partnered with Google on an interesting project which it announced at the time of the launch of its Assassin’s Creed Origins game.

Ubisoft unveiled what it called “The Hieroglyphics Initiative”, a machine learning-based research project which uses Google’s TensorFlow technology.

The company said it would simplify the decipherment of hieroglyphics, and made the project open source.

“By making the Hieroglyphics Initiative an open source project, we aim to create a new connection between two things that we love at Ubisoft – history and technology,” said Pierre Miazga, Hieroglyphics Initiative Project co-ordinator at Ubisoft.

Perrine Poiron, an Egyptologist based from the Sorbonne University, in France, said: “The Hieroglyphics Initiative not only has the potential to save us time as Egyptologists, it could unlock the magic of hieroglyphics for a new audience.”

Image recognition is the specific technology that is being developed to translate the hieroglyphs, and this is categorized in the machine learning branch of AI.

sumerian tablet

The voice of Sumer

Meanwhile, or at least within 5-10,000 years or so of the Egyptian hieroglyphs, millions of clay tablets were inscribed with strange scripts.

The text on a large proportion of these tablets is said to be in “cuneiform”, which features wedge-shaped impressions on the soft clay, making them look like a whole load of tiny golf course flags.

Cuneiform could probably be said to be an abstract form, whereas earlier forms of Sumerian texts were figurative in that they used pictures, much like hieroglyphs.

This figurative form of language or writing is called “logographic”, and is said to be part of the ancient Mesopotamian culture, sometimes referred to as “the birthplace of writing”, although this is probably debatable given the ongoing discoveries being made all around the world.

In any case, the translation of ancient texts into modern languages is probably a tortuous process for even the most knowledgeable of academics, very few of whom specialise in this field.

A new initiative that may help these long-suffering human translators of Sumerian clay tablets is being developed by a group of academics which includes Émilie Pagé-Perron, Maria Sukhareva, Ilya Khait, and Christian Chiarcos.

Their paper, Machine Translation and Automated Analysis of the Sumerian Language, presents a newly funded research project which will use natural language processing – a recognized term in AI – to “create an information retrieval system for Sumerian”.

The academics say the project is in response to the need to translate large numbers of administrative texts that are only available in transcription, in order to make them accessible to a wider audience.

Only around 10 percent of all the discovered Sumerian texts have so far been translated, so the AI system being built by the academics has a lot of work to do.

sumerian cuneiform

What we can read into this

The 10 percent that has been translated are royal texts – written by and for the royal and ruling families in contemporary Sumerian culture.

But they reveal some deep insights into not only the ancient world but also into space, which we are only now beginning to understand. More on this later.

And although the 90 percent of Sumerian tablets that have yet to be translated are thought to be more mundane in nature compared with what has been already deciphered, being as they are legal and administrative, it’s very likely that they, too, will feed the fascination that many millions of people share about ancient cultures.

The AI system of translation could obviously be applied to other ancient languages in time. But there may be some – like the Indus Valley language – that remain stubborn because they share no words or terms with other languages, which may close the door on further understanding until more artifacts and information is unearthed, perhaps literally by the archaeologists.

Even with the door closed against them, some academics are using artificial intelligence to try and understand the ancient Indus Valley language.

In their paper, Entropic Evidence for Linguistic Structure in the Indus Script, a group of academics is using artificial intelligence methods to try and find patterns in the language of the ancient Indus Valley inhabitants.

The academics are Rajesh Rao, Nisha Yadav, Mayank Vahia, Hrishikesh Joglekar, R. Adhikari, and Iravatham Mahadevan.

Interestingly, some researchers have speculated that the reason why the ancient Indus Valley disappeared relatively suddenly is that a nuclear war occurred thousands of years ago, pointing to evidence of radiation poisoning found in corpses at sites such as the Harappan region.

Descriptions of nuclear weapons and their aftermath – the way they affect humans – can also be found in Hindu religious texts, which are written in Sanskrit, a language we do understand.

orbit of comet encke
Picture courtesy of

Is it a bird? Is it a plane? 

While most people may be tempted to dismiss the ancient nuclear war hypothesis immediately, there are new discoveries being made that may explain what the Hindu religious texts were referring to.

While an actual nuclear war in the modern sense of the term is difficult to believe, what is undeniable is that these stories of terrible, destructive wars fought in the skies and featuring massive explosions also appear in other ancient texts which have no known direct connection with Hinduism – one example is the ancient Sumerian texts.

Without going into too much detail, in the Sumerian version, it’s interesting that an alien being called Enki is mentioned. Interesting because Enki is phonetically similar to the word Encke, which is a term modern scientists use as the name of a comet associated with something called the Taurids.

The Taurids is a meteor shower that looks like it emanates from the star system Taurus.

Some people – including myself – have probably never figured out why star constellations are said to resemble animal and human figures, except in the case of the Horse Head Nebula. But anyway, the Taurus star system is said to refer to the bull, which was a revered animal in ancient times, and still is in India.

The comet Encke is on an elliptical orbit of the sun – which takes about three-and-a-half years – and its path crosses the Earth’s orbital path (see image above).

Furthermore, the orbit path of Encke is strewn with debris from the explosion which destroyed what was probably a planet on that orbital path.

Whatever it was that created the debris, the meteor showers associated with it can be seen with the naked eye from Earth twice a year.

In other words, Earth crosses the Taurid orbital path twice a year and that is when the debris – in the form of small meteors – hits our planet. It’s possible that these were seen by the ancients and that is what they wrote about, and a particularly bad meteor shower could have destroyed their civilizations.

What is to say that the pyramids were not gigantic weapons created to shoot death rays to destroy particularly large meteors or other large cosmic objects?

Given that evidence is emerging to suggest that the Great Pyramid of Giza and other pyramids were actually designed not as tombs but as structures to concentrate and direct energy, and were interlinked somehow across the world, it’s possible.

Who knows? The people who built them obviously did. But they seem to have taken their plans with them, or hidden them somewhere which has prevented them being found in literally thousands of years.

If they are ever found, those will be the only user manuals that many people will ever read.

