Comments to the New Scientist article

A feature article about our research has appeared in the special X-mas issue of New Scientist.

We appreciate that the author of the article had tried to get to the bottom of our results. Nonetheless, there are some points that we must comment on.

First, about 37, which made its way even into the headline. It’s a mystery for us why it catches most of the attention and is even ascribed some “special meaning” (not only by this author). 37 is certainly not an answer to life and everything, it has a much humbler role in the code. Once you try to systematize codons in a position-dependent manner you get a structure that we call the ideogram, which reveals a hierarchy of symmetries, both at the level of individual nucleotides and at the level of triplets. Alternatively, when you systematize codons in a position-independent manner using all non-arbitrary divisions of the code, you find that in each of this divisions the code reveals exact equalities of nucleon sums (in essence, another type of symmetries – quantitative symmetries). And only then you find that those nucleon sums systematically reveal distinct notation in the positional decimal system, which, in its turn, is related to the criterion of divisibility by 37 (existing in the decimal system). That’s all – no any special meaning. Why the main 90% of the result are missed by many commentators, while 37 is accented, is something of a head-scratcher to us. All of the patterns would be no less impressing if they revealed distinctive notation, say, in septenary system (where it is related to the criterion of divisibility by 19), or even did not reveal distinctive notation in any numeral system at all.

Also, after reading the article one might have an impression that because the patterns are non-random, we argue that they are artificial. However, non-randomness alone is by no means a sign of artificiality. Unfortunately, the author did not mention any arguments on why we actually consider the signal to be artificial. You might take a look at these arguments here.

Billions of years ago, the planet was barren and lifeless. But then, at some distant and unknowable moment, it was seeded with what Makukov calls an “intelligent-like signal”…

Seeding individual planets is utterly inefficient compared to seeding collapsing clouds that produce star cluster (our Solar system, like the majority of others, is also believed to have formed in a cluster). Besides, saying that it was seeded “with a signal” might be misleading – it was seeded with microbial cells which carried an intelligent signal.

The idea goes back to 1973, when Francis Crick published a paper in the planetary sciences journal Icarus

In fact, the idea of directed panspermia goes back even further, to Carl Sagan and Iosif Shklovskii in 1966, and to J.B.S. Haldane in 1954.

Unlike genomic DNA, the code is stable. Genomes mutate over time, but the code is passed down the generations without alteration…

Actually, the genetic code did have minor alterations in some lineages of simple organisms and organelles. There are about a dozen known variations of the code. But that does not affect the major point, since the vast majority of organisms still use the code inherited from the last universal common ancestor (which is the original seeds in case of directed panspermia).

Their arguments are often dense and impenetrable, filled with complex mathematical formulae

No, no, no. Actually, it’s all very easy to grasp, and we do not use any formulae to retrieve the signal (not to mention complex formulae). If it were so complex, it would make little sense for us to argue that it has intentional origin – why inserting an intentional signal if it’s utterly complex to retrieve?

“It was clear right away that the code has a non-random structure,” says Makukov.

To clarify: as the code was cracked (in 1960s), it was clear right away that it has a non-random structure. It is not that everyone thought that the code is random until recently. E.g., any researcher in the field knew that the code has a block structure and is highly robust to misreadings.

In 1996, mathematician Olga Zhaksybayeva of the al-Farabi Kazakh National University calculated that the probability of it occurring by chance is 3.09 × 10-32.

Any biologist will probably turn against this number, since in that paper the probability for Rumer’s pattern was calculated out of any biological context. If one assumes that, as a requirement of biological efficiency, the code should have a block structure, this number is increased, though, of course, not to 1.

All in all, the Kazakhs have identified nine patterns in the code

Nine? :) Is the entire ideogram with all the symmetries in all of its strings counted here as a “one pattern”?

If you think that all sounds a bit like The Da Vinci Code for DNA, you’re not alone. “It’s flat out numerology,” says Myers, who also notes the similarity to the pseudoscience of intelligent design

We had commented on PZ Myers’ “numerology” here. Regardless of the validity of our results, it is surprising that one of the most ardent ID-critics has confused our approach with ID (we never use ID methods – irreducible complexity, statistical analysis out of evolutionary context, etc.).

Davies is also quite forgiving. “If you crunch numbers long enough, you’ll find patterns in almost anything,” he says. “It was very clear to me at the outset that what this boils down to is an assessment: what is the probability that you might find something like this by chance?

This is a default standard in all of the empirical sciences – when one observes some patterns – expected or not – the first thing to do is to perform statistical analysis to be sure that they are not a statistical fluke.

To that, Makukov and shCherbak have an answer: about 10-13, or 1 in 10 trillion

This is only an upper estimate, since there were simplifications in the statistical analysis.

As to what – or who – planted the message, Makukov stresses that he doesn’t know. 

Does anyone expect “little green men” here as an answer? No way – it’s pink fluffy fairies! ;)

Posted in Uncategorized | Leave a comment

The summary of the reddit Science AMA

It was a nice experience to have an AMA session at /r/science.

We are glad to see that most contributors are able to balance carefully between skepticism and open-mindedness. It was not surprise for us to see a certain portion of pseudoskeptics who invariably invoke allegories all of which boil down to the same mantra “this is nonsense just because this is nonsense”. But it was surprise for us to learn that absolutely any topic in Science AMA series gets 83-93% upvoters, regardless of how many points a thread gets. In other words, whatever the subject is, there are always 7-17% of people who dislike it. So this seems to be a norm in fact, probably some kind of mass effect in social psychology.

Below are the links to a few threads discussing typical objections:

If you wish to claim that any order you detect comes from artificial sources, you first have to eliminate order that comes from known sources.” (A very long but instructive discussion with an anonymous structural biologist. Many appreciations to him for this discussion).

The onus is on you to prove that ideas of notations and zero are incompatible with evolution

If you take 12 pages of explain the internal structure of a half line of text, you aren’t doing math, you’re doing numerology

It is “god of the gaps” approach

All of that just pushes the question of origins further back in time

Posted in Uncategorized | Leave a comment

Why the genetic code is not universal

A fresh post by Matthew Cobb at Jerry Coyne’s blog explains in lay terms why the genetic code is not universal – might be helpful to non-experts. Keep in mind that the explanation is simplified, and only one presumable mechanism (codon capture) responsible for non-universality is mentioned, while there are a few more distinct mechanisms that have been proposed (e.g., ambiguous intermediate). Also, intriguing to learn that Cobb has just finished a whole book about the genetic code titled “Life’s Greatest Secret: The Story of the Race to Crack the Genetic Code”.

Note that what Cobb describes has happened during evolution after the Last Universal Common Ancestor (LUCA – first cells on Earth from which all terrestrial life evolved), so all of that is absolutely compatible with directed panspermia and with the message in the universal genetic code.

Posted in Uncategorized | Leave a comment

Expanding the genetic code

The whole August issue of ChemBioChem is dedicated to expanding the genetic code with non-canonical amino acids.

Posted in Uncategorized | Leave a comment

NYT on Genomic SETI: DNA vs. genetic code

There is an article in The New York Times written by Dennis Overbye in 2007 and considering genomic SETI. Overbye was inspired by Japanese researchers who shortly before that managed to embed the famous formula “E=mc2” and the year of its publication “1905” into bacterial genome (though, in fact, embedding non-biological information into genomes was performed as early as 1986 and 1988, to name a few). Dennis Overbye hypothesizes that some kind of a message might already reside in DNA of terrestrial organisms, if life on Earth was seeded by a preceding galactic civilization, as proposed by Francis Crick and Leslie Orgel (though, again, Crick and Orgel proposed that Earth was seeded with living cells, not with DNA, as Overbye writes; seeding DNA makes no sense – it will simply decay out of the cellular context).

Overbye comes to the conclusion that there are two disadvantages to genomic SETI. The first one is that DNA mutates:

The sad truth is, as others will tell you, this is a bit like writing love letters in the sand. “I don’t buy it,” said Seth Shostak, an astronomer at the SETI Institute in Mountain View, Calif., pointing out that DNA is famously mutable. “Just ask Chuck Darwin,” he added in an e-mail message.

Apart from that it’s not clear what Darwin has to do with that (he had no idea about DNA, not to mention its mutability and role in heredity), this seems reasonable. Even though genomic DNA is now considered to be the most reliable information storage that can safely keep information for tens and hundreds of thousands of years, it is still not that durable as needed for messaging in directed panspermia, where intact message must be replicated with DNA for billions of years until intelligent beings evolve. This does not imply that it is definitely impossible to use DNA as billion-years storage; but no one knows how to do that.

The second problem is more subtle:

Gill Bejerano, a geneticist at the University of California, … pointed out that the problem with raising this question is that people who look will see messages in the genome even if they aren’t there — the way people have claimed in recent years to have found secret codes in the Bible.

This also seems reasonable, because genomes are diverse and huge (in fact, even bacterial genomes are comparable in capacity to the Bible). But why should this go against approaching genomes with the hypothesis of genomic SETI? After all, the mentioned problem pertains to any research. All researchers tend to see patterns in data that seem to confirm their hypotheses, and there are many examples of false positives in any research field. This is the way science works – normally you do not expect an analysis or experiment to immediately yield an unambiguous answer. Why require that from genomic SETI? The problem with the Bible code is not that its result is ambiguous; it is that this approach has no any reasonable scientific hypothesis behind suggesting why we could expect some hidden messages in the Bible whatsoever.

At all events, both problems essentially dissolve if one turns from genomes to the genetic code. First, unlike DNA, the genetic code does not mutate, at least as rapidly as DNA. Certainly, there are known minor variations of the code in simple organisms and organelles, but the vast majority of terrestrial organisms still use the same version which has been unchanged for billions of years. Second, unlike genomes which are huge and diverse, the genetic code is small and universal. You simply don’t have freedom here to see a lot of messages that are not actually there. This is not to say that you cannot get false positives here, but they are much less numerous and it is more easier to ultimately decide if a positive result is more likely to be false or true after all.

Posted in Uncategorized | Leave a comment

Attacking a straw man

The “straw man” is a type of argumentation when someone distorts the original statement of an opponent, or even attributes a false statement to him, and then debunks it. There is a nice example of straw man argumentation among some bloggers concerning our Icarus paper. Here is one of them. He writes:

[The paper] rests on a false comparison of two options:
1. Created by random chance
2. Created by space aliens
This is set up so that if the first is unlikely, the second “must” be right.
The setting is rigged because these two aren’t all the possibilities. There is at least one more:
3. Created by a non-random natural process (e.g. evolved)
To declare any one the ‘preferred’ choice they’d have to investigate all three possibilities, then compare what was found. But they don’t: they only look at the first then declare the second as the ‘winner’ without ever looking at the third.

The straw man here is that we tested exactly both options 1 and 3. To quote Appendix B from the paper:

We tested both versions of the null hypothesis (“the patterns are due to chance alone” and “the patterns are due to chance coupled with presumable evolutionary pathways”).

Then we go on to describe how we take certain evolutionary pathways into account. Even if the paper is read perfunctorily, it is impossible to miss such a big chunk of text. So we are forced to conclude that the straw man here is based not on inadvertence, but either on deliberate distortion or on mere unfamiliarity with the notion of evolution. Non-random evolution does not imply deterministic process, as perhaps those guys believe. It implies stochastic process acted upon by non-random forces (represented in our statistical test with corresponding filters applied to randomly generated codes).

Posted in Uncategorized | Leave a comment

Another second genetic code

Recently a paper was published in Science which reported that in human genome some codons specifying amino acids also serve as landmarks for binding of certain proteins that control the gene activity (they dubbed such codons “duons”).  Well, it is just another report on overlapping of the genetic code with yet another code (regulatory code, in this case). But the funny thing is that both the press release and the news outlets are writing that “scientists have discovered a second code hiding within DNA”. Yes, Edward. A second code. Well, the sixth second code, to be exact :)

The following excerpt from 2011 paper by Edward Trifonov makes it clear:

According to the media sympathetic to science and enthusiastic about sensational discoveries, the “Second Genetic Code” as it was called by New York Times was discovered by Ya-Ming Hou and Paul Schimmel and published in Nature in 1988. It was about recognition of tRNAs by respective aminoacyl-tRNA synthetases. Thirteen years later New Scientist announced the second Second Genetic Code, discovered by Jenuwein and Allis and published in Science. This time it was about histone modifications. Five years later, New York Times, again, reported about “a second code in DNA in addition to the genetic code”. This was already the third Second Genetic Code, discovered by Segal et al, suggesting now nucleosome positioning rules. One, surely, would raise eyebrows having learned that there is also the fourth Second Genetic Code — on interaction specificities between proteins and DNA, and the fifth Second Genetic Code, the name given by Nature magazine to the set of rules governing gene splicing. Bewildered reader, naturally, would say “I’m done with seconds, can I have a third?”

Posted in Uncategorized | Leave a comment