Page 1 of 3

Transactions on Engineering and Computing Sciences - Vol. 11, No. 2

Publication Date: April 25, 2023

DOI:10.14738/tecs.112.14499.

Diamant, E. (2023). Data-Driven Bioinformatics is a Popular but Wrong and Misleading Attempt to Assess the Information-Handling

Abilities of Natural Biological Systems. Transactions on Engineering and Computing Sciences, 11(2). 90-92.

Services for Science and Education – United Kingdom

Data-Driven Bioinformatics is a Popular but Wrong and

Misleading Attempt to Assess the Information-Handling

Abilities of Natural Biological Systems

Emanuel Diamant

Independent Research Engineer

Kiryat Ono, Israel

It is generally agreed and accepted that Information is an indispensable part of our modern

everyday life. But that does not mean that a generally agreed and accepted understanding of

what information is or may be does exist in the general public or in the specific scientific

communities. Norbert Wiener’s dictum “Information is information, not matter or energy” is

not a definition, but expresses pretty well the existing ambiguity and embarrassment about the

discoursed matters.

The grounding fathers of the Theory of Information (Shannon, Kolmogorov, Fisher, Chaitin,

Renyi) also keep silent about the issue, devoting themselves only to the “information measure”

problem. The newly invented notion of “information measure” has served very well the

purposes of a reliable message exchange over a communication channel. But in all other cases,

it led to a long-lasting improper mixing and merging between the notion of “information” and

the notion of “information measure”, which, in turn, made the relations between the notion of

“information” and notions of “data”, “knowledge”, and “semantics”, blurred, intuitive and

undefined.

In such circumstances, I have decided to develop my own definition of Information and the

consequent corollaries that come out from this definition. The adventure has taken some time,

but finally, I have published on several occasions (almost over a span of 15 years) my results,

which today sound as such:

“Information is a linguistic description of structures observable in a given data set”.

I would spend the rest of my time discussing some of the consequences of my definition that

deserve (as it seems to me) more attention and concern.

So, in a data set, the data elements are not distributed randomly, but due to the similarity of

their physical parameters, are naturally grouped into some kind of clusters or bundles. (In an

electronic image these could be groups of pixels of the same color or brightness, for example).

I propose to call these clusters primary or physical data structures.

In the eyes of an external observer, these primary data structures are organized into larger and

more complex agglomerations, which I propose to call secondary data structures.

Page 2 of 3

91

Diamant, E. (2023). Data-Driven Bioinformatics is a Popular but Wrong and Misleading Attempt to Assessthe Information-Handling Abilities of Natural

Biological Systems. Transactions on Engineering and Computing Sciences, 11(2). 90-92.

URL: http://dx.doi.org/10.14738/tecs.112.14499.

These secondary structures reflect the observer's view on the grouping of the primary data

structures, and so they can be called meaningful or semantic data structures.

While the formation of primary (physical) data structures is determined by the objective

(natural, physical) properties of the data, the subsequent formation of secondary (semantic)

data structures is a subjective process governed by the conventions and habits of the observer

(or a mutual agreement in an observers’ group).

As it was said, the description of the structures observed in the data set should be called

"Information". In this regard, it is necessary to distinguish between two types of information

- physical information and semantic information.

Both are language descriptions; however, physical information can be described using a variety

of languages (recall that mathematics is also a language), and semantic information can be

described only using the observer’s natural language.

It must be said that, as it follows from the above-mentioned definition of information, physical

and semantic pieces of information form a single composite notion of information. Not a dual

notion, where both forms coexist in a superposition of quantum states, but a composition of

two sequential states with different levels of structural organization (where physical

information is at the lowest level of this organization).

Information processing is carried out in a hierarchical fashion, where the semantic information

of a lower level is transferred to the next higher level, where it becomes part of a structure of a

higher complexity. The process of such a new arrangement is carried out according to

subjective rules retained in a prototypical (referential) structure stored in the observer's

(system’s) memory.

Thus, in an information processing system, memory is the conserved information retained from

a previous attempt of semantic information processing.

An important consequence of the statement that “Information is a linguistic description”

immediately implies that information descriptions always materialize as a set of words, a

fragment of text, or a narrative. In this regard, an important note should be made - these text

sequences are written with nucleotide letters and amino acid signs. That turns the biological

information into a physical entity, into a "thing", with its weight, length, and other physical

properties. For the purposes of our discussion, this is an extremely important remark, because

when we talk about information processing in biological systems we must take into account the

physical attributes of the processed information packages, which means a persistent demand

for a physical space for information flow arrangement, cell local spaces for information

(memory) deposition and handling, and so on space demanding requirements.

What follows from this is: Information processing is text processing! Not data processing, to

which we all are accustomed, but an unknown yet text processing paradigm. Lotfi Zadeh has

proposed to call it Computing with Words. But in the past 50 years, it has not taken off.

Page 3 of 3

92

Transactions on Engineering and Computing Sciences (TECS) Vol 11, Issue 2, April - 2023

Services for Science and Education – United Kingdom

By clarifying and disentangling the physical and semantic information processing paths we

immediately place in the right light the true abilities of new sciences and technologies

developed to extract information and knowledge from huge data volumes produced by

contemporary biological sciences and their experiments. All of them are being declared as data- driven undertakings. In light of our findings, that means that only physical information can be

derived as the result of their efforts. Not the desired semantic information, which (as you now

know) is the real goal of contemporary information handling and processing.

Only being equipped with the new knowledge about “What is information” we can now take

part in deciphering and explaining real biological experiments and trials – for example,

experiments of transferring memory from one group of songbirds or smiles to another group.

Such mechanical memory transfer from one group of living beings to another group can be

accomplished only if the memory is a physical thing, a material entity, and not an imagined

engram of spiking neurons, as it is usually believed today.

Only being equipped with the new knowledge about “What is information” we can now

understand the attempts to create artificial memory storage units similar to biological storage

units with an unprecedented storage capacity density – information volume vs involved

nucleotides’ physical volume.

Brain information processing, brain cognitive disorders, and brain inflammations can be

understood and successfully treated also only if information processing modeling rules

inspired by our latest findings would be brought to the possession and knowledge of the

neurologists’ communities. But that is not in the scope of this conference.

References

[1] Paulien Hogeweg, The Roots of Bioinformatics in Theoretical Biology, PLoS Computational Biology,

https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1002021

[2] Emanuel Diamant, Advances in Bioinformatics and Computational Biology: Don’t take them too seriously

anyway, https://arxiv.org/abs/1505.04785

[3] Emanuel Diamant, Does Wu’s Philosophy Define What Is Information? Proceedings 2017, 1(3), 88;

http://www.mdpi.com/2504-3900/1/3/88

[4] Emanuel Diamant, Shannon's Definition of Information is Obsolete and Inadequate. it is Time to

Embrace Kolmogorov’s Insights on the Matter, https://vixra.org/abs/1712.0494

[5] Emanuel Diamant, The brain is processing information, not data: Does anybody knows about that?

https://vixra.org/abs/1712.0493

[6] Emanuel Diamant, Unveiling the mystery of visual information processing in human brain, Brain

Research 1225(10):171-178, August 2008, https://www.researchgate.net/publication/222567891

[7] Diamant, E. Designing Artificial Cognitive Architectures: Brain Inspired or Biologically Inspired?

https://www.researchgate.net/publication/329582475