Malicious code written into DNA infects the computer that reads it

Malicious code written into DNA infects the computer that reads it

A team of biologists and security researchers have successfully infected a computer with a malicious program coded into a strand of DNA.

Though it may sound like science fiction, but you can assured that it's real. The possibilities that is suggested by this project are equally very fascinating and terrifying to contemplate. Anyway, one need not worry about this particular threat vector any time soon.

The University of Washington's  multidisciplinary team do not wish to make outlandish headlines, though they ended up doing just that. Having found elementary vulnerabilities in open-source software used in labs around the world, their major concern was that the security infrastructure around DNA transcription and analysis was inadequate. This could be a serious problem in future, because of the nature of the data that is usually being handled.

The only way such a system could have been attacked by any competent attacker is to demonstrate the weakness of the systems with the usual malware and remote access tools. Hence, the discriminating security professional wants to stay ahead of the game.

Professor Tadayoshi Kohno, one who has a history of pursuing unusual attack vectors for embedded and niche electronics like pacemakers, said “One of the big things we try to do in the computer security community is to avoid a situation where we say, ‘Oh shoot, adversaries are here and knocking on our door and we’re not prepared.'"

Malicious code written into DNA infects the computer that reads it
From left, Lee Organick, Karl Koscher, and Peter Ney from the UW’s Molecular Information Systems Lab and the Security and Privacy Research Lab prepare the DNA exploit for sequencing

One of the co-author of the study, Luis Ceze, added that “As these molecular and electronic worlds get closer together, there are potential interactions that we haven’t really had to contemplate before.”

Accordingly, they made the leap plenty of sci-fi writers have made in the past, and that we are currently exploring via tools like CRISPR: DNA is basically life’s file system. The analysis programs reads a DNA strand’s bases (cytosine, thymine etc, the A, T, G, and C we all know) and turns them into binary data. Supposing those nucleotides were encoding binary data in the first place? After all, it’s been done before — right down the hall.

Here comes the crazy science
Here’s how it's being done. The transcription application reads the raw data that is coming from the transcription process and then sorts through it, searching for patterns and converting the base sequences it finds into binary code.

In response to the request for more technical information, co-author Karl Koscher responded by saying that “The conversion from ASCII As, Ts, Gs, and Cs into a stream of bits is done in a fixed-size buffer that assumes a reasonable maximum read length.”

This makes it ready for a basic buffer overflow attack in which programs execute arbitrary code because it falls outside expected parameters. By introducing a particular vulnerability into the software themselves, they cheated a bit. But is was pointed out by them that similar ones are present elsewhere, just not as conveniently for purposes of demonstration.

They made the exploit itself after developing a way to include executable code in the base sequence. Ironically though it’s closer to a “real” virus than perhaps any malicious code ever written, it’s inaccurate to call it a virus.

“The exploit was 176 bases long. The compression program translates each base into two bits, which are packed together, resulting in a 44 byte exploit when translated," Koscher wrote.

Koscher confirmed that given  that there are 4 bases, it would make sense to have each represent a binary pair. (For curious minds:  A=00, C=01, G=10, T=11.)

He continued by saying “Most of these bytes are used to encode an ASCII shell command. Four bytes are used to make the conversion function return to the system() function in the C standard library, which executes shell commands, and four more bytes were used to tell system() where the command is in memory.”

Malicious code written into DNA infects the computer that reads it

Essentially, as soon as the code in the DNA is converted from ACGTs to 00011011s, and executes some commands in the system, it escapes the program, (this is a sufficient demonstration of the existence of the threat vector). And if you wanted to do more than break out of the app, there’s plenty of room for more code.

Lee Organick, a research scientist who worked on the project, said "At 176 bases, the DNA strand comprising the exploit is by almost any biological standard, very small,”

Biopunk future confirmed

TechCrunch further asked the team more questions in pursuance of every science journalist’s prime directive, which is to take interesting news and turn it into an existential threat to humanity.

“CONCEIVABLY, could such a payload be delivered via, for example, a doctored blood sample or even directly from a person’s body? One can imagine a person whose DNA is essentially deadly to poorly secured computers,” they asked.

“A doctored biological sample could indeed be used as a vector for malicious DNA to get processed downstream after sequencing and be executed,” Organick wrote.

“However, getting the malicious DNA strand from a doctored sample into the sequencer is very difficult with many technical challenges,” he continued. “Even if you were successfully able to get it into the sequencer for sequencing, it might not be in any usable shape (it might be too fragmented to be read usefully, for example).”

Also read: Seacoast surgeon trains others in use of latest technology in knee replacement

It is not really the biopunk apocalypse that was envisioned, but the researchers do want people to think along these lines at least as potential avenues of attack.

“We do want scientists thinking about this so they can hold the DNA analysis software they write to the appropriate security standards so that this never makes sense to become a potential attack vector in the first place,” said Organick.

“I would treat any input as untrusted and potentially able to compromise these applications. It would be wise to run these applications with some sort of isolation (in containers, VMs, etc.) to contain the damage an exploit could do. Many of these applications are also run as publicly-available cloud services, and I would make isolating these instances a high priority” Koscher added.

It was concluded that the likelihood of an attack like this actually being pulled off is minuscule, but it’s a symbolic milestone in the increasing overlap between the digital and the biological.

The findings and process (PDF) will be presented by the researchers at the USENIX Security conference in Vancouver next week.



Also Read