One of the challenges of working with ancient DNA samples is that damage accumulates over time, breaking the double-helix structure into smaller and smaller fragments. In the samples we have worked with, these fragments spread out and mix with contaminants, making reconstructing a genome a major technical challenge.
But a dramatic paper published Thursday shows that’s not always true. Damage creates smaller and smaller fragments of DNA over time. But if they’re trapped in the right kind of material, they stay right where they are, essentially preserving some key features of ancient chromosomes even as the underlying DNA decays. Now, researchers have used that to piece together the chromosome structure of mammoths, with implications for how the mammals regulated some key genes.
DNA meets Hi-C
The backbone of the DNA double helix is made up of alternating sugars and phosphates, chemically linked together (the bases of DNA are chemically linked to these sugars). Damage from radiation, for example, can break these chemical bonds, with fragmentation increasing over time. When samples reach the age of something like a Neanderthal, few fragments are longer than 100 base pairs. Because chromosomes are millions of base pairs long, it was thought that this would inevitably destroy their structure, as many of the fragments would simply diffuse away.
But that will only be true if the medium they’re in allows for diffusion. And some scientists suspected that permafrost, which preserves the tissue of some now-extinct Arctic animals, might block that diffusion. So they set out to test this using mammoth tissues, obtained from a sample called YakInf that’s about 50,000 years old.
The challenge is that the molecular techniques we use to study chromosomes take place in liquid solutions, where fragments would simply drift apart anyway. So the team turned to an approach called Hi-C, which specifically preserves information about which bits of DNA were close together. It does this by exposing chromosomes to a chemical that links together all the bits of DNA that are physically close to each other. So even if those bits are fragments, they will still be stuck together by the time they get to a liquid solution.
A pair of enzymes are then used to convert these linked molecules into a single piece of DNA, which is then sequenced. This data, which contains sequence information from two different parts of the genome, then tells us that those parts were once close together in a cell.
Interpretation of Hi-C
By itself, a single piece of data like this isn’t that interesting; two pieces of genome can end up next to each other at random. But when you have millions of pieces of data like this, you can start to build a map of how the genome is structured.
There are two basic rules that govern the pattern of interactions that we expect to see. The first is that interactions within a chromosome are more common than interactions between two chromosomes. And within a chromosome, parts that are physically closer together on the molecule are more likely to interact than parts that are farther apart.
So if you look at a specific segment of, say, chromosome 12, most of the sites that Hi-C interacts with will also be on chromosome 12. And the frequency of interactions will increase as you move to sequences that are closer and closer to the sequence that you’re interested in.
By itself, you can use Hi-C to help reconstruct a chromosome even if you start with nothing but fragments. But the exceptions to the expected pattern also tell us things about biology. For example, genes that are active tend to be on loops of DNA, with the two ends of the loop held together by proteins; the same is true for inactive genes. Interactions within these loops tend to be more frequent than interactions between them, which subtly changes the frequency with which two fragments are linked together during Hi-C.