46:44
Fitness, Health, Telomeres
02/21/2024
Season 1
Episode 23

Causal Epigenetic Age Uncouples Damage and Adaptation

Listen or watch on your favorite platforms

Show Notes
Transcript

Machine learning models that use DNA markers can estimate the age of biological samples. However, understanding why these markers change with age is challenging because it’s hard to prove that these changes cause aging-related traits.

In this week’s Everything Epigenetics podcast, I speak with Kejun Ying who uses large datasets to find specific DNA markers that directly influence aging traits.

We explore his recently published study which found casual CpGs that speed up aging and others that protect against it.

Kejun and colleagues created two new models, DamAge and AdaptAge, to measure harmful and beneficial changes related to aging. DamAge, which indicates negative aging effects, is linked to several health risks, including higher chances of dying. AdaptAge, on the other hand, shows positive aging adaptations.

Interestingly, only the negative changes seen in DamAge can be reversed by a process that makes aged cells young again.

The research findings provide a detailed understanding of the DNA markers that truly affect lifespan and overall health as we age. This helps us develop more accurate aging biomarkers and evaluate treatments aimed at reversing aging, improving longevity, and understanding events that speed up the aging process.

In this podcast you’ll learn about:

– Kejun’s unique journey into the aging field
– One of the biggest weaknesses of the epigenetic clocks (separating causation versus correlation)
– Mendelian randomization
– Casual inference
– Why causality matters for aging biomarkers
– Why it is important to separate deleterious and protective changes in aging
– DamAge (casual aging clock based on damaging sites)
– AdpateAge (casual aging clock based on protective sites)
– The applications of DamAge and what AdpateAge
– ClockBase: a comprehensive platform for biological age profiling in human and mouse
– The application of ClockBase
– Data privacy when using ClockBase

Transcript:

hannah_went (00:01.509)
Welcome to the Everything Epigenetics podcast, Albert. I’m glad you’re here today. Thank you so much for being here.

albert_kejun_ying (00:09.55)
Thank you so much for having me here.

hannah_went (00:12.009)
Yeah, I’d love to start. I know I gave our audience just a really brief introduction on you and where you are now. I just want to know more about you though. So what’s your background? How’d you get here? Did you have more of a traditional or non-traditional path in terms of, you know, finding out that you’re interested in the subject that we’re going to be talking about today?

albert_kejun_ying (00:34.31)
Okay, so actually my journey to the field of aging is actually quite interesting. So I was, like throughout my childhood, I was always wanted to be a scientist. But at that time, I’m actually, I’m already thinking about the physicist, so theoretical physicist. But very soon I realized that many physicists, well, no offense, but they died before their theories proved to be true or wrong.

This is because it’s a hard problem and human lifespan is too short. And at that moment I realized the aging is probably a bigger problem to be addressed. And so that’s actually the main reason I studied biology at DMU. During my undergrad in seniority in university in China. And actually I was super lucky. First year in my college I was able to join a research lab.

aging-related traits. So they studied the Tidal Mirror. And also, throughout my undergraduate, I was actually super lucky to be able to expose to various different subfields of aging. So I was in, like the second year, I was a visiting student in UC Berkeley, and at that time I was in

albert_kejun_ying (02:04.47)
And after that, I was in Bog Institute. So Bog Institute is an aging research institute. So I was in Judith Kempis’ lab. So she studied cellar stenosis. And one year after that, I went to the UW, so University of Washington. And from there, I joined the Mac Cabering’s lab. So, and I was working on a project related to the reprimanding of acarbons

a brain mitochondrial disease. So all these experiments are super grateful. So it’s allowed me to get exposed to the aging fields even before my graduate study. And it’s also basically let me know that what I want to study during my PhD. So it’s actually boiled down to basically two very basic questions.

albert_kejun_ying (03:03.97)
second is that how do we quantify the agent process and this is the main reason I joined the theme squad is just lab.

hannah_went (03:13.689)
Yeah, well, you have a lot of amazing experience. I didn’t realize, I guess, how much experience you had. It’s like almost you touched like half of the hallmarks of aging, right? But because I think that’s important. I’ve always been interested in science since I was a little girl, but it’s kind of like, it’s so broad, you don’t know where to start. You really kind of just have to dive in and get experience at all these different labs, know what you like to learn about and understand. So that’s really great.

albert_kejun_ying (03:23.931)
Yeah.

albert_kejun_ying (03:35.19)
Yeah, and secondly…

hannah_went (03:43.669)
close to all of that before your graduate studies now in Vadim’s lab at Harvard. So I love what you’re studying. And of course, that’s what we’re going to be mainly talking about today is this new paper that you have and talk about how that relates to epigenetics. So Albert, one of the biggest weaknesses of these epigenetic clock predictors that are now available is separating the causation versus correlation of course, as you know.

you speak to this problem and maybe before we introduce your paper, introduce our audience to the issue with correlation versus causation.

albert_kejun_ying (04:25.97)
So, almost all the predicted model, they are based on the spearest correlation. So basically it’s there, they build because your predictor somehow have a correlation with the trade you want to quantify. So this is actually not a big problem if you’re just trying to predict the number itself. So for example, if we’re for clocks, if we just want to predict the age of that person,

albert_kejun_ying (04:55.93)
Because since the correlation structure is usually stable in the population, your predictor will be reliable for predicting the age. But the problem is that we build the clock not for predicting age, right? Because we know your age, why do I want to predict that? So what we really want to do is we want to use clock to measure the age of acceleration. So basically we want to say this person looks older than average.

can be fine because as long as the person you are measuring are represented in the population of the training dataset of your clock. So basically this person has the, for example, methylation clock, this person has a very similar correlation structure of the DNA methylation and age compared to your training dataset. But the problem is that when you apply your clock

usual intervention. For example, if you have a say salary programming, which directly impact the your epigenetic landscape, or you have some weird drug or some different intervention that never seen in the training data sets will be problematic because the because the correlation structure may be disrupted by this intervention. So I think if a more is

albert_kejun_ying (06:25.49)
example is that there’s a very strong correlation between the chocolate assumption rate in the country and the Nobel Prize lowering number. So it’s actually a 2012 Landsat paper. So basically, you can consider this is kind of clock model. So basically, you can use chocolate assumption

hannah_went (06:36.689)
Ha ha ha!

albert_kejun_ying (06:54.95)
there’s no problem because there’s an underlying correlation structure. So basically my assumption is that the general like the economics is related to the chocolate assumption rate and also separately related to the biopartilories. And as long as this correlation structure holds, this model will be accurate. So if you add a new country here, it’s still worth the problem. But the problem begin to manifest is that when you start to do interventions.

them one of the country, they say, okay, we want to, based on this model, we want to increase our number of pride winner number. Let’s encourage people to eat more chocolate. So, and then, based on this model, you will see this country will show they will have super high predictive value of the number of pride lowering number, but it will probably have no effect because

hannah_went (07:35.254)
Ha ha ha.

albert_kejun_ying (07:55.47)
But instead, if we build a model even less predictive, so if we build a model using what we call causal biomarker, or causal predictor, so say we use that predictor that have really had causal effect on the outcome trace. So for a number of parts of the lowering number, we can use something like a number of university in the country or the quality of the publication from this country. And if we use this predictor, it will be robust

any intervention. So for example, if you the country say, okay, we encourage people to use more university to increase the quality of the publication, it will really have the effect to your Nobel Prize and Nobel Prize number. So the model will stay robust under some different intervention. That’s the whole point.

hannah_went (08:47.449)
I think that the first part of what you were saying there, and you can correct me if I’m wrong, but the first generation clocks, those are trying to predict your chronological age, right? So correlation causation, maybe a little bit more negligible there. Whereas the second generation clocks that are being created are including underlying biological aging phenotype, but we need to know if those CPG methylation markers that they’re using within those algorithms are purely correlation or causation

So that’s really I think what you’re chipping away at and is going to be hopefully what we’re able to understand more is why is this actually happening? You know, what is aging? I think everyone is trying to understand that question. And I love the example that you gave there. I’ve heard a lot of, you know, people talk actually a couple people from from Harvard.

albert_kejun_ying (09:18.1)
Yeah.

hannah_went (09:47.229)
you know, divorce rates and something really silly. And I think that’s a good way to put it so people can understand and relate to it more. So of course, our main focus is going to be just that, you know, you have your paper titled, causal epigenetic age on couples damage and adaptation. So you built the first ever causal aging biomarkers that could separate actual damage and adaptation using the

albert_kejun_ying (09:49.135)
Thank you.

hannah_went (10:17.249)
very, very new subject to the field. And I’m sure to my listeners as well. So, you know, you, you took this approach by using more of this, what we call this Mendelian randomization. So would you want to introduce our audience to that and maybe how you, you went about that?

albert_kejun_ying (10:34.81)
Sure. So before I get into the machine randomization, let’s first start with the causal inference. So what do we mean when we say when exposure or when risk factor is causal to the outcome or the disease or care? So basically, in traditional way, the GoStandard to establish

albert_kejun_ying (11:04.79)
control trial. So basically it’s what all the drug companies are doing right now. So basically you have a group of people, you randomly assign them into two arms and one arm you give intervention, and another arm you give placebo or do nothing, and then the effect difference of these two arms on the outcome trade is the causal effect of that intervention. So the key part here, so why

randomized control trial work is the key part is actually the randomization step. So the main reason that observational data are basically say you have your CA correlation between one phenotype and another phenotype, it doesn’t mean that the first phenotype costs second phenotype. It could be just the correlation. So in terms of methylation, for example, smoking can change a lot of the DNA methylation

albert_kejun_ying (12:05.01)
your lifespan. So usually you can see some of the g-side associated with the lifespan or with age, but it’s actually because smoking has an effect on both the predictor or the exposure or the risk factor and the outcome of the lifespan. So for things that have an effect on both risk factor and outcome, we call them confounding factor or confounders. So the randomized entrance step

example, go back again to this smoking-related methylation side. So in a cohort, without any trial or without any intervention, you cannot control the methylation value of that specific CPG side. So basically, you will see people who smoke will have higher methylation value, and people who don’t smoke have lower methylation value on that CPG side.

albert_kejun_ying (13:04.81)
of that methylation site and smoking itself. But if you really want to study the causal effect of that speuticide, you will need to do intervention. So basically you will need to have some genetic intervention or have some drug to specifically target that speuticide. And the key part is that you need to first randomize your cohort. So you can separate the methylene methylation of that specific site and the smoking.

purely look at the effect of the SVD site. But as you may realize, it’s not realistic to perform clinical trial or control trial on DNA methylation because it’s just too huge amount of the DNA methylation site across the genome. It’s basically impossible to do that. And the mandarin amnesia is another approach that take advantage of the genetic formation.

albert_kejun_ying (14:04.77)
randomized control trial. So basically, in my linear randomization, instead of looking at the dnmethylation itself, we are looking for the genetic invariant. So basically, the mutation come from your ancestor that’s located nearby that specific CPG site. And by meantime, we have another data set that shows that this mutation actually have effect on the methylation level.

So what I consider is, it’s kind of like a CRISPR screen, but instead of you doing CRISPR yourself, you allow the nature to introduce different kind of genetic variant and they affect the DNA methylation level at different extent. So you use this DNA, you use this genetic variant as instrument. And why it’s work is that when those genetic variant pass from parents to the offspring, it’s actually they are randomly allocated.

So if one parent have one genetic variant in their allele, it will path to the offspring by 50% chance. So this step is actually limiting the randomization step in the randomized control trial. So that’s basically how limited randomization works.

hannah_went (15:24.03)
Hmm.

hannah_went (15:28.149)
Yeah, so you’re taking, let me see if I can get this right. You’re taking more of the genetic variant that may be the underlying reason as to why the methylation may be the way it is.

albert_kejun_ying (15:41.55)
Yes, so we use that as an instrument for the methylation itself.

hannah_went (15:42.876)
For

hannah_went (15:47.189)
Perfect. Yeah, so you’re taking all of this very large scale data and leveraging it. So you’re taking the large scale genetic based data and you’re performing the epigenome wide Mendelian randomization to identify the methylation sites or those CPG sites for causal age related traits. Is that a good statement?

albert_kejun_ying (16:09.83)
Yes, that’s correct. Yeah.

hannah_went (16:11.609)
Okay, perfect. So, so you’re, you’re looking at these CPG sites, um, and some of these may be deleterious. Some of them may be protective changes in aging. So you’re, you’re trying to categorize these and why is it important to separate, um, you know, what’s causing maybe harm and damage versus what’s actually helping. Um, what are we able to know from that?

albert_kejun_ying (16:34.35)
So this is actually a, probably a myth in the aging field, so people tend to believe that if something changes with your age, this thing is bad. So somehow people kind of believe that. But this is actually, well it never proved to be true or false, but this is not very reasonable because the human body is a very complex system.

albert_kejun_ying (17:04.45)
when some damage happened during aging. Your body can, there’s some program in your gene regulatory network or some genetic program that can act against this change or this damage. So this is something we call response to the damage or the adaptation. So basically, if you look at the change of the say, genes pressure during aging,

would be really doing this damage to your body and that part of the nature could be protected. So there’s actually with a paper from our collaborator years ago, so basically they were able to show that the gene that have differential expression during the disease is actually mostly caused by disease instead of causing the disease. So basically all the gene

albert_kejun_ying (18:04.53)
are both body’s response to disease or not the cause of the disease. And also when you do an intervention, you really want to target those changes that are really damaging your body. But you definitely don’t want to target a change that’s adaptive. I can give you

hannah_went (18:09.269)
Mm-hmm.

albert_kejun_ying (18:34.59)
function. So they will probably wear glasses. So if you just look at the correlation, so people will have lower weaker vision and people also wear glasses. So there’s two age-related changes. And you want to have an intervention. So okay, let’s target. Easy thing, let’s remove people’s glasses. So this is actually an example from Vadim. So I just do it. But I think it’s really, really good. So

albert_kejun_ying (19:04.81)
You can remove adaptive change once you cure this damaging change, but you don’t want to remove the adaptation first. So I think that’s the key reason why it’s important to understand which one could potentially be damaged and which one could potentially be adaptation.

hannah_went (19:23.089)
Yeah, that’s a good example. I like that one. I’m gonna use that moving forward and we’ll use that clip to explain to viewers. And that scares me. My eyesight, I just went to the eye doctor a couple of weeks ago. I feel like it’s already starting to decline and I would like to think I’m still young chronologically. I don’t wear glasses or contacts yet. So yes, it’s all about looking at, okay, is this actually a response from the disease

albert_kejun_ying (19:40.071)
Bye.

hannah_went (19:56.751)
the disease because if we’re looking at the response of a disease thinking it’s the cause, then we could be totally leading ourselves in the wrong direction, right? Yeah. So, again, I think your work is super important in this area and we’re going to keep understanding how we can be ready to run randomized controlled trials for the epigenetic methylation, know

to move in different directions. So in this paper you have, you actually developed that framework that I’ve been talking about into these epigenetic cloth models. And you constructed, is it just dam age? Is it D.A.M., dam age? Dam age? Okay, I love it. I love it. So you constructed dam age and then adapt age that actually measure these age-related damaging and adaptive changes, respectively.

albert_kejun_ying (20:39.331)
Is it damage? Yes.

hannah_went (20:53.069)
about what dam ages and what adapt ages and maybe the outcome and what those mean.

albert_kejun_ying (20:57.69)
Yeah, so but before that I want to first class clarify that so during all process we talk about the We call it causal CPG in the preprint, but it’s actually So basically it’s not it’s not proven to be causal. So it’s just fit the causal model in the Mandatory minimization analysis So they’re actually putative causal CPG site So but it will be like very worthy view every time it’s a putative causal CPG site

hannah_went (21:15.079)
Mm.

albert_kejun_ying (21:27.83)
we just say causal, but the keeping mind is not really causal and also for causal side we are not talking about the side that have direct effect to your aging or lifespan. They go through some pathways. So for example, some C.G.C.I may increase the rate for example the probability of smoking or drinking alcohol, something like that. So some of them may go through the behavioral path

it goes through some more molecular mechanism. So it’s still unclear at this stage. And for the damage and adaptation. So this is because as I described before, we want to separate the potential damage and adaptation. And since we have many randomization, we can actually estimate the effect of each CPG site on the outcome trade. So basically on lifespan.

Since we have this causal effect and we have direction of causal effect, we can compare this direction to the age-related change direction. So for example, if you have a CPG side that’s higher methylation, we extend lifespan like causal effect. But you see this CPG methylation level actually also increased during aging. This is what we call protective increase or adaptive increase.

albert_kejun_ying (22:58.13)
increase during your age. So it’s totally good. Like glasses. So glasses in general increase people improve their suspicion and people put on glasses during their age. So it’s a good thing. We don’t want to stop that. And vice versa. If you have a CPD side that decreases the lifespan and you also see the CPD methylation increase during

albert_kejun_ying (23:27.91)
and increasing, so we want to target that. And by separating these two groups of CPU side, we can attribute a clock, two clocks separately, when we call adaptage, when we call damage. So damage is built purely using the side where we identify as potential damage inside. And vice versa, adaptage is only used the potential adaptation side.

believe this 2 clock can separately capture the damage and adaptation. So one one example we found is that we apply our clock to some human cohort and look at their association between the age of acceleration basically is 2 clock and mortality. So we found that the damage actually it’s positive correlation with the mortality risk. So basically if you look older if

albert_kejun_ying (24:27.71)
will have higher mortality risk. But the adaptage actually have super weak even negative association with the mortality. So basically people if they look older on this adaptation site, they will eventually don’t affect their mortality. They will even have a little bit negative effect. So basically they

albert_kejun_ying (24:57.73)
main thing. And also for, we also applied them to the IPS data. So the reprogramming. So it shows that only the damage are reversed during the IPS, but adaptive age is not reversed. It’s even slightly increased. So basically we believe that these two group of CPUs decided they have a very

albert_kejun_ying (25:27.911)
or the mortality related research trade.

hannah_went (25:33.089)
Sure. So yeah, the, the adapt age, that’s going to be the, the good sites, the good CPGs that we want to look at, whereas the damage is going to be the bad. So we need to separate the two to be able to see how these respond to different things or interventions. And like you just mentioned, one of the interesting findings I thought when I was reading your paper, um, that the damage is going to be reversed during reprogramming and the adapt age is actually going to be, be increased. So I like that you mentioned that there. Um,

albert_kejun_ying (25:44.734)
Thank you.

hannah_went (26:03.069)
applications of both of these clocks are. And do they come out as just an age? Is that what you’re getting once you’re looking at that? Or what’s the actual outcome that you’re receiving?

albert_kejun_ying (26:14.51)
also the outcome is just age

hannah_went (26:17.569)
is an age. Okay. Perfect. And what other applications do you think this would fit in? Are people, are these ready? Do you think for clinical interventional trials or what do you think that these are going to be used for in the future?

albert_kejun_ying (26:32.83)
Yeah, I think I definitely don’t think this clock will replace or other clock. It’s just another new clock. But I think what important is that we here, we emphasize the importance of basically building a consulting-informed model. And we believe that it will be more robust when you’re testing new interventions. So basically, if you use a generic clock,

age is reversed but you actually don’t know what that means so it could be your adaptation is reversed or it could be your damage is reversed so you really want to have intervention that targeting the damage but not the adaptation and but with yeah but with adapt agent damage you will have this at least some ability to separate these two things and when you’re testing your

hannah_went (27:18.231)
Yes.

albert_kejun_ying (27:33.19)
adjust that page. And we also show that our two clock are quite sensitive to short-term treatment. So there are some examples, for example, a cigarette condensate treatment or the very short-term nutrient treatment, the clock is able to show effect at a very early stage. So basically I think the ultimate

albert_kejun_ying (28:02.91)
to discover novel intervention or novel risk factor that could accelerate your biological age. So I think that the goal is the same for damage and adaptage, just you have a better

hannah_went (28:23.849)
And it makes a lot of sense. You know, I’ve said this several times on the show before. I spend most of my time digging through the interventional trials that are available and a lot of the research and informing healthcare providers on the data that’s available and what looks good. But again, there’s not a lot of definitive answers, right? It’s a lot of hypotheses still or kind of, yeah, we’re just learning. It’s so new. You know, there’s a lot that we can learn from these.

hannah_went (28:53.829)
true definitively would be to go back on every single interventional trial that’s been created or everything that’s been published and run these new algorithms on them that are available. And the data sets are there, but it’s just if people are willing to share those data sets, if it’s, you know, especially if it’s a privately funded study, but I think there’s only more that we could learn from that and more insight that we could gain. So I really hope to see these be used in the future and talked about more just so we can learn more about how they can be used

and how they react to certain intervention. Is there any other work than Albert that you’re doing with the Mendelian randomization at this time?

albert_kejun_ying (29:34.71)
Yeah, so we’re actually also looking at the effect of the… So here we look at the DNA methylation, but we also look at basically all other phenotypes. So not molecular phenotypes, just other lifestyle disease factor, how they affect ice band, how they affect clock measurement, like it’s a manoeuvring medication. So, yeah, this is almost done actually.

hannah_went (29:59.069)
But you need to send it to me when it’s available. And then we’ll have to chat later about that. I’m going to switch subjects completely. I kind of had our little agenda planned out. And then I remember receiving an email from you that you released a pre-print a couple of weeks ago that I am so excited to share with my audience. It is called that the pre-print is named

albert_kejun_ying (30:01.63)
Yeah. Yeah, yeah, sure.

albert_kejun_ying (30:16.234)
Yeah.

hannah_went (30:29.109)
and human and mouse. Albert, what is this? Tell everyone what you did, what you created, and we’ll go deeper into it as well.

albert_kejun_ying (30:38.61)
Yeah, it’s actually exactly what you mentioned before. So if you remember before, you said there’s a many study has been done, and it is really good. They have data available. It would be good if you can easily apply the clock on them. So the clock is exactly doing this. So we collected all the data from Gene Expression Omnibus. So it’s a huge database,

all the researchers upload their experimental data to this platform. So they have thousands of data set, each data set contain a very different experimental design. So some people are testing some new drugs, some people testing intervention. And they don’t, most of them don’t study aging. They some of them study cancer, study immune disease, but they have methylation data and gene expression data. So since we have the methylation clock

and now transcriptomic clock, we can actually apply this clock to this dataset and look at if there any intervention that people previously done could potentially decrease or accelerate at least the aging clock. So the clock basis is a huge database that contain, I think now it’s have like 300,000 sample from Geo.

albert_kejun_ying (32:08.47)
methylation age from based on existing clock. And people will basically, it’s a lot of people without, even without our informatics background. So even without the knowledge of how to build, how to apply clock, they can easily browse the methylation age of each sample just through like the few clicking. And they can also do some statistical tests we have on the platform. So basically they can compare whether

without decelerate that, each in process.

hannah_went (32:41.589)
Yeah, I think that’s the beauty of it. I’ve been on there and I was playing around with it a little bit and it’s very, very easy to use. So if you’re curious, just check it out and I’ll make sure to include everything if you’re wanting to play around. I haven’t been able to take a deep dive, so I’m setting aside some time to do that this weekend actually. But how did you build it? Did you think, and I guess, where did the idea come from? Was it that there is a need and you wanted to make this more publicly available?

albert_kejun_ying (33:12.35)
So yeah, so this website actually derived from a very small prototype. And so before it’s called pseudo clock. So it was a tiny Hexom project I did actually with Electra. It’s back in end of 2021. And at that time, the website is not as mature as now.

albert_kejun_ying (33:41.33)
and then it will calculate the biological, the epigenetic age for you. So that’s it. We did not connect to the any database. But very soon after that, we realized that we actually can do that, right? Because for bioinformatician, I think every time I have some idea, I want to test it. At first I look at paper and CO. It’s that here’s a geo data. People died in this experiment, we have data.

and test it. Every time you did it, it’s actually repetitive. So you just do the same thing to, every time you want to study something. And it’s become like tedious very soon. So I realized, oh, how about I just download all of them and we just process all of them. And next time I want to use it, I just search them. And after that, I would go, okay, oh, why don’t I just publish this? I just, people, whoever, like some, maybe there are some domain experts

intervention better than me and they can actually look at the intervention they’re interested in and to see how they’re affecting the effutiginity age. So this is actually the whole purpose of this project.

hannah_went (34:55.889)
Yeah, I think it’s a great way to build a community and bring people together, right? I’m sure many people have been reaching out to you already about different indications that they’ve been looking at and using your program for. So you identified this issue, you realized you were repeating yourself and kind of pulling up these databases and then you were like, well, why don’t I just make something like this and make it available to the public? So I think it’s such a great useful tool and something that we’ll be able to grow over time.

albert_kejun_ying (34:59.754)
Yeah.

albert_kejun_ying (35:18.456)
Yeah.

hannah_went (35:26.568)
Um, what types of clocks are, are on there or what can people expect when, when they log on and it’s just, is it, is it clock based.com or, or what is the, the link to go there is it.org clock based.org. If any of you all are interested, you can check that out and, and go there. But yeah, when people open it up, what is that journey like? What, what can they expect?

albert_kejun_ying (35:36.091)
the org.

albert_kejun_ying (35:39.866)
Yes.

albert_kejun_ying (35:47.65)
So basically, I think the first, the first, the first, the first, the first click button, the first page is a, it’s a, it’s a UMAB plot. So basically it’s a, it’s a 3D, it’s a 3D plot where we have each sample embedded in a lower dimensional space. So it’s a, you know, 3D plate for each point is a sample from Geo. So we have basically an OR, 100,000 sample there.

albert_kejun_ying (36:17.73)
based on the age of methylation clock you selected, you can actually see the age of that sample and what is this sample and what is data sets come from. And then you can, after that, you can, next step would be you select the data set you want to analyze. So basically you can, there’s a search block, you can search whatever intervention you are interested in.

albert_kejun_ying (36:48.07)
Yeah, and then there will just be a table where we pre-computed the clock for each sample. And we also have some metadata in the sample. And you can actually use this information to do some statistical tests on our website. So now we can do some correlation analysis or some t-tests to compare the group. So basically the output is a pretty figure.

albert_kejun_ying (37:17.67)
and they will be ready for the publication. So it’s really a streamlined process.

hannah_went (37:21.233)
Wow.

Yeah, I’m excited to do a deeper dive. So clockbased.org for anyone who’s interested, very, very new. And I’m sure it’s going to be a great asset that’s going to grow over time and just keep building in and keep adding. One of the questions I frequently get a lot when it comes to this epigenetic data is going to be about privacy. So I’m sure, you know, I’ve seen, I think some people ask you about this on Twitter or in other threads.

really want to use this application of yours, but maybe are a little bit worried about data privacy when looking or using Clockbase.

albert_kejun_ying (38:02.13)
We actually do not store the data that the user uploaded. So we don’t even have this functionality, so don’t worry. So the data will just be deleted after your closed session.

hannah_went (38:18.169)
Good, there you have it. If anyone’s worried, they’re not even tracking it. So shouldn’t be a point of emphasis at all. Well, what is the future for clock based? What do you think it can turn into more, or do you plan on building on top of that as more information is available?

albert_kejun_ying (38:21.371)
Yeah.

albert_kejun_ying (38:35.55)
Yeah, so since we have all this information, all this intervention and all this sample with their epigenetic age available, the next step would be basically systematic identified the intervention that could potentially or certain category of intervention that have most potential to reverse or increase the epigenetic age. So for example, as an example in the preprint we put, we actually

albert_kejun_ying (39:05.45)
biggest problem of the geo data is that the metadata structure is not structured. So people store their meta information in different formats. For example, you have gender column and someone says, okay, it’s sex, it’s gender. And in the column, they say female, male, or FM, someone even put zero and one. Nobody knows what that means. So I think this is really a barrier that we are figuring out how

to overcome that. And after we can standardize the annotation of all the data, we can actually perform a systematic screening of all experiments happening down before, which experiment can potentially increase or decrease epigenetization. So one example we found in our pre-print is the bilirine. It’s a cancer drug.

albert_kejun_ying (40:05.91)
I know nothing about that drug, I just found it in a database. And this drug actually show one of the strongest effect we observe in the whole geo database, one of the strongest.

hannah_went (40:19.469)
What is it called again?

albert_kejun_ying (40:20.65)
Zibyloirin is a cancer drug. So it’s consistently reversing the epigenetic age based on almost all the clock, not almost basically all the clock we tested and it reverses by decades, but in the cell line. So we don’t know if this is a human, but it potentially could be something or nothing. But I think it’s just demonstrating this potential. So Zibyloirin is just one example

hannah_went (40:24.063)
Yeah.

hannah_went (40:38.776)
Yeah.

hannah_went (40:44.769)
Thank you for watching. We’ll see you in the next video.

albert_kejun_ying (40:51.373)
And for domain expert, they could potentially find a lot of different kind of same in this data.

hannah_went (40:57.309)
I mean, I just have so many questions when you say, oh, it reversed it by, you know, almost a decade or whatever the exact number is. It’s like, there can be so many papers probably created off of that one finding and insights. So clock based is really, hopefully, like you said, a standard annotation people can use almost like a standard language where people can come together and understand how all of this fits into the epigenetic world and space as it relates to more of those aging clocks.

this data from public databases or are people able to submit data to you and you’re able to put it into Clockbase?

albert_kejun_ying (41:35.39)
Both are fine. So currently we only show the data that are publicly available. And people can also upload their data to get the result themselves. We don’t save them. But maybe we can, I’m not sure it’s good or not, but maybe in the future we can develop a function that people can choose whether they want to post their data in the database.

hannah_went (41:38.495)
Okay.

hannah_went (42:05.609)
Yeah, definitely. Well, I’m excited for clock base. I’m just glad we we had the chance to talk about it so new. It’s so fresh. And I think it’ll be a very helpful tool for for everyone wanting to know more In this space and we’re we’re getting to the end here, Albert. What are you doing now. I you have so many great projects. You know, what are you really trying to focus on and in the short term, maybe the next year or so.

albert_kejun_ying (42:31.29)
So right now I’m in my fourth year. So actually I’m planning to graduate next year. So right now I’m planning to wrap up everything. And yeah.

hannah_went (42:40.489)
Yeah. What about after graduation? Any plans yet or thoughts?

albert_kejun_ying (42:47.411)
I want to be a postdoc, so I want to stay in academia, so a design test, as a childhood dream. So I want to postdoc somewhere and keep doing agent research from different angles.

hannah_went (42:49.911)
Okay.

Mm-hmm.

hannah_went (43:01.289)
Yeah. Well, I can’t wait to follow your, your work along, alongside as you, you grow and follow your dreams. I think that’s so motivating. Um, my very last question for you, this is a curve ball. You, you may know if you’ve listened to any of the episodes before, but Albert, what, um, kind of animal would you be if you could choose anyone in the world and why?

albert_kejun_ying (43:16.771)
Mm-hmm.

hannah_went (43:26.69)
Ha ha.

albert_kejun_ying (43:27.311)
Maybe snake? Yeah, because I used to keep a past snake, so I think it’s super perfect as a creature, and they don’t need to eat for weeks, so sometimes I forgot to eat, so yeah. Yeah.

hannah_went (43:29.609)
Oh, a snake! No one said that before. Why a snake?

hannah_went (43:45.529)
Well, there’s caloric restriction, right? You can just call it. There you go. Well, cool. No one’s, no one said snake before. So that was, that was super unique. And, um, Albert, we’ve come to the end of this amazing podcast interview, um, for people who are listening and want to find you, uh, where are they able to do that?

albert_kejun_ying (44:03.29)
I think currently the best play would be Twitter, so I’m quite active on Twitter. So my Twitter account is just my name, KGN-IN, K-E-J-U-N-Y-N-G. So it’s very easy to find.

hannah_went (44:17.569)
perfect. And I’ll post that, that way people can, can stay connected. Well, thanks everyone for listening and joining us at the everything epigenetics podcast. Remember you have control over your epigenetics, so tune in next time to learn more. Thanks so much, Albert. I appreciate your time.

albert_kejun_ying (44:32.37)
Thank you so much.

About this Guest Expert

Kejun Ying

Kejun Ying is a 4th year Ph.D. student in Harvard Medical School, Gladyshev lab. His research focuses on understanding cause of aging and develop ML-based aging biomarkers to facilitate the discovery of novel anti-aging interventions.

More About me

Everything epigenetic

Everything epigenetic

Causal Epigenetic Age Uncouples Damage and Adaptation

00:00 / 46:44

46:44

More Episodes

The Importance of the Imprintome

February 21, 2024

Integrating Epigenetics into the Social Models of Health Disparities

February 21, 2024

Prioritizing Your Health Investment

February 21, 2024

Epigenetic Gestational Age Prediction

February 21, 2024

Causal Epigenetic Age Uncouples Damage and Adaptation

February 21, 2024

Tracking and Measuring Biomarkers to Maximize Longevity

January 11, 2024