首页后端开发GOGEO数据挖掘—3(geo数据挖掘教程)

GEO数据挖掘—3(geo数据挖掘教程)

时间2023-03-24 17:32:04发布访客分类GO浏览639
导读:GEO数据挖掘—3富集分析(一)GO富集分析(用差异基因做富集)输入数据#(1 输入数据 gene_up = deg$ENTREZID[deg$change == 'up'] gene_down = deg$ENTREZID[deg$ch...

GEO数据挖掘—3

富集分析

(一)GO富集分析(用差异基因做富集)

输入数据

#(1)输入数据
gene_up = deg$ENTREZID[deg$change == 'up'] 
gene_down = deg$ENTREZID[deg$change == 'down'] 
gene_diff = c(gene_up,gene_down)   #得到了差异基因
#(2)富集
#以下步骤耗时很长,设置了存在即跳过
f = paste0(gse_number,"_GO.Rdata")
if(!file.exists(f)){

  ego - enrichGO(gene = gene_diff,
                  OrgDb= org.Hs.eg.db,
                  ont = "ALL",
                  readable = TRUE)
  ego_BP - enrichGO(gene = gene_diff,
                  OrgDb= org.Hs.eg.db,
                  ont = "BP",
                  readable = TRUE)
  #ont参数:One of "BP", "MF", and "CC" subontologies, or "ALL" for all three.
  save(ego,ego_BP,file = f)
}
#(3)可视化
#条带图
barplot(ego)
barplot(ego, split = "ONTOLOGY", font.size = 10, 
        showCategory = 5) + 
  facet_grid(ONTOLOGY ~ ., space = "free_y",scales = "free_y") 
  
  #气泡图
dotplot(ego)
​
dotplot(ego, split = "ONTOLOGY", font.size = 10, 
        showCategory = 5) + 
  facet_grid(ONTOLOGY ~ ., space = "free_y",scales = "free_y") 
#(4)展示top通路的共同基因,要放大看。
​
#gl 用于设置下图的颜色
gl = deg$logFC
names(gl)=deg$ENTREZID
#Gene-Concept Network,要放大看
cnetplot(ego,
         #layout = "star",
         color.params = list(foldChange = gl),
         showCategory = 3)

(二)KEGG富集分析

上调、下调、差异、所有基因

#(1)输入数据
gene_up = deg[deg$change == 'up','ENTREZID'] 
gene_down = deg[deg$change == 'down','ENTREZID'] 
gene_diff = c(gene_up,gene_down)
#(2)对上调/下调/所有差异基因进行富集分析
f2 = paste0(gse_number,"_KEGG.Rdata")
if(!file.exists(f2)){

  kk.up - enrichKEGG(gene         = gene_up,
                      organism     = 'hsa')
  kk.down - enrichKEGG(gene         =  gene_down,
                        organism     = 'hsa')
  kk.diff - enrichKEGG(gene         = gene_diff,
                        organism     = 'hsa')
  save(kk.diff,kk.down,kk.up,file = f2)
}
    
load(f2)
#(3)看看富集到了吗?https://mp.weixin.qq.com/s/NglawJgVgrMJ0QfD-YRBQg
table(kk.diff@result$p.adjust0.05)
table(kk.up@result$p.adjust0.05)
table(kk.down@result$p.adjust0.05)
#(4)双向图
# 富集分析所有图表默认都是用p.adjust,富集不到可以退而求其次用p值,在文中说明即可
down_kegg - kk.down@result %>
    %
  filter(pvalue0.05) %>
    % #筛选行
  mutate(group=-1) #新增列
​
up_kegg - kk.up@result %>
    %
  filter(pvalue0.05) %>
    %
  mutate(group=1)
​
source("kegg_plot_function.R")
g_kegg - kegg_plot(up_kegg,down_kegg)
g_kegg
标准流程的后续
问题数据和常见错误分析

复杂数据及其分析

1.多分组数据:示例GSE474

2.多数据联系分析:例如GSE83521_ and_ GSE89143

批次效应

声明:本文内容由网友自发贡献,本站不承担相应法律责任。对本内容有异议或投诉,请联系2913721942#qq.com核实处理,我们将尽快回复您,谢谢合作!

数据挖掘go

若转载请注明出处: GEO数据挖掘—3(geo数据挖掘教程)
本文地址: https://pptw.com/jishu/307.html
ChIP-seq 分析:基因集富集(11) 一款为热门数据库系统打造的管理客户端,更具生产力的数据库工具

游客 回复需填写必要信息