如何将KEGG路径保存为Rmarkdown中的变量进行自我报告？

2023-02-08 Warsame 原文

我试图以R标记显示KEGG路径，但它们都保存为png。我可以在报告中显示它们，但我必须添加每个文件的名称，这使得如果代码发生更改，更新文件变得更加困难。有人知道我如何保存由以下代码生成的png文件，以便每当代码或数据发生更改时，我都可以让rmarkdown文件更新图像吗？我已经添加了代码和找到它的网站以及下面的数据。

https://rpubs.com/barryus/class15

标题：“kegg”输出：html_documentdate:“2023-02-07”

knitr::opts_chunk$set(echo = TRUE)

#link to source code and data: https://rpubs.com/barryus/class15

#BiocManager::install("DESeq2")
library(DESeq2)
#BiocManager::install("AnnotationDbi")
library("AnnotationDbi")
#BiocManager::install("org.Hs.eg.db")
library("org.Hs.eg.db")
#BiocManager::install("pathview")
library(pathview)
#BiocManager::install("gage")
library(gage)
#BiocManager::install("gageData")
library(gageData)


#import source data
metaFile <- "GSE37704_metadata.csv"
countFile <- "GSE37704_featurecounts.csv"

# Import metadata and take a peak
colData = read.csv(metaFile, row.names=1)
head(colData)

# Import countdata
countData = read.csv(countFile, row.names=1)
head(countData)

# Note we need to remove the odd first $length col
#   need the countData and colData files to match up so we will need to
#   remove that odd first column in countData namely contData$length.
countData <- as.matrix(countData[,-1])
head(countData)

# Filter count data where you have 0 read count across all samples.
countData = countData[rowSums(countData)>1, ]
head(countData)

#setup the DESeqDataSet object required for the DESeq() function
tdds = DESeqDataSetFromMatrix(countData=countData,
                              colData=colData,
                              design=~condition)

#run the DESeq pipeline
tdds = DESeq(tdds)
tdds

#get results for the HoxA1 knockdown versus control siRNA
#   labeled these as “hoxa1_kd” and “control_sirna” in our original colData metaFile input to DESeq
tres = results(tdds, contrast=c("condition", "hoxa1_kd", "control_sirna"))

#check names
resultsNames(tdds)

#reorder these results by p-value and call summary() on the results object
#   gives you a sense of how many genes are up or down-regulated at the default FDR of 0.1
#   FDR: "false-discovery rate" is the fraction of positives that are false positives at a given p-value threshold
tres = tres[order(tres$pvalue),]
summary(tres)

#Since we mapped and counted against the Ensembl annotation, our results only have information about Ensembl gene IDs
#   However, our pathway analysis downstream will use KEGG pathways
#   genes in KEGG pathways are annotated with Entrez gene IDs
#   need to add Entrez IDs
columns(org.Hs.eg.db)

tres$symbol = mapIds(org.Hs.eg.db,
                     keys=row.names(tres),
                     column="SYMBOL",
                     keytype="ENSEMBL",
                     multiVals="first")

tres$entrez = mapIds(org.Hs.eg.db,
                     keys=row.names(tres),
                     column="ENTREZID",
                     keytype="ENSEMBL",
                     multiVals="first")

tres$name =   mapIds(org.Hs.eg.db,
                     keys=row.names(tres),
                     column="GENENAME",
                     keytype="ENSEMBL",
                     multiVals="first")

head(tres, 10)

#use the gage package for pathway analysis.
#Once we have a list of enriched pathways
#    use the pathview package to draw pathway diagrams
#    shading the molecules in the pathway by their degree of up/down-regulation.

# gageData package has pre-compiled databases mapping genes to KEGG pathways and GO terms for common organisms
#     kegg.sets.hs is a named list of 229 elements.
#     Each element is a character vector of member gene Entrez IDs for a single KEGG pathway. (See also go.sets.hs)
#     The sigmet.idx.hs is an index of numbers of signaling and metabolic pathways in kegg.set.gs
#     KEGG pathway include other types of pathway definitions which may be undesirable in pathway analysis
#         i.e: “Global Map” and “Human Diseases”

#setup the KEGG data-sets we need.
data(kegg.sets.hs)
data(sigmet.idx.hs)

kegg.sets.hs = kegg.sets.hs[sigmet.idx.hs]
head(kegg.sets.hs, 3)

#main gage() function requires a named vector of fold changes, where the names of the values are the Entrez gene IDs.
tfoldchanges = tres$log2FoldChange
names(tfoldchanges) = tres$entrez
head(tfoldchanges)

# run the pathway analysis
#     might want to try changing the value of same.dir.
#     This value determines whether to test for
#         a. changes in a gene set toward a single direction (all genes up or down regulated)
#         b. changes towards both directions simultaneously (i.e. any genes in the pathway dysregulated).
#     we’re using same.dir=TRUE, which will give us separate lists for pathways that are upregulated-
#     versus pathways that are down-regulated. Let’s look at the first few results from each.

#Get the results for the pathway analysis
keggtres = gage(tfoldchanges, gsets=kegg.sets.hs, same.dir=TRUE)

#look at the result object
#   It is a list with three elements (“greater”, “less” and “stats”).
attributes(keggtres)

#keggtres is a list object
class(keggtres) #[1] "list"

#use the dollar syntax to access a named element, e.g.
head(keggtres$greater)
head(keggtres$less)


#look at both up (greater), down (less), and statistics by calling head() with the lapply() function.
#   both keggtres$greater and keggtres$less are data matrices with gene sets as rows sorted by p-value.
lapply(keggtres, head)

#process the results to pull out the top 5 upregulated pathways
#   further process that just to get the IDs
#   use these KEGG pathway IDs downstream for plotting

## Sanity check displaying all pathways data
tpathways = data.frame(id=rownames(keggtres$greater), keggtres$greater)
head(tpathways)

#use the pathview() function to make a pathway plot with our result shown in color.
#   manually supply a pathway.id (namely the first part of the "hsa04110 Cell cycle")
#   this is visible from the print out above.
# code blw downloads the patway figure data from KEGG and adds our results to it

pathview(gene.data=tfoldchanges, pathway.id="hsa04110") #svrl gns r perturbed (the coloured ones)

# A different PDF based output of the same data
#there are df arrows on this one
pathview(gene.data=tfoldchanges, pathway.id="hsa04110", kegg.native=FALSE)

#process our results a bit more to automagicaly pull out the top 5 upregulated pathways
#   then further process that just to get the IDs needed by the pathview() function
#   We’ll use these KEGG pathway IDs for plotting below.

## Focus on top 5 upregulated pathways here for demo purposes only
keggtrespathways <- rownames(keggtres$greater)[1:5]

# Extract the IDs part of each string
keggtresids = substr(keggtrespathways, start=1, stop=8)
keggtresids

#pass these IDs in keggtresids to the pathview() function to draw plots for all the top 5 pathways.
pathview(gene.data=tfoldchanges, pathway.id=keggtresids, species="hsa")

图像

#may need to change image name 
knitr::include_graphics("hsa04060.pathview.png", error = FALSE)

如何将KEGG路径保存为Rmarkdown中的变量进行自我报告？的更多相关文章

JDBCTM 指南：入门3 - DriverManager

内容:3-DriverManager3.1概述DriverManager类是JDBC的管理层，作用于用户和驱动程序之间。另外，DriverManager类也处理诸如驱动程序登录时间限制及登录和跟踪消息的显示等事务。但多数情况下，让DriverManager类管理建立连接的细节为上策。它创建该类的实例，然后在加载该实例时DriverManager类进行注册。这也是引入特定驱动程序的方法，因为一旦DriverManager类被初始化，它将不再检查jdbc.drivers属性列表。
PHP的宝库目录--PEAR

请跟我来，使用PEAR标准编写你的PHP程序吧，你的程序将会拥有更大的活力，你的程序和代码将会很方便地和其他高手的代码融合在一起，PEAR就象CPAN对于PERL一样，会让PHP产生更高的能量。PHPDoc也是一个PEAR的应用程序，更详细的介绍你可以去http://www.phpdoc.de/查看。
PHP 中 var_export、print_r、var_dump 调试中的区别

这篇文章主要介绍了PHP 中 var_export、print_r、var_dump 调试中的区别,非常不错，具有一定的参考借鉴价值，需要的朋友可以参考下
PHP中的print_r 与 var_dump 输出数组

下面小编就为大家带来一篇PHP中的print_r 与 var_dump 输出数组。小编觉得挺不错的，现在就分享给大家，也给大家做个参考
没有适用于&#39；inner_join&#39；应用于“类”的对象；字符“；？

我是一个初学者，最近才开始尝试在R中编程。我现在使用一个简单的pokemon数据集，只是为了练习，我想尝试用一个变量建立一个直方图。但为此，我试图将两个不同的列：Type.1和Type.2。两者都有相同的变量，但当我尝试inner_join时，我得到以下消息：类型_统一<；-inner_join没有适用于“inner_join”类对象的方法我在尝试left_join时也犯了同样的错误我该怎么做才能将这两列合并为一列？我只是想将两列统一为一个变量，以应用直方图
给定一个列名，从R中数据帧的底行提取最后一个非空值

对于以下数据集df，我希望给出列名，返回以提取并返回最近一天的非空值：这意味着我们可以实现类似的功能，但不需要传递特定的日期值：我可以问一下如何实现它吗？
计算进行转换时同一客户端id进行交互的次数

对于每次转换，我需要计算client_id进行交互或转换的次数，但只计算自上次转换以来的交互次数。
R中的时间趋势分析：露天包装

我正在尝试对空气污染数据进行趋势分析。我的数据格式如下：我尝试过使用openair::TheilSen，但它不起作用。我实际上需要5张图，但我无法将它们分组。
使用ggplot表示R列栏中的数据集

我在R中有一个名为df的数据帧NoYesdont_know111412310如何使用ggplot在柱状图中表示它们？
通过R下载gadm shapefile时出错

我正在尝试使用gadm为挪威下载一个shapefile（来自geodata包）但是，我遇到了以下错误：文件（con，“r”）中出错：无法打开与'的连接https://geodata.ucdavis.edu/gadm/gadm4.1.txt'此外：警告消息：在文件中（con，“r”）：URL'https://geodata.ucdavis.edu/gadm/gadm4.1.txt'：状态为'HTTP

随机推荐

如何扩展ATmega324PB微控制器的以下宏寄存器？

我目前正在学习嵌入式，我有以下练习：展开以下宏寄存器：如果有人解决了这个问题，我将不胜感激，以便将来参考
Python将ONNX运行时设置为返回张量而不是numpy数组

在python中，我正在加载预定义的模型：然后我加载一些数据并运行它：到目前为止，它仍在正常工作，但我希望它默认返回Tensor列表，而不是numpy数组。我对ONNX和PyTorch都是新手，我觉得这是我在这里缺少的基本内容。这将使转换中的一些开销相同。
在macOS上的终端中使用Shell查找文件中的单词

我有一个文本文件，其中有一行：我需要找到ID并将其提取到变量中。我想出了一个RexEx模式：但它似乎对我尝试过的任何东西都不起作用：grep、sed——不管怎样。我的一个尝试是：我为这样一个看似愚蠢的问题感到抱歉，但我在互联网上找不到任何东西：我在SO和SE上读了几十个类似的问题，并在谷歌上搜索了几个教程，但仍然无法找到答案。欢迎提供任何指导！
react-chartjs-2甜甜圈图中只有标题未更新

我正在使用react-chartjs-2在我的网站中实现甜甜圈图。下面是我用来呈现图表的代码。我将甜甜圈图的详细信息从父组件传递到子组件，所有道具都正确传递。当我在beforeDraw函数外部记录props.title时，它会记录正确的值，但当我在beforeDraw函数内部记录props.title时，它将记录标题的前一个值，从而呈现标题的前值。我在这里做错了什么？
如何在tkinter中使用Python生成器函数？

生成器函数承诺使某些代码更易于编写。但我并不总是知道如何使用它们。假设我有一个斐波那契生成器函数fib()，我想要一个显示第一个结果的tkinter应用程序。当我点击“下一步”按钮时，它会显示第二个数字，依此类推。我如何构建应用程序来实现这一点？我可能需要在线程中运行生成器。但如何将其连接回GUI？
如何为每次提交将存储库历史记录拆分为一行？

我正在尝试获取存储库的历史记录，但结果仅以单行文本的形式返回给我。
尝试在颤振项目上初始化Firebase时出错

当尝试在我的颤振项目上初始化firebase时，我收到了这个错误有人知道我能做什么吗？应用程序分级Gradle插件Gradle项目颤振相关性我已经将firebase设置为Google文档已经在另一个模拟器上尝试过，已经尝试过创建一个全新的模拟器，已经在不同的设备上尝试过了，已经尝试了特定版本的firebase，已经尝试添加但没有任何效果，已经在youtube上看到了关于它的每一个视频，该应用程序在android和iOS两个平台上都抛出了这个错误
在unix中基于当前日期添加新列

我试图在unix中基于时间戳列在最后一个单元格中添加一个状态列。我不确定如何继续。
麦克斯·蒙特利。我一直得到UncaughtReferenceError：当我在终端中写入node-v时，节点未定义

如果这是您应该知道的，请确认：我已将所有shell更改为默认为zsh。当我在终端中写入node-v时，我一直收到“UncaughtReferenceError:nodeisnotdefined”。但它显示节点已安装。我是个新手，在这方面经验不足。
如何在前端单击按钮时调用后端中的函数？

那么如何在后端添加一个新的端点，点击按钮调用这个函数。