本帖最后由 snow 于 2013-2-21 23:28 编辑
我来分享下我的解决过程:
首先分析问题:
看到你的错误,我先下载了ape包,运行read.dna没问题,运行clustal出错,错误跟你的一模一样,连例子都错误,不能吧....,我觉得肯定是哪里没有设置对。
仔细看错误提示:
无法打开下面文件,
C:\Users\ADMINI~1\AppData\Local\Temp\RtmpCuvK5E/input_clustal.aln
这说明这个aln文件没有,
于是我用打开一个文件夹窗口,在上面输入C:\Users\ADMINI~1\AppData\Local\Temp\RtmpCuvK5E,我发现里面只有一个input_clustal.fas文件,没有aln文件,确实没有这个文件哈,难怪程序报错。
于是我想,一个临时文件夹会产生input_clustal.fas这个文件,那理论上也应该产生aln文件,既然没上产生,是不是缺什么插件,而没产生得了呢。
接下来只有看源程序代码了。
在R命令窗口输入函数clustal,显示下面函数- > clustal
- function (x, pw.gapopen = 10, pw.gapext = 0.1, gapopen = 10,
- gapext = 0.2, exec = NULL, MoreArgs = "", quiet = TRUE, original.ordering = TRUE)
- {
- os <- Sys.info()[1]
- if (is.null(exec)) {
- if (os == "Linux")
- exec <- "clustalw"
- if (os == "Darwin")
- exec <- "clustalw2"
- if (os == "Windows")
- shortPathName("C:/Program Files/ClustalW2/clustalw2.exe")
- }
- if (missing(x)) {
- system(paste(exec, "-help"))
- return(invisible(NULL))
- }
- d <- tempdir()
- inf <- paste(d, "input_clustal.fas", sep = "/")
- outf <- paste(d, "input_clustal.aln", sep = "/")
- write.dna(x, inf, "fasta")
- prefix <- c("-INFILE", "-PWGAPOPEN", "-PWGAPEXT", "-GAPOPEN",
- "-GAPEXT")
- suffix <- c(inf, pw.gapopen, pw.gapext, gapopen, gapext)
- opts <- paste(prefix, suffix, sep = "=", collapse = " ")
- opts <- paste(opts, MoreArgs)
- system(paste(exec, opts), ignore.stdout = quiet)
- res <- read.dna(outf, "clustal")
- if (original.ordering)
- res <- res[labels(x), ]
- res
- }
- <environment: namespace:ape>
复制代码 仔细看了看,发现点问题看这两行- inf <- paste(d, "input_clustal.fas", sep = "/")
- outf <- paste(d, "input_clustal.aln", sep = "/")
复制代码 这两行明显的是相似的输出,理论上应该.fas和.aln文件同时输出才对,第二没输出,说明有些程序没运行。、
继续查找原因,发现程序有一行,- shortPathName("C:/Program Files/ClustalW2/clustalw2.exe")
复制代码 这个的意思是要用到C:/Program Files/ClustalW2/clustalw2.exe这个地方的一个exe文件,于是豁然开朗,clustalw是做多序列比对的,一个windows下的文件,如果不自己安装clustalw2怎么会有这个文件呢,找了下c盘的Program Files确实没有。
只好自己安装吧,于是google了下,找到clustalw2的下载地址,这个在ebi有,下载地址如下:
ftp://ftp.ebi.ac.uk/pub/software/clustalw2/2.1
下载里面的clustalw-2.1-win.msi,安装,一路下一步。
安装好之后,发现C:/Program Files下多了个ClustalW2文件夹,然后里面有个ClustalW2.exe文件。
这下好了,工具准备好了。
原理ape包是调用ClustalW2来完成分析的。
最后修改函数,为保险起见用全路径吧(找到你安装ClustalW2.exe的目录,把路径拷贝下来):修改代码如下:- clustal(x,pw.gapopen=10,pw.gapext=0.1,gapopen=10,gapext=0.2,exec=shortPathName("C:/Program Files/ClustalW2/clustalw2.exe"),MoreArgs="",quiet=TRUE)
复制代码 主要是将exec=NILL修改为全路径,也就是将shortPathName重新赋值,即:exec=shortPathName("C:/Program Files/ClustalW2/clustalw2.exe")
然后运行,发现出结果了,一切ok!
下面把例子加测试代码及结果贴出来
测试全部程序:- cat("> No305",
- "NTTCGAAAAACACACCCACTACTAAAANTTATCAGTCACT",
- "> No304",
- "ATTCGAAAAACACACCCACTACTAAAAATTATCAACCACT",
- "> No306",
- "ATTCGAAAAACACACCCACTACTAAAAATTATCAATCACT",
- file = "exdna.txt", sep = "\n")
- x<-read.dna("exdna.txt",format="fasta",skip=0,nlines=0,comment.char="#",as.character=FALSE)
- clustal(x,pw.gapopen=10,pw.gapext=0.1,gapopen=10,gapext=0.2,exec=shortPathName("C:/Program Files/ClustalW2/clustalw2.exe"),MoreArgs="",quiet=TRUE)
复制代码 运行结果:- > cat("> No305",
- + "NTTCGAAAAACACACCCACTACTAAAANTTATCAGTCACT",
- + "> No304",
- + "ATTCGAAAAACACACCCACTACTAAAAATTATCAACCACT",
- + "> No306",
- + "ATTCGAAAAACACACCCACTACTAAAAATTATCAATCACT",
- + file = "exdna.txt", sep = "\n")
- > x<-read.dna("exdna.txt",format="fasta",skip=0,nlines=0,comment.char="#",as.character=FALSE)
- > clustal(x,pw.gapopen=10,pw.gapext=0.1,gapopen=10,gapext=0.2,exec=shortPathName("C:/Program Files/ClustalW2/clustalw2.exe"),MoreArgs="",quiet=TRUE)
- CLUSTAL 2.1 Multiple Sequence Alignments
- Sequence format is Pearson
- Sequence 1: No305 40 bp
- Sequence 2: No304 40 bp
- Sequence 3: No306 40 bp
- Start of Pairwise alignments
- Aligning...
- Sequences (1:2) Aligned. Score: 90
- Sequences (1:3) Aligned. Score: 92
- Sequences (2:3) Aligned. Score: 97
- Guide tree file created: [C:\Users\ADMINI~1\AppData\Local\Temp\RtmpCuvK5E/input_clustal.dnd]
- There are 2 groups
- Start of Multiple Alignment
- Aligning...
- Group 1: Sequences: 2 Score:750
- Group 2: Sequences: 3 Score:742
- Alignment Score 776
- CLUSTAL-Alignment file created [C:\Users\ADMINI~1\AppData\Local\Temp\RtmpCuvK5E/input_clustal.aln]
- 3 DNA sequences in binary format stored in a matrix.
- All sequences of same length: 40
- Labels: No305 No304 No306
- Base composition:
- a c g t
- 0.458 0.288 0.034 0.220
复制代码 |