Hello Anurag,
I would like to thank for this software and your previous post on integration sequenceserver with HPC, I have question related to implementation.
I replaced the temp locations in blast.rb and I replaced the blast binaries with sge wrappers,jobs got submitted to cluster, Because of Job result which is in BLAST Archive format with different filename the Sequenceserver not able to generate xml. Can you help us how we can resolve this issue.
On Wednesday, 18 March 2015 12:39:36 UTC+5:30, Anurag Priyam wrote:Thanks!
Based on user input SequenceServer constructs a command (just like you would create a command without SequenceServer e.g. blastp -query foo.fa -db “bar.fa baz.fa”) which is then executed in the shell with due security considerations. Output, in BLAST Archive format (-outfmt 11), is redirected to a file. We then obtain XML output from the archive file using blast_formatter (again, output is redirected to a file). We parse the XML and generate HTML ourselves. The same archive file is used to generate XML and tabular report for download.
We used pipes in the very early days of SequenceServer (when we were just starting out) but soon felt that pipes were unreliable. So not anymore. Query sequences are written to a file and passed to blast using -query option instead of piping from stdin. Output is written to a file which is subsequently read instead of reading from a pipe.
For antgenomes.org, which is hosted on a thin server but runs BLAST on a 48 core fat machine (designated node on QMUL’s HPC cluster), we simply replace BLAST+ binaries with a shim that executes BLAST on the fat machine via ssh:
#!/usr/bin/env sh
ssh /path/to/blastn “$@”
The same scheme can be used to queue jobs if the queuing system allows waiting on a job id. I guess the corresponding shim would look something like:
#!/usr/bin/env sh
job_id=
qsub -N $job_id /path/to/blastn “$@”
qusb -hold_jid $job_id
(or use -sync option maybe)
If waiting on job id is not allowed in the conventional UNIX sense, it will not work because SequenceServer processes requests synchronously. That bit is due to change soon though.
I hope this helps. Please let us know if you took the above suggestions to integrate SequenceServer into an HPC system. We will be happy to help along the way.
– Priyam