parallel computing with SequenceServer

Dear SequenceServer Community,

we installed SequenceServer at our Server recently and tested the performance.
I was a little bit surprised to see that online Version was more than 5 times faster, using the same parameters.
I suspect the online version uses massive parallel computing to achieve this.

Therefore we checked the CPU-Usage of our server, which only uses one core per Query.
Is SequenceServer also able to use more than one Cores / CPU´s on one query?

Best Regards
Matthias Enders

Hi Matthias,

we installed SequenceServer at our Server recently and tested the
performance.
I was a little bit surprised to see that online Version was more than 5
times faster, using the same parameters.
I suspect the online version uses massive parallel computing to achieve
this.

What "online Version" are you talking about? NCBI's web-blast service [1]?

Therefore we checked the CPU-Usage of our server, which only uses one core
per Query.

Did you compile BLAST+ on your server or did you use pre-compiled
binaries from BLAST+'s download page? Pre-compiled binaries lock you
to one thread of execution (one CPU core).

Is SequenceServer also able to use more than one Cores / CPU´s on one query?

BLAST+ can leverage multiple CPU cores through multi-threading if you
compile it yourself on your server. You can tell SS to invoke BLAST+
in a multi-threaded mode by setting the num_threads parameter in
~/.sequenceserver.conf. You could experiment with different values of
num_threads parameter and see which offers the best performance boost.
If you are feeling lazy, two times the number cpu-cores should be a
safe value.

[1]: http://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastp&BLAST_PROGRAMS=blastp&PAGE_TYPE=BlastSearch&SHOW_DEFAULTS=on&LINK_LOC=blasthome

Hi,

I am the maintainer of the server, what he is talking about.

Hi Matthias,

we installed SequenceServer at our Server recently and tested the
performance.
I was a little bit surprised to see that online Version was more than 5
times faster, using the same parameters.
I suspect the online version uses massive parallel computing to achieve
this.

What “online Version” are you talking about? NCBI’s web-blast service [1]?

No, We talk about http://www.antgenomes.org/blast

Therefore we checked the CPU-Usage of our server, which only uses one core
per Query.

Did you compile BLAST+ on your server or did you use pre-compiled
binaries from BLAST+'s download page? Pre-compiled binaries lock you
to one thread of execution (one CPU core).

Is SequenceServer also able to use more than one Cores / CPU´s on one query?

BLAST+ can leverage multiple CPU cores through multi-threading if you
compile it yourself on your server. You can tell SS to invoke BLAST+
in a multi-threaded mode by setting the num_threads parameter in
~/.sequenceserver.conf. You could experiment with different values of
num_threads parameter and see which offers the best performance boost.
If you are feeling lazy, two times the number cpu-cores should be a
safe value.

Thanks for this hint!
I’ve tried to compile it myself with:

./configure --with-mt
make

The build was ended successfully!

I’ve also set num_threads to 8 (4 cores are present).

When I send a query over the WebInterface (configured to work with Apache over Passenger), blast+ only uses one core with 100% use.
Whats wrong? How can I test the multicore support?

BLAST+ can leverage multiple CPU cores through multi-threading if you
compile it yourself on your server. You can tell SS to invoke BLAST+
in a multi-threaded mode by setting the num_threads parameter in
~/.sequenceserver.conf. You could experiment with different values of

[...]

When I send a query over the WebInterface (configured to work with Apache
over Passenger), blast+ only uses one core with 100% use.
Whats wrong? How can I test the multicore support?

How do you verify that only one core is being used? What operating
system are you running on your server? Depending on the platform and
the tool, a 100% CPU utilization could mean all cores are maxed out.

It is possible that a BLAST query on antgenomes.org (Fourmidable) runs
five times faster because it runs on more powerful server.

Hi and thanks for your answer.

Iam checking the system using htop. It shows a usage of 100% for one thread out of 4 (configured 2 sockets and 2 cores per socket on an ESXi cluster.)

So, the query runs not on multiple threads, right?

Gruß,
R. Kluth
via SGSII

Ok, I just verified. This is a bug and will be fixed in the next
release. Thanks a lot for bringing this up Robin :).

On a related note, some pre-compiled binaries seem to support
num_threads option. Previously BLAST+ would flat out refuse to run
with num_threads option.

For reference, SequenceServer spawns another process to run BLAST
searches. pid of the BLAST job is not the same as that of the server.
This should be kept in mind when passing pid to ps or top commands.

Oh, okay :slight_smile:

Please notify us over this list about any news, Ill try the new package as soon as you released them.

Thanks a lot for your help!

Gruß,
R. Kluth
via SGSII

Hi Guys,

sorry for my late answer, I stayed in a short Holyday!

I have to state a little correction:

We acually talked about the NCBI Web-Blast ( http://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastp&BLAST_PROGRAMS=blastp&PAGE_TYPE=BlastSearch&SHOW_DEFAULTS=on&LINK_LOC=blasthome )
and NOT about the AntGenome.org server

You stated the server is may be more powerfull, but our Query runs even faster on a desktop-pc using Blast+ without SequenceServer, which was the main reason to write to this group.

When I get your Msg right you will fix this bug in the next Version? Can you give us a hint when this might be?

Thanks a lot!
Matthias & Robin

How about Friday? College makes it quite difficult to estimate
development time, people. So take it with a grain of salt, eh.

Dear Anurag,

We expected something like “perhaps in two month” so a “grain of salt” and Friday is way better!

Thanks for your efforts!

Hehe :). I would blame the college for slow releases too.