Jump to content

Forcing Assembly Optimizations


Trakfast11

Recommended Posts

i was wondering if anybody has heard any info on being able to choose what optimization to force. right now my 2400+ and 1700+ are working on the same wu, but with different optimizations. the 2400+ is using SSE and the other is using 3DNow. i won't go into the details just yet, going to catch my ZZzzz, but the 2400+ is completing each percentage 51 seconds faster than the 1700+ which is only clocked 150Mhz lower (3minutes 36seconds 2551Mhz SSE vs 2minutes 45 seconds 2401Mhz 3DNow | 216/165= a 24% decrease in the time it takes to complete 1 percent or one wu, more specifically the p547_BBA5_ext). but previous tests with my 2400+ going from 2400Mhz to 2500Mhz did not show that big of a jump in completion of a percentage. so i'm wondering how much SSE and 3DNow effect the completion times and wondering why the two choose different optimizations.

 

when i use PC Wizard eXPerience Edition (which is older than the current one out) and benchmark my [email protected] i get the following.

 

<<< Processor Benchmark >>>

> Addition/Multiplication (SSE) : 3064.11 MFLOPS

> Division (SSE) : 1094.94 MFLOPS

> Square Root (SSE) : 840.11 MFLOPS

> Addition/Multiplication (3DNow!) : 3765.28 MFLOPS

> Division (3DNow!) : 493.89 MFLOPS

> Square Root (3DNow!) : 341.4 MFLOPS

> Whetstone : 873 KWPS

>> General Information

Processor : AMD Athlon

Edited by Trakfast11

Share this post


Link to post
Share on other sites

i did some looking around on the folding forums and it seems the -advmethods flag has something to do with it. try taking it off if you currently use it or remove it if you do. with the flag mine uses sse and takes about 15min per frame, without it i waited about 30 mins and it never completed a frame and didn't say anything about either sse or 3dnow.

Share this post


Link to post
Share on other sites

cool, thx for the post sykocus.

 

oh hey!, i'm about to see the differences between 3dnow and SSE optimizations. my 2.4Ghz cpu was using 3Dnow optimizations on the wu (571) until i stopped it and added -forceasm to the command line. when it started back up it said Extra SSE boost (everytime i've noticed it said 3Dnow on this comp until just after i added -forceasm after it. here are the times it got on this wu with 3DNow optimizations (both results were obtained by using the same scenarios, only 2 app were running while getting my results: MBM and OCC webpage :D . didn't touch the comp while i was letting it run until i actually went through the folders to open the .txt file. everything the same execpt for the assembly optimization).

 

with 3DNow.

 

[00:17:54] Folding@home Gromacs Core

[00:17:54] Version 1.48 (May 7, 2003)

[00:17:54]

[00:17:54] Preparing to commence simulation

[00:17:54] - Looking at optimizations...

[00:17:54] - Created dyn

[00:17:54] - Files status OK

[00:17:54] - Go method

[00:17:55] - Expanded 542199 -> 3588921 (decompressed 661.9 percent)

[00:17:55] - Starting from initial work packet

[00:17:55]

[00:17:55] Project: 571 (Run 11, Clone 188, Gen 10)

[00:17:55]

[00:17:55] Assembly optimizations on if available.

[00:17:55] Entering M.D.

[00:18:28] Protein: p571_L939_K12M_nat

[00:18:28]

[00:18:28] Writing local files

[00:18:28] Extra 3DNow boost OK.

[00:18:32] Writing local files

[00:18:32] Completed 0 out of 500000 steps (0)

[00:28:00] Writing local files

[00:28:00] Completed 5000 out of 500000 steps (1)

[00:37:26] Writing local files

[00:37:26] Completed 10000 out of 500000 steps (2)

[00:46:50] Writing local files

[00:46:50] Completed 15000 out of 500000 steps (3)

[00:56:16] Writing local files

[00:56:16] Completed 20000 out of 500000 steps (4)

[01:05:40] Writing local files

[01:05:40] Completed 25000 out of 500000 steps (5)

 

 

about 9 minutes and 26 seconds per frame

 

 

 

 

 

now with SSE.

 

Arguments: -service -forceasm

 

[01:11:29] - Ask before connecting: No

[01:11:29] - User name: Trakfast11 (Team 12772)

[01:11:29] - User ID = 2D5AE2125FD4538

[01:11:29] - Machine ID: 2

[01:11:29]

[01:11:29] Loaded queue successfully.

[01:11:29] + Benchmarking ...

[01:11:31]

[01:11:31] + Processing work unit

[01:11:31] Core required: FahCore_78.exe

[01:11:31] Core found.

[01:11:31] Working on Unit 05 [July 24 01:11:31]

[01:11:31] + Working ...

[01:11:31]

[01:11:31] *------------------------------*

[01:11:31] Folding@home Gromacs Core

[01:11:31] Version 1.48 (May 7, 2003)

[01:11:31]

[01:11:31] Preparing to commence simulation

[01:11:31] - Ensuring status. Please wait.

[01:11:48] - Assembly optimizations manually forced on.

[01:11:48] - Not checking prior termination.

[01:11:48] - Go method

[01:11:49] - Expanded 542199 -> 3588921 (decompressed 661.9 percent)

[01:11:49]

[01:11:49] Project: 571 (Run 11, Clone 188, Gen 10)

[01:11:49]

[01:11:49] Assembly optimizations on if available.

[01:11:49] Entering M.D.

[01:12:36] (Starting from checkpoint)

[01:12:36] Protein: p571_L939_K12M_nat

[01:12:36]

[01:12:36] Writing local files

[01:12:36] Completed 25000 out of 500000 steps (5)

[01:12:36] Extra SSE boost OK.

[01:19:57] Writing local files

[01:19:57] Completed 30000 out of 500000 steps (6)

[01:27:18] Writing local files

[01:27:18] Completed 35000 out of 500000 steps (7)

[01:34:39] Writing local files

[01:34:39] Completed 40000 out of 500000 steps (8)

[01:42:01] Writing local files

[01:42:01] Completed 45000 out of 500000 steps (9)

 

 

finished a frame every 7 minutes and 21 seconds.

 

will try to be brief-ish: sorry for length.

in short (for this particular wu) the 3Dnow took about 2 minutes more to complete a percentage than its SSE counterpart. so using SSE over 3Dnow (for this wu) would give you about 200 extra minutes (100%x2minutes) to use for processing another wu (or 3.3hrs hours sooner).

 

this is will T-Breds (Bartons should yield simular results). i do not know how these optimizations would effect intel's P4 or other variations of processors. if anyone else happens to stop and change optimizations somewhere in the middle of a wu (with both running under the same conditions, preferrably no other apps running) i'd be really interested in seeing your results.

 

 

looks like this partially explains why i'm getting a greater amount of points per day on average for the last few days.

Edited by Trakfast11

Share this post


Link to post
Share on other sites

  • 2 weeks later...

Explain this to me like I'm a complete idiot.

 

Step by step it for me.

 

Right now I'm using the GUI, and I have no idea about any 3DNow or SSE or whatever you're talking about, so lead me through it.

 

Optimized = good.

Share this post


Link to post
Share on other sites

you may want to download the command-line (text based) folding program instead of the GUI (graphical user interface. in this case its the one with the color picture of proteins folding). i'm not sure how or if those arguements can be added while using the GUI. you can download the text-only console here. and here is a quote that should help after you download and get the console version running. the green parts are going to be parts that i edited to be more specific than my original post. just usered the original post as a base.

you'll want to add -forceasm to the end of your command line (if you made FAH a service then add -forceasm in there) (if you don't have a service running [if you don't know if you have set up a service then you don't have a FAH service running. you may have folding@home setup and running, but you don't have a service setup.] then make a shortcut to your FAHConsole. to do this right-click the file named FAH3Console and choose "Create Shortcut". next, right-click on the shortcut you just created and select "Properties". now you should see a line that says Target, which points to where your FAH3Console.exe file is located. go to the target line and add -forceasm to the end of it. it should look something like this:

 

Target: D:\FAH\FAH3Console.exe -forceasm

 

 

apply changes. close and then double click your shortcut (this is what you do if you are using the commmand-line interface. there are more details on the different interfaces in this forum (also info on setting up a service).

 

hope this helps clear things up a bit

Edited by Trakfast11

Share this post


Link to post
Share on other sites

Ive had SSE Boost since i started folding, and it takes me about 15 minutes or more to get one step!!!

Dude which WU is that for?

I'm only taking 18mins approx. for a step (GUI) for project 663

 

 

[[06:42:36] Finished a frame (55)

[06:59:54] Finished a frame (56)

[07:17:10] Finished a frame (57)

[07:34:28] Finished a frame (58)

[07:51:44] Finished a frame (59)

[08:09:01] Finished a frame (60)

[08:26:18] Finished a frame (61)

[08:44:34] Finished a frame (62)

[09:01:53] Finished a frame (63)

[09:19:10] Finished a frame (64)

[09:36:28] Finished a frame (65)

[09:53:45] Finished a frame (66)

[10:11:01] Finished a frame (67)

[10:28:19] Finished a frame (68)

 

Well you get the drift...........

 

I'm only using a K62 Chomper (pigdog) :lol:

That wouldn't be for every WU, some are more complex than others.

Share this post


Link to post
Share on other sites

Alright, so I guess my next question is - how do I make F@H a service? I searched the forums, and saw the little linked thing to how to make it a service, but unfortunately the download links to the .zips are dead.

Edited by Montol

Share this post


Link to post
Share on other sites

Heh, yeah, I figured out the What on my own. I just need the How now. Maybe I'll Google it

 

Edit: I found something on Google (Folding at home service) and there's a good guide on Overclockers Australia to follow. The version of FireDaemon they use in it is a little outdated, but the new ones are similar.

 

So, now I'm running it as a service, and instead of 13-15 minutes for one unit, I'm getting as low as 7 minutes. Booooyah

Edited by Montol

Share this post


Link to post
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
×
×
  • Create New...