Jump to content

Bottlenecks Re-examined: Lightmark an OpenGL Benchmark


Thraxz

Recommended Posts

Purpose:

Bottlenecks are the point at which one factor of a system slows down the rest. A similar concept is the 'rate limiting' step in chemical reaction chains. In computers, specifically gaming, they appear frequently. Despite their common occurrence, the general populace seem to have little or no knowledge of how exactly they work and effect system performance. As such, bottlenecks are important to study and characterize more clearly so that a firm grasp might be obtained by all and will be reflected in their choice of hardware.

 

Experiment:

The experiment was run on the new Lightmark graphics bench mark with an eVGA 8800GTX at 650/1800 clocks, a Q6600 processor at 3.0Ghz and DDR2 ram at 1000MHz and 5-5-5-12 timings. This was all on a DFI Infinity 975X/G motherboard.

 

The procedure was to set all quality fields in the drivers to the highest setting (meaning highest quality) and then take two sets of data points. One set of data points would be collected in a progressive series of benchmark runs of standard 4:3 resolutions from 640x480 through 1600x1200 at 0xAA/0xANISO. The second set would be collected in the same resolutions, except at 16xQ CSAA/16x ANISO, also set in the drivers.

 

Results:

Res ******FPS 0x/0x****FPS 16x/16x

640x480*****257.3********208.3

800x600*****248.3********190

1024x768*****239.7*******172

1280x1024****235.3*******123.5

1600x1200****219.5*******91.3

 

Analysis:

The % pixel increase over the baseline (640x480) curve, you can clearly see that it increases exponentially. Since graphical output (FPS) of a given card working as fast as it can is inversely related to the graphical load, you can see that the graphical output will be decreasing exponentially. The CPU load is not as dependent on the raw pixel load, as the GPU was made specifically to do that job. However, it still will receive a SMALL amount of extra load as it must direct process and direct more information to the GPU for graphical processing. From this we can conclude that the CPUs exponential curve will be significantly shallower than the GPUs. This situation is ripe for what are considered bottlenecks. At the lower resolution the graphical load will be light allowing the GPU to output extremly high FPS. The CPU's load will lighten, but very little, allowing it to only output a few more FPS. Since the machine can only output as many FPS as the SLOWEST COMPONENT of the system you'll have what is called a bottleneck, since the CPU will not be able to output as many FPS as the GPU. The observed FPS output of the system will therefore be only as much as the CPU can output.

 

attachment.php?attachmentid=59599&stc=1&d=1194776436

 

In the first graph, there is an extremely high amount of agreement between the exponential trendlines of the data sets and the actual data set, this is shown by the 0x/0x data set being nearly perfect with an R value of ~1. The single curve shown is consistent with a CPU bottleneck as there is no obvious intersection of outputs where a steeper graphical output would overtake the CPU curve as the resolution increased. There is to be noted that at the 1600x1200 resolution 0/0 value is less in agreement with the exponential curve than the rest of the data points in the set, but only slightly. With that result truncated from the data set the R value increases further to ~0.99 for the exponential trendline, indicating better accuracy. This could be caused by the GPU just beginning to intersect and become the limiting component. Another test at a higher resolution would show with more certainty but there was no monitor of that resolution to test on.

 

The 16x/16x has a very high R value indicating accuracy, however, its R value is significantly lower. Upon closer visual inspection, there is clearly the expected intersection between the CPU output curve and the GPU output curve at 1024x768 resolution. At 1280x1024 and 1600x1200 there is an obviously steeper rate of FPS decline than at 1024x756 or lower resolution. To test to see if this is not an anomaly I made two separate curves of the single 16x/16x data set. One of 640x480 through 1024x756 and another of 1024x756 through 1600x1200.

 

attachment.php?attachmentid=59600&stc=1&d=1194776436

 

The shallow, low resolution curve is clearly very close to the same curve as the fully CPU bound 0x/0x curve indicating that it is indeed a CPU bound section of the 16x/16x data set. It should be noted that the curve IS placed approximately 50FPS lower than its CPU bound homolog 0x/0x indicating that either AA or ANISO DOES have a negative impact upon CPU output.

 

attachment.php?attachmentid=59601&stc=1&d=1194776436

 

The second, steeper curve created from the 16x/16x data set needs to be shown that it is actually GPU bound, and not a mysterious depression of CPU output. This will be done by comparing the observed data points to the theoretical values predicted. Because there are only two points of this curve that are unequivocally decreasing more steeply, analysis will be done on them, the 1280x1024 and 1600x1200 resolutions.

 

Theoretically, 1600x1200 resolution is approximately a 46% heavier pixel load upon the GPU than 1280x1024. From that, we can figure that the system output, if GPU bound would be relatively close to this value. Taking the 1280 res point and dividing it by the 1600 data point (as it's inversely proportional to load) we should see a ~46% difference. We see about ~39% decrease in performance, clearly supporting the claim that it is a largely GPU bound result!

 

Conclusions:

 

It has become clear that both the CPU and GPU curves are exponentially decreasing under exponentially increasing graphical load. It has also been clarified that the GPU curve is steeper, beginning at a much higher FPS but sloping down quickly and finishing at a much lower FPS than the CPU. It is upon this common mechanism that virtually all bottlenecking occurs. To those who fear bottlenecking so much, there is such a thing as a GPU bottleneck as well. The only way to avoid this, is to ride that inflection just between being CPU bound and GPU bound. This can be adjusted by increasing graphical load by adding detail. But with this particular application and hardware, it's at 1024x756 res 16xAA/16x ANISO or, as best as we can tell, at 1600x1200 res 0xAA/0xANISO. Here is a hastily thrown together graph of how the GPU/CPU curves intersect.

 

attachment.php?attachmentid=59603&stc=1&d=1194781389

 

It is also well to keep in mind that your LCD monitor, should it go over the venerable 1280x1024 resolution only has a 60FPS refresh rate. Meaning, your monitor is REALLY bottlenecking all performance above 60FPS REGARDLESS of graphical load or CPU/GPU balancing.

Share this post


Link to post
Share on other sites

So this would mean an amd cpu...say model [email protected] and an 8800gts320 at 588/920 the cpu would feed enough information to the card to keep it's output @ over 60fps during your test with aa and af off? Thereby decreasing the chances of it dipping below the 60fps rate? Is that right?

 

What I'm trying to determemine is how powerful a cpu (amd please, cuz I'm not familiar with intel cpu's) would it take to make the bottlenecking a moot point [because of 60fps on lcd cap] with for instance playing a game with aa and af in "application control mode" which is usually around afx2-4 and aa4-8?

 

Nice job on the research Thaxz, thanks for your time and effort!

Share this post


Link to post
Share on other sites

A 3.0GHz AMD is the gaming equivalent of a 2.4Ghz Intel. I suggest that a 8800GTS will NOT be CPU bottlenecked at 16x/16x with any resolution above 1024x756. If it were, it would be insignificant in the face of the 60Hz monitor refresh rate.

Share this post


Link to post
Share on other sites

Well done, Thraxz!

 

It's refreshing to see some data vs. opinion on a subject like this. The amount of passion and discussion expended on GPU's and FPS always astounds me, and it's all opinion for the most part.

Share this post


Link to post
Share on other sites

Yea very nice. Some people do use the word bottleneck incorrectly. THey recomend a Core2 over the AMD becuase the core2's are jsut faster overall, and the newer games need the faster speeds/cores.

 

The CPU isnt bottlenecking the graphics performance. Me running crysis it uses both cores at 100%, so 200%. my friend has a quad, and it runs all 4 of his at 75%, which means 300%. So im losing that performance, 100% of that performance.

Share this post


Link to post
Share on other sites

What I'm trying to determemine is how powerful a cpu (amd please, cuz I'm not familiar with intel cpu's) would it take to make the bottlenecking a moot point [because of 60fps on lcd cap] with for instance playing a game with aa and af in "application control mode" which is usually around afx2-4 and aa4-8?

 

Theoretically, the 3.0GHz AMD will only be 80% the speed of a 3.0Ghz Intel. Given the curves found for the CPU performance found in my article, that puts the 16x/16x CPu performance at ~150FPS at 16x12 assuming the AMD's would be 80% of that you'd STILL be at ~120FPS.

Share this post


Link to post
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
×
×
  • Create New...