Jump to content
Sign in to follow this  
graysky

x264 video encoding benchmark

Recommended Posts

the readme is blank, i'm telling you that website put up bad files or somthing, but anyways it seems to be somehow working for other people.

No it's not...I just looked at it.

 

  x264 Benchmark

by graysky

 

WHAT IS THIS TEST?

 

Simply put, it is just a measure of fast your machine can encode a short video clip to a high quality x264 video file. What's x264 you ask? It's more or less the next generation xvid/divx for many people. I think it's ideal for a benchmark because the application (x264.exe) reports a fairly accurate benchmark (frames per second) for each pass of the video encode and it also uses multicore processors very efficiently.

 

You'll notice that the whole thing is pretty simplistic since I have no programming skills to speak of. The test basically, consists of the needed executables, the video file (donated by Adrian over at TechARP), the avisynth script, and the DGIndex project file all driven by a batch file that'll kick off the x264 encode and write a "results.txt" that you can upload along with your machine specs for comparison purposes.

 

The video content really doesn't matter since the clip is a DVD formatted, progressive, MPEG-2 stream, 720x480, 23.976 fps, etc. The important thing is that all those participating in the test use the same clip, the same version of x264, the same version of avisynth etc. so the results can be meaningful. That's the whole reason I put this together.

 

INITIAL SETUP

 

MAKE SURE that you unrared the contents of "x264-Benchmark.rar" to c:\work2 or else nothing will work!

 

If you don't have AVISynth 2.5.7 on your machine, you'll need it for the benchmark to run. You can get it from this Link. After you installed it, copy "DGDecode.dll" from the "Initial setup" directory to your C:\Program Files\AviSynth 2.5\plugins

 

RUNNING THE TEST

 

Simply double-click the shortcut named, "x264 Benchmark" and it will prompt you to name your results file. I usually just type in my CPU settings. For example, when running a factory settings, I'll use "9x266" or when I'm overclocking it, I'll use the multiplier and FSB from the overclock settings.

 

It really doesn't matter what you use; it's just to help you remember which set of results corresponds to which overclock. After you hit <ENTER> the benchmark will start.

 

It will run through a 2-pass encode of a small 720x480 video clip a total of five times and then write a file containing the results to the c:\work2 directory.

 

Please do not use your machine while the test runs to allow for an accurate result.

 

You can test your machine at different overclocked levels, but it is a good idea to also test it @ its stock level as a baseline so you can compare the overclocked results back to it.

 

REPORTING RESULTS

 

Simply post the contents of results.txt (or whatever you ended-up renaming it) to the main thread along with some hardware details which you can get from CPU-Z:

 

* Your processor model

* Your multiplier and FSB settings for the test

* Your motherboard's chipset

* Your memory timings (just the first 4) and the speed at which you're running your memory

* Your operating system

 

Example from the screenshots: Q6600, 9x333, P965, 4-4-4-10 @ 333 MHz, XP Pro SP2

 

Since the output is merely a text file, you can recognize the potential for people to cheat by simply making-up their own results, or skewing the real data. I don't know of a way to eliminate this (as I said, I really have no experience programming). All I can say about this is please don't. No one will think you're cool because you have faster numbers!

 

Finally, I don't plan on entering everyone's results into the 'official' table; it would make the table of results too massive and difficult to read. This benchmark really wasn't meant to be a dairy or scoreboard for people. I think it can be a cool way to compare your results to a few reference machines. My hope is to populate the table with results from several processors and new chips (such as the AMD Phenom and the Intel Penryn when they are released).

 

Enjoy and thanks for participating!

UNNECASSARY INFORMATION

 

All the info below isn't required for the test. It's here if you're curious about the video file, x264 encoding, more specifics about this benchmark etc.

 

Q : What is avisynth and why did I have to install it?

 

Avisynth is a frameserver. It allows the DVD video (MPEG-2 format) to be fed to x264 as if it were a standard avi file. x264 cannot read MPEG-2 videos without it.

 

Q: What does 480p mean? Is it like 720p or 1080i?

 

Well, technically, it's 480p24. The 480 means there are 480 vertical lines of resolution. The p means the video frames are progressive (i.e. not interlaced), and the 24 means there are 24 frames per second of video. 480p24 is the standard for most SDTV (standard definition TV) movies on DVD. The standard horizontal resolution of a 480p formatted DVD is usually 720 thus giving a 720x480 picture. It can be other values as well.

 

Yes, it's like 720p only instead of 720 vertical lines, there are 480. It's also like 1080i only instead of 1080 vertical lines, there are 480. The other difference is that these are progressive frames, not interlaced frames. Also, as you may or may not know, 720p and 1080i (or 1080p) are both HDTV (high definition) formats. As I mentioned above, 480p (and 480i) are SDTV (standard definition formats).

 

Roughly to scale, here are the differences in resolution between the various SD and HDTV formats:

 

If you think of them like your digital camera (i.e. in mega pixels or MPixels):

 

720x480 = 0.35 MPixels

1280x720 = 0.92 MPixels

1920x1080 = 2.07 MPixels

 

Q: What's the difference between progressive and interlaced frames?

 

If you've ever made a cartoon flipbook in a notepad, you've made a progressive frame movie in a sense. Progressive frames are just pictures set to display in a series. If you stop at any given frame, you'll see 100 % of the image as if it was a photograph. These photos are displayed in sequential order 1, 2, 3, etc.

 

Interlaced on the other hand is more complicated. Over simplified, an interlaced image is like half of a progressive image (the odd numbered lines) followed by the other half of a progressive image (the even numbered lines) displayed at a higher framerate and in a different order (termed 3:2 pulldown). For more on these concepts, see the following wikipedia articles: here and here.

 

Q: What's 3:2 Pulldown?

 

This is getting more difficult to explain without pictures, so I'll point you to the following links: here and here.

 

Q: Why does it encode two passes per file?

 

You can encode video in a single pass, but two passes will typically give a higher quality result. The first pass scans through the entire clip analyzing it so that the 2nd pass can use more bits on particular scenes and less on other scenes thus giving a higher quality (and more efficient) result.

 

In a single pass encode, the encoder doesn't have any idea what's coming up next in the video and is forced to guess based on what it's currently seeing. By the way, other video formats take advantage of multi-pass encodes such as xvid, divx, MPEG-2 (this is the format DVD movies use), etc.

 

Q: Why does the first pass encode faster than the 2nd pass?

 

The first pass is just a scanning pass. This analysis can occur much faster than the actual encode (the 2nd pass) can. No doubt, you see this reflected in your benchmark results. On my machine, the 1st pass occurs roughly 4x faster than the 2nd pass.

 

Q : Why doesn't the first pass use 100 % of my multi-core processor?

 

As mentioned above, the 1st pass simply isn't as CPU-intensive as the 2nd pass. It' not uncommon for less than 100 % usage on the first pass:

 

Q: What does all the info displayed in the beginning of the test mean?

 

Starting at the top. The Qf is an outdated measure of video quality that has been replaced by other benchmarks.

 

You can calculate it from the following formula:

 

Where v is the vertical resolution, h is the horizontal resolution, f is the frame rate, and b is the bit rate. So it's really a measure of bits per pixel-frame. Even though there is much more than Qf that dictates video quality, it does still give you a rough idea. Typical values for Qf are between 0.10 and 0.25. The lower the number, the less bits/pixel-frame you have. Usually, people say that a Qf between 0.20 and 0.24 gives a "high quality" x264 video.

 

The rest of the paragraph reflects the x264 settings used in the commandline for the 2nd pass encode. Some of these are just settings controlling what gets displayed to the screen (such as --progress and --quiet); I'll provide a short description for the settings that affect the video quality. Just as an FYI, you can get the full help on x264 if you simply run it with the --fullhelp option. I literally copy/pasted most of this from the contextual help file provided with meGUI and some online guides which I referenced when I used their content:

 

x264 --pass 2 --progress --quiet --bitrate 1823 --stats "1.stats" --ref 5 --mixed-refs --no-fast-pskip --bframes 3 --b-pyramid --b-rdo --bime --weightb --direct auto --subme 6 --trellis 1 --analyse all --8x8dct --threads auto --thread-input --me hex --sar 427:360 --no-dct-decimate --no-psnr --no-ssim --output "C:\work2\run1-480p.mp4" "C:\work2\test\test-480p.avs" 2>&1 | tee run1pass2.log

 

ref 5 - Uses 5 reference frames. This setting controls how many previous frames can be referenced by a P- or B-frame. The greater the number, the higher the quality up to a point, also the greater the number, the slower the encode. It's generally accepted that using more than 5 reference frames isn't practical since 5 have reached a point of diminishing returns as the increase in quality is so marginal in comparison to the increase in time required to encode them.

 

mixed-refs - Allows each 8x8 or 16x8 partition in a macroblock to independently select a reference frame, as opposed to only one reference frame per macroblock.

 

no-fast-pskip - Disables the "fast-pskip" option since; leaving pskip enabled can speed up the encode, but can lead to visual artifacts in flat scenes and gradients.

 

b-frames 3 - Uses a maximum of 3 consecutive B-frames. Bi-directional Predictive Frames are highly compressed as they only store the data that has changed from the previous frame or that is different from the next frame. B-frames generally have less quality than I- or P- frames but can increase the overall quality of the video by storing data very efficiently.

 

b-pyramid - Allows B-frames to be used as references for other B-frames, increasing compression efficiency when 2 or more B-frames are used.

 

b-rdo - Enables RDO (Rate Distortion Optimization) mode decision for B-frames. Improved motion estimation for B-frames at the cost of encoding speed.

 

bime - Enables Bi-Directional Motion Estimation. An additional options which searches for forwards and backwards vectors when encoding a bi-directional B-frames, improving quality.

 

weight b - Enables B-frame Weighted Prediction. Uses"brightness" weighting of B-frames which improves fades and color gradients i.e. the sky in background.

 

direct auto - Sets B-Frame Mode to auto (options are none/auto/spatial/temporal). This setting determines how motion vectors for B-frames are derived. Spatial uses neighboring blocks in the same frame which may result in higher PSNR. Temporal makes use of neighboring frames which may be perceived as higher quality. Auto selects direct mode per frame.

 

subme 6 - Sets Subpixel motion estimation and partition to level 6 (options are 1-7). Controls the decision quality of motion estimation. Lower values will make faster and less accurate decisions. Higher values will improve quality but will slow encoding speed. Every guide/document I have ever read recommends a setting NO lower than 5. 6 is widely accepted as a good settings for high quality videos.

 

trellis 1 - Tells the encoder when to use Trellis Quantization (options 0/1/2 where 0 is off, 1 is only on macroblocks, and 2 is always). A setting of 2 (which is on all frames) drops the 2nd pass encode speed by about 50 %.

 

analyse all --8x8dct- Macroblock analysis size options. A setting of all means to analyze all macroblock size options. The 8x8dct option is also required when encoding to high profiles.

 

threads auto - Sets the number of threads that x264 will use to 1.5 times your number of processor cores. Therefore, a dual core chip would use 3 threads; a quad core chip would use 6 threads, etc.

 

thread-input - Gives avisynth its own thread.

 

me hex - Motion Estimation Mode (options are diamond/hexagon/uneven-multi-hexagon/exhaustive). Controls the method used to search for motion vectors. The hex option searches horizontally, vertically, then diagonally. Uneven-multi-hex does the same thing except it does so via hexagons of varying sizes thus giving a higher quality result. Exhaustive search is not recommended for normal use as it is exceptionally slow and does not provide significant quality increases.

 

sar 437:360 - Source Aspect Ratio. Sets the SAR to 437/360 which for some reason with this MPEG-2 source gives the correct 16:9 ratio in the final video file. Normally, this is set to the AR of your video source. For example, 16:9 or 4:3 or 37:20 (for anamorphic wide screen), etc.

 

no-dct-decimate - Disables DCT-decimation. The following is taken from DeathTheSheep's x264 guide: Decimation

Share this post


Link to post
Share on other sites

Looks like an interesting benchmark. Maybe you could use the Nullsoft Scriptable Install System (NSIS), a free installation packaging tool, to make the install process simpler. It's not exactly hard, but a "one-click" install would get you a lot more users imo

 

E.g.

 

Install to C:\Work2, or "C:\Program Files\x264 Benchmark Tool" might be better

Offer optional tick/check box for AviSynth 2.5.7

Add DGEncode.dll to AviSynth plugins directory (automatically, or with manual dialog "Browse to your AviSynth plugins folder")

Create shortcuts in a folder in the Start Menu

 

If you extract your rar package to C:\Work2, the files end up at C:\Work2\work2, because of the embedded folder...

Also, maybe you should ask for stating dual or single channel RAM configurations, as you may have results with dual-channel capable motherboards running only a single stick. A small percentage yeah, but it could easily happen... maybe also 32-bit / 64-bit for the OS

 

Anyway, here are my results

 

Opteron 170 (Denmark = Toledo)

9 x 306 = 2754 MHz

NF4 (DFI NF4 Ultra-D/SLI-D)

3-4-4-8-1T @ 250.4 MHz (Dual Channel)

XP Pro SP2 x86

 

---------- RUN1PASS1.LOG
encoded 1749 frames, 68.55 fps, 1850.89 kb/s

---------- RUN2PASS1.LOG
encoded 1749 frames, 68.84 fps, 1850.89 kb/s

---------- RUN3PASS1.LOG
encoded 1749 frames, 69.01 fps, 1850.89 kb/s

---------- RUN4PASS1.LOG
encoded 1749 frames, 68.84 fps, 1850.89 kb/s

---------- RUN5PASS1.LOG
encoded 1749 frames, 68.71 fps, 1850.89 kb/s

---------- RUN1PASS2.LOG
encoded 1749 frames, 16.59 fps, 1826.38 kb/s

---------- RUN2PASS2.LOG
encoded 1749 frames, 16.56 fps, 1826.38 kb/s

---------- RUN3PASS2.LOG
encoded 1749 frames, 16.59 fps, 1826.37 kb/s

---------- RUN4PASS2.LOG
encoded 1749 frames, 16.54 fps, 1826.26 kb/s

---------- RUN5PASS2.LOG
encoded 1749 frames, 16.58 fps, 1826.38 kb/s

480p_results_9x306.txt

Share this post


Link to post
Share on other sites

not a total moron :P

 

I thought they were different for a while... like the Opteron production line was somehow different to the A64 line... and so the cores, although identically spec'd, had some sort of difference. Maybe they are, maybe a Toledo core has severed interconnects for the 2-way, 4-way and 8-way versions... maybe the A64 does too...

 

But it doesn't really matter, Toledo and Denmark are interchangeable, in the same way that Venus and San Diego are (as long as you are talking about the same number of cores, e.g. a single core Toledo is a bit of a misnomer, it's a Venus/San Diego with the other core disabled)

Share this post


Link to post
Share on other sites

Thanks for the result, hardnrg. I toyed with the idea of using ISS or NIS but I wanted people to see that the benchmark, although based on a batch file, wasn't a virus/dangerous etc. I think making people manually copy a file is as transparent as I can be!

Share this post


Link to post
Share on other sites

Oh, no, I wasn't talking about security issues... I'm talking about laziness... Most people won't bother with your benchmark because you need to download it, manually extract it to a specific folder, download and install AviSynth, move/copy a .dll file, run the batch file, type a filename, and then attach or paste the results file here...

 

That's a lot of steps for basically no reason apart from helping a random person build a performance statistics database with no clear purpose.

 

I think creating a simple GUI front-end executable and a one-click installer will help you get much greater results. If you can't create a GUI, then just the installer would help a *lot*.

Share this post


Link to post
Share on other sites

Jeez five runs seems a bit much! I think the average of two would be fine. Cool idea though, this is what I would consider a real-world benchmark. I don't try to compute the longest calculation of Pi, but I do actually use this encoder.

Share this post


Link to post
Share on other sites
Jeez five runs seems a bit much! I think the average of two would be fine. Cool idea though, this is what I would consider a real-world benchmark. I don't try to compute the longest calculation of Pi, but I do actually use this encoder.

 

It's real-world since most people use x264 to encode videos :) You planning to give it a whirl on your system?

Share this post


Link to post
Share on other sites

As of 20-Sep-2007, we have data on over 100 Intel-based systems and on over 40 AMD-based systems. There are a few trends I picked-up on while browsing through the database. I put them into a single table and color coded them to make them easier to see. If you see a trend I missed, lemme know and I'll add it to the table.

 

Request: we don't have a single example of a machine that has both WinXP and WinVista on it. If you have a dual-boot setup, it would be cool to see the difference the O/S makes. Another missing trend is a 32-bit O/S vs. the same O/S that's 64-bit.

 

On to the table:

 

resultstrendsml9.gif

 

Yellow: Nearly 1:1 increase by adding an additional processor to a dual-chip MB

Orange: Some operating systems seem to handle x264 more efficiently than others

Red: Insignificant gain by upping the DRAM speed by 50 %

Blue: For the most part, these chips scale in a pretty linear fashion

Green: Tighter/looser memory timings have a pretty insignificant effect

Purple: Keeping the same over-all clock speed using a different combo of multiplier and FSB can give pretty insignificant gains

 

Again, I only gave this a once-over look; please point out any trends you see that I missed and also don't forgot about the O/S request!

 

Thanks again to all who contributed!

Edited by graysky

Share this post


Link to post
Share on other sites

C2D E6700

10 x 350 = 3500MHz

MSI P6n SLI Plat (MS-7350)

5-5-5-15 @ 350 MHz (Single Channel)

Vista U 32bit.

 

---------- RUN1PASS1.LOG
encoded 1749 frames, 91.46 fps, 1850.89 kb/s

---------- RUN2PASS1.LOG
encoded 1749 frames, 88.45 fps, 1850.89 kb/s

---------- RUN3PASS1.LOG
encoded 1749 frames, 85.82 fps, 1850.89 kb/s

---------- RUN4PASS1.LOG
encoded 1749 frames, 87.34 fps, 1850.89 kb/s

---------- RUN5PASS1.LOG
encoded 1749 frames, 85.38 fps, 1850.89 kb/s

---------- RUN1PASS2.LOG
encoded 1749 frames, 23.68 fps, 1826.37 kb/s

---------- RUN2PASS2.LOG
encoded 1749 frames, 23.31 fps, 1826.37 kb/s

---------- RUN3PASS2.LOG
encoded 1749 frames, 22.49 fps, 1826.37 kb/s

---------- RUN4PASS2.LOG
encoded 1749 frames, 22.95 fps, 1826.37 kb/s

---------- RUN5PASS2.LOG
encoded 1749 frames, 23.35 fps, 1826.37 kb/s

Edited by M3NF

Share this post


Link to post
Share on other sites
Request: we don't have a single example of a machine that has both WinXP and WinVista on it. If you have a dual-boot setup, it would be cool to see the difference the O/S makes. Another missing trend is a 32-bit O/S vs. the same O/S that's 64-bit.

If I can get my XP install to boot again (I sorta borked it up) I'll try it this weekend.

Share this post


Link to post
Share on other sites

CPU: 400x8 (3.2GHz) - e2140
Chipset: Intel P35 (abit IP35-E)
Memory: 4-4-4-12 @ 400MHz (Dual-channel DDR2 [800MHz effective])
OS: Windows Vista Ultimate x86

---------- RUN1PASS1.LOG
encoded 1749 frames, 87.11 fps, 1850.89 kb/s

---------- RUN2PASS1.LOG
encoded 1749 frames, 87.86 fps, 1850.89 kb/s

---------- RUN3PASS1.LOG
encoded 1749 frames, 87.79 fps, 1850.89 kb/s

---------- RUN4PASS1.LOG
encoded 1749 frames, 87.80 fps, 1850.89 kb/s

---------- RUN5PASS1.LOG
encoded 1749 frames, 88.77 fps, 1850.89 kb/s

---------- RUN1PASS2.LOG
encoded 1749 frames, 20.93 fps, 1826.38 kb/s

---------- RUN2PASS2.LOG
encoded 1749 frames, 21.31 fps, 1826.37 kb/s

---------- RUN3PASS2.LOG
encoded 1749 frames, 21.28 fps, 1826.37 kb/s

---------- RUN4PASS2.LOG
encoded 1749 frames, 21.19 fps, 1826.37 kb/s

---------- RUN5PASS2.LOG
encoded 1749 frames, 21.26 fps, 1826.37 kb/s

 

I'm actually pretty surprised by this chip (it's up there with M3NF's E6700 @ 3.5GHz???). Every 33MHz I add on seems to yield another FPS. I think I can break 100FPS if I'm willing to risk a meltdown :D

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

×
×
  • Create New...