English  

 
產品介紹 > 中介軟體 > 平行程式編譯套件 > PathScale

平行程式編譯套件 -PathScale
  EKOPath Compiler Suite

實測結果

PathScale EKOPath Compiler Suite 是以AMD Opteron處理器運作的Linux作業系統上具有最高效能的64-bit編譯器,通過測試比較,我們的效能優勢將是顯然易見的!

 請點選連結以觀看比較數值

AM2 速度增加了 36% SPEC CPU2000 速度增加了26%
HIMENO BMT 速度增加了20% SPEC OMP2001 World Record Results at 2P&4P
POLYHEDRON 速度增加了33% STREAM 速度增加了 16%
QUANTUM MONTE CARLO 速度增加了73%

通常,由您手上的特定源碼所得的測試結果當然會有所差異,但我們相信使用PathScale C、C++及Fortran 編譯器將會為AMD Opteron及Athlon64用戶帶來立即的效益!

你手上有PathScale EKO編譯器的測試結果(benchmark)想要分享給其他用戶知道嗎?請不要忘記告知 Pathscale Benchmark Bounty Program.

  AM2 atmosphere model code

This code used at the University of Utah's Meteorology Department for climate research. The code consists of several closely coupled modules and is parallelized with MPI. It is written with Fortran 95 constructs.

Results for this benchmark were run independently at the University of Utah and published with their permission.

 

1CPU
(時間愈少愈好)
2CPU 4CPU
PathScale v1.2 368.89 sec 201.88 sec 99.11 sec
PGI v5.2 483.45 sec. 253.38 sec 135.53 sec
% Faster for PathScale 31.1% 25.5% 36.7%


  HimenoBMT "The Performance Evaluation"

The Pathscale EKOPath Compiler Suite produces the excellent single CPU and 4-CPU OpenMP results using the popular Himeno benchmark:
http://w3cic.riken.go.jp/E/HPC_e/HimenoBMT_e/index_e.html

Serial results on Opteron 2.2 GHz, PC3200

 

F77
F90
MFLOPS
C
PathScaleEKOPath 2.0
1584
1189
267
64-bit Commercial Compiler
1419
1125
141
GNU compilers(3.4.3 & g95)
1002
588
213
EKOPath Advantage
Commercial 64-bit compiler
+12%
+6%
+89%
GNU compilers(3.4.3 & g95)
+58%
+102%
+25%


4-thread OpenMP Results on 4-CPU (Microway) 2.2 GHz Opteron, PC3200 server

 

4 thread MFLOPS
EKOPath Advantage
Original Himeno F77 OpenMP code EKOPath 2.0
1696
Commercial 64-bit compiler
1691
+16%
PathScale-modified* Himeno F77 OpenMP code EKOPath 2.0
5155
Commercial 64-bit compiler
4309
+20%

* System & Compiler Flag & source code Details_


SPEC® and the benchmark names SPECfp® and SPECint® are registered trademarks of the Standard Performance Evaluation Corporation. AMD, AMD Opteron, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Linux is a registered trademark of Linus Torvalds. All other trademarks and company names mentioned are the property of their respective owners.


  POLYHEDRON

The PathScale EKOPath Compiler Suite produces the fastest and most accurate results for AMD Opteron systems for the Polyhedron 2004 Fortran 90 and Fortran 77 benchmarks.
PathScale is the only compiler vendor with no red squares for 64-bit results (meaning that on no code is 64-bit EKOPath 50% or more slower than the fastest compiler on a benchmark code).

Here are some comparisons obtained from those web pages:

Polyhedron 2004 F77 Benchmarks 64-bit Compiler Comparisons

 

Geometric mean time in seconds
PathScale % Faster
PathScale EKOPath 2.1
19.16
Commercial Compiler A
25.54
+33%
Commercial Compiler B
22.75
+19%
Commercial Compiler C

29.26

+53%

olyhedron 2004 F90 Benchmarks 64-bit Compiler Comparisons

 

Geometric mean time in seconds
PathScale % Faster
PathScale EKOPath 2.1
22.76
Commercial Compiler A
26.25
+15%
Commercial Compiler B
28.41
+25%
Commercial Compiler C
35.30
+55%

EKOPath 2.1 64-bit optimization flags:
F77: -O3 -LNO:fu=9 -OPT:div_split:fast_math:fast_sqrt -IPA:plimit=3500
F90: -Ofast -OPT:fast_math=on -WOPT:if_conv=off -LNO:fu=9:full_unroll_size=7000

  QUANTUM MONTE CARLO

Monte Carlo methods are extremely important in computational physics and related applied fields, and have many diverse applications. PathScale compilers do particularly well in Monte Carlo codes.

Results for this benchmark were run independently at Los Alamos National Laboratory and published with their permission.

 

Time
(時間愈少愈好)
PathScale % Faster
PathScale v1.0 78.08 sec  
PGI v5.1 135.80 sec. 73.93%
GCC v3.4.0 111.01 sec. 42.20%

Compiler settings for Quantum Monte Carlo C++ application
PathScale v1.0 : pathCC -64 -ansiE -Ofast -ffast-math
PGI v5.1 : pgCC -Kieee -fastsse -O3 -Minline=levels:10 -Msafeptr=global -Mvect=sse -Mvect=assoc -Mvect=cachesize:1048576 -Mvect=prefetch
GCC v3.4.0 : g++ -O3 -ffast-math -mtune=opteron -mfpmath=sse,387 -mieee-fp -m64

For more information on this benchmark go to: http://sourceforge.net/projects/qmcbeaver .

  SPEC® CPU2000

The PathScale EKOPath Compiler Suite enables the highest performance results for both integer and floating point SPEC CPU2000 speed benchmarks for any AMD64-based LinuxR system. The best evidence for this is that since October 2004 through August 26, 2005, on AMD processors and Linux operating systems, there have been 186 CPU2000 results published at www.spec.org using PathScale compilers and none with other compilers.

Since there are no results with competitive compilers published recently on the SPEC web site, we ran our own comparison to a competitive compiler using latest compilers for each with the following results:

 

SPECint®2000
SPECfp®2000
PathScale EKOPath™ v2.2.1
1598
1984
PGI® Workstation 6.0-5
1269
1779
% Faster for PathScale
+26%
+12%

Benchmarks were run on a 2.2 Ghz 1-CPU system with DDR400/PC3200 memory. Full details on the compiler flags and configuration used are available here . If anyone can provide us with improved base or peak optimization flags for the competitive compiler, we will be happy to use them and update these results.

Results Published by Our Partners Using the PathScale EKOPath Compiler Suite -- Including Dual-Core Opteron Results

Recently, AMD and HP have chosen to submit SPEC CPU2000 results for dual-core Opteron (for example, Opteron Models 275 and 875) systems with PathScale EKOPath Compilers. Also, IBM, HP, Fujitsu-Siemens, Sun, and AMD continue to choose the PathScale EKOPath Compiler Suite to get the highest level of performance from their AMD64-based Linux® systems.

AMD

HP SUN
SPECint2000 SPECint2000 SPECint2000
SPECfp2000 SPECfp2000 SPECfp2000
SPECint_rate2000
(4 CPU, dual core)
SPECint_rate2000
(4 CPU, dual core)
SPECint_rate2000 (4CPU)
SPECfp_rate2000
(4 CPU, dual core)
SPECint_rate2000
(4 CPU, single core)
SPECint_rate2000 (2CPU)
SPECint_rate2000
(2 CPU, dual core)
SPECint_rate2000 (2CPU) SPECfp_rate2000 (4CPU)
SPECfp_rate2000
(2 CPU, dual core)
SPECfp_rate2000
(4 CPU, dual core)
SPECfp_rate2000 (2CPU)
  SPECfp_rate2000
(4 CPU, single core)
 
  SPECfp_rate2000 (2CPU)  

IBM

Fujisu
SPECint2000 SPECint2000
SPECfp2000 SPECfp2000
SPECint_rate2000 SPECint_rate2000 (2CPU)
SPECfp_rate2000 SPECfp_rate2000 (2CPU)

  SPEC® OMP2001

"Sun tested 2-way Sun Fire V20z and 4-way Sun Fire V40z servers using multiple SPEC benchmarks, including the SPECR ompM2001 suite of OpenMPR benchmarks. The PathScale EKOPath Compiler Suite helped Sun's AMDR OpteronR processor-based servers set world records for SPEC ompM2001 on two-processor and four-processor systems. The Sun/PathScale two-processor results were 29 percent faster (footnote 1) than previous-best Linux ompM2001 benchmarks using non-PathScale Fortran and C compilers. This 29 percent advantage, enabled in large part by PathScale compilers, far exceeds the eight percent faster clock rate of the newer Sun systems."

SUN

SPECompM2001 (2 thread, 2 core)
SPECompM2001 (4 thread, 4 core)

Footnote 1:
(1) About the SPEC OMPM2001 Results Reported Above:

Two-Processor Results: The Sun V40z server with PathScale EKOPath Compiler Suite and 2.6 GHz AMD Opteron CPUs achieved a result of 6486 on a system with two cores, two chips and two threads. This comparison is based on the best performing two-processor Linux servers currently shipping, including previous results with competitor's compiler on a 2.4 GHz Sun Java Workstation W2100z system [SPECompM2001 5085, two cores, two chips, two threads].

Four-Processor Results: The Sun V40z server with PathScale EKOPath Compiler Suite and 2.6 GHz AMD Opteron CPUs achieved a result of 11223 on a system with four cores, four chips and four threads. This comparison is based on the best performing four-processor Linux servers currently shipping, including previous results on a 2.4 GHz Sun V40z system with a non-PathScale compiler [SPECompM2001 8694, four cores, four chips, four threads].

  STREAM

The Pathscale EKOPath Compiler Suite produces the highest single-CPU and OpenMP Parallel STREAM results for any system powered by AMD CPUs.

OpenMP

Machine ID (數值愈高愈好)

ncpus
COPY
SCALE
ADD
TRIAD
AMD_Opteron_848 (EKOPath 2.0)
4
15378
15845
15618
15921
PathScale EKOPath 2.2
4
16872
16932
16543
16545
% Faster for PathScale 2.2 vs. 2.0
+8%
+7%
+6%
+4%


Single CPU

Machine ID (數值愈高愈好)

ncpus
COPY
SCALE
ADD
TRIAD
ASUS_SK8N_Opteron248
(EKOPath 2.0)
1 4811 4782 4685 4682
ASUS_SK8N_Opteron248
(Comm'l 64bit Compiler)
1 4304 4251 4497 4458
PathScale EKOPath2.2 1 4902 4871 4979 4987
% Faster for PathScal   +14% +15% +11% +12%

The above results are for STREAM Benchmarks run on Opteron 248 (2.2 Ghz) machines with DDR400 memory and are posted at http://www.cs.virginia.edu/stream/standard/Bandwidth.html . Results for both PathScale EKOPath and our competitor are both identified as as 'ASUS_SK8N_Opteron248' and 'ASUS_SK8N_Opteron248 (1 CPU)'. Click on the data link at the right of those lines for more details on the submission.

* Results with 'PathScale EKOPath 2.1' are on the same system as the 2.0 results on the STREAM web site and use the following optimization flags:
OpenMP: pathf90/pathcc -O3 -CG:use_prefetchnta -LNO:prefetch_ahead=4 -mp
Serial: pathf90/pathcc -O3 -CG:use_prefetchnta .


PathScale

    PathScale 介紹 
    執行效益 
    實測結果 
    產品功能與特色 
    試用版下載 
    規格料號及訂購 
    型錄下載  
    技術文件白皮書