Intel Xeon Processor E5-2600 V4 and V3 Product Family Cpus

Articles > Detailed Specifications of the Intel Xeon E5-2600v4 "Broadwell-EP" Processors

This commodity provides in-depth discussion and assay of the 14nm Xeon E5-2600v4 serial processors (formerly codenamed "Broadwell-EP"). "Broadwell" processors replace the previous 22nm "Haswell" microarchitecture and are available for sale equally of March 31, 2016. For an introduction, read our blog mail service Intel Xeon E5-2600 v4 "Broadwell" Processor Review Note: these have since been superceded by the Intel Xeon Processor Scalable Family CPUs.

Of import changes available in E5-2600v4 "Broadwell-EP" include:

  • Upwards to 22 processor cores per socket (with options for four-, six-, 8-, 10-, 12-, fourteen-, 16-, 18-, and 20-cores)
  • Support for DDR4 memory speeds upwardly to 2400MHz
  • Floating Signal Education functioning improvements:
    • Faster floating betoken multiplier completes operations in 3 cycles (downwardly from v cycles)
    • 1024 Radix divider for reduced latency
    • Divide Scalar divides for increased parallelism/bandwidth
    • Faster vector Gather
    • As introduced with Haswell, Broadwell continues to support AVX2 and FMA3 instructions for significant speedups of floating-point multiplication and addition operations
  • Excerpt more parallelism in scheduling micro-operations:
    • Reduced instruction latencies on ADC, CMOV and PCLMULQDQ
    • Larger out-of-club scheduler, with 64 entries (up from 60 entries)
    • Improved address prediction for branches and returns, with an expanded 10-way Co-operative Prediction Unit Target Array (up from 8-manner)
  • Improved performance on big data sets:
    • Larger L2 Translation Lookaside Buffer (TLB), with i.5k entries (up from 1K entries)
    • A new L2 TLB for 1GB pages (with 16 entries)
    • Add-on of a second TLB folio miss handler for parallel page walks

With a product this complex, it'due south very difficult to cover every attribute of the blueprint. Here, nosotros concentrate primarily on the performance of the processors for HPC applications.

Exceptional Computational Performance

The Xeon E5-2600v4 processors provide the highest performance available to date in a socketed CPU. Many of the higher-finish models provide well over 500 GFLOPS (more than half a TFLOPS). Much of this performance is made possible through the use of AVX2 with FMA3 instructions. The plot below compares the peak operation of these CPUs with and without FMA instructions:

Plot of Xeon E5-2600v4 Theoretical Peak Performance (GFLOPS)

The colored confined indicate operation using but AVX instructions; the greyness bars indicate theoretical superlative performance when using AVX with FMA. Note that only a small set of codes will exist capable of issuing most exclusively FMA instructions (e.thou., LINPACK). Near applications will issue a variety of instructions, which will issue in lower than summit FLOPS. Expect the achieved performance for well-parallelized & optimized applications to fall between the greyness and colored bars.

Intel Xeon E5-2600v4 Series Specifications

The tabs beneath compare the features and specifications of the new model line. Intel has divided the CPUs into several groups:

  • Standard: cost-effective CPUs with moderate operation
  • Advanced: CPUs offering the highest performance for most applications
  • Loftier Core Count: ideal for highly multi-threaded applications; CPUs providing the highest number of processor cores (sometimes sacrificing clock frequency in favor of cadre count)
  • Frequency Optimized: ideal for not-parallel/unmarried-threaded applications; CPUs with the highest clock speeds (sacrificing number of cores in club to provide the highest frequencies)

Although these processors introduce pregnant functioning increases, technical readers will see that many of the changes are incremental: increased core counts, improved DDR memory speed, etc. However, processor clock speeds/frequencies have non seen significant improvements.

In fact, in some cases the CPU frequency has been lowered from the previous models. Processor frequency and Turbo Boost behavior take changed fairly significantly in the final two CPU releases ("Haswell" and "Broadwell"). Those metrics are discussed in further detail in the next department.

CPU Cores

Chart of Xeon E5-2600v4 Number of CPU Cores

Memory Speed

Chart of Xeon E5-2600v4 Memory Performance

L3 Enshroud

Chart of Xeon E5-2600v4 CPU L3 Cache Size

QPI

Chart of Xeon E5-2600v4 QPI Performance

TDP

Chart of Xeon E5-2600v4 CPU Wattage (TDP)

Specifications Table

Model AVX Frequency AVX Turbo Boost Core Count Retention Speed L3 Enshroud QPI Speed TDP (Watts)
E5-2699v4 1.eighty GHz 3.60 GHz 22 2400 Mhz 55 MB 9.6 GT/s 145W
E5-2698v4 xx 50 MB 135W
E5-2697Av4 ii.20 GHz three.10 GHz 16 40 MB 145W
E5-2697v4 2.00 GHz 3.sixty GHz eighteen 45 MB
E5-2695v4 1.70 GHz 3.ten GHz 120W
E5-2683v4 three.00 GHz 16 forty MB
E5-2690v4 ii.10 GHz 3.50 GHz 14 35 MB 135W
E5-2680v4 1.xc GHz 3.30 GHz 120W
E5-2660v4 1.seventy GHz three.xx GHz 105W
E5-2650v4 1.80 GHz ii.80 GHz 12 thirty MB
E5-2640v4 ii.00 GHz iii.twoscore GHz 10 2133 Mhz 25 MB viii GT/s 90W
E5-2630v4 1.80 GHz three.10 GHz 85W
E5-2620v4 3.00 GHz 8 20 MB
E5-2687Wv4 2.sixty GHz 3.40 GHz 12 2400 Mhz 30 MB 9.6 GT/south 160W
E5-2667v4 iii.50 GHz 8 25 MB 135W
E5-2643v4 ii.eighty GHz 3.60 GHz 6 twenty MB
E5-2637v4 3.20 GHz 4 fifteen MB
E5-2623v4 ii.xx GHz 3.20 GHz 2133 Mhz ten MB 8 GT/s 85W

HPC groups do not typically choose Intel's "Basic" and "Depression Power" models – those skus are not shown.

Clock Speeds & Turbo Boost in Xeon E5-2600v4 series "Broadwell" processors

With each new processor line, Intel introduces new architecture optimizations. The design of the "Broadwell" architecture acknowledges that highly-parallel/vectorized applications place the highest load on the processor cores (requiring more power and thus generating more oestrus). While a CPU core is executing intensive vector tasks (AVX instructions), the clock speed may exist reduced to keep the processor within its ability limits (TDP).

In effect, this may consequence in the processor running at a lower frequency than the "base" clock speed advertised for each model. For that reason, each "Broadwell" processor is assigned ii "base" frequencies:

  1. AVX fashion: due to the higher ability requirements of AVX instructions, clock speeds may be somewhat lower while executing AVX instructions *
  2. Non-AVX style: while not executing AVX instructions, the processor will operate at what would traditionally be considered the "stock" frequency

* a CPU core will return to Non-AVX manner 1 millisecond afterwards AVX instructions consummate

Information technology is worth noting that these modes are isolated to each core. Within a given CPU, some cores may be operating in AVX mode while others are operating in Not-AVX way. In the previous generation, AVX instructions running on a single core would crusade all cores to run in AVX style.

AVX and Non-AVX Turbo Boost

Just as in previous architectures, "Broadwell" CPUs include the Turbo Boost feature which allows each processor cadre to operate well in a higher place the "base" clock speed during most operations. The precise clock speed increase depends upon the number & intensity of tasks running on each CPU. However, Turbo Boost speed increases also depend upon the types of instructions (AVX vs. Non-AVX).

The two plots below show that processor clock speeds tin can be categorized as:

  1. All cores on the CPU actively running Non-AVX instructions
  2. All cores on the CPU actively running AVX instructions
  3. A single active cadre running Non-AVX instructions (all other cores on the CPU must be idle)
  4. A single agile core running AVX instructions (all other cores on the CPU must exist idle)

Clock Speeds for All-Core Operation

Diagram of Xeon E5-2600v4 CPU Frequency Speeds (comparing AVX and Non-AVX Instructions) running on all CPU cores

Clock Speeds for Single-Core Performance

Diagram of Xeon E5-2600v4 CPU Frequency Speeds (comparing AVX and Non-AVX Instructions) running on a single core

Notation that despite the clear rules stated above, each value is still a range of clock speeds. Because workloads are and then various, Intel is unable to guarantee one specific clock speed for AVX or Non-AVX instructions. Users are guaranteed that cores volition run inside a specific range, only each awarding will have to be benchmarked to determine which frequencies a CPU will operate at.

When examining the differences between AVX and Not-AVX instructions, notice that Non-AVX instructions typically result in no more than than a 100MHz to 200MHz increase in the highest clock speed. Still, AVX instructions may cause clock speeds to drop by 300MHz to 400MHz if they are peculiarly intensive.

Recollect that AVX2 introduces back up for both integer and floating-indicate instructions, which means any compute-intensive application will be using such instructions (if it has been properly designed and compiled). HPC users should expect their processors to be running in AVX style nearly of the time.

Meridian Clock Speeds for Specific Core Counts

When workloads leave some CPU cores idle, the Xeon E5-2600v4 processors are able to utilise that headroom to increment the clock speed of the cores which are performing piece of work. Just as with other Turbo Boost scenarios, the precise speed increase will depend upon the CPU model. It will also depend upon how many CPU cores are active.

We advise users to consider how many CPU cores their application is able to saturate. The tabs beneath particular the peak Turbo Heave frequencies for each CPU model, sorted past the number of active cores:

One to 2 cores

Chart of Xeon E5-2600v4 CPU Frequency for single-core and dual-core applications

three

Chart of Xeon E5-2600v4 CPU Frequency for triple-core applications

4

Chart of Xeon E5-2600v4 CPU Frequency for quad-core applications

v

Chart of Xeon E5-2600v4 CPU Frequency for 5-core applications

6

Chart of Xeon E5-2600v4 CPU Frequency for 6-core applications

vii

Chart of Xeon E5-2600v4 CPU Frequency for 7-core applications

eight

Chart of Xeon E5-2600v4 CPU Frequency for 8-core applications

9

Chart of Xeon E5-2600v4 CPU Frequency for 9-core applications

10

Chart of Xeon E5-2600v4 CPU Frequency for 10-core applications

11+ cores

Chart of Xeon E5-2600v4 CPU Frequency for applications using 11 or more cores

All of the in a higher place plots show CPU frequencies for applications utilizing AVX instructions. The colored bars indicate the worst-case scenario – CPUs volition run at to the lowest degree this fast. The grey bars indicate the expected clock speeds for most workloads.

Cost-Effectiveness and Power Efficiency of Xeon E5-2600v4 CPUs

The "Broadwell-EP" processors have nearly the same price structure and power requirements as earlier Xeon E5-2600 products, so their price-effectiveness and power-efficiency should be quite bonny to HPC users. Savvy readers may find the post-obit facts useful:

  • HPC applications run all-time on the Advanced CPU models; they typically practise not scale well on the High-Cadre-Count models.
  • The High-Core-Count models are more common in Enterprise and Finance – these carry higher prices than other E5-2600 models.
  • The following graphs depict the cost-effectiveness and power-efficiency of only the CPU itself. In many cases, HPC users will find that once they've taken the full platform and cluster design into business relationship, the cost-effectiveness of an Advanced CPU may exist college than these plots demonstrate.

Operation vs. Price

Chart of Xeon E5-2600v4 Cost-Effectiveness (performance vs. price)

Performance vs. Power

Chart of Xeon E5-2600v4 Power-Efficiency

Processor Prices

Chart of Xeon E5-2600v4 CPU Prices

Summary of features in Xeon E5-2600v4 "Broadwell-EP" processors

In add-on to the capabilities mentioned at the top of this article, these processors include many of the successful features from before Xeon designs. The list below provides a summary of relevant technology features:

  • Upwards to 22 processor cores per socket (with options for iv-, 6-, 8-, 10-, 12-, fourteen-, sixteen-, xviii-, and 20-cores)
  • Back up for Quad-channel ECC DDR4 memory speeds upwardly to 2400MHz
  • Direct PCI-Express (generation 3.0) connections between each CPU and peripheral devices such every bit network adapters, GPUs and coprocessors (40 PCI-E lanes per socket)
  • Floating Point Pedagogy operation improvements:
    • Faster floating point multiplier completes operations in 3 cycles (down from 5 cycles)
    • 1024 Radix divider for reduced latency
    • Carve up Scalar divides for increased parallelism/bandwidth
    • Faster vector Gather
  • Equally introduced with "Haswell", "Broadwell" continues to back upAdvanced Vector Extensions (AVX 2.0):
    • effectively double the throughput of integer and floating-indicate operations with math units expanded from 128-$.25 to 256-bits
    • introduce Fused Multiply Add (FMA3) instructions which permit a multiply and an accumulate instruction to be completed in a single cycle (finer doubling the FLOPS/clock from 8 to 16 for each core of a CPU)
    • add support for additional instructions, including Gather and vector shift
    • F16C 16-bit Floating-Betoken conversion instructions advance information conversion between 16-flake and 32-bit floating bespeak formats
  • Turbo Boost technology improves performance under acme loads past increasing processor clock speeds. With version ii.0, (introduced in "Sandy Span") clock speeds are boosted more frequently, to higher speeds and for longer periods of time. With "Haswell" and "Broadwell", superlative clock speeds depend upon the blazon of instructions (AVX vs. Non-AVX).
  • Extract more parallelism in scheduling micro-operations:
    • Reduced instruction latencies on ADC, CMOV and PCLMULQDQ
    • Larger out-of-order scheduler, with 64 entries (upwardly from 60 entries)
    • Introduction of the ADCX and ADOX instructions to speed upwards cryptography
    • Improved address prediction for branches and returns, with an expanded x-fashion Co-operative Prediction Unit Target Array (upwards from 8-way)
  • Improved performance on big data sets:
    • Larger L2 Translation Lookaside Buffer (TLB), with 1.5k entries (upwards from 1K entries)
    • A new L2 TLB for 1GB pages (with 16 entries)
    • Addition of a second TLB page miss handler for parallel folio walks
  • Dual Quick Path Interconnect (QPI) links between processor sockets improve advice speeds for multi-threaded applications
  • Intel Data Straight I/O Technology increases performance and reduces latency by assuasive Intel ethernet controllers and adapters to talk directly with the processor cache
  • Transactional Synchronization Extensions (TSX) amend the parallelism of multi-threaded applications with synchronization locks
  • Introduction of the RDSEED education for high-quality, non-deterministic, random seed values
  • Avant-garde Encryption Standard New Instructions (AES-NI) accelerate encryption and decryption for fast, affordable data protection and security
  • 32-chip & 64-bit Intel Virtualization Technology (VT/VT-x) for Directed I/O (VT-d) and Connectivity (VT-c) deliver faster performance for core virtualization processes and provide congenital-in hardware support for I/O virtualization.
  • Intel APIC Virtualization (APICv) provides increased virtualization performance
  • Hyper-Threading technology allows 2 threads to "share" a processor core for improved resource usage. Although useful for some workloads, it is non recommended for HPC applications.
  • Improved energy efficiency with Per Core P-States and independent uncore frequency control
  • Hardware Controlled Power Management for more rapid and efficient decisions on optimal P- and C-State operating point
  • DDR4 CRC provides better memory reliability and data integrity by detecting retention omnibus faults during write
  • ECRC for PCI-Limited provides optional data integrity protection for systems using PCI-Express switches or bridges

huberoverve.blogspot.com

Source: https://www.microway.com/knowledge-center-articles/detailed-specifications-of-the-intel-xeon-e5-2600v4-broadwell-ep-processors/

0 Response to "Intel Xeon Processor E5-2600 V4 and V3 Product Family Cpus"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel