新聞稿

3 分鐘閱讀

Tachyum Demonstrates Instruction Profiling Unit on Prodigy FPGA

LAS VEGAS, June 11, 2024 – Tachyum® today announced that it enabled an Instruction Profiling Unit (IPU), a low overhead way to collect the profile of non-instrumented executed code, on its Prodigy® Universal Processor. IPU is used by hyperscalers to profile applications in production execution and recompile code to gain better performance.

This latest enhancement is part of the company’s focus on refining and performance optimization of the Tachyum Software Distribution Package upon its beta release. Using results collected by IPU in re-compiling applications can provide a 5-15% gain in performance depending on application, which would result in a huge financial benefit to users.

In February, Tachyum added a Performance Monitoring Unit (PMU) to the emulation system that empowers customers and partners with the ability to address bottlenecks and better optimize Prodigy performance for all applications and workloads. The PMU’s wide range of performance counters – supported by both software C-model and FPGA – facilitates both system debugging and performance tuning. The addition of IPU allows it to be used by Profile Directed Optimizations (PDO) and is important for Just In Time (JIT) compilers for optimizing hotspots.

“IPU is essential for large-scale operators and is now readily available as part of our FPGA,” said Dr. Radoslav Danilak, founder and CEO of Tachyum. “We believe this technology will also be used by smaller data center operators. This is important for meeting our goals of Prodigy supplying industry-leading performance at significantly lower power and lower cost to organizations of all sizes looking to supercharge their workloads while supporting the greatest breadth of applications.”

As a Universal Processor offering industry-leading performance for all workloads, Prodigy-powered data center servers can seamlessly and dynamically switch between computational domains (such as AI/ML, HPC, and cloud) with a single homogeneous architecture. By eliminating the need for expensive dedicated AI hardware and dramatically increasing server utilization, Prodigy reduces CAPEX and OPEX significantly while delivering unprecedented data center performance, power, and economics. Prodigy integrates 192 high-performance custom-designed 64-bit compute cores, to deliver up to 4.5x the performance of the highest-performing x86 processors for cloud workloads, up to 3x that of the highest performing GPU for HPC, and 6x for AI applications.

A video demonstrating IPU running on a Prodigy FPGA prototype is available for viewing below.

Follow Tachyum

https://twitter.com/tachyum

https://www.linkedin.com/company/tachyum

https://www.facebook.com/Tachyum/

About Tachyum

Tachyum is transforming the economics of AI, HPC, public and private cloud workloads with Prodigy, the world’s first Universal Processor. Prodigy unifies the functionality of a CPU, a GPU, and a TPU in a single processor architecture to deliver industry-leading performance, cost and power efficiency for both specialty and general-purpose computing. As global data center emissions continue to contribute to a changing climate, with projections that they will consume 10 percent of the world’s electricity by 2030, the ultra-low power Prodigy is positioned to help balance the world’s appetite for computing at a lower environmental cost. Tachyum has received a major purchase order from a US company to build a large-scale system that can deliver more than 50 exaflops performance, which will exponentially exceed the computational capabilities of the fastest inference or generative AI supercomputers available anywhere in the world today. When complete in 2026, the Prodigy-powered system will deliver a 25x multiplier vs. the world’s fastest conventional supercomputer – built just this year – and will achieve AI capabilities 25,000x larger than models for ChatGPT4. Tachyum has offices in the United States and Slovakia. For more information, visit https://www.tachyum.com/.