TMCnet News

Aviz and TensorWave Collaborate to Enhance GPU Services with Advanced RoCE-based AI Fabrics
[May 15, 2024]

Aviz and TensorWave Collaborate to Enhance GPU Services with Advanced RoCE-based AI Fabrics


Today marks a significant milestone as Aviz Networks, a leader in AI networking solutions, announces its collaboration with TensorWave, a pioneering provider of GPU as a Service. The collaboration is focused on intelligent networks powering AI by implementing RoCE (RDMA over Converged Ethernet)-based AI fabrics to optimize GPU as a service offerings. The implementation features Aviz's Open Network Enterprise Suite (ONES) multi-vendor SONiC solution.

By deploying Aviz Networks' technology, TensorWave will enhance its GPU as a service offering by utilizing advanced AMD-based MI300 accelerators to efficiently meet the growing demands of the market, as companies seek to leverage GenAI, LLMs and machine learning for their businesses. Aviz Networks is renowned for its Networking 3.0 product suite, which has been at the forefront of AI-driven network operations and management solutions.

TensorWave has quickly established itself as a leader in providing GPU as a service, and this collaboration will only amplify its capabilities. The integration of Aviz technology will enable the management and operations of multi-vendor RoCE-based AI fabrics, crucial for handling diverse GPUs, DPUs, and high radix switches. ONES's unique capabilities include RoCE orchestration, real-time visibility, and threshold-based anomaly detection, making it the only vendor-agnostic AI fabric controller on the market.

Advanced ROCE Orchestration and Real-Time Anomaly Detection

ONES stands out with its advanced RoCE orchestration capabilities, meticulously managing buffer settings, Priority Flow Control (PFC), and Explicit Congestion Notification (ECN). This detailed orchestration ensures that data travels through network fabrics efficiently, which is crucial for AI an machine learning applications requiring real-time processing.



Furthermore, ONES enhances network reliability through its real-time visibility and anomaly detection capabilities. It continuously monitors the network to identify and respond to anomalies, such as packet loss, thus preventing potential disruptions before they can impact AI/ML workload completion times.

Future-Proof Features and Network Copilot™ Integration


The existing feature set of ONES, which includes support for high-density platform configurations, is particularly beneficial for environments managed by TensorWave. The integration of Network Copilot™ with ONES extends these capabilities further, providing intelligent guidance and automated management functions, simplifying the complex tasks of network configuration and maintenance.

Vishal Shukla, CEO of Aviz Networks, stated, "Our technology is crucial in enabling TensorWave to provide state-of-the-art GPU services, representing a major advancement in our mission to simplify and elevate Networks for AI, and AI for Networks."

Darrick Horton, CEO of TensorWave, also commented on the collaboration, "Working with Aviz Networks will significantly boost our capabilities to deliver superior GPU as a service to our customers. Aviz's expertise in RoCE support and AI fabric design will be instrumental in scaling our services to meet growing demand."

This collaboration not only underscores Aviz Networks' commitment to delivering innovative solutions but also enhances TensorWave's market position by equipping it with state-of-the-art GPU resources. With these technologies, Aviz and TensorWave are setting new standards in the industry, ensuring their customers have access to state-of-the-art GPU resources optimized for the most demanding applications.

Aviz Networks champions Networks for AI and AI for Networks, focusing on the next generation of networking with its Networking 3.0 stack. This data-centric, vendor-agnostic framework embraces multiple ASICs, switches, NOS, clouds, LLMs, and AI and security applications. Designed to integrate with existing networks seamlessly, Aviz's solutions empower users to navigate the multi-vendor ecosystem without constraints, emphasizing choice, control, and cost-effectiveness. Launched in 2021 and supported by industry leaders such as Moment Ventures, Accton, Cisco Investments, Wistron, and notable angel investors.

TensorWave is a next-gen cloud computing platform for AI workloads and beyond. Its upcoming deployment ushers in the next wave of AI compute, leveraging the AMD Instinct MI300X accelerator at scale. TensorWave is optimized for large-scale AI training, fine-tuning, and inference workloads. Visit www.tensorwave.com to learn more and try it today.


[ Back To TMCnet.com's Homepage ]