NFS over RDMA - A foundation for uncompressed 8K
By: Gregory Shiff

NFS over RDMA - A foundation for uncompressed 8K


We are in a new golden era of content creation. The explosion of streaming services has brought an unprecedented volume of new and amazing media. Production, post-production, visual effects, animation, finishing: everyone is booked solid with work. Expectations for this content are higher than ever, with new, technically challenging formats becoming the norm rather than the exception. Even in 2021, working with native 8K video or high frame rate 4K video (60 frame per second+) is no joke.

Storage and workstation performance can be huge bottlenecks during post. These bottlenecks can be particularly problematic for “hero” seats that work with uncompressed media in real-time. Remote Direct Memory Access (RDMA) technology improves storage and workstation performance simultaneously for systems handling the most demanding content. This article examines RDMA for NFS storage traffic over an ethernet.

Why NFS? Linux is the operating system of choice for media professionals working with applications that support the most challenging media. Even if applications have Windows or macOS variants, the Linux version is used in the truly high-end. The native way for a Linux computer to access network storage is NFS. In particular, NFS over TCP.

RDMA is a protocol that allows a client system to copy data from a storage server’s memory directly into that client’s memory, bypassing many of the buffering layers inherent to TCP. This direct communication improves storage throughput and reduces latency in moving data between server and client. It also reduces CPU load on the client and storage server.

Client workstations, network and storage that support NFSoRDMA can massively boost performance by mounting the network storage with a few different commands. The performance gains of RDMA are impressive. RDMA can be twice as performant as TCP all other things being equal (with a similar drop in workstation utilization). 

Let’s take a look at some real-world examples in media creation. First up, 8K uncompressed. Uncompressed video puts less strain on the workstation (no real-time decompression), but file sizes and bandwidth requirements are huge. In the testing for this article, an 8K DPX image sequence was put on Dell PowerScale network storage. As an image sequence, each frame of video is a separate file. At 8K resolution - each file is approximately 190 MB. Sustaining 24 frame per second playback requires 4.5 GB/s. Long story short, the image sequence would not play with the storage mounted over TCP. Mounting the exact same storage using RDMA was a night and day difference: 8K video at 24-frames per second over the network! 

Now let’s take a look at workstation performance. Uncompressed 8K video is unwieldy to store or work with. The number of facilities truly working in uncompressed 8K is small. 6K PIZ compressed OpenEXR is a more common format. OpenEXR is another image sequence format (file per frame) and PIZ compression is lossless, retaining full image fidelity. The PIZ compressed image sequence had frames between 80 MB and 110 MB each. Sustaining 24 frame-per-second required around 2.7 GB/s. This bandwidth is less than uncompressed 8K but still substantial. The real challenge is that the workstation needs to decompress each frame. Playback dropped frames with the network storage mounted using TCP. The combination of CPU cycles required to read and decode each 6k frame using network storage was too much. RDMA was key for this kind of playback. Remounting the storage using RDMA enabled smooth playback of this OpenEXR 6K PIZ image sequence over the network.

I also tested other common video formats: Sony XAVC and Apple ProRes 422HQ at full 4K DCI resolution and 59.94 frames per second. The application I used for playback showed video disk, GPU, and broadcast output dropped frames. With the file system mounted using TCP or RDMA the video disk never dropped a frame. The storage was plenty fast as were the Nvidia RTX GPUs. With the file system mounted using TCP, the broadcast output dropped thousands of frames, the workstation could not keep up. RDMA was a different story, smooth broadcast output and essentially no dropped frames. In this case, it was all about the CPU cycles freed up by RDMA.

That was a lot of information, so let me put it plainly: NFS over RDMA will play a vital role for creative companies working with uncompressed 8K or high framerate 4k video. Click here if you want to dig deeper into testing and results. 

Gregory Shiff is Principal Solutions Architect, Media & Entertainment at Dell Technologies