NVMe-oPF: Designing Efficient Priority Schemes for NVMe-over-Fabrics with Multi-Tenancy Support

Darren Ng; Andrew Lin; Arjun Kashyap; Guanpeng Li; Xiaoyi Lu

doi:10.1109/IPDPS57955.2024.00052

Back

Conference proceeding

NVMe-oPF: Designing Efficient Priority Schemes for NVMe-over-Fabrics with Multi-Tenancy Support

Darren Ng, Andrew Lin, Arjun Kashyap, Guanpeng Li and Xiaoyi Lu

2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp.519-531

05/27/2024

DOI: 10.1109/IPDPS57955.2024.00052

View Online

Abstract

Resource disaggregation is prevalent in datacenters since it provides high resource utilization when compared to servers dedicated to either compute, memory, or storage. NVMe-over-Fabrics (NVMe-oF) is the standardized protocol for accessing disaggregated network storage. Currently, the NVMe-oF specification lacks semantics to prioritize I/O requests based on different application needs. Since applications have varying goals - latency-sensitive or throughput-critical I/O - we need to design efficient schemes to allow applications to specify the type of performance they wish to achieve. To this end, we propose a new NVMe-over-Priority-Fabrics (NVMe-oPF) protocol with multi-tenancy support that allows applications to specify whether to optimize for latency or throughput. NVMe-oPF proposes coalescing request completions, lock-free optimization, zero-copy queues, out-of-order request completion handling, and window size optimization for the specific I/O patterns, queue depths, and I/O sizes that yield the best performance. Our NVMe-oPF-10Gbps can achieve up to 2.94X improvement in throughput and reduces tail latency by up to 32.1% for highly concurrent multi-tenant read workloads when compared to the state-of-the-art userspace NVMe-oF runtime design in Intel Storage Performance Development Kit (SPDK). For write workloads with 100Gbps, NVMe-oPF achieves a 32.6% increase in throughput while maintaining low latency compared to SPDK. We also bring performance benefits to the application level with HDF5 by increasing write workload throughput by 25.2% in larger-scale experiments.

Semantics

Disaggregated Storage

Multi-Tenancy

Nonvolatile memory

NVMe-over-Fabric (NVMe-oF)

Out of order

Priority Scheme

Protocols

Runtime

Tail

Throughput

Details

Title: Subtitle: NVMe-oPF: Designing Efficient Priority Schemes for NVMe-over-Fabrics with Multi-Tenancy Support
Creators: Darren Ng - University of California, Merced
Andrew Lin - University of California, Merced
Arjun Kashyap - University of California, Merced
Guanpeng Li - University of Iowa, Computer Science
Xiaoyi Lu - University of California, Merced
Resource Type: Conference proceeding
Publication Details: 2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp.519-531
Publisher: IEEE
DOI: 10.1109/IPDPS57955.2024.00052
ISSN: 1530-2075
eISSN: 1530-2075
Grant note: NSF: 2321123, 2340982 DOE: DE-SC0024207
We are grateful to our anonymous reviewers for their invaluable feedback on the paper. We would like to thank our lab mates, with special appreciation to Weicong Chen and Yuke Li for their valuable input. We gratefully acknowledge the computing resources provided on Chameleon Cloud and CloudLab which provide us with our testbed for all our experiments. This work was supported in part by NSF research grants OAC #2321123 and #2340982 and a DOE research grant DE-SC0024207.
Language: English
Date published: 05/27/2024
Academic Unit: Computer Science
Record Identifier: 9984658353902771

Metrics

6 Record Views