CEC Theses and Dissertations


The Design of a Logarithmic File Data Allocation Algorithm for Extent Based File Systems

Date of Award


Document Type


Degree Name

Doctor of Philosophy (PhD)


Graduate School of Computer and Information Sciences


Gregory Simco

Committee Member

Lee J. Leitner

Committee Member

Junping Sun


I/O has become the major bottleneck in application performance as processor speed has skyrocket over the last few years, leaving storage hardware and file systems struggling to be competitive. This study presented a new methodology to evaluate workload-dependent file system performance. Applying the new methodology, the benchmarks conducted on the VxFS file system served as a baseline for the design of a new logarithmic file data allocation algorithm for extent based file systems. Most of the I/O performance evaluations conducted today have reached a steady state in which new developments are evolutionary, not revolutionary. The performance model that was introduced in this thesis is revolutionary, as it produces an application as well as a system characterization vector that allows researchers to conduct an in -depth analysis of a complete system environment.

The constructed performance hierarchy incorporates performance dependencies at all levels of a system, which allows comparing, evaluating, and analyzing the system at any level of abstraction. When combining the performance characterization vectors, a single, application specific metric allows the analyst to conduct a sensitivity study. The characterization vectors and the resulting single metric satisfy all the potential demands for analyzing I/O performance, in particular systems evaluation, comparison, and optimization.

The main argument made throughout this study is that systems performance has to be measured in the context of a particular application and a particular workload. The performance evaluation methodology presented in this thesis enables such measurements. Utilizing the methodology, the in-depth file system performance analysis conducted on different hardware and software configurations reviled that the new approach taken in this study was successful, as it was possible to rank the test systems in their proper order of relative performance measurements. The methodology allowed to normalize the results and to quantify performance in terms of performance paths that included an application, an operating system, as well as the underlying hardware subsystems.

The file system benchmarks conducted on VxFS and UFS file systems respectively disclosed the strengths and weaknesses of an extent-based and a block-based file system design. In both architectures, fragmentation substantially impacts I/O performance as the file systems age. The performance measurements outlined the correlation between internal and external fragmentation, and made a strong case for a much enhanced file data allocation algorithm for extent based file systems. The second part of this research introduced a new file data allocation algorithm for extent based file systems that is unique in its approach in that it questioned established boundaries and redistributed existing responsibilities. The new allocation algorithm fulfilled the major requirements of increasing I/O performance by lowering internal fragmentation without significantly increasing the metadata overhead. The presented analytical data model, as well as the actual simulation of the new file data allocation algorithm proofed the great potential of the new design.

The study concluded with the recommendation for a new I/O model that is fundamentally different from all the existing once. The vision is that a completely redesigned I/O model is necessary to provide adequate I/O performance. To be efficient, the new I/O model will have to work around the expensive memory-copy to and from user and kernel address space. Moving the data in an out of virtual address ·space is poor overhead, and if the subsystem that is moving the data does not have any virtual memory technique implemented to avoid the data copy, performance is limited to approximately 1,4 of the memory speed. The next generation I/O model envisioned in this study focuses on alleviating all unnecessary overhead in the I/O path , allowing the hardware to operate at its full potential.

This document is currently not available here.

  Link to NovaCat