CCE Theses and Dissertations

Campus Access Only

All rights reserved. This publication is intended for use solely by faculty, students, and staff of Nova Southeastern University. No part of this publication may be reproduced, distributed, or transmitted in any form or by any means, now known or later developed, including but not limited to photocopying, recording, or other electronic or mechanical methods, without the prior written permission of the author or the publisher.

Database Streaming Compression on Memory-Limited Machines

Damon F. Bruccoleri, Nova Southeastern UniversityFollow

Date of Award

2018

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computer Science (CISD)

Department

College of Engineering and Computing

Advisor

Junping Sun

Committee Member

Wei Li

Committee Member

Jamie Raigoza

Keywords

Compression, FPGA Compression, High Frequency Trading, Huffman, Minimum delay, Streaming Compression

Abstract

Dynamic Huffman compression algorithms operate on data-streams with a bounded symbol list. With these algorithms, the complete list of symbols must be contained in main memory or secondary storage. A horizontal format transaction database that is streaming can have a very large item list. Many nodes tax both the processing hardware primary memory size, and the processing time to dynamically maintain the tree. This research investigated Huffman compression of a transaction-streaming database with a very large symbol list, where each item in the transaction database schema’s item list is a symbol to compress. The constraint of a large symbol list is, in this research, equivalent to the constraint of a memory-limited machine. A large symbol set will result if each item in a large database item list is a symbol to compress in a database stream. In addition, database streams may have some temporal component spanning months or years. Finally, the horizontal format is the format most suited to a streaming transaction database because the transaction IDs are not known beforehand This research prototypes an algorithm that will compresses a transaction database stream. There are several advantages to the memory limited dynamic Huffman algorithm. Dynamic Huffman algorithms are single pass algorithms. In many instances a second pass over the data is not possible, such as with streaming databases. Previous dynamic Huffman algorithms are not memory limited, they are asymptotic to O(n), where n is the number of distinct item IDs. Memory is required to grow to fit the n items. The improvement of the new memory limited Dynamic Huffman algorithm is that it would have an O(k) asymptotic memory requirement; where k is the maximum number of nodes in the Huffman tree, k < n, and k is a user chosen constant. The new memory limited Dynamic Huffman algorithm compresses horizontally encoded transaction databases that do not contain long runs of 0’s or 1’s.

NSUWorks Citation

Damon F. Bruccoleri. 2018. Database Streaming Compression on Memory-Limited Machines. Doctoral dissertation. Nova Southeastern University. Retrieved from NSUWorks, College of Engineering and Computing. (1031)
https://nsuworks.nova.edu/gscis_etd/1031.

CCE Theses and Dissertations

Campus Access Only

Database Streaming Compression on Memory-Limited Machines

Date of Award

Document Type

Degree Name

Department

Advisor

Committee Member

Committee Member

Keywords

Abstract

NSUWorks Citation

Included in

Browse

Author Corner

Links

Connect with NSU

CCE Theses and Dissertations

Campus Access Only

Database Streaming Compression on Memory-Limited Machines

Author

Date of Award

Document Type

Degree Name

Department

Advisor

Committee Member

Committee Member

Keywords

Abstract

NSUWorks Citation

Included in

Share

Browse

Author Corner

Links

Connect with NSU