On the efficient implementation and parameter selection for TDA Mapper
Abstract
Details
- Title: Subtitle
- On the efficient implementation and parameter selection for TDA Mapper
- Creators
- Ethan A. Rooke
- Contributors
- Isabel K Darcy (Advisor)Colleen C Mitchell (Committee Member)James Traer (Committee Member)Charles D Frohman (Committee Member)
- Resource Type
- Dissertation
- Degree Awarded
- Doctor of Philosophy (PhD), University of Iowa
- Degree in
- Mathematics
- Date degree season
- Summer 2023
- Publisher
- University of Iowa
- DOI
- 10.25820/etd.006912
- Number of pages
- xiv, 80 pages
- Copyright
- Copyright 2023 Ethan A. Rooke
- Language
- English
- Date submitted
- 07/25/2023
- Description illustrations
- color illustrations
- Description bibliographic
- Includes bibliographical references (pages 78-80).
- Public Abstract (ETD)
In 1982 John Naisbitt wrote “We are drowning in information but starved for knowledge.” Since then our capacity to store data as species has only grown, roughly doubling per-capita every three years. This data, if properly utilized, has the capacity to meaningfully improve our lives. For instance genomic research such as the human genome project have lead to revolutions in studying genetic diseases. This work, however, would not be possible without large datasets. Thus it is imperative that we find ways to deal with this torrent of data. Luckily mathematicians are no strangers to complicated objects, many fields of mathematics have techniques for simplifying these objects while maintaining critical aspects of their structure. One field in particular, topology, focuses on simplifications which preserve the “shape” of the data. The obvious question presents itself “can we utilize these techniques to analyze our data?” Unfortunately this isn’t straight forward as topology, as many branches of mathematics, assumes your objects are continuous, which is often not the case with real world data. Topological data analysis is a field which tries to bring these topological tools from the ivory tower of mathematics down into the trenches of data analytics. Mapper is one such algorithm aiming to bring a construction known as the Reeb graph to discrete data. In a nutshell Mapper decomposes a data set into overlapping chunks. It then analyzes and displays how these chunks connect together. This results in a visualization which is low dimensional and allows for human inspection. Mapper has already found success in differentiating breast cancer strains, analyzing disease trajectories, and even detecting financial fraud. This thesis is focused on making Mapper more accessible. To use Mapper one must choose how to decompose the data. Different choices here lead to radically different outcomes. As it stands there is no good understanding on how to make these choices. This thesis provides an introduction to the common choices practictioners make, tools to analyze how robust various features are to different choices of parameters, and visualization strategies to help understand Mapper’s output.
- Academic Unit
- Mathematics
- Record Identifier
- 9984454540102771