(cum laude: ranked among the top 5% at TU/e)
Algorithms for Context-Aware Trajectory Analysis (PDF
Spatio-temporal trajectories are a ubiquitous source for analyzing patterns of mobility of people and animals. The major challenges when working with these trajectories are to get clean, complete data and to visualize and the vast amount of data that is available. To be able to tackle these challenges in meaningful ways, we leverage context: the factors, internal or external, that influence movement of people and animals. Using this context in algorithms provides us with a way to infer more information. For instance, what are reasonable trajectories if we know that the moving entity has physical limits? In this thesis, we provide novel algorithms and approaches that take into account the spatial environment, namely the road-network, for traces of vehicles and considers the real-world physical context of trajectories.
Parameterized Algorithms for Finding Large Sparse Subgraphs: Kernelization and Beyond (PDF
The study of parameterized problems shows that in many cases large instances of hard problems can still be solved efficiently under the condition that the parameter is small. The parameter measures, in a sense, the complexity of the instance. Kernelization is the first preprocessing framework that makes use of this. A kernelization algorithm guarantees that, in polynomial time, large problem instances can be reduced to a size only dependent on the parameter. A more general notion is Turing kernelization, where a large problem instance is reduced to potentially multiple small instances. In this thesis we investigate these preprocessing techniques for a number of problems revolving around finding large subgraphs that have a certain property. We also argue for a new perspective on preprocessing, where a size reduction is not the ultimate goal, and instead the focus is on finding a part of the solution.
Reliable Geometric Spanners (PDF
The quality of geometric networks like road networks is often measured based on distances: a good network should provide relatively short routes between the nodes of the network. In this thesis we study the robustness of an important class of geometric networks, namely geometric spanners. The vertex set of a geometric graph is a set of points in the plane (or some higher-dimensional Euclidean space) and the edges are straight line segments, weighted with the Euclidean distance between their endpoints. For some $t > 1$, such a network is a $t$-spanner if the length of the shortest path between any two vertices is at most $t$ times their Euclidean distance. The factor $t$ is called the dilation (or stretch factor, spanning ratio) of the network. Since the spanning ratio of the complete graph is always one, the challenge is to construct spanners with constant dilation that have as few edges as possible. There are several constructions of linear size spanners, which are optimal. Naturally, more edges are needed for spanners that can resist to failures. We give several constructions of spanners that can survive massive failures, while having only a few edges.
Algorithms for Coherent Rectangular Visualizations (PDF
To gain insight into the large amounts of data that chronicle our life, work and world, we often turn to visualizations to use our powerful perceptual system for (parts of) the analysis. For a visualization to be effective, it has to be coherent: relations between data items are visually represented, and visible relations are present in the data. A coherent visualization thus prevents false patterns from emerging in the visualization, and preserves patterns that exist in the data. In this thesis we focus on rectangular visualizations, that is, visualizations that use rectangles as their fundamental building blocks. We describe several new algorithms that transform different types of data into coherent rectangular visualizations.
Stability of Geometric Algorithms (PDF
A large amount of data that is collected nowadays is time-varying: air traffic, stock prices and weather are all examples of every-day data that changes over time. To aid humans in their decision making, this data has to be analyzed quickly and communicated effectively. Algorithms and visualizations together play an important part in both these tasks. For time-varying data it is especially important to not only communicate data at one point in time, but also to show the evolution of the data over time. To preserve temporal patterns in visualizations, it is important that they are stable: small changes in the data should result in small changes in the visualization. In this thesis we introduce theoretical tools for analyzing the stability of algorithms. More specifically, we propose a framework that allows us to better understand the trade-offs between stability and traditional criteria for evaluating algorithms, such as solution quality and running time. We apply the framework to a variety of problems in computational geometry. Furthermore, we develop stable methods to generate a visual summary, a static visualization method that shows changes in time-varying data.
Tight Parameterized Preprocessing Bounds: Sparsification via Low-Degree Polynomials (PDF
Many of the problems we are interested in solving algorithmically are known to be NP-hard and can (probably) not be solved efficiently. In this thesis we will study preprocessing algorithms for such problems. In particular, we will be interested in polynomial-time preprocessing algorithms that guarantee that the preprocessed instance is equivalent to the original one, and that its size is bounded by a function of some parameter of the original instance. Such preprocessing algorithms are known as kernels, and the bound on the size of the resulting instance is the size of the kernel. Naturally, one can ask what the best-possible size is for a certain problem, with a certain parameter. In this thesis, we mostly focus on finding preprocessing algorithms for the very general constraint satisfaction problem. Given a number of constraints over a set of variables, the problem asks to set values to the variables that satisfy all constraints. We obtain kernel upper and lower bounds, depending on the type of constraints that occur in an instance.
Algorithms for Visualization in Digital Humanities (PDF
This dissertation encompasses the most important results of the interdisciplinary research project CatVis, which is a collaborative effort between computer scientists, data scientists, and philosophers. We describe a number of methodological and philosophical challenges that arose within CatVis. After that we present GlamMap, a visualization tool for large, multi-variate georeferenced Humanities data sets. Inspired by GlamMap, we present a fully dynamic kinetic data structure that maintains a set of $n$ disjoint growing squares. This leads to an $O(n\,\alpha(n) \log^5 n)$ time algorithm to solve the agglomerative clustering problem. Although our algorithm is asymptotically much faster than previous approaches, from a practical point of view, it does not perform better for $n \leq 10^6$. We present a new agglomerative clustering algorithm which works efficiently in practice. Other results contained in the dissertation are a novel type of low distortion radial embedding and a study of plane supports of hypergraphs.
Algorithms for River Network Analysis (PDF
Braided rivers and estuaries are examples of river networks: systems consisting of intertwined channels and islands, that can behave in complex and unpredictable ways. The study of river networks is important because they play a crucial part in the natural environment. Therefore, river networks form one of the key research interests in geomorphology, but so far their analysis has been carried out mostly by hand. In this thesis, we take an algorithmic approach to the study of river networks. We employ techniques from topology on a digital elevation model of the river to automatically construct a graph representing significantly different channels of water in the river network, and we apply this method to the analysis of two real-world rivers. Furthermore, we study methods to track the evolution of river network over time, and a method to create more compact visualizations of a network.
ETH-tight algorithms for geometric network problems (PDF
IPA Dissertation Award 2019
EATCS Distinguished Dissertation Award 2020
The thesis establishes new algorithmic and lower bound techniques for geometric network problems. Geometric networks arise in many applications. One such application is in sensor networks, which are often modeled as so-called unit disk graphs; these are graphs whose nodes corresponds to disks in the plane and where two nodes are connected if their disks intersect. They have several generalizations: to higher dimensions (called unit ball graphs), and to intersections of other ball-like objects, called fat objects and similarly sized fat objects. The first part of the thesis presents an algorithmic framework for solving NP-hard problems in geometric intersection graphs of similarly sized fat objects. The studied problems are classics of graph theory, such as Independent Set, Dominating Set, Steiner Tree, Hamiltonian Cycle. The second part of the thesis is about lower bounds on the running times of algorithms in geometric intersection graphs.
Dynamic range and frequency assignment problems (PDF
More and more communication is nowadays performed using wireless networks: mobile phone telecommunication, robots or other devices that communicate via Bluetooth, and so on. The optimization of such wireless communication systems gives rise to many challenging algorithmic questions. Most problems studied in this thesis are inspired by this type of questions. The first question we study is inspired by the problem of assigning frequencies to base stations in a mobile phone network. In the mobile phone telecommunication case, the locations of the base stations are given, together with the range within which they can serve customers. Since typically the ranges overlap, if a customer stands within the range of two different base stations that use the same frequency, then interference occurs and the customer cannot communicate with either of the two base stations. On the other end of the spectrum, if all base stations use a different frequency, the frequency range needed is large and hence costly or even not available at all. We therefore want to assign a frequency to each base station such that in any location within the range of at least one device, it is possible to communicate with at least one base station without suffering from interference. The second problem is inspired by the problem of minimizing power consumption in wireless communication networks. Here one is given a collection of devices that can send information to other devices within their transmission range. A device can then communicate with another one via other devices, that is the devices form a communication network and a device can send a message to another one if there is a path from the former to the later device. To ensure connectivity between devices, it is obviously possible to have each device transmit to each other device by assigning large ranges. However this is not desirable as the power cost needed for such ranges can be too high, forcing, for instance, football robots to carry heavy batteries that would slow them down. On the other hand, if the ranges are too small, two robots might not be able to communicate with one another. We hence want an assignment of ranges that maintain a certain connectivity while minimising the energy cost.
Faster Algorithms for Geometric Clustering and Competitive Facility-Location Problems (PDF
In this thesis we study algorithmic problems related to clustering and competitive facility location of point sets in 2- and higher-dimensional space. In the geometric clustering problems we consider, we want to partition a given set of points in the plane into a number of clusters, so that a certain cost function is minimized. This cost function could be the sum of radii of the clusters or some other geometric measure. One of the contributions of the thesis, is an algorithm to compute an approximately optimal clustering that works for a large class of cost functions, which we call regular functions. Next we focus on a specific cost function, namely the sum of the perimeters of the clusters. We present the first sub-quadratic algorithm to compute an optimal clustering for this cost function, for the case where the number of clusters is two. We also present an efficient approximation algorithm that is specifically optimized for this cost function and this number of clusters. Our final contribution on geometric clustering problems is the first polynomial-time algorithm to compute an optimal clustering (for the sum-of-perimeters as cost function) where the number of clusters is part of the input.
Continuous Similarity Measures for Curves and Surfaces (PDF
IPA Dissertation Award 2018
Distinguishing objects is a crucial task for survival. Animals survive by distinguishing predators from prey, and poisonous plants from edible ones. As such, objects come in different types, and to recognize the type of an object, one needs a description of it. In theory, one would best describe it by listing all objects of that type. However, there tend to be many objects of a given type, and sometimes, objects of the same type may not have been discovered. A more practical way to describe a type would be to provide a collection of objects of that type, complemented by a collection of objects not of that type. For an object of a given type, we expect that it is similar to the objects of the same type, and dissimilar from other objects. To test whether an object is of a given type, one can compare it to other objects that do and do not fit the description of that type. The similarity of objects can be measured in many different ways, and different applications tend to have different similarity measures that are suitable. In this thesis, we investigate computational aspects of various similarity measures between curves, as well as surfaces.
Trajectory Analysis: Bridging Algorithms and Visualization (PDF
In this thesis, we explore how combining algorithms and visualization can enhance the analysis of movement data. Thus, we aim to fill the methodological gap between algorithms and visualization by integrating computations, their context, and their visual representations more closely. Filling this gap will help movement analysts to externalize their cognition by integrating algorithmic means, visual means, and their domain knowledge into a holistic tooling. The contributions of this thesis make it possible for movement analysts to benefit from the exchange of ideas and concepts between algorithms and visualization.
Geographic Graph Construction and Visualization (PDF
A geographic graph is a way to represent information about the network of connections between geographic locations. As our small world has grown ever more interconnected, it has become an important way to represent data from a wide variety of sources. In this thesis we consider methods for the efficient construction of a geographic graph from a set of locations. The graphs created by our algorithms are constructed to have various useful properties and applications to real world problems. Furthermore, we propose and evaluate methods to visualize existing geographic graphs. These visualizations help us to explore the structure of geographic graphs as well as communicate their properties to a wide audience.
Data Structures for Analyzing Geometric Data (PDF
The amount of data that are being generated, collected, and stored is growing rapidly and a large body of the data are geometric in nature. In many applications, in particular those involving geographic data, the data needs to be stored in such a way that relevant parts of the data can be retrieved quickly. More precisely, one often wants to store the data in such a way that certain queries about the data can be answered quickly. Thus the art of designing efficient geometric data structures has always been a core topic within computational geometry. Unfortunately, traditional data structures are far from being able to get involved in many modern applications, as they are neither able to store complex objects nor to support queries that involve complicated analysis tasks. Inspired by the two downsides of traditional data structures, in this thesis we design and analyze a number of new efficient data structures that go beyond traditional setting.
Arthur van Goethem
Algorithms for Curved Schematization (PDF
Schematization is the process of creating a simplified and compact representation of the input data. It can structure and simplify information, reduce the illusion of accuracy, and help create a striking and powerful message. In this thesis we look at possibilities for automated schematization using curves. We present different algorithms that may be used to schematize territorial outlines or networks. Algorithms are presented that allow the use of different curves, the introduction of new vertices, and schematization across junction points. Furthermore we also discuss linear cartograms and provide an alternative model where distortion is restricted to the (road) network. An efficient algorithm is described to compute a possible placement of all edges. Lastly we explore the limits of schematization. Traditional schematization considers shape schematization as a schematization of the boundary of a shape. We present a radically different type of geometric abstraction - the stenomap. Stenomaps represent two-dimensional input shapes by smoothly curving linear glyphs.
Cartographic Modelling for Automated Map Generation (PDF
For many applications, maps are the most effective visualisation of geographic phenomena. Automated map generation, once thought to be impossible, has celebrated great successes in recent years. These successes were concentrated on large-scale topographic map production and roadmap-style web-maps. Many more specialised map types, especially from the realm of thematic maps, remain challenges both interesting and relevant. The previous efforts were especially successful when the crucial design specifications were well understood and properly formalised. When these preconditions were met, algorithmic and technical refinement could take over. In this thesis we investigate several areas of cartographic design specifications that have so far been mostly neglected. We concentrate on cartographic modelling for map labelling and medium-scale map generation. We put a special focus on automated schematisation for chorematic diagrams.
Similarity Measures and Algorithms for Cartographic Schematization (PDF
A schematic map focuses on displaying specific information rather than full geographic detail. Such a map uses few geometric elements to convey a geographic shape. The geometry often adheres to a certain style, such as the octilinear style which only permits horizontal, vertical and 45-degree diagonal lines. For a map to be effective, geographic shapes such as countries and provinces must still be recognizable. In this thesis we investigate similarity measures and algorithms in the context of automated cartographic schematization, the process of creating schematic maps. Similarity measures are required to quantify resemblance between schematic and geographic shapes. We improve an existing similarity measure and develop new algorithms to compute schematic representations that resemble the geographic shapes.
Kinetic Data Structures in the Black-Box Model (PDF
In recent years there has been a growing interest in maintaining geometric data structures as the input moves over time. The straightforward approach is to recompute the structure from scratch whenever the input changes, but this can be wasteful if the changes are small. To avoid unnecessary computations, Kinetic Data Structures were introduced by Basch, Guibas and Hershberger in 1997. In this framework we assume that the trajectories of the objects are known and we can compute when to change the data structure. This leads to an event-driven approach for which several efficiency guarantees can be proven. Unfortunately, there are many situations when the trajectories are unknown. For these applications we introduce the black-box model to model the motions of the input and design efficient algorithms to maintain several basic data structures within this model.
Pushing and Pulling: Computing push plans for disk-shaped robots and dynamic labelings for moving points (PDF
This thesis tackles two problems that seem unrelated at the surface. The first is from robotics: a disk-shaped robot needs to move a disk-shaped object, too heavy for it to pick up. To this end, we compute a path that will make it push the object to the desired location, avoiding obstacles along the way. The second problem is related to air-traffic control and interactive maps. A set of points on the screen, representing air planes or cities, is to be assigned textual labels. As the air planes move, or the user's view onto the cities is panned, rotated, or zoomed, the labels need to be moved smoothly, as if pulled by the points. By exploiting the symmetry between pushing and pulling, we can re-use some of the algorithmic techniques used to solve the one problem in solving the other.
Algorithms for Cartographic Visualization (PDF
Maps play an important role in our society. We use maps to find our way from one place to another, or to locate points of interest, like hotels, restaurants, and tourist attractions. But maps not only help us navigate through our environment. They are effective tools for communicating various types of information and help people to make decisions in, for example, spatial planning and politics. Recent technological advances and the changing needs of map users require maps to be created on-demand. Hence we need methods for the automated construction of maps. In this thesis we consider the automated construction of thematic maps, in particular schematic maps, cartograms, necklace maps, and flow maps. We study the underlying algorithmic problems and we present efficient algorithms to construct these maps.
Amirali Khosravi Dehkordi
Optimal Geometric Data Structures (PDF
Spatial data structures form a core ingredient of many geometric algorithms, both in theory and in practice. Many of these data structures, especially the ones used in practice are based on partitioning the underlying space (examples are binary space partitions and decompositions of polygons) or partitioning the set of objects (examples are bounding-volume hierarchies). The efficiency of such data structures—and, hence, of the algorithms that use them—depends on certain characteristics of them such as their size or their crossing number. Much research has been done on the problem of finding such data structures whose characteristics are good in the worst case. In this thesis, we studied the problem from a different point of view, namely instance-optimality. In particular, we considered the following question: given a class of geometric partitioning structures—like BSPs, simplicial partitions, polygon decompositions, …—and a cost function—like size or crossing number—can we design an algorithm that computes a structure whose cost is optimal or close to optimal for any input instance (rather than only worst-case optimal).
Analysis of Flow and Visibility on Triangulated Terrains (PDF
Landscapes and their morphology have been widely studied for predicting physical phenomena, such as floods or erosion, but also for planning human activities effectively, such as building prominent fortifications and watchtowers. Nowadays, the study of terrains is done in a computer-based environment; terrains are modelled by digital representations, and algorithms are used to simulate physical processes like water flow and to compute attributes like visibility from certain locations. One of the most popular digital terrain representations are the so-called Triangulated Irregular Networks (TINs), that is, piecewise linear surfaces that consist of triangles. In the current thesis we examine several complexity issues related with the computation of flow and visibility structures on triangulated terrains. Among other results, we provide novel algorithms for extracting information on the drainage attributes of such terrains efficiently. We also study the influence of noise in the computation of drainage and visibility structures on triangulated terrains.
Randomized Approximation Algorithms: Facility Location, Phylogenetic Networks, Nash Equilibria (PDF
For many problems of practical interest it seems to be impossible to compute their optimal solution in polynomial time. A $c$-approximation algorithm is a polynomial-time algorithm that produces a solution which may not be optimal, but is never more than a factor $c$ off from the optimum. The factor $c$ is then called the approximation ratio. A randomized $c$-approximation algorithm is an approximation algorithm which makes decisions based on the outcome of random events (say, coin tosses) so that its approximation ratio is a random variable with expected value $c$. In this thesis, randomized approximation algorithms are given for a number of combinatorial optimization problems, including metric uncapacitated facility location, construction of phylogenetic networks, computation of Nash equilibria, and a generalization of set cover. The algorithms can be derandomized by the method of conditional expectation to give guaranteed approximation ratios that often improve on the previously best known.
Drawing Graphs for Cartographic Applications (PDF
Graphs are ubiquitous in computer science, and visualizing them in a good way has become the topic of the research area known as graph drawing. It combines techniques from graph theory, algorithms, (computational) geometry, and (computational) topology, and has applications to cartography, VLSI design, and information visualization, among others. This thesis studies three graph drawing problems. The first problem is the construction of rectilinear cartograms, where each region on a map is deformed into a rectilinear polygon whose area represents some value associated with the area (say, the population of a country). The resulting cartogram should resemble the original map as closely as possible in terms of shapes and adjacencies of the regions, while at the same time faithfully representing the value associated with each region. The second problem concerns cased drawings of graphs, where edge crossings are drawn in a way that suggests that one edge goes underneath the other. The third problem is to decide for certain curves and surfaces whether they could occur as the boundary of some shape.
Algorithms for Fat Objects: Decompositions and Applications (PDF
Traditionally, the analysis of the running time of an algorithm is a worst-case analysis. Because of this, the bound on the running time is determined by the most difficult inputs, even when such inputs are non-existent in the application at hand. This is true in all areas of algorithms research, but it is particularly problematic for geometric algorithms. The reason is that there can be a large variation in shape and spatial distribution of geometric data, and unfavorable properties of the input—extremely skinny objects for example—can increase the running time dramatically. In this thesis we design and analyze several novel geometric algorithms under the assumption that the input objects are fat, that is, they do not contain extremely skinny parts. In particular, we show how to decompose fat objects into simpler pieces (such as triangles or tetrahedra), and we give algorithms for ray shooting, computing depth orders, and hidden-surface removal for fat objects. All our algorithms are significantly more efficient than the best known solutions for arbitrary (non-fat) objects.
A Theoretical and Experimental Study of Geometric Networks (PDF
A geometric network on a set $P$ of $n$ points in $d$-dimensional space is an undirected graph $G=(P,E)$ with vertex set $P$ whose edges are straight-line segments connecting pairs of points in $P$. Geometric networks can model many real-life networks: road networks, telecommunication networks, and so on. When designing a network for a given point set $P$, it is often important to ensure a fast connection between every pair of points. For this it would be ideal to have a direct connection between every pair—the network would then be a complete graph—but in most applications this is unacceptable due to the high costs. This leads to the concepts of spanners: networks with a small number of edges where there is a relatively short path between any pair of points. In this thesis we obtain several new results on geometric spanners. For example, we develop novel algorithms for constructing spanners that are fault-tolerant and for adding edges to an existing spanner to improve its performance, and we experimentally investigate the performance of different spanner constructions.
Mohammad Ali Abam
New Data Structures and Algorithms for Mobile Data (PDF
Motion is ubiquitous in the physical world and due to recent advances in sensing and tracking technology, motion data is becoming more and more available in a variety of areas: air-traffic control, mobile communication, geographic information systems, and so on. It is not surprising, therefore, that in these applications areas of computer science it is necessary to store, analyze, and create or manipulate motion data. As a result, algorithms and data structures for moving objects have become an important area of study within the field of computational geometry. This thesis presents a number of new and fundamental results in this exciting area of research.
Multifunctional Geometric Data Structures (PDF
Many computational problems can be stated in terms of geometric queries on a set of objects, such as: ‘Which objects in the set are contained within this region?’. This query is an example of a range searching query. If a set of objects is often queried then it is beneficial to build a data structure on these objects such that the queries can be answered faster. This can be a very specific data structure, that only answers one specific type of geometric query. However, it can also be a general data structure, that answers many different queries and stores many types of objects. The latter type of data structures we call multifunctional geometric data structures and these are the topic of this thesis.