Robust Visual-Aided Autonomous Takeoff, Tracking, and Landing of a Small UAV on a Moving Landing Platform for Life-Long Operation


Robot cooperation is key in Search and Rescue (SaR) tasks. An Unmanned Aerial Vehicle (UAV) in cooperation with an Unmanned Ground Vehicle (UGV) can provide valuable insight into the area after a disaster. In this paper, we present an autonomous system that enables a UAV to take off autonomously from a moving landing platform, locate it using visual cues, follow it, and robustly land on it. The system relies on a finite state machine, which together with a novel re-localization module allows the system to operate robustly for extended periods of time and to recover from potential failed landing maneuvers. Both the system as a whole and the re-localization module in particular have been tested extensively in a simulated environment (Gazebo). We also present a qualitative evaluation of the system on the real robotic platforms, demonstrating that our system can also be deployed on real robotic platforms. For the benefit of the community, we make our software open source.

Our work has been accepted at the Special Issue Multi-Robot Systems: Challenges, Trends and Applications of the Applied Sciences Open Access Journal of MDPI. Check it out here.

paper | publication website

SemanticDepth: Fusing Semantic Segmentation and Monocular Depth Estimation for Enabling Autonomous Driving in Roads without Lane Lines


Typically, lane departure warning systems rely on lane lines being present on the road. However, in many scenarios, e.g., secondary roads or some streets in cities, lane lines are either not present or not sufficiently well signaled. In this work, we present a vision-based method to locate a vehicle within the road when no lane lines are present using only RGB images as input. To this end, we propose to fuse together the outputs of a semantic segmentation and a monocular depth estimation architecture to reconstruct locally a semantic 3D point cloud of the viewed scene. We only retain points belonging to the road and, additionally, to any kind of fences or walls that might be present right at the sides of the road. We then compute the width of the road at a certain point on the planned trajectory and, additionally, what we denote as the fence-to-fence distance. Our system is suited to any kind of motoring scenario and is especially useful when lane lines are not present on the road or do not signal the path correctly. The additional fence-to-fence distance computation is complementary to the road’s width estimation. We quantitatively test our method on a set of images featuring streets of the city of Munich that contain a road-fence structure, so as to compare our two proposed variants, namely the road’s width and the fence-to-fence distance computation. In addition, we also validate our system qualitatively on the Stuttgart sequence of the publicly available Cityscapes dataset, where no fences or walls are present at the sides of the road, thus demonstrating that our system can be deployed in a standard city-like environment. For the benefit of the community, we make our software open source.

code | thesis

VoxFlowNet: Learning Scene Flow in Point Clouds via Voxel Grids


During my Guided Research under supervision of Prof. Matthias Niessner, I worked on the task of dense scene flow estimation from pairs of unstructured rigid point clouds. We first partition the scene into voxels, compute a feature vector for every voxel and then pass these voxel features through a volumetric encoder-decoder architecture. We evaluate our approach on a synthetic dataset and obtain competitive results compared to other recent approaches.


code | report | presentation

Photometric Bundle Adjustment


As part of the practical course Vision-based Navigation at the Computer Vision group of TUM, I implemented Photometric Bundle Adjustment as a refinement step after feature-based Structure-from-Motion.

Additionally, I implemented the addition of new points to the map using a discrete search along the epipolar line, minimizing the cross-correlation between a patch of pixels around the candidate point. This way, we obtain a denser map than that obtained with feature-based techniques.

code | report

Sudoku AR


Augmented Reality application to solve Sudokus with a hand-held camera, developed as final project for the course Introduction to Augmented Reality (IN2018).

Out of the whole class, my team and I were the only group willing to put enough work to make it to the TUM’s Demo Day of the Games Engineering department. You can check out our slides to learn who we solve sudokus using AR.

code | slides