Patent Ofiice US Patent number 12340531 Publication date 2025/6/24
A method for the extraction of information about a staircase includes scanning the staircase with a 3D scanning device to obtain a point cloud of the surface of the staircase; calculating the estimated normal vector of at least a part of the points of the point cloud; filtering the at least part of the points based on their normal vectors by ignoring points which do not have a normal vector with a substantially vertically upward direction; ordering the remaining points into sets of points based on their vertical elevation so that each set of points represents a step; determining a straight edge line of each set, which edge line represents the respective step; and storing a numerical representation of each determined edge line in the digital memory.
11th International Conference on Geographical Information Systems Theory, Applications and Management (GISTAM 2025), 2025
Robotic platforms have transformed pipe inspection from routine checks into an automatic data-driven process. Such robotic systems often integrate computer vision technology to collect and analyze inspection data in an automated and efficient way and offer additional capabilities such as 3D reconstruction of pipes and precise measurement of deformations (e.g., dents, buckling). This work presents an initial case study of a robotic inspection system equipped with LiDAR and camera sensors capable of performing automatic pipeline inspections. This proof-of-concept study is dedicated to the 3D reconstruction of the pipeline using LiDAR data collected during inspections. Reconstruction accuracy is evaluated by computing the RMSE for pipe surface reconstruction and the deviation from the reference diameter of a single pipe in a controlled laboratory setting. Reconstruction results reach an accuracy higher than 2cm based on computed RMSE and a precision higher than 0.5cm in pipe diameter estimation. The current implementation is limited to the inspection of matte and non-reflective pipes. Still, it offers a straightforward and scalable solution for various industrial sectors. Future work will incorporate camera data to integrate color mapping into the 3D reconstruction model and detect potential defects and deformations in a pipe.
IGARSS 2024 – 2024 IEEE International Geoscience and Remote Sensing Symposium, 2024
Remote sensing methods employing Unmanned Aerial Vehicles (UAVs) equipped with hyperspectral sensors, facilitate data acquisition, allowing for a precise analysis ofplant health and growth. These methods are characterized by their speed and non-destructiveness. Hyperspectral imagery has a great potential for estimating the Leaf Area Index (LAI), which is crucial for measuring vegetation density. In this work we introduce a novel multi-purpose dataset from a vineyard field in the Korinthos region, Southern Greece, where UAV images were acquired from a Specim AFX-10 camera, with 224 spectral bands as well as from a RGB camera (Phantom 4 Pro DJI); the images were georeferenced by using Ground Control Points (GCP). An experimental analysis of the hyperspectral dataset was conducted to examine the canopy growth rate on three different days, aiming to distinguish the differences between fertilized and non-fertilized vines, with the dataset being available at https://github.com/aelsaer/uavine
Sensors 2024, 24, 1083.
Vehicle exterior inspection is a critical operation for identifying defects and ensuring the overall safety and integrity of vehicles. Visual-based inspection of moving objects, such as vehicles within dynamic environments abounding with reflections, presents significant challenges, especially when time and accuracy are of paramount importance. Conventional exterior inspections of vehicles require substantial labor, which is both costly and prone to errors. Recent advancements in deep learning have reduced labor work by enabling the use of segmentation algorithms for defect detection and description based on simple RGB camera acquisitions. Nonetheless, these processes struggle with issues of image orientation leading to difficulties in accurately differentiating between detected defects. This results in numerous false positives and additional labor effort. Estimating image poses enables precise localization of vehicle damages within a unified 3D reference system, following initial detections in the 2D imagery. A primary challenge in this field is the extraction of distinctive features and the establishment of accurate correspondences between them, a task that typical image matching techniques struggle to address for highly reflective moving objects. In this study, we introduce an innovative end-to-end pipeline tailored for efficient image matching and stitching, specifically addressing the challenges posed by moving objects in static uncalibrated camera setups. Extracting features from moving objects with strong reflections presents significant difficulties, beyond the capabilities of current image matching algorithms. To tackle this, we introduce a novel filtering scheme that can be applied to every image matching process, provided that the input features are sufficient. A critical aspect of this module involves the exclusion of points located in the background, effectively distinguishing them from points that pertain to the vehicle itself. This is essential for accurate feature extraction and subsequent analysis. Finally, we generate a high-quality image mosaic by employing a series of sequential stereo-rectified pairs.
arXiv:2312.01148, 2023
As capturing devices become common, 3D scans of interior spaces are acquired on a daily basis. Through scene comparison over time, information about objects in the scene and their changes is inferred. This information is important for robots and AR/VR devices, in order to operate in an immersive virtual experience. We thus propose an unsupervised object discovery method that identifies added, moved, or removed objects without any prior knowledge of what objects exist in the scene. We model this problem as a combination of a 3D change detection and a 2D segmentation task. Our algorithm leverages generic 2D segmentation masks to refine an initial but “incomplete” set of 3D change detections. The initial changes, acquired through render-and-compare likely correspond to movable objects. The incomplete detections are refined through graph optimization, distilling the information of the 2D segmentation masks in the 3D space. Experiments on the 3Rscan dataset prove that our method outperforms
competitive baselines, with SoTA results. Our code will become available at https://github.com/katadam/ObjectChangeDetection
Heritage 2023, 6, 2701-2715.
Current Multi-View Stereo (MVS) algorithms are tools for high-quality 3D model reconstruction, strongly depending on image spatial resolution. In this context, the combination of image Super-Resolution (SR) with image-based 3D reconstruction is turning into an interesting research topic in photogrammetry, around which however only a few works have been reported so far in the literature. Here, a thorough study is carried out on various state-of-the-art image SR techniques to evaluate the suitability of such an approach in terms of its inclusion in the 3D reconstruction process. Deep-learning techniques are tested here on a UAV image dataset, while the MVS task is then performed via the Agisoft Metashape photogrammetric tool. The data under experimentation are oblique cultural heritage imagery. According to results, point clouds from low-resolution images present quality inferior to those from upsampled high-resolution ones. The SR techniques HAT and DRLN outperform bicubic interpolation, yielding high precision/recall scores for the differences of reconstructed 3D point clouds from the reference surface. The current study indicates spatial image resolution increased by SR techniques may indeed be advantageous for state-of-the art photogrammetric 3D reconstruction
Sensors 2022, 22, 5576
In this contribution, we present a simple and intuitive approach for estimating the exterior (geometrical) calibration of a Lidar instrument with respect to a camera as well as their synchronization shifting (temporal calibration) during data acquisition. For the geometrical calibration, the 3D rigid transformation of the camera system was estimated with respect to the Lidar frame on the basis of the establishment of 2D to 3D point correspondences. The 2D points were automatically extracted on images by exploiting an AprilTag fiducial marker, while the detection of the corresponding Lidar points was carried out by estimating the center of a custom-made retroreflective target. Both AprilTag and Lidar reflective targets were attached to a planar board (calibration object) following an easy-to-implement set-up, which yielded high accuracy in the determination of the center of the calibration target. After the geometrical calibration procedure, the temporal calibration was carried out by matching the position of the AprilTag to the corresponding Lidar target (after being projected onto the image frame), during the recording of a steadily moving calibration target. Our calibration framework was given as an open-source software implemented in the ROS platform. We have applied our method to the calibration of a four-camera mobile mapping system (MMS) with respect to an integrated Velodyne Lidar sensor and evaluated it against a state-of-the-art chessboard-based method. Although our method was a single-camera-to-Lidar calibration approach, the consecutive calibration of all four cameras with respect to the Lidar sensor yielded highly accurate results, which were exploited in a multi-camera texturing scheme of city point clouds
2022 IEEE 14th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP)
High-resolution (HR) satellite images can provide detailed information about land usage/land cover. Often, it is necessary that the satellite sensor inherent spatial resolution is increased through algorithmic processing of the image data acquired. Machine-learning and in particular deep-learning based super-resolution (SR) techniques are an effective tool for increasing the spatial resolution of images. In the current work, Sentinel-2 images are super-resolved to spatial resolution equal to 2.5 m/pixel by means of deep-learning based SR techniques. The area of study is Zakynthos island in Greece. A novel index called Normalized Carotenoid Reflectance Index (NCRI) is proposed for the assessment of land cover by olive trees.
Frontiers in Marine Science 2022
Submarine hydrothermal systems along active volcanic ridges and arcs are highly dynamic, responding to both oceanographic (e.g., currents, tides) and deep-seated geological forcing (e.g., magma eruption, seismicity, hydrothermalism, and crustal deformation, etc.). In particular, volcanic and hydrothermal activity may also pose profoundly negative societal impacts (tsunamis, the release of climate-relevant gases and toxic metal(loid)s). These risks are particularly significant in shallow (<1000m) coastal environments, as demonstrated by the January 2022 submarine paroxysmal eruption by the Hunga Tonga-Hunga Ha’apai Volcano that destroyed part of the island, and the October 2011 submarine eruption of El Hierro (Canary Islands) that caused vigorous upwelling, floating lava bombs, and natural seawater acidification. Volcanic hazards may be posed by the Kolumbo submarine volcano, which is part of the subduction-related Hellenic Volcanic Arc at the intersection between the Eurasian and African tectonic plates. There, the Kolumbo submarine volcano, 7 km NE of Santorini and part of Santorini’s volcanic complex, hosts an active hydrothermal vent field (HVF) on its crater floor (~500m b.s.l.), which degasses boiling CO2–dominated fluids at high temperatures (~265°C) with a clear mantle signature. Kolumbo’s HVF hosts actively forming seafloor massive sulfide deposits with high contents of potentially toxic, volatile metal(loid)s (As, Sb, Pb, Ag, Hg, and Tl). The proximity to highly populated/tourist areas at Santorini poses significant risks. However, we have limited knowledge of the potential impacts of this type of magmatic and hydrothermal activity, including those from magmatic gases and seismicity. To better evaluate such risks the activity of the submarine system must be continuously monitored with multidisciplinary and high resolution instrumentation as part of an in-situ observatory supported by discrete sampling and measurements. This paper is a design study that describes a new long-term seafloor observatory that will be installed within the Kolumbo volcano, including cutting-edge and innovative marine-technology that integrates hyperspectral imaging, temperature sensors, a radiation spectrometer, fluid/gas samplers, and pressure gauges. These instruments will be integrated into a hazard monitoring platform aimed at identifying the precursors of potentially disastrous explosive volcanic eruptions, earthquakes, landslides of the hydrothermally weakened volcanic edifice and the release of potentially toxic elements into the water column
Proceedings of the 25th Pan-Hellenic Conference on Informatics 2021
In the present work deep-learning based super-resolution (SR) is applied on Sentinel-2 images of the Zakynthos island, Greece, with the intention of detecting stress levels in supercentenarian olive trees due to water deficiency. The aim of this study is monitoring the stress in supercentenarian olive trees over time and over season. Specifically, the Carotenoid Reflectance Index 2 (CRI2) is calculated utilizing the Sentinel-2 bands B2 and B5. CRI2 maps at 10m and at 2.5mspatial resolutions are generated. In fact, the images of band B2 with original spatial resolution 10m are super-resolved to 2.5m. Regarding the images of band B5, these are SR resolved from 20m firstly to 10m and secondly to 2.5m. Deep-learning based SR techniques, namely DSen2 and RakSRGAN, are utilized for enhancing the spatial resolution to 10m and 2.5m. The following five seasons are considered autumn 2019, spring 2019, spring 2020, summer 2019 and summer 2020. In the future, comparisons with field measurements could better assess for the proposed methodology effectiveness regarding the recognition of stress levels in very old olive trees.
Pattern Recognition. ICPR International Workshops and Challenges (pp.462-476), 2021
Monitoring construction sites from space using high-resolution (HR) imagery enables remote tracking instead of physically traveling to a site. Thus, valuable resources are saved while recording of the construction site progression at anytime and anywhere in the world is feasible. In the present work Sentinel-2 (S2) images at 10meters (m) are spatially super-resolved per factor 4 by means of deep-learning. Initially, the very-deep super-resolution (VDSR) network is trained with matching pairs of S2 and SPOT-7 images at 2.5m target resolution. Then, the trained VDSR network, named SPOT7-VDSR, becomes able to increase the resolution of S2 images which are completely unknown to the net. Additionally, the VDSR net technique and bicubic interpolation are applied to increase the resolution of S2. Numerical and visual comparisons are carried out on the area of interest Karditsa, Greece. The current study of superresolving S2 images is novel in the literature and can prove very useful in application cases where only S2 images are available and not the corresponding SPOT-7 higher-resolution ones. During the present super-resolution (SR) experimentations, the proposed net SPOT7-VDSR outperforms the VDSR net up to 8.24 decibel in peak signal to noise ratio (PSNR) and bicubic interpolation up to 16.9% in structural similarity index (SSIM).
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2020
In this contribution, we propose a versatile image-based methodology for 3D reconstructing underwater scenes of high fidelity and integrating them into a virtual reality environment. Typically, underwater images suffer from colour degradation (blueish images) due to the propagation of light through water, which is a more absorbing medium than air, as well as the scattering of light on suspended particles. Other factors, such as artificial lights, also, diminish the quality of images and, thus, the quality of the image-based 3D reconstruction. Moreover, degraded images have a direct impact on the user perception of the virtual environment, due to geometric and visual degenerations. Here, it is argued that these can be mitigated by image pre-processing algorithms and specialized filters. The impact of different filtering techniques on images is evaluated, in order to eliminate colour degradation and mismatches in the image sequences. The methodology in this work consists of five sequential pre-processes; saturation enhancement, haze reduction, and Rayleigh distribution adaptation, to de-haze the images, global histogram matching to minimize differences among images of the dataset, and image sharpening to strengthen the edges of the scene. The 3D reconstruction of the models is based on open-source structure-from-motion software. The models are optimized for virtual reality through mesh simplification, physically based rendering texture maps baking, and level-of-details. The results of the proposed methodology are qualitatively evaluated on image datasets captured in the seabed of Santorini island in Greece, using a ROV platform.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2020
In this work we present the development of a prototype, mobile mapping platform with modular design and architecture that can be suitably modified to address effectively both outdoors and indoors environments. Our system is built on the Robotics Operation System (ROS) and utilizes multiple sensors to capture images, pointclouds and 3D motion trajectories. These include synchronized cameras with wide angle lenses, a lidar sensor, a GPS/IMU unit and a tracking optical sensor. We report on the individual components of the platform, it’s architecture, the integration and the calibration of its components, the fusion of all recorded data and provide initial 3D reconstruction results. The processing algorithms are based on existing implementations of SLAM (Simultaneous Localisation and Mapping) methods combined with SfM (Structure-from-Motion) for optimal estimations of orientations and 3D pointclouds. The scope of this work, which is part of an ongoing H2020 program, is to digitize the physical world, collect relevant spatial data and make digital copies available to experts and public for covering a wide range of needs; remote access and viewing, process, design, use in VR etc
arXiv preprint arXiv:2007.15417, 2020
In this work, very deep super-resolution (VDSR) method is presented for improving the spatial resolution of remotely sensed (RS) images for scale factor 4. The VDSR net is re-trained with Sentinel-2 images and with drone aero orthophoto images, thus becomes RS-VDSR and Aero-VDSR, respectively. A novel loss function, the Var-norm estimator, is proposed in the regression layer of the convolutional neural network during re-training and prediction. According to numerical and optical comparisons, the proposed nets RS-VDSR and Aero-VDSR can outperform VDSR during prediction with RS images. RS-VDSR outperforms VDSR up to 3.16 dB in terms of PSNR in Sentinel-2 images
arXiv preprint arXiv:2007.08791, 2020
Deep learning techniques are applied so as to increase the spatial resolution of Sentinel2 satellite imagery, depicting the Amynteo lignite mine in Ptolemaida, Greece. Resolution enhancement by factors 2 and 4 as well as by factors 2 and 6 using Very-Deep SuperResolution (VDSR) and DSen2 networks, respectively, provides fairly well results on Amynteo lignite mine images.
Proc., 99th Annual Meeting of the Transportation Research Board. Washington DC: Transportation Research Board, 2020
This paper investigates the interaction between vehicle dynamics parameters and road geometry during the passing process. The methodology is based on a realistic representation of the passing task with respect to roadway’s posted speed and the ability of the passing (examined) vehicle to perform such maneuvers. Regarding passing distance outputs, an existing vehicle dynamics model was utilized, where aiming to assess the model’s accuracy, instrumented field measurements were performed. The analytical model is computationally demanding. Therefore statistical models were worked out, in line with the German rural road design guidelines, to determine passing sight distances (PSDs) by arranging combinations of 4 critical vehicle – roadway parameters; namely vehicle horsepower rates, variations between the passed vehicle’s speed and roadway’s posted speed, peak friction supply coefficients and grade values. The analysis revealed that the difference between the speed of the passed vehicle and the posted speed value, as well as certain interactions of the assessed parameters impact excessively PSD, especially for values below 20km/h. The lognormal modelling approach for predicting PSDs was found efficient and may be useful to researchers and practitioners aiming to evaluate the interaction of the utilized road–vehicle parameters in terms of determining PSDs as well as passing zones. Although more advanced communication between vehicles or between vehicles and road environment seems a prerequisite in order integrated guidance during passing maneuvers to be enabled, the present research consists an opening paradigm of how the passing process can be standardized and therefore deployed in advanced driver assistance systems (ADAS).
Remote Sens. 2020, 12, 2002.
Generating Digital Elevation Models (DEM) from satellite imagery or other data sources constitutes an essential tool for a plethora of applications and disciplines, ranging from 3D flight planning and simulation, autonomous driving and satellite navigation, such as GPS, to modeling water flow, precision farming and forestry. The task of extracting this 3D geometry from a given surface hitherto requires a combination of appropriately collected corresponding samples and/or specialized equipment, as inferring the elevation from single image data is out of reach for contemporary approaches. On the other hand, Artificial Intelligence (AI) and Machine Learning (ML) algorithms have experienced unprecedented growth in recent years as they can extrapolate rules in a data-driven manner and retrieve convoluted, nonlinear one-to-one mappings, such as an approximate mapping from satellite imagery to DEMs. Therefore, we propose an end-to-end Deep Learning (DL) approach to construct this mapping and to generate an absolute or relative point cloud estimation of a DEM given a single RGB satellite (Sentinel-2 imagery in this work) or drone image. The model has been readily extended to incorporate available information from the non-visible electromagnetic spectrum. Unlike existing methods, we only exploit one image for the production of the elevation data, rendering our approach less restrictive and constrained, but suboptimal compared to them at the same time. Moreover, recent advances in software and hardware allow us to make the inference and the generation extremely fast, even on moderate hardware. We deploy Conditional Generative Adversarial networks (CGAN), which are the state-of-the-art approach to image-to-image translation. We expect our work to serve as a springboard for further development in this field and to foster the integration of such methods in the process of generating, updating and analyzing DEMs.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2019
3D semantic segmentation is the joint task of partitioning a point cloud into semantically consistent 3D regions and assigning them to a semantic class/label. While the traditional approaches for 3D semantic segmentation typically rely only on structural information of the objects (i.e. object geometry and shape), the last years many techniques combining both visual and geometric features have emerged, taking advantage of the progress in SfM/MVS algorithms that reconstruct point clouds from multiple overlapping images. Our work describes a hybrid methodology for 3D semantic segmentation, relying both on 2D and 3D space and aiming at exploring whether image selection is critical as regards the accuracy of 3D semantic segmentation of point clouds. Experimental results are demonstrated on a free online dataset depicting city blocks around Paris. The experimental procedure not only validates that hybrid features (geometric and visual) can achieve a more accurate semantic segmentation, but also demonstrates the importance of the most appropriate view for the 2D feature extraction.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2019
The now widely available and highly popular among non-expert users, particularly in the context of UAV photogrammetry, Structure-from-Motion (SfM) pipelines have also further renewed the interest in the issue of automatic camera calibration. The well-documented requirements for robust self-calibration cannot be always met, e.g. due to restrictions in time and cost, absence of ground control and image tilt, terrain morphology, unsuitable flight configuration etc.; hence, camera pre-calibration is frequently recommended. In this context, users often resort to flexible, user-friendly tools for camera calibration based on 2D coded patterns (primarily ordinary chessboards). Yet, the physical size of such patterns poses obvious limitations. This paper discusses the alternative of extending the size of the calibration object by using multiple unordered coplanar chessboards, which might accommodate much larger imaging distances. This is done initially by a detailed simulation to show that – in terms of geometry – this could be a viable alternative to single patterns. A first algorithmic implementation is then laid out, and results from real multi-pattern configurations, both ordered and unordered, are successfully compared. However, aspects of the proposed approach need to be further studied for its reliable practical employment.
Digital Presentation and Preservation of Cultural and Scientific Heritage 2019
MindSpaces provides solutions for creating functionally and emotionally appealing architectural designs in urban spaces. Social media services, physiological sensing devices and video cameras provide data from sensing environments. State-of-the-Art technology including VR, 3D design tools, emotion extraction, visual behaviour analysis, and textual analysis will be incorporated in MindSpaces platform for analysing data and adapting the design of spaces.
International Conference on Maritime Safety and Operations 2016
Lately, 3D modelling via Photogrammetry/3D Computer Vision and Laser scanning are becoming a standard in processes regarding industrial inspection and spatio-temporal recording. In this contribution, a combination of both methods is presented for the precise 3D geometric and visual documentation of the internal compartments of a cargo ship. Visual inspection is an integral part of Condition and Class surveys, with the results comprising of the surveyors’ opinion, documented by a sum of pictures indicating areas of interest. Although this way provides the most essential information, the communication of the results may be difficult, since isolated images cannot provide the necessary context. Here we exploit image data and a terrestrial scanner in order to provide high fidelity 3D models. The presented workflow combines a custom SfM and stereo-matching algorithm, commercial and open-source tools. Except from the accurate results, the workflow is required to be structured in an algorithmic way, to enable realization by automated means (robotic platforms). The reconstructed 3D model is presented along with some concluding remarks.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2015
The indirect estimation of leaf area index (LAI) in large spatial scales is crucial for several environmental and agricultural applications. To this end, in this paper, we compare and evaluate LAI estimation in vineyards from different UAV imaging datasets. In particular, canopy levels were estimated from i.e., (i) hyperspectral data, (ii) 2D RGB orthophotomosaics and (iii) 3D crop surface models. The computed canopy levels have been used to establish relationships with the measured LAI (ground truth) from several vines in Nemea, Greece. The overall evaluation indicated that the estimated canopy levels were correlated (r2 > 73%) with the in-situ, ground truth LAI measurements. As expected the lowest correlations were derived from the calculated greenness levels from the 2D RGB orthomosaics. The highest correlation rates were established with the hyperspectral canopy greenness and the 3D canopy surface models. For the later the accurate detection of canopy, soil and other materials in between the vine rows is required. All approaches tend to overestimate LAI in cases with sparse, weak, unhealthy plants and canopy.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2015
The automated and cost-effective building detection in ultra high spatial resolution is of major importance for various engineering and smart city applications. To this end, in this paper, a model-based building detection technique has been developed able to extract and reconstruct buildings from UAV aerial imagery and low-cost imaging sensors. In particular, the developed approach through advanced structure from motion, bundle adjustment and dense image matching computes a DSM and a true orthomosaic from the numerous GoPro images which are characterised by important geometric distortions and fish-eye effect. An unsupervised multi-region, graphcut segmentation and a rule-based classification is responsible for delivering the initial multi-class classification map. The DTM is then calculated based on inpaininting and mathematical morphology process. A data fusion process between the detected building from the DSM/DTM and the classification map feeds a grammar-based building reconstruction and scene building are extracted and reconstructed. Preliminary experimental results appear quite promising with the quantitative evaluation indicating detection rates at object level of 88% regarding the correctness and above 75% regarding the detection completeness.
Videometrics, Range Imaging, and Applications XIII 2015
Although multiple-view matching provides certain significant advantages regarding accuracy, occlusion handling and radiometric fidelity, stereo-matching remains indispensable for a variety of applications; these involve cases when image acquisition requires fixed geometry and limited number of images or speed. Such instances include robotics, autonomous navigation, reconstruction from a limited number of aerial/satellite images, industrial inspection and augmented reality through smart-phones. As a consequence, stereo-matching is a continuously evolving research field with growing variety of applicable scenarios. In this work a novel multi-purpose cost for stereo-matching is proposed, based on census transformation on image gradients and evaluated within a local matching scheme. It is demonstrated that when the census transformation is applied on gradients the invariance of the cost function to changes in illumination (non-linear) is significantly strengthened. The calculated cost values are aggregated through adaptive support regions, based both on crossskeletons and basic rectangular windows. The matching algorithm is tuned for the parameters in each case. The described matching cost has been evaluated on the Middlebury stereo-vision 2006 datasets, which include changes in illumination and exposure. The tests verify that the census transformation on image gradients indeed results in a more robust cost function, regardless of aggregation strategy.
Videometrics, Range Imaging, and Applications XIII 2015
A fundamental step in the generation of visually detailed 3D city models is the acquisition of high fidelity 3D data. Typical approaches employ DSM representations usually derived from Lidar (Light Detection and Ranging) airborne scanning or image based procedures. In this contribution, we focus on the fusion of data from both these methods in order to enhance or complete them. Particularly, we combine an existing Lidar and orthomosaic dataset (used as reference), with a new aerial image acquisition (including both vertical and oblique imagery) of higher resolution, which was carried out in the area of Kallithea, in Athens, Greece. In a preliminary step, a digital orthophoto and a DSM is generated from the aerial images in an arbitrary reference system, by employing a Structure from Motion and dense stereo matching framework. The image-to-Lidar registration is performed by 2D feature (SIFT and SURF) extraction and matching among the two orthophotos. The established point correspondences are assigned with 3D coordinates through interpolation on the reference Lidar surface, are then backprojected onto the aerial images, and finally matched with 2D image features located in the vicinity of the backprojected 3D points. Consequently, these points serve as Ground Control Points with appropriate weights for final orientation and calibration of the images through a bundle adjustment solution. By these means, the aerial imagery which is optimally aligned to the reference dataset can be used for the generation of an enhanced and more accurately textured 3D city model.
2015
Relative orientation in a stereo pair (establishing 3D epipolar geometry) is generally described as a rigid body transformation, with one arbitrary translation component, between two formed bundles of rays. In the uncalibrated case, however, only the 2D projective pencils of epipolar lines can be established from simple image point homologies. These may be related to each other in infinite variations of perspective positions in space, each defining different camera geometries and relative orientation of image bundles. It is of interest in photogrammetry to also approach the 3D image configurations embedded in 2D epipolar geometry in a Euclidean (rather than a projective-algebraic) framework. This contribution attempts such an approach initially in 2D to propose a parameterization of epipolar geometry; when fixing some of the parameters, the remaining ones correspond to a ‘circular locus’ for the second epipole. Every point on this circle is related to a specific direction on the plane representing the intersection line of image planes. Each of these points defines, in turn, a circle as locus of the epipole in space (to accommodate all possible angles of intersection of the image planes). It is further seen that knowledge of the lines joining the epipoles with the respective principal points suffices for establishing the relative position of image planes and the direction of the base line in model space; knowledge of the actual position of the principal points allows full relative orientation and camera calibration of central perspective cameras. Issues of critical configuration are also addressed. Possible future tasks include study of different a priori knowledge as well as the case of the image triplet.
ISPRS Journal of Photogrammetry and Remote Sensing 2014
Defining pixel correspondences among images is a fundamental process in fully automating image-based 3D reconstruction. In this contribution, we show that an adaptive local stereo-method of high computational efficiency may provide accurate 3D reconstructions under various scenarios, or even outperform global optimizations. We demonstrate that census matching cost on image gradients is more robust, and we exponentially combine it with the absolute difference in colour and in principal image derivatives. An aggregated cost volume is computed by linearly expanded cross skeleton support regions. A novel consideration is the smoothing of the cost volume via a modified 3D Gaussian kernel, which is geometrically constrained; this offers 3D support to cost computation in order to relax the inherent assumption of “fronto-parallelism” in local methods. The above steps are integrated into a hierarchical scheme, which exploits adaptive windows. Hence, failures around surface discontinuities, typical in hierarchical matching, are addressed. Extensive results are presented for datasets from popular benchmarks as well as for aerial and high-resolution close-range images.
International Symposium on Visual Computing 2013
Cultural and creative industries constitute a large range of economic activities. Towards this expansion we need to state the inclusion of ICT technologies, as such of 3D reconstruction methods. However, precise 3D reconstruction under a computationally affordable manner is a research challenge. One way to precisely reconstruct a cultural object is through the use of photogrammetry with the main goal of finding the correspondences between two or more images to reconstruct 3D surfaces. A cultural object is often surrounded by visual background data that should be excluded to improve 3D reconstruction accuracy. Background conditions dynamically change, especially if the object is captured under outdoor conditions, where many occlusions occur and the shadows effects are not negligible. In this paper, we propose a combine image segmentation and matching method to yield an affordable 3D reconstruction of cultural objects. Image segmentation is performed on the use of active contours while image matching through novel multi-cost criteria optimization functions. Experimental results on real-life ancient column capitals indicate the efficiency of the proposed scheme both in terms of performance efficiency and cost.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2013
As a rule, image-based documentation of cultural heritage relies today on ordinary digital cameras and commercial software. As such projects often involve researchers not familiar with photogrammetry, the question of camera calibration is important. Freely available open-source user-friendly software for automatic camera calibration, often based on simple 2D chess-board patterns, are an answer to the demand for simplicity and automation. However, such tools cannot respond to all requirements met in cultural heritage conservation regarding possible imaging distances and focal lengths. Here we investigate the practical possibility of camera calibration from unknown planar objects, i.e. any planar surface with adequate texture; we have focused on the example of urban walls covered with graffiti. Images are connected pair-wise with inter-image homographies, which are estimated automatically through a RANSAC-based approach after extracting and matching interest points with the SIFT operator. All valid points are identified on all images on which they appear. Provided that the image set includes a “fronto-parallel” view, inter-image homographies with this image are regarded as emulations of image-to-world homographies and allow computing initial estimates for the interior and exterior orientation elements. Following this initialization step, the estimates are introduced into a final self-calibrating bundle adjustment. Measures are taken to discard unsuitable images and verify object planarity. Results from practical experimentation indicate that this method may produce satisfactory results. The authors intend to incorporate the described approach into their freely available user-friendly software tool, which relies on chess-boards, to assist non-experts in their projects with image-based approaches.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2013
In recent years, a demand for 3D models of various scales and precisions has been growing for a wide range of applications; among them, cultural heritage recording is a particularly important and challenging field. We outline an automatic 3D reconstruction pipeline, mainly focusing on dense stereo-matching which relies on a hierarchical, local optimization scheme. Our matching framework consists of a combination of robust cost measures, extracted via an intuitive cost aggregation support area and set within a coarse-tofine strategy. The cost function is formulated by combining three individual costs: a cost computed on an extended census transformation of the images; the absolute difference cost, taking into account information from colour channels; and a cost based on the principal image derivatives. An efficient adaptive method of aggregating matching cost for each pixel is then applied, relying on linearly expanded cross skeleton support regions. Aggregated cost is smoothed via a 3D Gaussian function. Finally, a simple “winnertakes- all” approach extracts the disparity value with minimum cost. This keeps algorithmic complexity and system computational requirements acceptably low for high resolution images (or real-time applications), when compared to complex matching functions of global formulations. The stereo algorithm adopts a hierarchical scheme to accommodate high-resolution images and complex scenes. In a last step, a robust post-processing work-flow is applied to enhance the disparity map and, consequently, the geometric quality of the reconstructed scene. Successful results from our implementation, which combines pre-existing algorithms and novel considerations, are presented and evaluated on the Middlebury platform.
European Conference on Computer Vision 2012
Falls have been reported as the leading cause of injury-related visits to emergency departments and the primary etiology of accidental deaths in elderly. The system presented in this article addresses the fall detection problem through visual cues. The proposed methodology utilize a fast, real-time background subtraction algorithm based on motion information in the scene and capable to operate properly in dynamically changing visual conditions, in order to detect the foreground object and, at the same time, it exploits 3D space’s measures, through automatic camera calibration, to increase the robustness of fall detection algorithm which is based on semi-supervised learning. The above system uses a single monocular camera and is characterized by minimal computational cost and memory requirements that make it suitable for real-time large scale implementations.
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2012
Defining pixel correspondences in stereo-pairs is a fundamental process in automated image-based 3D reconstruction. In this contribution we report on an approach for dense matching, based on local optimization. The approach represents a fusion of state-of-theart algorithms and novel considerations, which mainly involve improvements in the cost computation and aggregation processes. The matching cost which has been implemented here combines the absolute difference of image colour values with a census transformation directly on images gradient of all colour channels. Besides, a new cost volume is computed by aggregating over cross-window support regions with a linearly defined threshold on cross-window expansion. Aggregated costs are, then, refined using a scan-line optimization technique, and the disparity map is estimated using a “winner-takes-all” selection. Occlusions and mismatches are also handled using existing schemes. The proposed algorithm is tested on a standard stereo-matching data-set with promising results. The future tasks mainly include research on refinement of the disparity map and development of a self-adaptive approach for weighting the contribution of different matching cost components.
Geoinformatics FCE CTU 2011
Recently, one of the central issues in the fields of Photogrammetry, Computer Vision, Computer Graphics and Image Processing is the development of tools for the automatic reconstruction of complex 3D objects. Among various approaches, one of the most promising is Structured Light 3D scanning (SL) which combines automation and high accuracy with low cost, given the steady decrease in price of cameras and projectors. SL relies on the projection of different light patterns, by means of a video projector, on 3D object sur faces, which are recorded by one or more digital cameras. Automatic pattern identification on images allows reconstructing the shape of recorded 3D objects via triangulation of the optical rays corresponding to projector and camera pixels. Models draped with realistic phototexture may be thus also generated, reproducing both geometry and appearance of the 3D world. In this context, subject of our research is a synthesis of state-of-the-art as well as the development of novel algorithms, in order to implement a 3D scanning system consisting, at this stage, of one consumer digital camera (DSLR) and a video projector. In the following, the main principles of structured light scanning and the algorithms implemented in our system are presented, and results are given to demonstrate the potential of such a system. Since this work is part of an ongoing research project, future tasks are also discussed.
XXII CIPA Symposium on Digital Documentation, Interpretation & Presentation of Cultural Heritage, Kyoto, Japan 2009
Recently, the authors presented a fully automatic camera calibration algorithm. The open-source software FAUCCAL (Fully Automatic Camera Calibration) is now freely available on the Internet. Input is simply different images of standard chess-board patterns; the software then proceeds automatically to produce calibration results and statistical data for camera parameters selected by the user. With this software, feature points are first extracted with a Harris operator, among which valid pattern nodes are separated and subsequently ordered in columns and rows in correspondence with pattern nodes. Initial values for all unknown parameters are estimated automatically. Based on the established point correspondences and initial values, a final bundle adjustment allows fully recovering–without use of any external information–the camera geometry parameters selected by the user. Besides camera constant, principal point location and radial-symmetric lens distortion polynomial, these may include decentering lens distortion, aspect ratio and skewness. Graphical output is also provided. The FAUCCAL site provides users with the source code in Matlab, detailed documentation of the software (including tips), links to bibliographical references and an image test dataset with results. It is believed that our software will prove useful to everyone, and particularly non-photogrammetrists, involved in the field of cultural heritage documentation. The authors welcome any questions but also suggestions, comments and criticism which will help improve this toolbox.
Proc. 22nd CIPA Symposium, October, 2009
Recently, the authors presented a fully automatic camera calibration algorithm. The open-source software FAUCCAL (Fully Automatic Camera Calibration) is now freely available on the Internet. Input is simply different images of standard chess-board patterns; the software then proceeds automatically to produce calibration results and statistical data for camera parameters selected by the user. With this software, feature points are first extracted with a Harris operator, among which valid pattern nodes are separated and subsequently ordered in columns and rows in correspondence with pattern nodes. Initial values for all unknown parameters are estimated automatically. Based on the established point correspondences and initial values, a final bundle adjustment allows fully recovering–without use of any external information–the camera geometry parameters selected by the user. Besides camera constant, principal point location and radial-symmetric lens distortion polynomial, these may include decentering lens distortion, aspect ratio and skewness. Graphical output is also provided. The FAUCCAL site provides users with the source code in Matlab, detailed documentation of the software (including tips), links to bibliographical references and an image test dataset with results. It is believed that our software will prove useful to everyone, and particularly non-photogrammetrists, involved in the field of cultural heritage documentation. The authors welcome any questions but also suggestions, comments and criticism which will help improve this toolbox.
XXI International Society of Photogrammetry and Remote Sensing Congress 2008
The paper focuses on a description of the techniques, both photogrammetric and geodetic, used for the data acquisition and processing concerning the project “Development of Geographic Information Systems at the Acropolis of Athens”. Aiming at the development of a Geographic Information System which will incorporate large-scale orthophotomosaics for the walls, an orthophotomosaic of the top view of the site, as well as a dense textured 3D surface model of the walls along with the rock, the project is divided into three basic tasks: the geodetic, involving field measurements for the generation of a polygonometric network and terrestrial laser scanning of the walls along with the Erechtheion monument, the photogrammetric one involving image acquisition, orientation, DSM generation and orthorectification, and finally the development of the GIS. This contribution underlines particularly the methodologies applied highlighting simultaneously the potential of combining photogrammetry and state-of-the-art geodetic techniques (laser scanning) for an accurate 3D modeling of cultural heritage sites
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2008
This paper presents the project “Development of Geographic Information Systems at the Acropolis of Athens“, financed by the European Union and the Government of Greece. The Acropolis of Athens is one of the major archaeological sites world-wide included in the UNESCO World Heritage list. The project started in June 2007 and will finish at the end of 2008. The paper presents the motivation for the project and its aims, giving a description of the deliverables and the specifications, as well as the project difficulties. Furthermore, we present the techniques used, both photogrammetric and geodetic, for data acquisition and processing. The project is divided into three basic tasks: the geodetic one, involving field measurements for the generation of a polygonometric network and terrestrial laser scanning of the walls and Acropolis rock and also the Erechtheion monument, the photogrammetric one involving image acquisition, orientation, DSM generation and orthorectification, and finally the development of a GIS database and applications. This contribution underlines particularly the potential of combining different technologies (especially digital imaging and laser scanning) for an accurate 3D modeling of cultural heritage sites. Preliminary results are reported
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2008
In this contribution, we report on the image-based modeling (IBM) results of the Erechtheion monument, located inside the Acropolis of Athens, Greece. The work is part of the project “Development of Geographic Information Systems at the Acropolis of Athens“. An aim of the project is the 3D documentation of the whole Acropolis, one of the major archaeological sites world-wideincluded in the UNESCO World Heritage list. The largest part of the monument was digitised with laser scanning, while the main objective of IBM was to model difficult-to- access areas not covered by the scanner but also for comparison with laser scanning for scientific investigations. For the 3D modeling, as the Erechtheion contains some typical architectural elements (like columns, flat walls, etc), some manual measurements were necessary. On the other hand, for some detailed areas automated approaches for densesurface reconstructions are applied. For these parts we compared the image matching results with the surfaces coming from a laser scanner.
Proceedings of the 2nd ISPRS International Workshop 3D-ARCH 2007
Texture-mapping in close-range photogrammetry focuses mostly on the generation of large-scale projections of 3D surfaces, the most common instance being that of orthoimaging. Today, photogrammetry increasingly uses terrestrial laser scanning as basis for generation of 3D models. For a full exploitation of fully 3D data, however, typical shortcomings of conventional orthorectification software (inability to handle both surface self-occlusions and image occlusions; single-image texturing) must be addressed. Here, the authors elaborate on their approach for the automated generation of orthoimages and perspective views, based on fully 3D models from laser scanning and multi-image texture interpolation. The problem of occlusion is solved by first identifying all surface points visible in the direction of projection; texture is then interpolated through blending using all images which actually view each particular surface point. Texture outliers from individual images are automatically filtered out with a statistical test. Yet, further means for excluding outlying colour values are needed. Rather than using the depth maps of source images to identify possible occlusion borders, these borders are automatically extracted directly on the orthoprojection of each image. Their back-projection on the corresponding image, suitably processed with a morphological operator (dilation), defines ‘risk buffers’ on each image and suppresses their participation in colour interpolation. Combined with a statistical test, this procedure has proved beneficial for the quality of the results. Practical tests based on image sets with considerable variations in image scale have indicated that such new features of the algorithm facilitate the cooperation of laser scanning with photogrammetry for the automatic multi-view synthesis of textured projections.
ISPRS journal of photogrammetry and remote sensing 2007
Camera calibration is a fundamental task in photogrammetry and computer vision. This paper presents an approach for the automatic estimation of interior orientation from images with three vanishing points of orthogonal directions. Extraction of image line segments and their clustering into groups corresponding to three dominant vanishing points are performed without any human interaction. Camera parameters (camera constant, location of principal point, two coefficients of radial lens distortion) and the vanishing points are estimated in a one-step adjustment of all participating image points. The approach may function in a single-image mode, but is also capable of handling input from independent images (i.e. images not necessarily of the same object) with three and/or two vanishing points in a common solution. The reported experimental tests indicate that, within certain limits, results from single images compare satisfactorily with those from multi-image bundle adjustment.
Photogrammetric Engineering & Remote Sensing 2007
Conventional orthorectification software cannot handle surface occlusions and image visibility. The approach presented here synthesizes related work in photogrammetry and computer graphics/vision to automatically produce orthographic and perspective views based on fully 3D surface data (supplied by laser scanning). Surface occlusions in the direction of projection are detected to create the depth map of the new image. This information allows identifying, by visibility checking through back-projection of surface triangles, all source images which are entitled to contribute color to each pixel of the novel image. Weighted texture blending allows regulating the local radiometric contribution of each source image involved, while outlying color values are automatically discarded with a basic statistical test. Experimental results from a close-range project indicate that this fusion of laser scanning with multiview photogrammetry could indeed combine geometric accuracy with high visual quality and speed. A discussion of intended improvements of the algorithm is also included
XXI CIPA International Symposium 2007
Terrestrial laser scanning has now become a standard tool for 3D surface modeling. In order to exploit such fully 3D data in texturemapping or for the creation of large-scale ‘true orthos’ suitable software is needed, particularly to allow handling surface self-occlusions and image occlusions, as well as multi-image texture interpolation. The authors have presented such an automatic approach for creating orthoimages and perspective views, based on fully 3D models from laser scanning. All surface points visible in the direction of projection are first identified, and then texture from all images which view each particular surface point is blended. Means for automatically eliminating colour outliers from individual images, especially near image occlusion borders, are also provided. In this contribution, the algorithm is evaluated using image sets with large variations in image scale and unconventional imaging configurations. The presented results indicate that this approach, which involves a cooperation of photogrammetry with laser scanning for the automatic multi-view synthesis of textured projections, performs quite satisfactorily also under demanding circumstances.
Proceedings of the FIG-ISPRS-ICA International Symposium on Modern Technologies, Education and Professional Practice in Geodesy and Related Fields, Sofia 2006
The current implementation of an educational package for basic photogrammetric operations is out-lined. The context of this open-source software (MPT), developed in Matlab, is primarily the intro-ductory course of Photogrammetry. Thus, the scope here is not to show students ‘how to do it’ but rather to clarify ‘what is actually being done’ in every step. In this sense, the stress lies mainly on basic photogrammetric adjustments. Students can work with one or two images at a time and per-form monoscopic measurements of image points, lines or polylines. Exterior orientation is handled in a variety of ways: space resection with or without camera calibration (with or without estimation of radial lens distortion); linear and non-linear DLT approach (again with or without lens distortion); relative orientation and absolute orientation. Detailed results are presented, including standard error of the adjustment, residuals, covariance matrix of estimated parameters, correlations. Individual ob-servations may be optionally excluded to study their effect on the adjustment. For fully oriented ste-reo pairs, 3D reconstruction is then possible. The 3D plot may be observed in the program’s 3D viewer and exported in DXF format. Besides, 2D projective transformation is also possible, allowing rectification of vector data or resampling of digital images. Other features (e.g. image enhancement tools or self-calibrating bundle adjustment) are already implemented and will soon be incorporated.
International Archives of the Photogrammetry, Remote Sensing & Spatial Information Sciences 2006
A novel approach is presented for automatic camera calibration from single images with three finite vanishing points in mutually orthogonal directions (or of more independent images having two and/or three such vanishing points). Assuming ‘natural camera’, estimation of the three basic elements of interior orientation (camera constant, principal point location), along with the two coefficients of radial-symmetric lens distortion, is possible without any user interaction. First, image edges are extracted with sub-pixel accuracy, linked to segments and subjected to least-squares line-fitting. Next, these line segments are clustered into dominant space directions. In the vanishing point detection technique proposed here, the contribution of each image segment is calculated via a voting scheme, which involves the slope uncertainty of fitted lines to allow a unified treatment of long and short segments. After checking potential vanishing points against certain geometric criteria, the triplet having the highest score indicates the three dominant vanishing points. Coming to camera calibration, a main issue here is the simultaneous adjustment of image point observations for vanishing point estimation, radial distortion compensation and recovery of interior orientation in one single step. Thus, line-fitting from vanishing points along with estimation of lens distortion is combined with constraints relating vanishing points to camera parameters. Here, the principal point may be considered as the zero point of distortion and participate in both sets of equations as a common unknown. If a redundancy in vanishing points exists – e.g. when more independent images from the same camera with three, or even two, vanishing points are at hand and are to be combined for camera calibration – such a unified adjustment is undoubtedly advantageous. After the initial adjustment, the points of all segments are corrected for lens distortion to allow linking of collinear segments to longer entities, and the process is repeated. Data from automatic single-image calibrations are reported and evaluated against multi-image bundle adjustment with satisfactory results. Finally, further interesting topics of study are indicated
Proceedings of International Symposium on Modern Technologies, Education and Profeesional Practice in Geodesy and Related Fields 2005
Video sequences of road and traffic scenes are currently used for various purposes, such as studies of the traffic character of freeways. The task here is to automatically estimate vehicle speed from video sequences, acquired with a downward tilted camera from a bridge. Assuming that the studied road segment is planar and straight, the vanishing point in the road direction is extracted automatically by exploiting lane demarcations. Thus, the projective distortion of the road surface can be removed allowing affine rectification. Consequently, given one known ground distance along the road axis, 1D measurement of vehicle position in the correctly scaled road direction is possible. Vehicles are automatically detected and tracked along frames. First, the background image (the empty road) is created from several frames by an iterative per channel exclusion of outlying colour values based on thresholding. Next, the subtraction of the background image from the current frame is binarized, and morphological filters are employed for vehicle clustering. At the lowest part of vehicle clusters a window is defined for normalised cross-correlation among frames to allow vehicle tracking. The reference data for vehicle speed came from rigorous 2D projective transformation based on control points (which had been previously evaluated against GPS measurements). Compared to these, our automatic approach gave a very satisfactory estimated accuracy in vehicle speed of about±3 km/h.
CIPA 2005 XX International Symposium 2005
Orthophotography – and photo-textured 3D surface models, in general – are most important photogrammetric products in heritage conservation. However, it is now common knowledge that conventional orthorectification software accepts only surface descriptions obtained via 2D triangulation and cannot handle the question of image visibility. Ignoring multiple surface elevations and image occlusions of the complex surface shapes, typically met in conservation tasks, results in products visually and geometrically distorted. Tiresome human intervention in the surface modeling and image orthorectification stages might partly remedy this shortcoming. For surface modeling, however, laser scanners allow now collection of numerous accurate surface points and creation of 3D meshes. The authors present their approach for an automated production of correct orthoimages (and perspective views), given a multiple image coverage with known calibration/orientation data and fully 3D surface representations derived through laser scanning. The developed algorithm initially detects surface occlusions in the direction of projection. Next, all available imagery is utilised to establish a colour value for each pixel of the new image. After back-projecting (using the bundle adjustment data) all surface triangles onto all initial images to establish visibilities, texture ‘blending’ is performed. Suitable weighting controls the local radiometric contribution of each participating source image, while outlying colour values (due mainly to registration and modeling errors) are automatically filtered with a simple statistical test. The generation of a depth map for each original image provides a means to further restrict the effects of orientation and modeling errors on texturing, mainly by checking closeness to occlusion borders. This ‘topological’ information may also allow establishing suitable image windows for colour interpolation. Practical tests of the implemented algorithm, using images with multiple overlap and two 3D models, indicate that this fusion of laser scanner and photogrammetry is indeed capable to automatically synthesize novel views from multiple images. The developed approach, combining an outcome of geometric accuracy and visual quality with speed, appears as a realistic approach in heritage conservation. Further necessary elaborations are also outlined.
CIPA International Workshop on” Vision Techniques Applied to the Rehabilitation of City Centers”, Lisbon 2004
The basic photogrammetric deliverable in heritage conservation is orthophotography (and other suitable raster projections) – closely followed today by a growing demand for photo-textured 3D surface models. The fundamental limitation of conventional photogram-metric software is twofold: it can handle neither fully 3D surface descriptions nor the question of image visibility. As a consequence, software which ignores both surface and image occlusions is clearly inadequate for the complex surface topography encountered, as a rule, in conservation or restoration tasks; geometric accuracy and good visual quality are then possible only at the cost of tiresome human interaction, especially in the phase of surface modeling. However, laser scanning and powerful modeling/editing software allow today fast and accurate collection of vast numbers of surface points and the creation of reliable 3D meshes. Close-range photo-grammetry is obviously expected to extend its horizon by taking full advantage of this new possibility. Here an approach is presented for the automated generation of orthoimages and perspective views from fully 3D surface descriptions derived from laser scanning. Initially, the algorithm detects surface occlusions for the novel view. Next – in contrast to conventional photogrammetric software which requires an operator to define individual original images as the source for image content – all avail-able images participate in a view-independent texturing of the new image. Thus, following a bundle adjustment, all surface triangles are back-projected onto all initial images to establish visibilities. Texture “blending” is realised via an appropriate weighting scheme, which regulates the local radiometric contribution of each original image involved. Finally, a basic statistical test allows to automati-cally filter out outlying colour values. At its present stage of implementation, the algorithm has been tried at the example of a Byzan-tine church in Athens. It is concluded that the presented combination of laser scanning with photogrammetry – resulting in geometric accuracy, high visual quality and speed – is essentially capable to automatically create novel views from several images, and hence provide a realistic, practicable approach in heritage conservation. Finally, several elaborations of the approach are also suggested
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2004
Photo-textured 3D surface models, and orthophotography in particular, are most important photogrammetric products, notably in heritage conservation. However, conventional software typically uses surface descriptions obtained via 2D triangulation; additionally, it cannot handle image visibility. Ignoring multiple elevations and image occlusions is clearly too restrictive for a complex surface shape. Geometric accuracy and visual quality are then possible only with tedious human interaction during surface modeling but also orthoprojection. Yet, laser scanning allows today fast collection of accurate, dense surface point clouds and creation of 3D meshes. Close-range photogrammetry is obviously expected to take full advantage of this. The authors present their approach for an automated production of orthoimages from fully 3D surface representations derived from laser scanning. In a first step, the algorithm detects surface occlusions for the novel view. While common photogrammetric software needs operator-defined patches on individual original images as the source for image content, here all available images are combined for ‘viewer-independent’ texturing of the new image. To this end, bundle adjustment data allow all surface triangles to be back-projected onto all initial images to establish visibilities. Texture blending is performed with suitable weighting, which controls the local radiometric contribution of each original image involved. Given more than two values, a statistical test allows to automatically exclude outlying colour data. The implemented algorithm was tested at the example of a Byzantine church in Athens to indicate that this coupling of laser scanning with photogrammetry is capable to automatically create novel views from several images, while combining geometric accuracy and visual quality with speed. Finally, future tasks and further elaborations are outlined.
International Archives of the Photogrammetry, Remote Sensing & Spatial Information Sciences 2004
Single image calibration is a fundamental task in photogrammetry and computer vision. It is known that camera constant and princi-pal point can be recovered using exclusively the vanishing points of three orthogonal directions. Yet, three reliable and well-distribu-ted vanishing points are not always available. On the other hand, two vanishing points basically allow only estimation of the camera constant (assuming a known principal point location). Here, a camera calibration approach is presented, which exploits the existence of only two vanishing points on several independent images. Using the relation between two vanishing points of orthogonal direc-tions and the camera parameters, the algorithm relies on direct geometric reasoning regarding the loci of the projection centres in the image system (actually a geometric interpretation of the constraint imposed by two orthogonal vanishing points on the ‘image of the absolute conic’). Introducing point measurements on two sets of converging image lines as observations, the interior orientation pa-rameters (including radial lens distortion) are estimated from a minimum of three images. Recovery of image aspect ratio is possible, too, at the expense of an additional image. Apart from line directions in space, full camera calibration is here independent from any exterior metric information (known points, lengths, length ratios etc.). Besides, since the sole requirement is two vanishing points of orthogonal directions on several images, the imaged scenes may simply be planar. Furthermore, calibration with images of 2D objects and/or ‘weak perspectives’ of 3D objects is expected to be more precise than single image approaches using 3D objects. Finally, no feature correspondences among views are re-quired here; hence, images of totally different objects can be used. In this sense, one may still refer to a ‘single-image’ approach. The implemented algorithm has been successfully evaluated with simulated and real data, and its results have been compared to photo-grammetric bundle adjustment and plane-based calibration.
Proceedings of the XIX CIPA International Symposium 2003
Single image techniques may be very useful for heritage documentation purposes, not only in the particular instances of damaged or destroyed objects but also as auxiliary means for a basic metric reconstruction. In the general case, single images have unknown interior orientation, thus posing the fundamental question of camera calibration (as in several cases no ground control is available). To this end, the known–or assumed–geometry of imaged man-made objects may be exploited. Recovery of the three main elements of interior orientation, together with image attitude, requires the existence on the image of lines in three known non-coplanar directions, typically orthogonal to each other (from the lines, radial lens distortion might also be estimated). Several approaches have been reported for the exploitation of this basic image geometry; however, the expected accuracy has not been adequately investigated. In this contribution, three alternative algorithms are presented, based: on the direct use of the three basic image vanishing points; on the use of image line parameters; and on the direct use of image point observations. The integration of radial distortion into the algorithms is also presented. The reported results are evaluated, and promising conclusions are drawn regarding the performance and limitations of such camera calibration methods, as compared to self-calibrating bundle adjustment techniques based on control points
International archives of photogrammetry remote sensing and spatial information sciences 2002
Video sequences of road and traffic scenes are now used for various purposes. The framework of this research on the metric potential of single uncalibrated images is road mapping and studies of the traffic character of freeways. In the first case, an approach has been developed to extract lane width in straight road segments exploiting sequences from a forward looking camera. Apart from an initial reference width, necessary for calibrating camera height, no intrinsic or extrinsic calibration is required if frontal image acquisition is assumed. This approach, making use of the vanishing point of the road, gave an accuracy better than 5 cm in lane width. The second technique regards the measurement of vehicle speed, given the time interval between frames of a stationary camera tilted downwards. Here, too, the vanishing point in the direction of the road is used, with the vanishing point of the orthogonal direction assumed at infinity. Given one known ground distance along the road axis, the projective distortion of the ground plane is removed, allowing an affine rectification and, thus, 1D measurement in the correctly scaled road direction. This approach, evaluated against rigorous 2D-2D projective transformation and GPS measurements, has given a satisfactory estimated accuracy in vehicle speed of about 3 km/h.