43-Issue 2

Permanent URI for this collection

https://diglib.eg.org/handle/10.2312/3606944

Browse

Now showing 1 - 20 of 54

3D Reconstruction and Semantic Modeling of Eyelashes
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Kerbiriou, Glenn; Avril, Quentin; Marchal, Maud; Bermano, Amit H.; Kalogerakis, Evangelos
High-fidelity digital human modeling has become crucial in various applications, including gaming, visual effects and virtual reality. Despite the significant impact of eyelashes on facial aesthetics, their reconstruction and modeling have been largely unexplored. In this paper, we introduce the first data-driven generative model of eyelashes based on semantic features. This model is derived from real data by introducing a new 3D eyelash reconstruction method based on multi-view images. The reconstructed data is made available which constitutes the first dataset of 3D eyelashes ever published. Through an innovative extraction process, we determine the features of any set of eyelashes, and present detailed descriptive statistics of human eyelashes shapes. The proposed eyelashes model, which exclusively relies on semantic parameters, effectively captures the appearance of a set of eyelashes. Results show that the proposed model enables interactive, intuitive and realistic eyelashes modeling for non-experts, enriching avatar creation and synthetic data generation pipelines.
Advancing Front Surface Mapping
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Livesu, Marco; Bermano, Amit H.; Kalogerakis, Evangelos
We present Advancing Front Mapping (AFM), a novel algorithm for the computation of injective maps to simple planar domains. AFM is inspired by the advancing front meshing paradigm, which is here revisited to operate on two embeddings at once, becoming a tool for compatible mesh generation. AFM extends the capabilities of existing robust approaches, supporting a broader set of embeddings (star-shaped polygons) with a direct approach, without resorting to intermediate constructions. Our method only relies on two topological operators (split and flip) and on the computation of segment intersections, thus permitting to compute a valid embedding without solving any numerical problem. AFM is therefore easy to implement, debug and deploy. This article is mainly focused on the presentation of the compatible advancing front idea and on the demonstration that the algorithm provably converges to an injective map. We also complement our theoretical analysis with an extensive practical validation, executing more than one billion advancing front moves on 36K mapping tasks.
BallMerge: High-quality Fast Surface Reconstruction via Voronoi Balls
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Parakkat, Amal Dev; Ohrhallinger, Stefan; Eisemann, Elmar; Memari, Pooran; Bermano, Amit H.; Kalogerakis, Evangelos
We introduce a Delaunay-based algorithm for reconstructing the underlying surface of a given set of unstructured points in 3D. The implementation is very simple, and it is designed to work in a parameter-free manner. The solution builds upon the fact that in the continuous case, a closed surface separates the set of maximal empty balls (medial balls) into an interior and exterior. Based on discrete input samples, our reconstructed surface consists of the interface between Voronoi balls, which approximate the interior and exterior medial balls. An initial set of Voronoi balls is iteratively processed, merging Voronoi-ball pairs if they fulfil an overlapping error criterion. Our complete open-source reconstruction pipeline performs up to two quick linear-time passes on the Delaunay complex to output the surface, making it an order of magnitude faster than the state of the art while being competitive in memory usage and often superior in quality. We propose two variants (local and global), which are carefully designed to target two different reconstruction scenarios for watertight surfaces from accurate or noisy samples, as well as real-world scanned data sets, exhibiting noise, outliers, and large areas of missing data. The results of the global variant are, by definition, watertight, suitable for numerical analysis and various applications (e.g., 3D printing). Compared to classical Delaunay-based reconstruction techniques, our method is highly stable and robust to noise and outliers, evidenced via various experiments, including on real-world data with challenges such as scan shadows, outliers, and noise, even without additional preprocessing.
CharacterMixer: Rig-Aware Interpolation of 3D Characters
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Zhan, Xiao; Fu, Rao; Ritchie, Daniel; Bermano, Amit H.; Kalogerakis, Evangelos
We present CharacterMixer, a system for blending two rigged 3D characters with different mesh and skeleton topologies while maintaining a rig throughout interpolation. CharacterMixer also enables interpolation during motion for such characters, a novel feature. Interpolation is an important shape editing operation, but prior methods have limitations when applied to rigged characters: they either ignore the rig (making interpolated characters no longer posable) or use a fixed rig and mesh topology. To handle different mesh topologies, CharacterMixer uses a signed distance field (SDF) representation of character shapes, with one SDF per bone. To handle different skeleton topologies, it computes a hierarchical correspondence between source and target character skeletons and interpolates the SDFs of corresponding bones. This correspondence also allows the creation of a single ''unified skeleton'' for posing and animating interpolated characters. We show that CharacterMixer produces qualitatively better interpolation results than two state-of-the-art methods while preserving a rig throughout interpolation. Project page: https://seanxzhan.github.io/projects/CharacterMixer.
Cinematographic Camera Diffusion Model
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Jiang, Hongda; Wang, Xi; Christie, Marc; Liu, Libin; Chen, Baoquan; Bermano, Amit H.; Kalogerakis, Evangelos
Designing effective camera trajectories in virtual 3D environments is a challenging task even for experienced animators. Despite an elaborate film grammar, forged through years of experience, that enables the specification of camera motions through cinematographic properties (framing, shots sizes, angles, motions), there are endless possibilities in deciding how to place and move cameras with characters. Dealing with these possibilities is part of the complexity of the problem. While numerous techniques have been proposed in the literature (optimization-based solving, encoding of empirical rules, learning from real examples,...), the results either lack variety or ease of control. In this paper, we propose a cinematographic camera diffusion model using a transformer-based architecture to handle temporality and exploit the stochasticity of diffusion models to generate diverse and qualitative trajectories conditioned by high-level textual descriptions. We extend the work by integrating keyframing constraints and the ability to blend naturally between motions using latent interpolation, in a way to augment the degree of control of the designers. We demonstrate the strengths of this text-to-camera motion approach through qualitative and quantitative experiments and gather feedback from professional artists. The code and data are available at https://github.com/jianghd1996/Camera-control.
Computational Smocking through Fabric-Thread Interaction
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Zhou, Ningfeng; Ren, Jing; Sorkine-Hornung, Olga; Bermano, Amit H.; Kalogerakis, Evangelos
We formalize Italian smocking, an intricate embroidery technique that gathers flat fabric into pleats along meandering lines of stitches, resulting in pleats that fold and gather where the stitching veers. In contrast to English smocking, characterized by colorful stitches decorating uniformly shaped pleats, and Canadian smocking, which uses localized knots to form voluminous pleats, Italian smocking permits the fabric to move freely along the stitched threads following curved paths, resulting in complex and unpredictable pleats with highly diverse, irregular structures, achieved simply by pulling on the threads. We introduce a novel method for digital previewing of Italian smocking results, given the thread stitching path as input. Our method uses a coarse-grained mass-spring system to simulate the interaction between the threads and the fabric. This configuration guides the fine-level fabric deformation through an adaptation of the state-of-the-art simulator, C-IPC [LKJ21]. Our method models the general problem of fabric-thread interaction and can be readily adapted to preview Canadian smocking as well.We compare our results to baseline approaches and physical fabrications to demonstrate the accuracy of our method.
DivaTrack: Diverse Bodies and Motions from Acceleration-Enhanced 3-Point Trackers
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Yang, Dongseok; Kang, Jiho; Ma, Lingni; Greer, Joseph; Ye, Yuting; Lee, Sung-Hee; Bermano, Amit H.; Kalogerakis, Evangelos
Full-body avatar presence is important for immersive social and environmental interactions in digital reality. However, current devices only provide three six degrees of freedom (DOF) poses from the headset and two controllers (i.e. three-point trackers). Because it is a highly under-constrained problem, inferring full-body pose from these inputs is challenging, especially when supporting the full range of body proportions and use cases represented by the general population. In this paper, we propose a deep learning framework, DivaTrack, which outperforms existing methods when applied to diverse body sizes and activities. We augment the sparse three-point inputs with linear accelerations from Inertial Measurement Units (IMU) to improve foot contact prediction. We then condition the otherwise ambiguous lower-body pose with the predictions of foot contact and upper-body pose in a two-stage model. We further stabilize the inferred full-body pose in a wide range of configurations by learning to blend predictions that are computed in two reference frames, each of which is designed for different types of motions. We demonstrate the effectiveness of our design on a large dataset that captures 22 subjects performing challenging locomotion for three-point tracking, including lunges, hula-hooping, and sitting. As shown in a live demo using the Meta VR headset and Xsens IMUs, our method runs in real-time while accurately tracking a user's motion when they perform a diverse set of movements.
An Empirically Derived Adjustable Model for Particle Size Distributions in Advection Fog
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Kolárová, Monika; Lachiver, Loïc; Wilkie, Alexander; Bermano, Amit H.; Kalogerakis, Evangelos
Realistically modelled atmospheric phenomena are a long-standing research topic in rendering. While significant progress has been made in modelling clear skies and clouds, fog has often been simplified as a medium that is homogeneous throughout, or as a simple density gradient. However, these approximations neglect the characteristic variations real advection fog shows throughout its vertical span, and do not provide the particle distribution data needed for accurate rendering. Based on data from meteorological literature, we developed an analytical model that yields the distribution of particle size as a function of altitude within an advection fog layer. The thickness of the fog layer is an additional input parameter, so that fog layers of varying thickness can be realistically represented. We also demonstrate that based on Mie scattering, one can easily integrate this model into a Monte Carlo renderer. Our model is the first ever non-trivial volumetric model for advection fog that is based on real measurement data, and that contains all the components needed for inclusion in a modern renderer. The model is provided as open source component, and can serve as reference for rendering problems that involve fog layers.
Enhancing Image Quality Prediction with Self-supervised Visual Masking
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Çogalan, Ugur; Bemana, Mojtaba; Seidel, Hans-Peter; Myszkowski, Karol; Bermano, Amit H.; Kalogerakis, Evangelos
Full-reference image quality metrics (FR-IQMs) aim to measure the visual differences between a pair of reference and distorted images, with the goal of accurately predicting human judgments. However, existing FR-IQMs, including traditional ones like PSNR and SSIM and even perceptual ones such as HDR-VDP, LPIPS, and DISTS, still fall short in capturing the complexities and nuances of human perception. In this work, rather than devising a novel IQM model, we seek to improve upon the perceptual quality of existing FR-IQM methods. We achieve this by considering visual masking, an important characteristic of the human visual system that changes its sensitivity to distortions as a function of local image content. Specifically, for a given FR-IQM metric, we propose to predict a visual masking model that modulates reference and distorted images in a way that penalizes the visual errors based on their visibility. Since the ground truth visual masks are difficult to obtain, we demonstrate how they can be derived in a self-supervised manner solely based on mean opinion scores (MOS) collected from an FR-IQM dataset. Our approach results in enhanced FR-IQM metrics that are more in line with human prediction both visually and quantitatively.
Enhancing Spatiotemporal Resampling with a Novel MIS Weight
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Pan, Xingyue; Zhang, Jiaxuan; Huang, Jiancong; Liu, Ligang; Bermano, Amit H.; Kalogerakis, Evangelos
In real-time rendering, optimizing the sampling of large-scale candidates is crucial. The spatiotemporal reservoir resampling (ReSTIR) method provides an effective approach for handling large candidate samples, while the Generalized Resampled Importance Sampling (GRIS) theory provides a general framework for resampling algorithms. However, we have observed that when using the generalized multiple importance sampling (MIS) weight in previous work during spatiotemporal reuse, variances gradually amplify in the candidate domain when there are significant differences. To address this issue, we propose a new MIS weight suitable for resampling that blends samples from different sampling domains, ensuring convergence of results as the proportion of non-canonical samples increases. Additionally, we apply this weight to temporal resampling to reduce noise caused by scene changes or jitter. Our method effectively reduces energy loss in the biased version of ReSTIR DI while incurring no additional overhead, and it also suppresses artifacts caused by a high proportion of temporal samples. As a result, our approach leads to lower variance in the sampling results.
Estimating Cloth Simulation Parameters From Tag Information and Cusick Drape Test
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Ju, Eunjung; Kim, Kwang-yun; Yoon, Sungjin; Shim, Eungjune; Kang, Gyoo-Chul; Chang, Phil Sik; Choi, Myung Geol; Bermano, Amit H.; Kalogerakis, Evangelos
In recent years, the fashion apparel industry has been increasingly employing virtual simulations for the development of new products. The first step in virtual garment simulation involves identifying the optimal simulation parameters that accurately reproduce the drape properties of the actual fabric. Recent techniques advocate for a data-driven approach, estimating parameters from outcomes of a Cusick drape test. Such methods deviate from standard Cusick drape tests, introducing high-cost tools, which reduces practicality. Our research presents a more practical model, utilizing 2D silhouette images from the ISO-standardized Cusick drape test. Notably, while past models have shown limitations in estimating stretching parameters, our novel approach leverages the fabric's tag information including fabric type and fiber composition. Our proposed model functions as a cascaded system: first, it estimates stretching parameters using tag information, then, in the subsequent step, it considers the estimated stretching parameters alongside the fabric sample's Cusick drape test results to determine bending parameters. We validated our model against existing methods and applied it in practical scenarios, showing promising outcomes.
EUROGRAPHICS 2024: CGF 43-2 Frontmatter
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Bermano, Amit H.; Kalogerakis, Evangelos; Bermano, Amit H.; Kalogerakis, Evangelos
FontCLIP: A Semantic Typography Visual-Language Model for Multilingual Font Applications
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Tatsukawa, Yuki; Shen, I-Chao; Qi, Anran; Koyama, Yuki; Igarashi, Takeo; Shamir, Ariel; Bermano, Amit H.; Kalogerakis, Evangelos
Acquiring the desired font for various design tasks can be challenging and requires professional typographic knowledge. While previous font retrieval or generation works have alleviated some of these difficulties, they often lack support for multiple languages and semantic attributes beyond the training data domains. To solve this problem, we present FontCLIP – a model that connects the semantic understanding of a large vision-language model with typographical knowledge. We integrate typographyspecific knowledge into the comprehensive vision-language knowledge of a pretrained CLIP model through a novel finetuning approach. We propose to use a compound descriptive prompt that encapsulates adaptively sampled attributes from a font attribute dataset focusing on Roman alphabet characters. FontCLIP's semantic typographic latent space demonstrates two unprecedented generalization abilities. First, FontCLIP generalizes to different languages including Chinese, Japanese, and Korean (CJK), capturing the typographical features of fonts across different languages, even though it was only finetuned using fonts of Roman characters. Second, FontCLIP can recognize the semantic attributes that are not presented in the training data. FontCLIP's dual-modality and generalization abilities enable multilingual and cross-lingual font retrieval and letter shape optimization, reducing the burden of obtaining desired fonts.
Freeform Shape Fabrication by Kerfing Stiff Materials
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Speetzen, Nils; Kobbelt, Leif; Bermano, Amit H.; Kalogerakis, Evangelos
Fast, flexible, and cost efficient production of 3D models from 2D material sheets is a key component in digital fabrication and prototyping. In order to achieve high quality approximations of freeform shapes, a common set of methods aim to produce bendable 2D cutouts that are then assembled. So far bent surfaces are achieved automatically by computing developable patches of the input surface, e.g. in the context of papercraft. For stiff materials such as medium-density fibreboard (MDF) or plywood, the 2D cutouts require the application of additional cutting patterns (''kerfing'') to make them bendable. Such kerf patterns are commonly constructed with considerable user input, e.g. in architectural design. We propose a fully automatic method that produces kerfed cutouts suitable for the assembly of freeform shapes from stiff material sheets. By exploring the degrees of freedom emerging from the choice of bending directions, the creation of box joints at the patch boundaries as well as the application of kerf cuts with adaptive density, our method is able to achieve a high quality approximation of the input.
GANtlitz: Ultra High Resolution Generative Model for Multi-Modal Face Textures
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Gruber, Aurel; Collins, Edo; Meka, Abhimitra; Mueller, Franziska; Sarkar, Kripasindhu; Orts-Escolano, Sergio; Prasso, Luca; Busch, Jay; Gross, Markus; Beeler, Thabo; Bermano, Amit H.; Kalogerakis, Evangelos
High-resolution texture maps are essential to render photoreal digital humans for visual effects or to generate data for machine learning. The acquisition of high resolution assets at scale is cumbersome, it involves enrolling a large number of human subjects, using expensive multi-view camera setups, and significant manual artistic effort to align the textures. To alleviate these problems, we introduce GANtlitz (A play on the german noun Antlitz, meaning face), a generative model that can synthesize multi-modal ultra-high-resolution face appearance maps for novel identities. Our method solves three distinct challenges: 1) unavailability of a very large data corpus generally required for training generative models, 2) memory and computational limitations of training a GAN at ultra-high resolutions, and 3) consistency of appearance features such as skin color, pores and wrinkles in high-resolution textures across different modalities. We introduce dual-style blocks, an extension to the style blocks of the StyleGAN2 architecture, which improve multi-modal synthesis. Our patch-based architecture is trained only on image patches obtained from a small set of face textures (<100) and yet allows us to generate seamless appearance maps of novel identities at 6k×4k resolution. Extensive qualitative and quantitative evaluations and baseline comparisons show the efficacy of our proposed system.
GLS-PIA: n-Dimensional Spherical B-Spline Curve Fitting based on Geodesic Least Square with Adaptive Knot Placement
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Zhao, Yuming; Wu, Zhongke; Wang, Xingce; Bermano, Amit H.; Kalogerakis, Evangelos
Due to the widespread applications of curves on n-dimensional spheres, fitting curves on n-dimensional spheres has received increasing attention in recent years. However, due to the non-Euclidean nature of spheres, curve fitting methods on n-dimensional spheres often struggle to balance fitting accuracy and curve fairness. In this paper, we propose a new fitting framework, GLSPIA, for parameterized point sets on n-dimensional spheres to address the challenge. Meanwhile, we provide the proof of the method. Firstly, we propose a progressive iterative approximation method based on geodesic least squares which can directly optimize the geodesic least squares loss on the n-sphere, improving the accuracy of the fitting. Additionally, we use an error allocation method based on contribution coefficients to ensure the fairness of the fitting curve. Secondly, we propose an adaptive knot placement method based on geodesic difference to estimate a more reasonable distribution of control points in the parameter domain, placing more control points in areas with greater detail. This enables B-spline curves to capture more details with a limited number of control points. Experimental results demonstrate that our framework achieves outstanding performance, especially in handling imbalanced data points. (In this paper, ''sphere'' refers to n-sphere (n = 2) unless otherwise specified.)
HaLo-NeRF: Learning Geometry-Guided Semantics for Exploring Unconstrained Photo Collections
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Dudai, Chen; Alper, Morris; Bezalel, Hana; Hanocka, Rana; Lang, Itai; Averbuch-Elor, Hadar; Bermano, Amit H.; Kalogerakis, Evangelos
Internet image collections containing photos captured by crowds of photographers show promise for enabling digital exploration of large-scale tourist landmarks. However, prior works focus primarily on geometric reconstruction and visualization, neglecting the key role of language in providing a semantic interface for navigation and fine-grained understanding. In more constrained 3D domains, recent methods have leveraged modern vision-and-language models as a strong prior of 2D visual semantics. While these models display an excellent understanding of broad visual semantics, they struggle with unconstrained photo collections depicting such tourist landmarks, as they lack expert knowledge of the architectural domain and fail to exploit the geometric consistency of images capturing multiple views of such scenes. In this work, we present a localization system that connects neural representations of scenes depicting large-scale landmarks with text describing a semantic region within the scene, by harnessing the power of SOTA vision-and-language models with adaptations for understanding landmark scene semantics. To bolster such models with fine-grained knowledge, we leverage large-scale Internet data containing images of similar landmarks along with weakly-related textual information. Our approach is built upon the premise that images physically grounded in space can provide a powerful supervision signal for localizing new concepts, whose semantics may be unlocked from Internet textual metadata with large language models. We use correspondences between views of scenes to bootstrap spatial understanding of these semantics, providing guidance for 3D-compatible segmentation that ultimately lifts to a volumetric scene representation. To evaluate our method, we present a new benchmark dataset containing large-scale scenes with groundtruth segmentations for multiple semantic concepts. Our results show that HaLo-NeRF can accurately localize a variety of semantic concepts related to architectural landmarks, surpassing the results of other 3D models as well as strong 2D segmentation baselines. Our code and data are publicly available at https://tau-vailab.github.io/HaLo-NeRF/.
Hierarchical Co-generation of Parcels and Streets in Urban Modeling
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Chen, Zebin; Song, Peng; Ortner, F. Peter; Bermano, Amit H.; Kalogerakis, Evangelos
We present a computational framework for modeling land parcels and streets. In the real world, parcels and streets are highly coupled with each other since a street network connects all the parcels in a certain area. However, existing works model parcels and streets separately to simplify the problem, resulting in urban layouts with irregular parcels and/or suboptimal streets. In this paper, we propose a hierarchical approach to co-generate parcels and streets from a user-specified polygonal land shape, guided by a set of fundamental urban design requirements. At each hierarchical level, new parcels are generated based on binary splitting of existing parcels, and new streets are subsequently generated by leveraging efficient graph search tools to ensure that each new parcel has a street access. At the end, we optimize the geometry of the generated parcels and streets to further improve their geometric quality. Our computational framework outputs an urban layout with a desired number of regular parcels that are reachable via a connected street network, for which users are allowed to control the modeling process both locally and globally. Quantitative comparisons with state-of-the-art approaches show that our framework is able to generate parcels and streets that are superior in some aspects.
The Impulse Particle-In-Cell Method
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Sancho, Sergio; Tang, Jingwei; Batty, Christopher; Azevedo, Vinicius C.; Bermano, Amit H.; Kalogerakis, Evangelos
An ongoing challenge in fluid animation is the faithful preservation of vortical details, which impacts the visual depiction of flows. We propose the Impulse Particle-In-Cell (IPIC) method, a novel extension of the popular Affine Particle-In-Cell (APIC) method that makes use of the impulse gauge formulation of the fluid equations. Our approach performs a coupled advection-stretching during particle-based advection to better preserve circulation and vortical details. The associated algorithmic changes are simple and straightforward to implement, and our results demonstrate that the proposed method is able to achieve more energetic and visually appealing smoke and liquid flows than APIC.
Interactive Exploration of Vivid Material Iridescence based on Bragg Mirrors
(The Eurographics Association and John Wiley & Sons Ltd., 2024) Fourneau, Gary; Pacanowski, Romain; Barla, Pascal; Bermano, Amit H.; Kalogerakis, Evangelos
Many animals, plants or gems exhibit iridescent material appearance in nature. These are due to specific geometric structures at scales comparable to visible wavelengths, yielding so-called structural colors. The most vivid examples are due to photonic crystals, where a same structure is repeated in one, two or three dimensions, augmenting the magnitude and complexity of interference effects. In this paper, we study the appearance of 1D photonic crystals (repetitive pairs of thin films), also called Bragg mirrors. Previous work has considered the effect of multiple thin films using the classical transfer matrix approach, which increases in complexity when the number of repetitions increases. Our first contribution is to introduce a more efficient closedform reflectance formula [Yeh88] for Bragg mirror reflectance to the Graphics community, as well as an approximation that lends itself to efficient spectral integration for RGB rendering. We then explore the appearance of stacks made of rough Bragg layers. Here our contribution is to show that they may lead to a ballistic transmission, significantly speeding up position-free rendering and leading to an efficient single-reflection BRDF model.

Browse

Browsing 43-Issue 2 by Title

Results Per Page

Sort Options