Browsing by Author "Noh, Junyong"
Now showing 1 - 9 of 9
Results Per Page
Sort Options
Item Deep Learning-Based Unsupervised Human Facial Retargeting(The Eurographics Association and John Wiley & Sons Ltd., 2021) Kim, Seonghyeon; Jung, Sunjin; Seo, Kwanggyoon; Ribera, Roger Blanco i; Noh, Junyong; Zhang, Fang-Lue and Eisemann, Elmar and Singh, KaranTraditional approaches to retarget existing facial blendshape animations to other characters rely heavily on manually paired data including corresponding anchors, expressions, or semantic parametrizations to preserve the characteristics of the original performance. In this paper, inspired by recent developments in face swapping and reenactment, we propose a novel unsupervised learning method that reformulates the retargeting of 3D facial blendshape-based animations in the image domain. The expressions of a source model is transferred to a target model via the rendered images of the source animation. For this purpose, a reenactment network is trained with the rendered images of various expressions created by the source and target models in a shared latent space. The use of shared latent space enable an automatic cross-mapping obviating the need for manual pairing. Next, a blendshape prediction network is used to extract the blendshape weights from the translated image to complete the retargeting of the animation onto a 3D target model. Our method allows for fully unsupervised retargeting of facial expressions between models of different configurations, and once trained, is suitable for automatic real-time applications.Item A Drone Video Clip Dataset and its Applications in Automated Cinematography(The Eurographics Association and John Wiley & Sons Ltd., 2022) Ashtari, Amirsaman; Jung, Raehyuk; Li, Mingxiao; Noh, Junyong; Umetani, Nobuyuki; Wojtan, Chris; Vouga, EtienneDrones became popular video capturing tools. Drone videos in the wild are first captured and then edited by humans to contain aesthetically pleasing camera motions and scenes. Therefore, edited drone videos have extremely useful information for cinematography and for applications such as camera path planning to capture aesthetically pleasing shots. To design intelligent camera path planners, learning drone camera motions from these edited videos is essential. However, first, this requires to filter drone clips and extract their camera motions out of these edited videos that commonly contain both drone and non-drone content. Moreover, existing video search engines return the whole edited video as a semantic search result and cannot return only drone clips inside an edited video. To address this problem, we proposed the first approach that can automatically retrieve drone clips from an unlabeled video collection using high-level search queries, such as ''drone clips captured outdoor in daytime from rural places". The retrieved clips also contain camera motions, camera view, and 3D reconstruction of a scene that can help develop intelligent camera path planners. To train our approach, we needed numerous examples of edited drone videos. To this end, we introduced the first large-scale dataset composed of edited drone videos. This dataset is also used for training and validating our drone video filtering algorithm. Both quantitative and qualitative evaluations have confirmed the validity of our method.Item Generating Texture for 3D Human Avatar from a Single Image using Sampling and Refinement Networks(The Eurographics Association and John Wiley & Sons Ltd., 2023) Cha, Sihun; Seo, Kwanggyoon; Ashtari, Amirsaman; Noh, Junyong; Myszkowski, Karol; Niessner, MatthiasThere has been significant progress in generating an animatable 3D human avatar from a single image. However, recovering texture for the 3D human avatar from a single image has been relatively less addressed. Because the generated 3D human avatar reveals the occluded texture of the given image as it moves, it is critical to synthesize the occluded texture pattern that is unseen from the source image. To generate a plausible texture map for 3D human avatars, the occluded texture pattern needs to be synthesized with respect to the visible texture from the given image. Moreover, the generated texture should align with the surface of the target 3D mesh. In this paper, we propose a texture synthesis method for a 3D human avatar that incorporates geometry information. The proposed method consists of two convolutional networks for the sampling and refining process. The sampler network fills in the occluded regions of the source image and aligns the texture with the surface of the target 3D mesh using the geometry information. The sampled texture is further refined and adjusted by the refiner network. To maintain the clear details in the given image, both sampled and refined texture is blended to produce the final texture map. To effectively guide the sampler network to achieve its goal, we designed a curriculum learning scheme that starts from a simple sampling task and gradually progresses to the task where the alignment needs to be considered. We conducted experiments to show that our method outperforms previous methods qualitatively and quantitatively.Item Online Avatar Motion Adaptation to Morphologically-similar Spaces(The Eurographics Association and John Wiley & Sons Ltd., 2023) Choi, Soojin; Hong, Seokpyo; Cho, Kyungmin; Kim, Chaelin; Noh, Junyong; Myszkowski, Karol; Niessner, MatthiasIn avatar-mediated telepresence systems, a similar environment is assumed for involved spaces, so that the avatar in a remote space can imitate the user's motion with proper semantic intention performed in a local space. For example, touching on the desk by the user should be reproduced by the avatar in the remote space to correctly convey the intended meaning. It is unlikely, however, that the two involved physical spaces are exactly the same in terms of the size of the room or the locations of the placed objects. Therefore, a naive mapping of the user's joint motion to the avatar will not create the semantically correct motion of the avatar in relation to the remote environment. Existing studies have addressed the problem of retargeting human motions to an avatar for telepresence applications. Few studies, however, have focused on retargeting continuous full-body motions such as locomotion and object interaction motions in a unified manner. In this paper, we propose a novel motion adaptation method that allows to generate the full-body motions of a human-like avatar on-the-fly in the remote space. The proposed method handles locomotion and object interaction motions as well as smooth transitions between them according to given user actions under the condition of a bijective environment mapping between morphologically-similar spaces. Our experiments show the effectiveness of the proposed method in generating plausible and semantically correct full-body motions of an avatar in room-scale space.Item Real-time Content Projection onto a Tunnel from a Moving Subway Train(The Eurographics Association, 2021) Kim, Jaedong; Eom, Haegwang; Kim, Jihwan; Kim, Younghui; Noh, Junyong; Lee, Sung-Hee and Zollmann, Stefanie and Okabe, Makoto and Wünsche, BurkhardIn this study, we present the first actual working system that can project content onto a tunnel wall from a moving subway train so that passengers can enjoy the display of digital content through a train window. To effectively estimate the position of the train in a tunnel, we propose counting sleepers, which are installed at regular interval along the railway, using a distance sensor. The tunnel profile is constructed using pointclouds captured by a depth camera installed next to the projector. The tunnel profile is used to identify projectable sections that will not contain too much interference by possible occluders. The tunnel profile is also used to retrieve the depth at a specific location so that a properly warped content can be projected for viewing by passengers through the window when the train is moving at runtime. Here, we show that the proposed system can operate on an actual train.Item Real‐Time Human Shadow Removal in a Front Projection System(© 2019 The Eurographics Association and John Wiley & Sons Ltd., 2019) Kim, Jaedong; Seo, Hyunggoog; Cha, Seunghoon; Noh, Junyong; Chen, Min and Benes, BedrichWhen a person is located between a display and an operating projector, a shadow is cast on the display. The shadow on the display may eliminate important visual information and therefore adversely affect the viewing experiences. There have been various attempts to remove the human shadow cast on a projection display by using multiple projectors. While previous approaches successfully removed the shadow region when a person moderately moves around or stands stationary in front of the display, there is still an afterimage effect due to the lack of consideration of the limb motion of the person. We propose a new real‐time approach to removing the shadow cast by a person who dynamically interacts with the display, making limb motions in a front projection system. The proposed method utilizes a human skeleton obtained from a depth camera to track the posture of the person which changes over time. A model that consists of spheres and conical frustums is constructed based on the skeleton information in order to represent volumetric information of the person being tracked. Our method precisely estimates the shadow region by projecting the volumetric model onto the display. In addition, employment of intensity masks that are built based on a distance field helps suppress the afterimage of the shadow that appears when the person moves abruptly. It also helps blend the projected overlapping images from different projectors and show one smoothly combined display. The experiment results verify that our approach removes the shadow of a person effectively in a front projection environment and is fast enough to achieve real‐time performance.Item Recurrent Motion Refiner for Locomotion Stitching(© 2023 Eurographics ‐ The European Association for Computer Graphics and John Wiley & Sons Ltd., 2023) Kim, Haemin; Cho, Kyungmin; Hong, Seokhyeon; Noh, Junyong; Hauser, Helwig and Alliez, PierreStitching different character motions is one of the most commonly used techniques as it allows the user to make new animations that fit one's purpose from pieces of motion. However, current motion stitching methods often produce unnatural motion with foot sliding artefacts, depending on the performance of the interpolation. In this paper, we propose a novel motion stitching technique based on a recurrent motion refiner (RMR) that connects discontinuous locomotions into a single natural locomotion. Our model receives different locomotions as input, in which the root of the last pose of the previous motion and that of the first pose of the next motion are aligned. During runtime, the model slides through the sequence, editing frames window by window to output a smoothly connected animation. Our model consists of a two‐layer recurrent network that comes between a simple encoder and decoder. To train this network, we created a sufficient number of paired data with a newly designed data generation. This process employs a K‐nearest neighbour search that explores a predefined motion database to create the corresponding input to the ground truth. Once trained, the suggested model can connect various lengths of locomotion sequences into a single natural locomotion.Item StylePortraitVideo: Editing Portrait Videos with Expression Optimization(The Eurographics Association and John Wiley & Sons Ltd., 2022) Seo, Kwanggyoon; Oh, Seoung Wug; Lu, Jingwan; Lee, Joon-Young; Kim, Seonghyeon; Noh, Junyong; Umetani, Nobuyuki; Wojtan, Chris; Vouga, EtienneHigh-quality portrait image editing has been made easier by recent advances in GANs (e.g., StyleGAN) and GAN inversion methods that project images onto a pre-trained GAN's latent space. However, extending the existing image editing methods, it is hard to edit videos to produce temporally coherent and natural-looking videos. We find challenges in reproducing diverse video frames and preserving the natural motion after editing. In this work, we propose solutions for these challenges. First, we propose a video adaptation method that enables the generator to reconstruct the original input identity, unusual poses, and expressions in the video. Second, we propose an expression dynamics optimization that tweaks the latent codes to maintain the meaningful motion in the original video. Based on these methods, we build a StyleGAN-based high-quality portrait video editing system that can edit videos in the wild in a temporally coherent way at up to 4K resolution.Item Synthesizing Character Animation with Smoothly Decomposed Motion Layers(© 2020 Eurographics ‐ The European Association for Computer Graphics and John Wiley & Sons Ltd, 2020) Eom, Haegwang; Choi, Byungkuk; Cho, Kyungmin; Jung, Sunjin; Hong, Seokpyo; Noh, Junyong; Benes, Bedrich and Hauser, HelwigThe processing of captured motion is an essential task for undertaking the synthesis of high‐quality character animation. The motion decomposition techniques investigated in prior work extract meaningful motion primitives that help to facilitate this process. Carefully selected motion primitives can play a major role in various motion‐synthesis tasks, such as interpolation, blending, warping, editing or the generation of new motions. Unfortunately, for a complex character motion, finding generic motion primitives by decomposition is an intractable problem due to the compound nature of the behaviours of such characters. Additionally, decomposed motion primitives tend to be too limited for the chosen model to cover a broad range of motion‐synthesis tasks. To address these challenges, we propose a generative motion decomposition framework in which the decomposed motion primitives are applicable to a wide range of motion‐synthesis tasks. Technically, the input motion is smoothly decomposed into three motion layers. These are base‐level motion, a layer with controllable motion displacements and a layer with high‐frequency residuals. The final motion can easily be synthesized simply by changing a single user parameter that is linked to the layer of controllable motion displacements or by imposing suitable temporal correspondences to the decomposition framework. Our experiments show that this decomposition provides a great deal of flexibility in several motion synthesis scenarios: denoising, style modulation, upsampling and time warping.