Posters Sessions
Session 1
6DoF Dynamic Small Marker Tracking with Low-Cost, Unsynchronized, Dynamic Consumer Cameras For Augmented Reality Training
Nicholas Rewkowski, Andrei State, Henry Fuchs
Surgeons improve their skills through repetition of training tasks in order to operate on living patients, ideally receiving timely, useful, and objective performance feedback. However, objective performance measurement is currently difficult without 3D visualization, with effective surgical training apparatus being extremely expensive or limited in accessibility. This is problematic for medical students, especially in situations such as the COVID-19 pandemic in which they are needed by the community but have few ways of practicing without lab access. In this work, we propose and prototype a system for augmented reality (AR) visualization of laparoscopic training tasks using cheap and widely-compatible borescopes, which can track small objects typical of surgical training. We use forward kinematics for calibration and multi-threading to attempt synchronization in order to increase compatibility with consumer applications, resulting in an effective AR simulation with low-cost devices and consumer software, while also providing dynamic camera and marker tracking. We test the system with a typical peg transfer task on the HoloLens 1 and MagicLeap One.
A Conceptual Model for Data Collection and Analysis for AR-based Remote Collaboration Evaluation
Bernardo Marques, António J. Teixeira, Samuel Silva, João Alves, Paulo Dias, Beatriz Sousa Santos
A significant effort has been devoted to the creation of the enablingtechnology and in the proposal of novel methods to support remotecollaboration using Augmented Reality (AR), given the noveltyof the field. As the field progresses to focus on the nuances ofsupporting collaboration and with the growing number of proto-types mediated by AR, the characterization and evaluation of thecollaborative process becomes an essential, but difficult endeavor.Evaluation is particularly challenging in this multifaceted contextinvolving many aspects that may influence the way collaborationoccurs. Therefore, it is essential the existence of holistic evaluationstrategies that monitor the use and performance of the proposed solu-tions regarding the team, its members, and the technology, allowing adequate characterization and report of collaborative processes. Asa contribute, we propose a conceptual model for multi-user datacollection and analysis that monitors several collaboration aspects:individual and team performance, behaviour and level of collabora-tion, as well as contextual data in scenarios of remote collaborationusing AR-based solutions.
An Efficient Planar Bundle Adjustment Algorithm
Lipu Zhou, Daniel Koppel, Hui Ju, Frank Steinbruecker, Michael Kaess
This paper presents an efficient algorithm for the least-squares problem using the point-to-plane cost, which aims to jointly optimize depth sensor poses and plane parameters for 3D reconstruction. We call this least-squares problem Planar Bundle Adjustment (PBA), due to the similarity between this problem and the original Bundle Adjustment (BA) in visual reconstruction. As planes ubiquitously exist in the man-made environment, they are generally used as landmarks in SLAM algorithms for various depth sensors. PBA is important to reduce drift and improve the quality of the map. However, directly adopting the well-established BA framework in visual reconstruction will result in a very inefficient solution for PBA. This is because a 3D point only has one observation at a camera pose. In contrast, a depth sensor can record hundreds of points in a plane at a time, which results in a very large nonlinear least-squares problem even for a small-scale space. The main contribution of this paper is an efficient solution for the PBA problem using the point-to-plane cost. We introduce a reduced Jacobian matrix and a reduced residual vector, and prove that they can replace the original Jacobian matrix and residual vector in the generally adopted Levenberg-Marquardt (LM) algorithm. This significantly reduces the computational cost. Besides, when planes are combined with other features for 3D reconstruction, the reduced Jacobian matrix and residual vector can also replace the corresponding parts derived from planes. Our experimental results verify that our algorithm can significantly reduce the computational time compared to the solution using the traditional BA framework. In addition, our algorithm is faster, more accurate, and more robust to initialization errors compared to the start-of-the-art solution using the plane-to-plane cost [3].
An Evaluation of AR-Assisted Navigation for Search and Rescue in Underground Spaces
Doga Cagdas Demirkan, Sebnem Duzgun
In this study, we evaluated the performance of AR-assisted navigation in a real underground mine with good and limited illumination conditions as well as without the illumination considering possible search and rescue conditions. For this purpose, we utilized the Lumin SDK’s embedded spatial mapping algorithm for mapping and navigating. We used the spatial mapping algorithm to create the mesh model of the escape route and to render it with the user input into the Magic Leap One. Then we compared the spatial mapping algorithm in three different scenarios for the evacuation of an underground mine in an emergency situation. The escape route has two junctions and 30 meters (100 feet). The baseline scenarios are (i) evacuation of the mine in a fully illuminated condition, (ii) evacuation with the headlamp and (iii) without any illumination. For navigation in the third scenario, we provided an existing mesh of the underground space to the user. To assess the efficiency of the spatial mapping we tested each scenario with the rendered meshes. In the first scenario (fully illuminated route with the rendered meshes) the evacuation took 40 seconds. In the second scenario (illumination with the headlamp), the evacuation took 44 seconds. For the last scenario (no light source and hence in total darkness) the evacuation took 54 seconds. We found that AR-assisted navigation is effective for supporting search and rescue efforts in high attrition conditions of underground space.
An Image-Based Method for Measuring Strabismus in Virtual Reality
Wolfgang Andreas Mehringe, Markus Gerhard Wirth, Stefan Gradl, Luis Simon Durner, Matthias Ring, Annemarie F. Laudanski, Bjoern M Eskofier, Georg Michelson
Strabismus is a visual disorder characterized by eye misalignment. The effect of Panum's Fusional Area (PFA) compensates for small misalignments. However, prominent misalignments affect binocular vision and when present in childhood it may lead to amblyopia, a developmental disorder of the visual system. With the advent of Virtual Reality (VR) technology, novel binocular treatments to amblyopia arise in which the measurement of strabismus is crucial to correctly compensate for it. Thus, VR yields great potential due to the ability of displaying content to each eye independently. Major research in VR addresses this topic using eye-tracking while there is a paucity of research on image-based assessment methods. In this work, we propose a VR application for measuring strabismus in nine lines of sight. We conducted a study with 14 healthy participants to evaluate the system under two conditions: no strabismus and an artificial deviation induced by prism lenses. Further, we evaluated the effect of PFA on the system by measuring its extent in horizontal and vertical lines of sight. Results show significant difference between the expected deviation induced by prism lenses and the measured deviation. The existing difference within the measurements can be explained with the recorded extent of the PFA.
An Intelligent Augmented Reality Training Framework for Neonatal Endotracheal Intubation
Shang Zhao, Xiao Xiao, Qiyue Wang, Xiaoke Zhang, Wei Li, Lamia Soghier, James Hahn
Neonatal Endotracheal Intubation (ETI) is a critical resuscitation skill that requires tremendous practice of trainees before clinical exposure. However, current manikin-based training regimen is ineffective in providing satisfactory real-time procedural guidance for accurate assessment due to the lack of see-through visualization within the manikin. The training efficiency is further reduced by the limited availability of expert instructors, which inevitably results in a long learning curve for trainees. To this end, we propose an intelligent Augmented Reality (AR) training framework that provides trainees with a complete visualization of the ETI procedure for real-time guidance and assessment. Specifically, the proposed framework is capable of capturing the motions of the laryngoscope and the manikin and offer 3D see-through visualization rendered to the head-mounted display (HMD). Furthermore, an attention-based Convolutional Neural Network (CNN) model is developed to automatically assess the ETI performance from the captured motions as well as identify regions of motions that significantly contribute to the performance evaluation. Lastly, augmented user-friendly feedback is delivered with interpretable results with the ETI scoring rubric through the color-coded motion trajectory that classifies highlighted regions that need more practice. The classification accuracy of our machine learning model is 84.6%.
AR Circuit Constructor: Combining Electricity Building Blocks and Augmented Reality for Analogy-Driven Learning and Experimentation
Tobias Kreienbühl, Richard Wetzel, Naomi Burgess, Andrea Maria Schmid, Dorothee Brovelli
We present AR Circuit Constructor (ARCC), an augmented reality application to explore and inspect electric circuits for use in educational settings. Learners use tangible electricity building blocks to construct a working electric circuit. Then, they can use a tablet device for exploring the circuit in an augmented reality visualization. Learners can switch between three distinct conceptual analogies: bicycle chain, water pipes, and waterfalls. Through experimentation with different circuit configurations, learners explore different properties of electricity to ultimately improve their understanding of it. We describe the development of our application, including a qualitative user study with a group of STEM teachers. The latter allowed us to gain insights into the qualities required for such an application before it can ultimately be deployed in a classroom setting.
AR Mini-Games for Supermarkets
Urs Riedlinger, Leif Oppermann, Yücel Uzun, Constantin Brosda
We present an Augmented Reality (AR) application intended for use in supermarkets, with the primary goal to bring fun and digital engagement through AR mini-games to the customers while shopping. We believe that our approach can be extended and scaled up by integrating mini-games into existing shopping apps in the future.
ARPads: Mid-air Indirect Input for Augmented Reality
Eugenie Brasier, Olivier Chapuis, Nicolas FEREY, Jeanne VEZIEN, Caroline Appert
Interacting efficiently and comfortably with Augmented Reality (AR) headsets remains a major issue. We investigate the concept of mid-air pads as an alternative to gaze or direct hand input to control a cursor in windows anchored in the environment. ARPads allow users to control the cursor displayed in the headset screen through movements on a mid-air plane, which is not spatially aligned with the headset screen. We investigate a design space for ARPads, which takes into account the position of the pad relative to the user's body, and the orientation of the pad relative to that of the headset screen. Our study suggests that 1) indirect input can achieve the same performance as direct input while causing less fatigue than hand raycast, 2) an ARPad should be attached to the wrist or waist rather than to the thigh, and 3) the ARPad and the screen should have the same orientation.
Augmented Reality for Pack Optimization using Video and Depth Data
Mario Lorenz, Felix Pfeiffer, Philipp Klimant
Augmented Reality (AR) can be used in intra-logistics to optimize packing processes towards shipping less unused space by calculating and visualizing efficient pack schemas. This work describes the prototype of such a system using a Kinect v2. Based on the delivered RGB-D stream detection and track-ing algorithm applied particle filters, tracks boxes and articles. Further, the tracking approach can check if an article is placed correctly in the box. A qualitative assessment of the prototype in a warehouse revealed that such a system only useful for unexperienced packers.
Automatic Detection and Prediction of Cybersickness Severity using Deep Neural Networks from user's Physiological Signals
Rifatul Islam, Yonggun Lee, mehrad Jaloli, Imtiaz Muhammad Arafat, Dakai Zhu, peyman najafirad, Yufei Huang, John Quarles
Cybersickness is one of the primary challenges to the usability and acceptability of virtual reality (VR). Cybersickness can cause motion sickness-like discomforts, including disorientation, headache, nausea, and fatigue, both during and after the VR immersion. Prior research suggested a significant correlation between physiological signals and cybersickness severity, as measured by the simulator sickness questionnaire (SSQ). However, SSQ may not be suitable for automatic detection of cybersickness severity during immersion, as it is usually reported before and after the immersion. In this study, we introduced an automated approach for the detection and prediction of cybersickness severity from the user's physiological signals. We collected heart rate, breathing rate, heart rate variability, and galvanic skin response data from 31 healthy participants while immersed in a virtual roller coaster simulation. We found a significant difference in the participants' physiological signals during their cybersickness state compared to their resting baseline. We compared a support vector machine classifier and three deep neural classifiers for cybersickness severity detection and prediction in two minutes' future, given the previous two minutes of physiological signals. Our proposed simplified convolutional long short-term memory classifier achieved an accuracy of 97.44% for detecting current cybersickness severity and 87.38% for predicting future cybersickness severity from the physiological signals.
Catching the Drone – A Tangible Augmented Reality Game in Superhuman Sports
Christian Eichhorn, David A. Plecher, Sandro Weber, Gudrun Klinker, Yuta Itoh
The newly defined genre of Superhuman Sports provides several challenges, including the need to develop rich Augmented Reality (AR) game concepts with tangible interactions and augmentation. In this paper, we provide insights into a Superhuman Sports ball game, where players are able to interact in mid-air, rapidly and precisely with a smart, augmented and catchable drone ball. We describe our core concepts and a path towards a fully functional system with multiple and potentially different display solutions, ranging from smartphone-based AR to, eventually, HMD-based AR. For this AR game idea, a unique drone with a trackable cage based on LED pattern recognition has been developed. The player, as well as the drone will move swiftly during the game. To precisely estimate the 6DoF pose of the fast-moving drone in this dynamic scenario, we propose a suitable pipeline with a tracking algorithm. As foundation for the tracking, LEDs have been placed in a specific, spherical pattern on the drone cage. Furthermore, refinements based on the unique attributes of LEDs are considered.
Collaborative Augmented Reality on Smartphones via Life-long City-scale Maps
Lukas Platinsky, Michal Szabados, Filip Hlasek, Ross Hemsley, Luca Del Pero, Andrej Pancik, Bryan Baum, Hugo Grimmett, Peter Ondruska
CollaboVR: A Reconfigurable Framework for Creative Collaboration in Virtual Reality
Zhenyi He, Ruofei Du, Ken Perlin
Writing or sketching on whiteboards is an essential part of collab-orative discussions in business meetings, reading groups, designsessions, and interviews. However, prior work in collaborative virtual reality (VR) systems has rarely explored the design space of multi-user layouts and interaction modes with virtual whiteboards.In this paper, we present CollaboVR, a reconfigurable framework for both co-located and geographically dispersed multi-user communication in VR. Our system unleashes users’ creativity by sharing freehand drawings, converting 2D sketches into 3D models, and generating procedural animations in real-time. To minimize the computational expense for VR clients, we leverage a cloud architecture in which the computational expensive application (Chalktalk) is hosted directly on the servers, with results being simultaneously streamed to clients. We have explored three custom layouts – integrated, mir-rored, and projective – to reduce visual clutter, increase eye contact,or adapt different use cases. To evaluate CollaboVR, we conducted a within-subject user study with 12 participants. Our findings reveal that users appreciate the custom configurations and real-time interac-tions provided by CollaboVR. We have open sourced CollaboVR at https://github.com/snowymo/CollaboVR to facilitate future research and development of natural user interfaces and real-time collaborative systems in virtual and augmented reality.
Comparing Methods for Mapping Facial Expressions to Enhance Immersive Collaboration with Signs of Emotion
Natalie Hube, Oliver Lenz, Lars Engeln, Rainer Groh, Michael Sedlmair
We present a user study comparing a pre-evaluated mapping approach with a state-of-the-art direct mapping method of facial expressions for emotion judgment in an immersive setting. At its heart, the pre-evaluated approach leverages semiotics, a theory used in linguistic. In doing so, we want to compare pre-evaluation with an approach that seeks to directly map real facial expressions onto their virtual counterparts. To evaluate both approaches, we conduct a controlled lab study with 22 participants. The results show that users are significantly more accurate in judging virtual facial expressions with pre-evaluated mapping. Additionally, participants were slightly more confident when deciding on a presented emotion. We could not find any differences regarding potential Uncanny Valley effects. However, the pre-evaluated mapping shows potential to be more convenient in a conversational scenario.
Comparison of Augmented Reality Display Techniques to Support Medical Needle Insertion
Florian Heinrich, Luisa Schwenderling, Fabian Joeres, Kai Lawonn, Christian Hansen
Augmented reality (AR) may be a useful technique to overcome issues of conventionally used navigation systems supporting medical needle insertions, like increased mental workload and complicated hand-eye coordination. Previous research primarily focused on the development of AR navigation systems designed for specific displaying devices, but differences between employed methods have not been investigated before. To this end, a user study involving a needle insertion task was conducted comparing different AR display techniques with a monitor-based approach as baseline condition for the visualization of navigation information. A video see-through stationary display, an optical see-through head-mounted display and a spatial AR projector-camera-system were investigated in this comparison. Results suggest advantages of using projected navigation information in terms of lower task completion time, lower angular deviation and affirmative subjective participant feedback. Techniques requiring the intermediate view on screens, i.e. the stationary display and the baseline condition showed less favorable results. Thus, benefits of providing AR navigation information compared to a conventionally used method could be identified. Significant objective measures results, as well as an identification of advantages and disadvantages of individual display techniques contribute to the development and design of improved needle navigation systems.
Concept for a Virtual Reality Robot Ground Simulator
Mario Lorenz, Sebastian Knopp, Philipp Klimant, Johannes Quellmalz, Dr.-Ing. Holger Schlegel
For many VR applications where natural walking is necessary, the problem of a far smaller real movement space than in VR arises. Treadmills and redirected walking are established methods for this issue. However, both are limited to even surfaces and are unable to simulate different ground properties. Here a concept for a VR robot ground simulator is presented allowing to walk on steep ground or even staircase and which can simulate different undergrounds like sand, grass, or concrete. Starting from gait parameters, the technical requirements and implementation challenges for the realization of such a VR ground simulator are given.
Designing a Multitasking Interface for Object-aware AR applications
Brandon Huynh, Jason Orlosky, Tobias Höllerer
Many researchers and industry professionals believe Augmented Reality (AR) to be the next step in personal computing. But the idea of an always-on context-aware AR device presents new and unique challenges to the way users organize multiple streams of information. What does multitasking look like, when applications may be tied to specific elements in the environment? In this exploratory study, we look at one such element: physical objects, and explore an object-centric approach to multitasking in AR. We developed 3 prototype applications that operate on a subset of objects in a simulated test environment. We performed a pilot study of our multitasking solution with a novice user, domain expert, and system expert, to develop insights into the future of AR application design.
Effects of Behavioral and Anthropomorphic Realism on Social Influence with Virtual Humans in AR
Hanseul Jun, Jeremy Bailenson
While many applications in AR will display embodied agents in scenes, there is little research examining the social influence of these AR renderings. In this experiment, we manipulated the behavioral and anthropomorphic realism of an embodied agent. Participants wore an AR headset and walked a path specified by four virtual cubes, designed to bring them close to either humans or objects rendered in AR. In addition there was a control condition with no virtual objects in the room. Participants were then asked to choose between two physical chairs to sit on---one with a virtual human or object on it, or an empty one. We examined the interpersonal distance between participants and rendered objects, physical seat choice, body rotation direction while choosing a seat, and social presence ratings. For interpersonal distance, there was an effect of anthropomorphic realism but not behavioral realism---participants left more space for human-shaped objects than for non-human objects, regardless of how real the human behaved. There were no significant differences in seat choice and rotation direction. For social presence ratings, they were higher for agents high in both behavioral and anthropomorphic realism than for other conditions. We discuss implications for the social influence theory and for the design of AR systems.
Evaluation of different Visualization Techniques for Perception- Based Alignment in Medical AR
Marc J Fischer, Christoph Leuze, Alejandro Martin-Gomez, Stephanie L Perkins, Jarrett Rosenberg, Bruce Daniel
Many Augmented Reality (AR) applications require the alignment of virtual objects to the real world; this is particularly important in medical AR scenarios where medical imaging information may be displayed directly on a patient and is used to identify the exact locations of specific anatomical structures within the body. For optical see-through AR, alignment accuracy depends both on the optical parameters of the AR display as well as the visualization parameters of the virtual model. In this paper, we explore how different static visualization techniques influence users’ ability to perform perception-based alignment in AR for breast reconstruction surgery, where surgeons must accurately identify the locations of several perforator blood vessels while planning the procedure. We conducted a pilot study in which four subjects used four different visualization techniques with varying degrees of opaqueness and brightness as well as outline contrast to align virtual replicas of the relevant anatomy to their 3D-printed counterparts. We collected quantitative scores on spatial alignment accuracy using an external tracking system and qualitative scores on user preference and perceived performance. Results indicate that the highest source of alignment error was along the depth dimension, with users consistently overestimating depth when aligning the virtual renderings. The majority of subjects preferred visualization techniques rendered with lower levels of opaqueness and brightness as well as higher outline contrast, which were also found to support more accurate alignment.
Exploring Virtual Environments by Visually Impaired Using a Mixed Reality Cane Without Visual Feedback
Lei Zhang, Kelvin W, Bin Yang, Hao Tang, Zhigang Zhu
Though virtual reality (VR) has been advanced to certain levels of maturity in recent years, the general public still cannot enjoy the benefit provided by VR, especially the population of the blind and visually impaired (BVI). Current VR accessibility applications have been developed either on expensive head-mounted displays or with extra accessories and mechanisms, which are not accessible and convenient for BVI individuals. In this paper, we present a mobile VR app that enables BVI users to access a virtual environment on an iPhone in order to build their skills of perception and recognition of the virtual environment and the virtual objects in the environment. The app uses the iPhone on a selfie stick to simulate a long cane in VR, and applies Augmented Reality (AR) techniques to track the iPhone’s real-time poses in an empty space of the real world, which is then synchronized to the long cane in the VR environment. Due to the use of mixed reality (the integration of VR & AR), we call it the Mixed Reality cane (MR Cane), which provides BVI users auditory and vibrotactile feedback whenever the virtual cane comes in contact with objects in VR. Thus, the MR Cane allows BVI individuals to interact with the virtual objects and identify approximate sizes and locations of the objects in the virtual environment. We performed preliminary user studies with blind-folded participants to investigate the effectiveness of the proposed mobile approach and the results indicate that the proposed MR Cane is effective to help BVI individuals in understanding the interaction with virtual objects and exploring 3D virtual environments. The MR Cane concept can be extended to new applications of navigation, training and entertainment for BVI individuals without more significant efforts.
Extended by Design: A Toolkit for Creation of XR Experiences
Arlindo Gomes, Lucas Silva Figueiredo, Walter F M Correia, Veronica Teichrieb, Jonysberg P. Quintino, Fabio Q. B. da Silva, Andre L M Santos, Helder de Sousa Pinho
Through the last decade, the creation of extended reality (XR) solu-tions has significantly increased due to the advent of cheaper, moreadvanced, and accessible instruments like smartphones, headsets,platforms, development kits, and engines. For instance, the num-ber of GitHub repositories for XR related projects jumped from 51in 2010 to over 15,000 in 2020. At the same time, the developercommunity approaches the creation of XR applications using inher-ited design processes and methods from past mainstream platformssuch as web, mobile, or even product design. Unfortunately, thoseplatforms do not consider the spatial aspects of these applications.In this paper, we present a revisited design process and a toolkitfocused on the challenges innate to XR, that aims to help beginnersand experienced teams in the creation of applications and interac-tions in Virtual, Augmented, and Mixed Reality. We also presenta compendium of 113 techniques and 118 guidelines and a set ofcanvases that guides users through the process, preventing them fromskipping important tasks and discoveries. At last, we present a pilotcase where we accompany a team with developers and designersrunning our process and using our toolkit for the first time, showingthe benefits of a process that strikes specific issues of XR apps.
Generating Emotive Gaits for Virtual Agents Using Affect-Based Autoregression
Uttaran Bhattacharya, Nick Rewkowski, Pooja Guhan, Niall Williams, Trisha Mittal, Aniket Bera, Dinesh Manocha
We present a novel autoregression network to generate virtual agents that convey various emotions through their walking styles or gaits. Given the 3D pose sequences of a gait, our network extracts pertinent movement features and affective features from the gait. We use these features to synthesize subsequent gaits such that the virtual agents can express and transition between emotions represented as combinations of happy, sad, angry, and neutral. We incorporate multiple regularizations in the training our network to simultaneously enforce plausible movements and noticeable emotions on the virtual agents. We also integrate our approach with an AR environment using a Microsoft HoloLens and can generate emotive gaits at interactive rates to increase the social presence. We evaluate how human observers perceive both the naturalness and the emotions from the generated gaits of the virtual agents in a web-based study. Our results indicate around 89% of the users found the naturalness of the gaits satisfactory on a five-point Likert scale, and the emotions they perceived from the virtual agents are statistically similar to the intended emotions of the virtual agents. We also use our network to augment existing gait datasets with emotive gaits and will release this augmented dataset for future research in emotion prediction and emotive gait synthesis.
Ginput: Fast hi-fi prototyping of gestural interactions in virtual reality
Jose Roberto Fonseca Junior, Jader Abreu, Lucas Silva Figueiredo, José Gomes Neto, Veronica Teichrieb, Jonysberg P. Quintino, Fabio Q. B. da Silva, Andre L M Santos, Helder de Sousa Pinho
Gestural interfaces in virtual reality (VR) expand the design space for user interaction, allowing spatial metaphors with the environment and more natural and immersive experiences. Typically, machine learning approaches recognize gestures with models that rely on a large number of samples for the training phase, which is an obstacle for rapidly prototyping gestural interactions. In this paper, we propose a solution designed for hi-fi prototyping of gestures within a virtual reality environment through a high-level Domain-Specific Language (DSL), as a subset of the natural language. The proposed DSL allows non-programmer users to intuitively describe a broad domain of poses and connect them for compound gestures. Our DSL was designed to be general enough for multiple input classes, such as body tracking, hand tracking, head movement, motion controllers, and buttons. We tested our solution for wands with VR designers and developers. Results showed that the tool gives non-programmers the ability to prototype gestures with ease and refine its recognition within a few minutes.
HuTrain: a Framework for Fast Creation of Real Human Pose Datasets
Ricardo Rossiter Barioni, Willams de Lima Costa, José André Carneiro Neto, Lucas Silva Figueiredo, Veronica Teichrieb, Jonysberg P. Quintino, Fabio Q. B. da Silva, Andre L M Santos, Helder de Sousa Pinho
Image-based body tracking algorithms are useful in several scenarios, such as avatar animations and gesture interaction for VR applications. In the last few years, the best-ranked solutions presented on the state of the art of body tracking (according to the most popular datasets in the field) are intensively based on Convolutional Neural Networks (CNNs) algorithms and use large datasets for training and validation. Although these solutions achieve high precision scores while evaluated with some of these datasets, there are particular tracking challenges (for example, upside-down cases) that are not well-modeled and, therefore, not correctly tracked. Instead of lurking an all-in-one solution for all cases, we propose HuTrain, a framework for creating datasets quickly and easily. HuTrain comprises a series of steps, including automatic camera calibration, refined human pose estimation, and known dataset formats conversion. We show that, with our system, the user can generate human pose datasets, targeting specific tracking challenges for the desired application context, with no need to annotate human pose instances manually.
HydrogenAR: Interactive Data-Driven Presentation of Dispenser Reliability
Matt Whitlock, Danielle Szafir, Kenny Gruchalla
When delivering presentations to a co-located audience, we typically use slides with text and 2D graphics to complement the spoken narrative. Though presentations have largely been explored on 2D media, augmented reality (AR) allows presentation designers to add data and augmentations to existing physical infrastructure on display. This could provide a more engaging experience to the audience and support comprehension. With HydrogenAR, we present a novel application that leverages the benefits of data-driven storytelling with those of AR to explain the unique challenges of hydrogen dispenser reliability. Utilizing physical props, situated data, and virtual augmentations and animations, HydrogenAR serves as a unique presentation tool, particularly critical for stakeholders, tour groups, and VIPs. HydrogenAR is a product of multiple collaborative design iterations with a local Hydrogen Fuel research team and is evaluated through interviews with team members and a user study with end-users to evaluate the usability and quality of the interactive AR experience. Through this work, we provide design considerations for AR data-driven presentations and discuss how AR could be used for innovative content delivery beyond traditional slide-based presentations.
Industrial Augmented Reality: Concepts and User Interface Designs for Augmented Reality Maintenance Worker Support Systems
Jisu Kim, Mario Lorenz, Sebastian Knopp, Philipp Klimant
Maintenance departments of producing companies in most industrial countries are facing challenges originating from an aging workforce, increasing product variety, and the pressure to increase productivity. We present the concepts and the user interface (UI) designs for two Augmented Reality (AR) applications, which help to tackle these issues. An AR Guidance System will allow new and unexperienced staff to perform medium to highly complex maintenance tasks, which they currently incapable to. The AR Remote Service System enables techni-cians at the machine to establish a voice/video stream with an internal or external expert. The video stream can be augmented with 3D models and drawings so that problems can be solved remotely and more efficiently. A qualitative assessment with maintenance managers and technicians from three producing companies rated the AR application concept as beneficial and the UI designs as very usable.
Intention to use an interactive AR app for engineering education
Alejandro Alvarez-Marin, J Ángel Velázquez-Iturbide, Mauricio Castillo-Vergara
An interactive AR app on electrical circuits was developed. The app allows the manipulation of circuit elements, computes the voltage and amperage values using the loop method, and applies Kirchhoff's voltage law. This research aims to determine the intention of using the AR app by students. It also looks to determine if it is conditioned by how the survey is applied (online or face-to-face) or gender of students. The results show that the app is well evaluated on the intention of use by students. Regarding how the survey is applied, the attitude towards using does not present significant differences. In contrast, the students who carried out the Online survey presented a higher behavioral intention to use than those who participated in the guided laboratory. Regarding gender, women showed a higher attitude toward using and behavioral intention to use this technology, compared to men.
Landmark-based mixed-reality perceptual alignment of medical imaging data and accuracy validation in living subjects
Christoph Leuze, Supriya Sathyanarayana, Bruce Daniel, Jennifer McNab
Medical augmented reality (AR) applications where virtual renderings are aligned with the real world allow to visualize internal anatomy of the patient to a medical caregiver wearing an AR headset. Accurate alignment of virtual and real content is important for applications where the virtual rendering is used to guide the medical procedure such as a surgery. Compared to 2D AR applications, where the alignment accuracy can be directly measured on the 2D screen, 3D medical AR applications require alignment measurements using phantoms and external tracking systems. In this paper we present an approach for landmark-based alignment, validation and accuracy measurement of a 3D AR overlay of medical images on the real-world subject. This is done by performing an initial MRI scan to acquire a subject’s head scan, an AR alignment task of the virtual rendering of the head MRI data to the subject’s real-world head using virtual fiducials, and a second MRI scan to test the accuracy of the AR alignment task. We have performed these 3D medical AR alignment measurements on seven volunteers using a MagicLeap AR head-mounted display. Across all seven volunteers we measured an alignment accuracy of 4.7 ± 2.6 mm. These results suggest that such an AR application can be a valuable tool for guiding non-invasive transcranial magnetic brain stimulation treatment. The presented MRI-based accuracy validation will furthermore be an important versatile tool to establish the safety of medical AR techniques.
Locomotive and Cognitive Trade-Offs for Target-based Travel
Chengyuan Lai, Afham Aiyaz, Ryan P. McMahan
Target-based travel has become a common travel metaphor for virtual reality (VR) applications. Three of the most common target-based travel techniques include Point-and-Instant-Teleport (Teleport), Point-and-Walk-Motion (Motion), and Automatic-Walk-Motion (Automatic). We present a study that employed a dual-task methodology to investigate the user performance characteristics and cognitive loads of the three target-based travel techniques, in addition to several subjective measures. Our results indicate that the Teleport technique afforded the best user travel performances, but that the Automatic technique afforded the best cognitive load.
MiXR: A Hybrid AR Sheet Music Interface for Live Performance
Shalva A. Kohen, Carmine Elvezio, Steven Feiner
Musicians face a number of challenges when performing live, including organizing and annotating sheet music. This can be an unwieldy process, as musicians need to simultaneously read and manipulate sheet music and interact with the conductor and other musicians. Augmented Reality can provide a way to ease some of the more cumbersome aspects of live performance and practice. We present MiXR, a novel interactive system that combines an AR headset, a smartphone, and a tablet to allow performers to intuitively and efficiently manage and annotate virtual sheet music in their physical environment. We discuss our underlying motivation, the interaction techniques supported, and the system architecture.
Mobile3DRecon: Real-time Monocular 3D Reconstruction on a Mobile Phone
Xingbin Yang, Liyang Zhou, Hanqing Jiang, Zhongliang Tang, Yuanbo Wang, Feng Pan, Hujun Bao, Guofeng Zhang
We present a real-time monocular 3D reconstruction system on a mobile phone, called Mobile3DRecon. Using an embedded monocular camera, our system provides an online mesh generation capability on back end together with real-time 6DoF pose tracking on front end for users to achieve realistic AR effects and interactions on mobile phones. Unlike most existing state-of-the-art systems which produce only point cloud based 3D models online or surface mesh offline, we propose a novel online incremental mesh generation approach to achieve fast online dense surface mesh reconstruction to satisfy the demand of real-time AR applications. For each keyframe of 6DoF tracking, we perform a robust monocular depth estimation, with a multi-view semi-global matching method followed by a depth refinement post-processing. The proposed mesh generation module incrementally fuses each estimated keyframe depth map to an online dense surface mesh, which is useful for achieving realistic AR effects such as occlusions and collisions. We verify our real-time reconstruction results on two mid-range mobile platforms. The experiments with quantitative and qualitative evaluation demonstrate the effectiveness of the proposed monocular 3D reconstruction system, which can handle the occlusions and collisions between virtual objects and real scenes to achieve realistic AR effects.
Modeling Emotions for Training in Immersive Simulations (METIS): A Cross-Platform Virtual Classroom Study
Alban Delamarre, Cédric Buche, Christine Lisetti
Virtual training environments (VTEs) using immersive technology have been able to successfully provide training for technical skills. Combined with recent advances in virtual social agent technologies and in affective computing, VTEs can now also support the training of social skills. Research looking at the effects of different immersive technologies on users' experience (UX) can provide important insights about their impact on user's engagement with the technology, sense presence and co-presence. However, current studies do not address whether emotions displayed by virtual agents provide the same level of UX across different virtual reality (VR) platforms. In this study, we considered a virtual classroom simulator built for desktop computer, and adapted for an immersive VR platform (CAVE). Users interact with virtual animated disruptive students able to display facial expressions, to help them practice their classroom behavior management skills. We assessed effects of the VR platforms and of the display of facial expressions on em presence, co-presence, engagement, and believability. Results indicate that users were engaged, found the virtual students believable and felt presence and co-presence for both VR platforms. We also observed an interaction effects of facial expressions and VR platforms for co-presence (p=.018<.05).
Optical Gaze Tracking with Spatially-Sparse Single-Pixel Detectors
Richard Li, Eric Whitmire, Michael Stengel, Ben Boudaoud, Jan Kautz, David Luebke, Patel Shwetak, Kaan Akşit
Gaze tracking is an essential component of next generation displays for virtual reality and augmented reality applications. Traditional camera-based gaze trackers used in next generation displays are known to be lacking in one or multiple of the following metrics:power consumption, cost, computational complexity, estimation ac-curacy, latency, and form-factor. We propose the use of discrete photodiodes and light-emitting diodes (LEDs) as an alternative to traditional camera-based gaze tracking approaches while taking all of these metrics into consideration. We begin by developing a rendering-based simulation framework for understanding the relationship between light sources and a virtual model eyeball. Findings from this framework are used for the placement of LEDs and photodiodes. Our first prototype uses a neural network to obtain an average error rate of 2.67°at 400Hz while demanding only 16mW. By simplifying the implementation to using only LEDs, duplexed as light transceivers, and more minimal machine learning model, namely a light-weight supervised Gaussian process regression algorithm, we show that our second prototype is capable of an average error rate of 1.57° at 250 Hz using 800 mW.
Real-Time Gait Reconstruction For Virtual Reality Using a Single Sensor
Tobias Feigl, Lisa Gruner, Christopher Mutschler, Daniel Roth
Embodying users through avatars based on motion tracking and reconstruction is an ongoing challenge for VR application developers. High quality VR systems use full-body tracking or inverse kinematics to reconstruct the motion of the lower extremities and control the avatar animation. Mobile systems are limited to the motion sensing of head-mounted displays (HMDs) and typically cannot offer this. We propose an approach to reconstruct gait motions from a single head-mounted accelerometer. We train our models to map head motions to corresponding ground truth gait phases. To reconstruct leg motion, the models predict gait phases to trigger equivalent synthetic animations. We designed four models: a threshold-based, a correlation-based, a Support Vector Machine (SVM) -based and a bidirectional long-term short-term memory (BLSTM) -based model. Our experiments show that, while the BLSTM approach is the most accurate, only the correlation approach runs on a mobile VR system in real time with sufficient accuracy. Our user study with 21 test subjects examined the effects of our approach on simulator sickness and showed significantly less negative effects on disorientation.
RGB-D-E: Event Camera Calibration for Fast 6-DOF Object Tracking
Etienne Dubeau, Mathieu Garon, Benoit Debaque, Raoul de Charette, Jean-François Lalonde
Augmented reality devices require multiple sensors to perform various tasks such as localization and tracking. Currently, popular cameras are mostly frame-based (e.g. RGB and Depth) which impose a high data bandwidth and power usage. With the necessity for low power and more responsive augmented reality systems, using solely frame-based sensors imposes limits to the various algorithms that needs high frequency data from the environement. As such, event-based sensors have become increasingly popular due to their low power, bandwidth and latency, as well as their very high frequency data acquisition capabilities. In this paper, we propose, for the first time, to use an event-based camera to increase the speed of 3D object tracking in 6 degrees of freedom. This application requires handling very high object speed to convey compelling AR experiences. To this end, we propose a new system which combines a recent RGB-D sensor (Kinect Azure) with an event camera (DAVIS346). We develop a deep learning approach, which combines an existing RGB-D network along with a novel event-based network in a cascade fashion, and demonstrate that our approach significantly improves the robustness of a state-of-the-art frame-based 6-DOF object tracker using our RGB-D-E pipeline. Our code and our RGB-D-E evaluation dataset are available at https://github.com/lvsn/rgbde_tracking
Scale-aware Insertion of Virtual Objects in Monocular Videos
Songhai Zhang, Xiangli Li, Yingtian Liu, Hongbo Fu
In this paper, we propose a scale-aware method for inserting virtual objects with proper sizes into monocular videos. To tackle the scale ambiguity problem of geometry recovery from monocular videos, we estimate the global scale objects in a video with a Bayesian approach incorporating the size priors of objects, where the scene object sizes should strictly conform to the same global scale and the possibilities of global scales are maximized according to the size distribution of object categories. To do so, we propose a dataset of sizes of object categories: Metric-Tree, a hierarchical representation of sizes of more than 900 object categories with the corresponding images. To handle the incompleteness of objects recovered from videos, we propose a novel scale estimation method that extracts plausible dimensions of objects for scale optimization. Experiments have shown that our method for scale estimation performs better than the state-of-the-art methods, and has considerable validity and robustness for different video scenes. Metric-Tree has been made available at: https://metric-tree.github.io
Stepping over Obstacles with Augmented Reality based on Visual Exproprioception
Alessandro Luchetti, Edoardo Parolin, Isidro III Butaslac, Yuichiro Fujimoto, Masayuki Kanbara, Paolo Bosetti, Mariolino De Cecco, Hirokazu Kato
The purpose of this study is to analyze the different kinds of exproprioceptive visual cues on an Augmented Reality (AR) system during gait exercise, on top of understanding which cues provide the best visualizations in stepping over obstacles. The main problem for users is to understand the position of virtual objects relative to themselves. Since visual exproprioception provides information about the body position in relation to the environment, it has the possibility to yield positive effects with regards to position control and gait biomechanics in the AR system. This research was born as part of the collaboration with the staff of Takanohara Central Hospital in Japan. Twenty-seven individuals were invited to take part in the user study to test three visual interfaces. The task of the mentioned user study involves making the subjects cross and avoid virtual obstacles of different heights that come from different directions. The AR application was implemented in the experiment by using the Head-Mounted Display (HMD) Microsoft HoloLens. Data obtained from the experiment revealed that the interface projected in front of the user from a third-person point of view resulted to improvements in terms of posture, visual stimuli, and safety.
The Comfort Benefits of Gaze-Directed Steering
Chengyuan Lai, Xinyu Hu, Ann Segismundo, Ananya Phadke, Ryan P. McMahan
Spatial steering is a common virtual reality (VR) travel metaphor that affords virtual locomotion and spatial understanding. Variations of spatial steering include Gaze-, Hand-, and Torso-directed steering. We present a study that employed a dual-task methodology to investigate the user performance characteristics and cognitive loads of the three spatial steering techniques, in addition to several subjective measures. Using the two one-sided tests (TOST) procedure for dependent means, we have found that Gaze- and Hand-directed steering were statistically equivalent for travel performance and cognitive load. However, we found that Gaze-directed steering induced significantly less simulator sickness than Hand-directed steering.
Towards Eyeglass-style Holographic Near-eye Displays with Statically Expanded Eyebox
Xinxing Xia, Yunqing Guan, Andrei State, Praneeth Chakravarthula, Tat-Jen Cham, henry fuchs
Holography is perhaps the only method demonstrated so far that can achieve a wide field of view (FOV) and a compact eyeglass-style form factor for augmented reality (AR) near-eye displays (NEDs). Unfortunately, the eyebox of such NEDs is impractically small (~<1mm). In this paper, we introduce and demonstrate a design for holographic NEDs with a practical, wide eyebox of ~10mm and without any moving parts, based on holographic lenslets. In our design, a holographic optical element (HOE) based on a lenslet array was fabricated as the image combiner with expanded eyebox. A phase spatial light modulator (SLM) alters the phase of the incident laser light projected onto the HOE combiner such that the virtual image can be perceived at different focus distances, which can reduce the vergence-accommodation conflict (VAC). We have successfully implemented a benchtop prototype following the proposed design. The experimental results show effective eyebox expansion to a size of ~10mm. With further work, we hope that these design concepts can be incorporated into eyeglass-size NEDs.
User-Aided Global Registration Method using Geospatial 3D Data for Large-Scale Mobile Outdoor Augmented Reality
Simon Burkard, Frank Fuchs-Kittowski
Accurate global camera registration is a key requirement for precise AR visualizations in large-scale outdoor AR applications. Existing approaches mostly use complex image-based registration methods requiring large pre-registered databases of geo-referenced images or point clouds that are hardly applicable to large-scale areas. In this paper, we present a simple yet effective user-aided registration method that utilizes common geospatial 3D data to globally register mobile devices. For this purpose, text-based 3D geospatial data including digital 3D terrain and city models is processed into small-scale 3D meshes and displayed in a live AR view. Via two common mobile touch gestures the generated virtual models can be aligned manually to match the actual perception of the real-world environment. Experimental results show that - combined with a robust local visual-inertial tracking system - this approach enables an efficient and accurate global registration of mobile devices in various environments determining the camera attitude with less than one degree deviation while achieving a high degree of immersion through realistic occlusion behavior.
Virtual Reality in Education: A Case Study on Exploring Immersive Learning for Prisoners
Jonny Collins, Tobias Langlotz, Holger Regenbrecht
Our research presented here tries to bridge the gap between technology-oriented lab work and the praxis of introducing VR technology into difficult to deploy to contexts—in our case prisoners with high learning needs. We have developed a prototypical immersive VR application designed for delivering low-level literacy and numeracy content to illiterate adults. This development has been taken to the commercial sector and is currently under product development. The target population for the application are those currently held in a correctional facility, but who have the motivation and determination to educate themselves. In this paper we discuss the current life-cycle of this project including the development, initial tests, and an exploratory study we conducted. We conclude with a discussion of logistical issues, potential research opportunities, and current outcomes.
Walking and Teleportation in Wide-area Virtual Reality Experiences
Ehsan Sayyad, Misha Sra, Tobias Höllerer
Location-based or Out-of-Home Entertainment refers to experiences such as theme and amusement parks, laser tag and paintball arenas, roller and ice skating rinks, zoos and aquariums, or science centers and museums among many other family entertainment and cultural venues. More recently, location-based VR has emerged as a new category of out-of-home entertainment. These VR experiences can be likened to social entertainment options such as laser tag, where physical movement is an inherent part of the experience versus at-home VR experiences where physical movement often needs to be replaced by artificial locomotion techniques due to tracking space constraints. In this work, we present the first VR study to understand the impact of natural walking in a large physical space on presence and user preference. We compare it with teleportation in the same large space, since teleportation is the most commonly used locomotion technique for consumer, at-home VR. Our results show that walking was overwhelmingly preferred by the participants and teleportation leads to a significantly higher disorientation component in a standard simulator sickness questionnaire. The data also shows a trend towards higher self-reported presence for natural walking.
Session 2
"Kapow!": Augmenting Contact with Real and Virtual Objects Using Stylized Visual Effects
Victor Rodrigo MERCADO, Jean-Marie Normand, Anatole Lécuyer
In this poster we propose a set of stylized visual effects (VFX) meant to improve the sensation of contact with objects in Augmented Reality (AR). Various graphical effects have been conceived, such as virtual cracks, virtual wrinkles, or even virtual onomatopoeias inspired by comics. The VFX are meant to augment the perception of contact, with either real or virtual objects, in terms of material properties or contact location for instance. These VFX can be combined with a pseudo-haptics approach to further increase the range of simulated physical properties of the touched materials. An illustrative setup based on a HoloLens headset was designed, in which our proposed VFX could be explored. The VFX appear each time a contact is detected between the user's finger and one object of the scene. Such VFX-based approach could be introduced in AR applications for which the perception and display of contact information are important.
3D human model creation on a serverless environment
Peter Fasogbon
The creation of realistic 3D human model is cumbersome, takes longer time and is traditionally done by trained professionals. While computer vision technologies can generate human model from controlled environments, we demonstrate a pure mobile web approach for creating realistic human models from few images captured using a smartphone camera. Our 3D reconstruction pipeline consists of various intermediate process such as semantic human segmentation, human keypoint detection and texture generation. The whole reconstruction process is containerized into a state-less function ready to be deployed to any server-less cloud. The use of server-less architecture eases the building of such multimedia service without any hard-coded structure. In addition, there is no need for a specialized mobile device with advance hardware to have accurate human model. Thanks to this proposed framework, anyone can easily generate 3D models on any kind of smartphone device without expert guidance, and in less than 3 minutes.
A Brain-Computer Interface and Augmented Reality Neurofeedback to Treat ADHD: A Virtual Telekinesis Approach
G S Rajshekar Reddy, Lingaraju G M
Attention-Deficit/Hyperactivity Disorder or ADHD poses a severe concern for today's youth, especially when the costs, efficacy, side effects of medication and the lack of immediate risk discourage treatment. ADHD causes people to make impulsive decisions, making it harder to succeed in school, work, and other aspects of life. Neurofeedback Therapy has shown promising results as an alternative in treating mental disorders and improving cognition. It leverages the inherent mechanism of operant conditioning by presenting real-time feedback of the user's brainwave activity, which is usually acquired via an EEG. However, long sessions of monotonous feedback have proven tedious, and users lose motivation to continue. Engaging graphical interfaces and games have been developed to combat this issue and have been proven to improve the treatment's efficacy. In this work, we extend upon these methods to increase engagement by employing Augmented Reality in the context of a virtual telekinetic game. The system comprises three modules: an Emotiv headset for EEG acquisition, MATLAB for signal processing, and an AR mobile application to deliver the feedback. The hardware and software implementation, the signal processing methodology, and the Neurofeedback protocol are thoroughly outlined.
A Sense of Quality for Augmented Reality Assisted Process Guidance
Anes Redzepagic, Christoffer Löffler, Tobias Feigl, Christopher Mutschler
The ongoing automation of modern production processes requires novel human-computer interaction concepts that support employees in dealing with the unstoppable increase in time pressure, cognitive load, and the required fine-grained and process-specific knowledge. Augmented Reality (AR) systems support employees by guiding and teaching work processes. Such systems still lack a precises process quality analysis (monitoring), which is, however, crucial to close gaps in the quality assurance of industrial processes. We combine inertial sensors, mounted on work tools, with AR headsets to enrich modern assistance systems with a sense of process quality. For this purpose, we develop a Machine Learning (ML) classifier that predict quality metrics from a 9-degrees of freedom inertial measurement unit, while we simultaneously guide and track the work processes with a HoloLens AR system. In our user study, 8 test subjects perform typical assembly tasks with our system. We evaluate the tracking accuracy of the system based on a precise optical reference system and evaluate the classification of each work step quality based on the collected ground truth data. Our evaluation shows a tracking accuracy of fast dynamic movements of \\SI{4.92}{mm} and our classifier predicts the actions carried out with mean F1 value of $93.8\\%$ on average.
A User Study on AR-assisted Industrial Assembly
Jan Schmitt, Bastian Engelmann, Uwe Sponholz, Florian Schuster
The utilization of modern assistance system e.g. Augmented Reality (AR) has reached into industrial assembly scenarios. Beside the technical realization of AR assistance in the assembly scene the worker has to accept the new technology. Only both, user acceptance and technical user-interface design leads to an optimized Overall system. Hence, this contribution gives a brief literature overview and analysis about AR acceptance and acceptance modeling. Then, a proprietary model for acceptance measurement is developed, which includes and synthesizes previous models (TAM and UTAUT) and simplifies them considerably for the purpose of industrial assembly. Following, a laboratory experiment is set-up in the FHWS c-Factory, which is a smart, IoT-based production environment. A survey and an assembly cycle time measurement is conducted to collect data to characterize AR assistance. The study participants assemble a toy truck once without and once with AR support. The evaluation Shows that the mean assembly time decreases. The results show also, that AR is accepted by the participants supporting their work.
A Virtual Morris Water Maze to Study Neurodegenarative Disorders
Daniel Roth, Christian Felix Purps, Wolf-Julian Neumann
Navigation is a crucial cognitive skill that allows humans and an-imals to move from one place to another without getting lost. Inneurological patients this skill can be impaired, when neural struc-tures that form the brain networks important for spatial learning areimpaired. Thus, spatial navigation represents an important measureof cognitive health that is impossible to test in a clinical exami-nation, due to lack of space in examination rooms. Consequently,spatial navigation is largely neglected in the clinical assessment ofneurological, neurosurgical and psychiatric patients. Virtual realityrepresents a unique opportunity to develop a systematic assessmentof spatial navigation for diagnosis and therapeutic monitoring ofmillions of patients presenting with cognitive decline in the clinicalroutine. Therefore, we have adapted a classical spatial navigationparadigm that was developed for animal research, the “Morris WaterMaze” as an openly available Virtual Reality (VR) application, thatallows objective quantification of navigational skills in humans. Thistool may be used in the future to aid the assessment of the humannavigation system in health and neurological disease.
An Exploratory Study for Designing Social Experience of Watching VR Movies Based on Audience’s Voice Comments
Shuo Yan, Wenli Jiang, Menghan Xiong, Xukun Shen
Social experience is important when audience are watching movies. Virtual reality (VR) movies engage audience through immersive environment and interactive narrative. However, VR headsets restrict audience to an individual experience, which disrupt the potential for shared social realities. In our study, we propose an approach to design an asynchronous social experience that allows the participant to receive other audiences’ voice comments (such as their opinions, impressions or emotional reactions) in VR movies. We measured the participants’ feedback on their engagement levels, recall abilities and social presence. The results showed that in VR-Voice Comment (VR-VC) movie, the audience’s voice comments could affect participant’s engagement and the recall of information in the scenes. The participants obtained social awareness and enjoyment at the same time. A few of them were worried mainly because of the potential auditory clutter that resulted from unpredictable voice comments. We discuss the design implications for this and directions for future research. Overall, we observe a positive tendency in watching VR-VC movie, which could be adapted for future VR movie experience.
AR-Chat: an AR-based instant messaging system
Pierrick Jouet, Anthony Laurent, Tao Luo, Vincent Alleaume, Matthieu Fradet, Caroline Baillard
We describe a multi-user system enabling instant messaging in Augmented Reality. A user can get in contact with another one without requiring his/her identification number and can easily localize the person initiating the contact. It is also possible to exchange various types of personal information in a private manner. This innovative type of social interaction can significantly increase consumer interest for AR experiences.
Augmented illusionism. The influence of optical illusions through artworks with augmented reality
Borja Jaume Pérez
The regular studies on optical illusions through artistic practice usually focus on illusionist painting, placing its conceptualization and methodology within the framework of the pictorial tradition. However, few investigations have dealt with the phenomena of the wrong motion perception, color, anamorphosis, or trompe l'oeil in artistic pieces created through the use of augmented reality. This offers a new field of study through which to explore optical illusions, hereby considered erroneous and unconscious perceptions of the actual physical characteristics of an image or object, through the hybridization of real and virtual elements.
Automatic Generation of Diegetic Guidance in Cinematic Virtual Reality
Chong Cao, Zhaowei Shi, Miao Yu
One of the advantages of cinematic virtual reality over traditional cinema is that viewers can explore freely in the virtual space. However, such freedom leads to a problem of missing some key events during watching. Therefore, directional visual guidance in the virtual space is vital for the viewer to follow the story-line and capture key events. Generating moving objects is one of the most commonly used diegetic guidance which can implicitly guide the viewer's attention in the virtual space. In this paper, we investigate the formulation of key events in cinematic virtual reality with event-of-interest script. Based on the formulation, we analyze the factors influencing diegetic guidance and propose an automatic-generating approach named Dynamic Diegetic Guidance. Dynamic Diegetic Guidance maintains viewers' perception of immersion and takes less time than fixed guidance, which makes the viewing experience more informative and entertaining.
Comparing Single-modal and Multimodal Interaction in an Augmented Reality System
Zhimin Wang, Huangyue Yu, Haofei Wang, Zongji Wang, Feng Lu
Multimodal interaction is expected to offer better user experience in Augmented Reality (AR), and thus becomes a recent research focus. However, due to the lack of hardware-level support, most existing works only combine two modalities at a time, e.g., gesture and speech. Gaze-based interaction techniques have been explored for the screen-based application, but rarely been used in AR systemsy configurable augmented reality system. In this paper, we propose a multimodal interactive system that integrates gaze, gesture and speech in a flexibly configurable augmented reality system. Our lightweight head-mounted device supports accurate gaze tracking, hand gesture recognition and speech recognition simultaneously. More importantly, the system can be easily configured into different modality combinations to study the effects of different interaction techniques. We evaluated the system in the table lamps scenario, and compared the performance of different interaction techniques. The experimental results show that the Gaze+Gesture+Speech is superior in terms of performance.
Decoupled Localization and Sensing with HMD-based AR for Interactive Scene Acquisition
Soeren Skovsen, Harald Haraldsson, Abe Davis, Henrik Karstoft, Serge Belongie
Real-time tracking and visual feedback offer interactive AR-assisted capture systems as a convenient and low-cost alternative to specialized sensor rigs and robotic gantries. We present a simple strategy for decoupling localization and visual feedback in these applications from the primary sensor being used to capture the scene. Our strategy is to use an AR HMD and 6-DOF controller for tracking and feedback, synchronized with a separate primary sensor for capturing the scene. This approach allows for convenient real-time localization of sensors that cannot do their own localization. In this short paper, we present a prototype implementation of this strategy and investigate the accuracy of decoupled tracking by comparing runtime pose estimates to the results of high-resolution offline SfM.
Design preferences on Industrial Augmented Reality: a survey with potential technical writers
Michele Gattullo, Lucilla Dammacco, Francesca Ruospo, Alessandro Evangelista, Michele Fiorentino, Jan Schmitt, Antonio E. Uva
The research presented in this contribution aims to investigate user preferences about how to convey information in Industrial Augmented Reality (IAR) interfaces to the user. Our interest is focused on the opinion of potential technical writers of IAR documentation for assembly or maintenance operations. Authoring of IAR interfaces imply a choice among various visual assets, that is influenced by the information type and the AR display used. There are no specific standards in the literature to follow and it is challenging to extract guidelines from the literature. This study gathers preferences of 105 selected users that have knowledge about IAR issues, graphical user interfaces (GUI) designing, and assembly/maintenance procedures. The results of this survey show a great preference for 3D CAD models of components (product model) for almost all the information types. However, some alternative visual assets have also been proposed, such as video and auxiliary models. Contrary to common practices in industry, text was the least preferred visual asset. The insights from this research can help other IAR technical writers in the authoring of their interfaces.
Distortion Correction Algorithm of AR-HUD Virtual Image based on Neural Network Model of Spatial Continuous Mapping
Ke Li, Ling Bai , Yinguo Li, Zhongkui Zhou
We propose a distortion correction framework of the AR-HUD virtual image based on a multilayer feedforward neural network model(MFNN) and spatial continuous mapping(SCM). First, we put forward the concept and calculation method of the equivalent plane of the virtual image in the AR-HUD system. Then construct a network structure named MFNN-SCM, and train a network model that can predict the vertex coordinates and pre-distortion map of the equivalent plane of AR-HUD virtual image, and then obtain the eye position of the driver based on training. The network model calculates the virtual image projection mapping relationship under the current eye position of the driver. Finally, the virtual image projection mapping relationship is used to pre-distort the AR-HUD projected image, and the pre-distorted image is projected to improve the AR-HUD imaging effect observed by the driver. In addition, we have embedded the framework into the AR-HUD system of intelligent vehicles and tested it in the real vehicle. The results show that he projected virtual image in this paper has a small relative pixel drift at any eye position. On the premise of ensuring the real-time performance of the algorithm, our method has stronger flexibility, higher accuracy and lower cost than other existing methods.
Effects of Augmented Content’s Placement and Size on User's Search Experience in Extended Displays
Hyunjin Lee, Woontack Woo
Using an augmented reality head mounted display(HMD) to extend the display of smartphone could open up new user interface solutions, benefiting users with an increased screen real-estate and visualisations that leverage each device’s capabilities. Some previous works have explored the viability of extended displays, but knowledge regarding considerable design factors and constraints for such displays is still very limited. In our work, we conducted an exploratory study to investigate how different properties of augmented content in extended displays affect the task performance and subjective workload of the user. Through an experiment with 24 participants, we compared four augmented content placements (top, right, left and bottom) and two augmented content sizes (small and large). The study results demonstrated that there are significant effects of both placement and size of augmented content on task performance and subjective workload. Based on the findings of our study, we provide design implications on the future design of extended displays: When extending the screen of the smartphone with HMDs 1) augmentations should not occlude the dominant-hand performing interactions on the smartphone 2) placing information on the bottom of the smartphone should be avoided, and 3) augmented content should be provided in a size where both augmented content and smartphone content could be viewed together.
Effects of Background Complexity and Viewing Distance on an AR Visual Search Task
Hyunjin Lee, Sunyoung Bang, Woontack Woo
Information in augmented reality (AR) consists of virtual and real contexts and is provided to the AR environment at different distances for each user. Therefore, it is important to understand how the integration of two different information influences the user’s AR experience. However, little has been studied regarding the complexity and viewing distance of a real space. Our study investigates how the complexity of the physical environment and viewing distance influence workload and performance in a visual search task in an AR environment. We conducted an experiment in which participants performed conjunction search under three different levels of background complexity, at both near (1.5m) and far (3m) distances. The results indicated that as the complexity of the background increased, the users’ performance time and workload were negatively impacted. In addition, when the distance between the user and the background was greater, search time increased. From the results of the study, we derived some recommendations for the design of AR interfaces. Our research contributes to the design of interfaces by demonstrating the necessity to consider the complexity of background and viewing distance.
EmnDash: M-sequence Dashed Markers on Vector-based Laser Projection for Robust High-speed Spatial Tracking
Ryota Nishizono, Tomohiro Sueishi, Masatoshi Ishikawa
Camera pose estimation is commonly used for augmented reality, and it is currently expected to be integrated into sports assistant technologies. However, conventional methods face difficulties in simultaneously achieving fast estimation in milliseconds or less for sports, bright lighting environments of the outdoors, and capturing of large activity areas. In this paper, we propose EmnDash, M-sequence dashed markers on vector-based laser projection for an asynchronous high-speed dynamic camera, which provides both a graphical information display for humans and markers for the wearable high-speed camera with a high S/N ratio from a distance. One of the main notions is drawing a vector projection image with a single stroke using two dashed lines as markers. The other involves embedding the binary M-sequence as the length of each dashed line and its recognition method using locality. The recognition of the M-sequence dashed line requires only a one-shot image, which increases the robustness of tracking both in terms of camera orientation and occlusion. We experimentally confirm an increase in recognizable posture, sufficient tracking accuracy, and low-computational cost in the evaluation of a static camera. We also show good tracking ability and demonstrate immediate recovery from occlusion in the evaluation of a dynamic camera.
Evaluate Optimal Redirected Walking Planning Using Reinforcement Learning
Ko Tsai-Yen, Su Li-wen, Chang Yuchen, Keigo Matsumoto, Takuji Narumi, Michitaka Hirose
Redirected Walking (RDW) is commonly used to overcome the limitation of real walking locomotion while exploring virtual worlds. Although a few machine learning-based RDW algorithm is proposed, most of the system did not go through live user evaluation. In this work, we evaluated a novel RDW controller proposed by Chang et al., in which the formatted steering rule is replaced with reinforcement learning(RL), by simulation and live user experiment. We found the RL-based RDW controller reduced boundary collisions significantly in both simulation and user study comparing to the heuristic algorithm, Steer-to-Center(S2C); also, there are no noticeable differences in immersiveness. These results indicate that the novel controller is superior to the heuristic method. Furthermore, as we conducted experiments in a relatively simple space and still outperformed the heuristic method, we are optimistic that the RL-based controller can maintain the high-performance in complicated scenarios in the future.
Flower Factory: A Component-based Approach for Rapid Flower Modeling
Siyuan Wang, Junjun Pan, Junxuan Bai, Jinglei Wang
The rapid 3D objects modeling is of great importance for enriching digital content, which is one of the essential tasks in VR/AR research. Flowers are frequently utilized in real-time applications, such as video games and virtual reality scenes. Technically, generating a realistic flower using the existing 3D modeling software is complicated and time-consuming for designers. Moreover, it is difficult to create imaginary and surreal flowers, which might be more interesting and attractive for the artists and game players. In this paper, we propose a component-based framework for rapid flower modeling, called Flower Factory. The flowers are assembled by different components, e.g., petals, stamens, receptacles and leaves. The shape of these components are created using simple primitives such as points and splines. After the shape of models are determined, the textures are synthesized automatically based on a predefine mask, according to a number of rules from real flowers. The entire fabrication can be controlled by several parameters, which describe the physical attributes of the flowers. Our technique is capable of producing a variety of flowers rapidly. Even novices without any modeling skills are able to control and model the 3D flowers. Furthermore, the developed system will be integrated in a lightweight application of smartphone due to its low computational cost.
Industrial Augmented Reality: 3D-Content Editor for Augmented Reality Maintenance Worker Support System
Mario Lorenz, Sebastian Knopp, Jisu Kim, Philipp Klimant
Supporting maintenance with 3D object enhanced instruction is one of the key applications of Augmented Reality (AR) in industry. For the breakthrough of AR in maintenance, it is important that the technicians themselves can create AR-instructions and perform the challenging task of placing 3D objects as they know best how to perform a task and what necessary information needs to be displayed. For this challenge, a 3D-content editor is being presented wherein a first step the 3D objects can roughly be placed using a 2D image of the ma-chine, therefore, limiting the time required to access the machine. In a second step, the positions of the 3D objects can be fine-tuned at the machine site using live footage. The key challenges were to develop an easily accessible UI that requires no prior knowledge of AR content creation in a tool that works both with live footage and images and is usable with a touch screen and keyboard/mouse. The 3D-content editor was qualitatively assessed by technicians revealing its general applicability, but also the requirement for a lot of time to gain the necessary experience for positioning 3D objects.
Investigating Three-dimensional Directional Guidance With Nonvisual Feedback with Target Searching Task
SeungA Chung, Kyungyeon Lee, Uran Oh
While directional guidance is essential for spatial navigation, little has been studied about providing nonvisual cues in 3D space for individuals who are blind or have limited visual acuity. To understand the effects of different nonvisual feedback for 3D directional guidance, we conducted a user study with 12 blind-folded participants. They were asked to search for a virtual target in a 3D space with a laser pointer as quickly as possible under 6 different feedback designs varying the feedback mode (beeping vs. haptic vs. beeping+haptic) and the presence of a stereo sound. Our findings show that beeping sound feedback with and without haptic feedback outperforms the mode where only haptic feedback is provided. We also found that stereo sound feedback generated from a target significantly improves both the task completion time and travel distance. Our work can help people who are blind or have limited visual acuity to understand the directional guidance in a 3D space.
LCR-SMPL: Toward Real-time Human Detection and 3D Reconstruction from a Single RGB Image
Elena Peña-Tapia, Ryo Hachiuma, Antoine Pasquali, Hideo Saito
This paper presents a novel method for simultaneous human detection and 3D shape reconstruction from a single RGB image. It offers a low-cost alternative to existing motion capture solutions, allowing to reconstruct realistic human 3D shapes and poses by leveraging the speed of an object-detection based architecture and the extended applicability of a parametric human mesh model. Evaluation results using a synthetic dataset show that our approach is on-par with conventional 3D reconstruction methods in terms of accuracy, and outperforms them in terms of inference speed, particularly in the case of multi-person images.
Learning Bipartite Graph Matching for Robust Visual Localization
Hailin Yu, Weicai Ye, Youji Feng, Guofeng Zhang, Hujun Bao
2D-3D matching is an essential step for visual localization, where the accuracy of the camera pose is mainly determined by the quality of 2D-3D correspondences. The matching is typically achieved by the nearest neighbor search of local features. Many existing works have shown impressive results on both efficiency and accuracy. Recently emerged learning-based features further improve the robustness compared to the traditional hand-crafted ones. However, it is still not easy to establish enough correct matches in challenging scenes with illumination changes or repetitive patterns due to the intrinsic local property of the features. In this work, we propose a novel method to deal with 2D-3D matching in a very robust way. We first establish as many potential correct matches as possible using the local similarity. Then we construct a bipartite graph and use a deep neural network, referred to as Bipartite Graph Network (BGNet), to extract the global geometric information. The network predicts the likelihood of being an inlier for each edge and outputs the globally optimal one-to-one correspondences with a Hungarian pooling layer. The experiments show that the proposed method is able to find more correct matches and improves localization in both robustness and accuracy. The results on multiple visual localization datasets are obviously better than the existing state-of-the-arts, which demonstrate the effectiveness of the proposed method.
Living with Rules: An AR Approach
Vinu Kamalasanan, Monika Sester
The Social distancing rule has proven to be an effective measure against the spread of the infectious COronaVIrus Disease 2019 (COVID-19). Even with a lot of research focusing on static camera based solutions for monitoring the rule, the real issue with visualising and monitoring rules for public spaces still remain an open question. In this work we propose a Social Distancing Helmet (SDH) with basic prototyping for an outdoor augmentation system using body worn sensors for visualising and monitoring rules for shared spaces using AR. First results with some software components of the prototype are presented.
Lower Limb Balance Rehabilitation of Post-stroke Patients Using an Evaluating and Training Combined Augmented Reality System
Shuwei Che, Ben Hu, Yang Gao, Zhiping Liao, Jianhua Li, Aimin Hao
Augmented/virtual reality applications can provide immersive and interactive virtual environment for motor rehabilitation using the collaborativestimulationsofmultiplesensorychannelssuchassight, hearing, and movement, enhance the rehabilitation effect through repetitions,feedbacks,andencouragement. Inthispaper,wepropose an evaluating and training integrated application for the rehabilitation of patients with lower limb balance disorder. The AR-based evaluation module visualizes the limits of lower limbs patients’ balance abilities and provides quantitative data to their therapists, then rehabilitation therapists can customize personalized VR training games accordingly.
LSFB: A Low-cost and Scalable Framework for Building Large-Scale Localization Benchmark
Haomin Liu, Mingxuan Jiang, Zhuang Zhan, Xiaopeng Huang, Linsheng Zhao, Meng Hang, Youji Feng, Hujun Bao, Guofeng Zhang
With the rapid development of mobile sensor, network infrastructure and cloud computing, the scale of AR application scenario is expanding from small or medium scale to large-scale environments. Localization in the large-scale environment is a critical demand for the AR applications. Most of the commonly used localization techniques require quite a number of data with groundtruth localization for algorithm benchmarking or model training. The existed groundtruth collection methods can only be used in the outdoors, or require quite expensive equipments or special deployments in the environment, thus are not scalable to large-scale environment or to massively produce large amount of groundtruth data. In this work, we propose LSFB, a novel low-cost and scalable framework to build localization benchmark in large-scale environments with groundtruth poses. The key is to build an accurate HD map of the environment. For each visual-inertial sequence captured in it, the groundtruth poses are obtained by joint optimization taking both the HD map and visual-inertial constraints. The experiments demonstrate the obtained groundtruth poses are accurate enough for AR applications. We use the proposed method to collect a dataset of both mobile phones and AR glass exploring in large-scale environments, and will release the dataset as a new localization benchmark for AR.
Machine Intelligence Matters: Rethink Human-Robot Collaboration Based on Symmetrical Reality
Zhenliang Zhang, Xuejiao Wang
Human-robot collaboration could be valuable in some challenging tasks. Previous researches only consider the human-centered systems, but there will be many changes in the symmetrical reality systems because there are two perceptual centers in symmetrical reality. In this paper, we introduce the contents of the symmetrical reality-based human-robot collaboration and interpret the human-robot collaboration from the perspective of equivalent interaction. By analyzing task definition in symmetrical reality, we present the special features of human-robot collaboration. Furthermore, there are many fields in which the symmetrical reality can produce a remarkable effect, we only list some typical applications, such as service robots, remote training, interactive exhibition, digital assistants, companion robots, the immersive entertainment community and so forth. The current situation and future development of this framework are also analyzed to provide a kind of guidance for researchers.
Multi-feature 3D Object Tracking with Adaptively-Weighted Local Bundles
Jiachen Li, Fan Zhong, Xueying Qin
D object tracking with monocular RGB images faces many challenges in real environments. The popular color- and edge-based methods, although have been well studied, are still known to be limited in handling specific cases. We observed that the color and edge features are complementary for different cases, and thus propose to fuse them for improving tracking robustness. To optimize the combination and to cope with inconsistency between color and edge features, we propose to fuse different energy terms with respect to a set of local bundles. Each bundle represents a local region containing a set of pixel locations for computing color and edge energies, in which two energy terms are adaptively weighted to play advantages of them. Experiments show that the proposed method can improve the accuracy in challenging cases, especially in light changing and similar color condition.
NIID-Net: Adapting Surface Normal Knowledge for Intrinsic Image Decomposition in Indoor Scenes
Jundan Luo, Zhaoyang Huang, Yijin Li, Xiaowei Zhou, Guofeng Zhang, Hujun Bao
Intrinsic images (i.e., a reflectance image and a shading image) are used in some augmented reality applications to improve immersion, because better visual coherence between virtual components and real scenes can be achieved if we edit the input image via its intrinisc images. Intrinsic image decomposition estimates intrinsic images from a single input image. The main challenge is that the decomposition equation is ill-posed, especially in indoor scenes where lighting conditions are complicated and spatially-varying. We propose the NIID-Net (Normal-injected Intrinsic Image Decomposition Network) to alleviate the ambiguities by adapting surface normal knowledge, the training data of which is relatively more abundant and low-cost. Instead of directly estimating the shading image, we propose to estimate the normal adapting map first, which is a mid-level feature map that encodes spatially-varying lighting conditions and reconstructs shading with the predicted surface normal. Besides, we propose normal feature adapters to propagate pre-trained geometry knowledge from the normal estimation network into intrinsic image decomposition. Our framework significantly reduces misinterpreted texture variation in estimated shading images while recovering reasonable shading variation. In terms of both visual comparison and numerical accuracy, our NIID-Net outperforms all previous works in shading estimation and achieves competitive performance in reflectance estimation.
Perceptions of Integrating Augmented Reality into Network Cabling Tutors
Bradley Herbert, Grant Wigley, Barrett Ens, Mark Billinghurst
As networks become increasingly complex, professionals must familiarise themselves with the cabling rack to administer the network effectively. In line with this industry goal, we compared the usability, task load and perceptions of three similar network cabling tutoring systems: (1) a hand-held AR-based cabling tutor (HAR); (2) a head-mounted AR-based cabling tutor (HMD) and (3) a 2D-based cabling tutor (HH). While usability of different modalities have been compared previously, none of those comparisons used knowledge modelling approaches. So, in our comparison, each tutor uses knowledge space modelling (KSM) approaches to detect learner mistakes and show arrows on the rack to indicate the source of the mistake. While adding an AR sub-system to a network cabling tablet-based tutor may not necessarily improve usability, participants reported higher engagement over the AR hand held tablet when using the HMD condition. Several potential reasons were identified to account for the side effect such as the potential for the AR sub-system to potentially influence learning perceptions; the additional physical effort of needing to point the tablet at the rack and perceived performance degradation.
Pleistocene Crete: A narrative, interactive mixed reality exhibition that brings prehistoric wildlife back to life
Konstantinos Cornelis Apostolakis, George Margetis, Constantine Stephanidis
This paper describes a three-part interactive museum exhibition targeted at the domain of Natural History, where various technological components are utilized to deliver a compelling Mixed Reality experience for the museum’s visitors. The goal is to create new educational pathways for both adults and children wishing to learn about prehistoric life on the island of Crete, while simultaneously attracting a broader audience and maintaining its engagement with the museum digital content for longer periods of time. In these experiences, holographic technology, diminished reality and multiview 3D reconstruction are combined with fact/fantasy storytelling, utilizing cost-efficient state-of-the-art solutions.
PRISME: An interaction model linking domain activities and mixed and tangible interactors in virtual environments
Jean-Michel FAZZARI, Ronan Querrec, Sébastien Kubicki
The PRISME model introduced in this article is part of ongoing research On VR and AR for ergonomics and the design of industrial operator platforms. In this context PRISME is an innovative solution by proving an automatic link (removing the need for domain-specific adaptations) between operator tasks on and interactions with their platform. This research will be presented in two stages: a generic topology of interactors in Mixed and Tangible reality followed by an interaction model based on MASCARET’s activity description meta-model syntax. Finally, an aeronautical use case will validate the model by simulating the standard operations performed by an airplane controller.
Real-Time Detection of Simulator Sickness in Virtual Reality Games Based on Players’ Psychophysiological Data during Gameplay
Jialin Wang, Hai-Ning Liang, Diego Vilela Monteiro, Wenge Xu, Hao Chen, Qiwen Chen
Virtual Reality (VR) technology has been proliferating in the last decade, especially in the last few years. However, Simulator Sickness (SS) still represents a significant problem for its wider adoption. Currently, the most common way to detect SS is using the Simulator Sickness Questionnaire (SSQ). SSQ is a subjective measurement and is inadequate for real-time applications such as VR games. This research aims to investigate how to use machine learning techniques to detect SS based on in-game characters’ and users' physiological data during gameplay in VR games. To achieve this, we designed an experiment to collect such data with three types of games. We trained a Long Short-Term Memory neural network with the dataset eye-tracking and character movement data to detect SS in real-time. Our results indicate that, in VR games, our model is an accurate and efficient way to detect SS in real-time.
Retargetable AR: Context-aware Augmented Reality in Indoor Scenes based on 3D Scene Graph
Tomu Tahara, Takashi Seno, Gaku Narita, Tomoya Ishikawa
We present Retargetable AR—a novel AR framework that yields an AR experience that is aware of scene contexts set in various real environments, achieving natural interaction between the virtual and real worlds. We characterize scene contexts with relationships among objects in 3D space. A context assumed by an AR content and a context formed by a real environment where users experience AR are represented as abstract graph representations, i.e. scene graphs. From RGB-D streams, our framework generates a volumetric map in which geometric and semantic information of a scene are integrated. Moreover, using the semantic map, we abstract scene objects as oriented bounding boxes and estimate their orientations. Then our framework constructs, in an online fashion, a 3D scene graph characterizing the context of a real environment for AR. The correspondence between the constructed graph and an AR scene graph denoting the context of AR content provides a semantically registered content arrangement, which facilitates natural interaction between the virtual and real worlds. We performed extensive evaluations on our prototype system through quantitative evaluation of the performance of the oriented bounding box estimation, subjective evaluation of the AR content arrangement based on constructed 3D scene graphs, and an online AR demonstration.
Stage-wise Salient Object Detection in 360 Omnidirectional Image via Object-level Semantical Saliency Ranking
Guangxiao Ma, Shuai Li, Chenglizhao Chen, Aimin Hao, Hong Qin
The 2D image based salient object detection (SOD) has been extensively explored, while the 360 omnidirectional image based SOD has received less research attention and there exist three major bottlenecks that are limiting its performance. Firstly, the currently available training data is insufficient for the training of 360 SOD deep model. Secondly, the visual distortions in 360 omnidirectional images usually result in large feature gap between 360 images and 2D images; consequently, the widely used stage-wise training——a widely-used solution to alleviate the training data shortage problem, becomes infeasible when conducing SOD in 360 omnidirectional images. Thirdly, the existing 360 SOD approach has followed a multi-task methodology that performs salient object localization and segmentation-like saliency refinement at the same time, being faced with extremely large problem domain, making the training data shortage dilemma even worse. To tackle all these issues, this paper divides the 360 SOD into a multi-stage task, the key rationale of which is to decompose the original complex problem domain into sequential easy sub problems that only demand for small-scale training data. Meanwhile, we learn how to rank the "object-level semantical saliency", aiming to locate salient viewpoints and objects accurately. Specifically, to alleviate the training data shortage problem, we have released a novel dataset named 360-SSOD, containing 1,105 360 omnidirectional images with manually annotated object-level saliency ground truth, whose semantical distribution is more balanced than that of the existing dataset. Also, we have compared the proposed method with 13 SOTA methods, and all quantitative results have demonstrated the performance superiority.
Stencil Marker: Designing Partially Transparent Markers for Stacking Augmented Reality Objects
Xuan Zhang, Jonathan Lundgren, Yoya Mesaki, Yuichi Hiroi, Yuta Itoh
We propose a transparent colored AR marker that allows 3D objects to be stacked in space. Conventional AR markers make it difficult to display multiple objects in the same position in space, or to manipulate the order or rotation of objects. The proposed transparent colored markers are designed to detect the order and rotation direction of each marker in the stack from the observed image, based on mathematical constraints. We describe these constraints to design markers, the implementation to detect its stacking order and rotation of each marker, and a proof-of-concept application Totem Poles. We also discuss limitations of the current prototype, and possible research directions.
TGA: Two-level Group Attention for Assembly State Detection
Hangfan Liu, Yongzhi Su, Jason Rambach, Alain Pagani, Didier Stricker
Assembly state detection, i.e., object state detection, has a critical meaning in computer vision tasks, especially in AR assisted assembly. Unlike other object detection problems, the visual difference between different object states can be subtle. For the better learning of such subtle appearance difference, we proposed a two-level group attention module (TGA), which consists of inter-group attention and intro-group attention. The relationship between feature groups as well as the representation within a feature group is simultaneously enhanced. We embedded the proposed TGA module in a popular object detector and evaluated it on two new datasets related to object state estimation. The result shows that our proposed attention module outperforms the baseline attention module.
Towards an AR game for walking rehabilitation: Preliminary study of the impact of augmented feedback modalities on walking speed
Anne-Laure Guinet, Guillaume Bouyer, Samir Otmane, Eric Desailly
Designing a serious game for walking rehabilitation requires compliance with the theory of motor learning. Motivation, repetition, variability and feedback are key elements in improving and relearning a new walking pattern. As a preamble to the development of an AR rehabilitation game, and in order to choose the most effective feedback to provide to the patient, this article presents a preliminary study on the impact of presentation modalities on walking speed. We investigate which visual concurrent feedback modalities allows to reach and maintain a target speed (maximum or intermediate). Our first results on children with motor disabilities (n=10) show that some modalities improved walking performance and helped patients to better control their walking speed. In particular, a combination of targets anchored in the real world with a time indication seems to be effective in maintaining maximum walking speed, while simple moving objects could be used to control speed.
Towards Sailing supported by Augmented Reality: Motivation, Methodology and Perspectives
Francesco Laera, Mario Massimo Foglia, Alessandro Evangelista, Antonio Boccaccio, Michele Gattullo, Vito M Manghisi, Joseph L Gabbard, Antonio E. Uva, Michele Fiorentino
Sailing is a multidisciplinary activity that requires years to master. Recently this sustainable sport is becoming even harder due to the increasing number of onboard sensors, automation, artificial intelligence, and the high performances obtainable with modern vessels and sail designs. Augmented Reality technology (AR) has the potential to assist sailors of all ages and experience level and improve confidence, accessibility, situation awareness, and safety. This work presents our ongoing research and methodology for developing AR assisted sailing. We started with the problem definition followed by a state of the art using a systematic review. Secondly, we elicited the main task and variables using an online questionnaire with experts. Third, we extracted the main variables and conceptualized some visual interfaces using 3 different approaches. As final phase, we designed and implemented a user test platform using a VR headset to simulate AR in different marine scenarios. For a real deployment, we witness the lack of available AR devices, so we are developing one specific headset dedicated to this task. We also envision the possible redesign of the entire boat as a consequence of the introduction of AR technology.
Understanding Physical Common Sense in Symmetrical Reality
Zhenliang Zhang
Physical commonsense is the intuitive knowledge that can be obtained from the physical world. But the commonsense will be broken in symmetrical reality because of the integration of the physical world and the virtual world. In this paper, we will introduce the specific physics in symmetrical reality from two perspectives: existence and interaction. We emphasize the bi-directional mechanical control within the symmetrical reality framework and why free wills of machines can break the commonsense. Afterward, we give the experiments of discovering new physical commonsense of the symmetrical reality systems. Experiment I is about learning physical commonsense from symmetrical reality, which is used to show what can be learned and how to learn in a symmetrical reality environment. Experiment II is about changing the physical commonsense of symmetrical reality, which is used to show why physical commonsense deserve much attention. Finally, we draw an initial conclusion about the physical commonsense in symmetrical reality and give some suggestions for understanding symmetrical reality-based physical commonsense
Usability Considerations of Hand Held Augmented Reality Wiring Tutors
Bradley Herbert, William Hoff, Mark Billinghurst
Electrical repair tasks across domains use a common set of skills that combine problem solving, fine motor and spatial skills. Augmented Reality (AR) helps develop these skills by overlaying virtual objects on the real-world. So we designed a hand held AR-based wiring tutor which incorporates Constraint-Based Modelling (CBM) paradigms to detect learner errors in an electrical wiring task. We compared the performance and usability of our prototype with a state of the art hand held AR-based training system with a consistent user interface design, which lacks CBM approaches. Although, the CBM condition had significantly lower usability scores than the state of the art, participants using the CBM approach reported higher practical scores. We discuss reasons for the usability differences, including potential for positive perceptions of the system to be distorted by critical feedback needed to regulate learning in the electrical wiring domain. Next steps would include using the same system to evaluate the theoretical and practical learning outcomes.
User Study on Virtual Reality for Design Reviews in Architecture
Michele Fiorentino, Elisa Maria Klose, Alemanno Maria Lucia Valentina, Isabella Giordano, Alessandro De Bellis, S Ilaria Cavaliere, Dario Costantino, Giuseppe Fallacara, Oliver Straeter
Virtual reality is a candidate to become the preferred interface for architectural design review, but the effectiveness and usability of such systems is still an issue. We put together a multidisciplinary team to implement a test methodology and system to compare VR with 2D interaction, with a coherent test platform using Rhinoceros as industry-standard CAD software. A direct and valid comparison of the two setups is made possible by using the same software for both conditions. We designed and modeled three similar CAD models of a 2 two-story villa (1 for the training and 2 for the test) and we implanted 13 artificial errors, simulating common CAD issues. Users were asked to find the errors in a 10 minutes fixed-time session for each setup respectively. We completed our test with 10 students from the design and architecture faculty, with proven experience of the 2D version of the CAD. We did not find any significant differences between the two modalities in cognitive workload, but the user preference was clearly towards VR. The presented work may provide interesting insights for future human-centered studies and to improve future VR architectural applications.
Using Space Syntax to Enable Walkable AR Experiences
Derek Reilly, Joseph Malloch, Abbey Singh, Isaac Fresia, Shivam Mahajan, Jake Moore, Matthew Peachey
"Walkable" Augmented Reality (AR) experiences span floors of a building or involve exploring city neighbourhoods. In these cases setting greatly impacts object placement, interactive events, and narrative flow: in a zombie game for example, a standoff might best occur in an open foyer while a chase might be most effective in a narrow hallway. Spatial attributes are important when experiences are designed for a specific setting but also when settings are not known at design time. In this paper we explore how generic spatial attributes can facilitate design decisions in both cases. We conduct game design through the lens of space syntax, illustrating how attributes like openness, connectivity, and visual complexity can assist placement of walkable AR content in a site-specific narrative-driven scavenger hunt called ScavengAR and a "site-agnostic" game called Adventure AR. We contribute a Unity3D plugin that resolves design constraints expressed in terms of space syntax attributes to place AR content for a single setting or for multiple settings dynamically.
©2020 by ISMAR
Sponsored by the IEEE Computer Society Visualization and Graphics Technical Committee and ACM SIGGRAPH
IEEE Privacy Policy