Metaverse – Matthew Thomas Short, Sunny Webb, Joshua Opel, Theo E. Christensen, Accenture Global Solutions Ltd

Abstract for “Rendering virtual objects within 3D environments”

“Systems, methods and devices for rendering and placing virtual objects in three-dimensional environments. One technique is to provide a view of the environment of a first person using a device. The first computing system associated to the first person receives instructions to display a virtual marker at a specific position in the environment. This position is derived from the interaction of a second user with a 3D model of at most a portion the environment. The device displays the virtual marker at the indicated position in the environment.

Background for “Rendering virtual objects within 3D environments”

“This specification generally refers to computer-based methods for placing and rendering virtual objects within three-dimensional (3D), environments.”

There are many computing systems that can render 3D environments. Virtual reality (VR), for example, renders a 3D environment that can then be displayed to a user wearing an electronic display screen mounted on their head. A head-mounted display can include an electronic screen screen and optical lenses that allow the user to view the screen and the 3D environment. The system can display the 3D environment stereoscopically on the screen. This creates an illusion of depth for the user when viewed through its lenses. VR systems can provide an immersive experience that allows the user to feel as though they are actually in the virtual environment. VR systems can be used to view, move and manipulate the environment in 3D.

“Another example is augmented reality systems that augment real-world environments with virtual objects. You can make virtual objects appear like they are part the real-world environment. For example, a virtual robot appearing in live video from a lab setting or a virtual pet appearing in live video in a home. Mixed reality systems can augment real-world views with virtual objects that appear to be part of the environment. For example, a direct view of an environment can be obtained through transparent or semi-transparent lenses on a head-mounted display device. A video feed, however, provides an indirect view.

“This specification describes, among others, methods for rendering and placing virtual objects in a 3D environment. Computer systems that are configured in accordance with the techniques described herein can allow remote users to coordinate tasks that involve physical interaction of objects within the environment of one user. A field worker may be the first user, performing maintenance or repair work on the target device. A second user might have the expertise to do the same thing but is not physically present. The system might present a 3D model for the target device to the second person. This second user can interact with the model and place virtual markers near it to indicate where the first user should take action. The system can then display virtual markers to the field worker in real-world views of the worker’s environment. The system could render virtual markers for the field worker by augmenting his real-world environment with a head-mounted display device that mixes reality. The virtual markers are placed on the target device by the user as they physically work on it.

“Some computer-implemented methods are available for some implementations of the subject matter described herein. A device can display a view of the environment of a first person. The first computing system associated to the first person is instructed to display a virtual marker at a specific position in the environment. This position was derived from interaction between a second user and a three-dimensional (3D), model of at most a portion the environment. The device displays the virtual marker at the indicated position in the environment.

“These and other implementations may optionally include one or several of the following features.”

The instruction to display the virtual markers at the location specified by the first user may include a coordinate value which identifies the position and orientation of the virtual markers in a 3D space. The coordinate value can be used to identify the position and location of the virtual marker within a third-dimensional space other than that of the first user. To determine the position of the virtual marker within the environment of the user, the system can map the coordinates from the 3D environment to the 3D environment of user 1. The coordinate value can be used to identify the location of the virtual marker within the 3D space in which the second environment is located. This coordinate value is obtained by using a first 3D offset from one of the anchor points in the second environment. To map the coordinates from the 3D environment to the 3D environment of first user, it is possible to (1) identify one or more anchor point in the environment and (2) apply a second 3D offset to one or more anchorpoints of the environment for the first user. This will allow you to identify the position of the virtual marker within the environment.

“Displaying the virtual mark at the designated position in the environment can be done by superimposing it over the view of first user, so that the virtual markers visually appear as though they are far away from first user at the specific position in the environment.”

“The interaction of the second user with the 3D model at least a portion of the environment can occur even though the second user is located far from the environment of first user.”

The environment can be represented as a 3D model. It may include a model of a target object that is located within the environment. The interaction of the second user with the 3D environment model can include placing the virtual marker at the specified location on the 3D model. The virtual marker can be displayed at the desired location in the environment by the second user.

“The first computing system is capable of obtaining, from one or more spatial sensors spatial sensing data that indicates the spatial characteristics of the environment for the first user.” The first computing system is able to generate a 3D model of at least a portion of the environment based on spatial sensing data.

The first computing system can identify a virtual object within the 3D environment of at least one user. This is called a target object. The system can create a 3D model for the target object by extracting it from the 3D environment model. The second user can place the virtual marker in a specific location within a virtual environment, which includes the 3D model.

The device can also be a head-mounted display device (HMD), where the HMD displays the virtual marker in the designated position of the environment while the HMD is being worn by the first user.

“Providing the view to the environment can include creating an augmented view that shows one or several virtual objects in fixed positions within a 3D space.

“Some implementations are computer-implemented methods of the subject matter described herein.” The method may include: receiving data representing a 3D environment model for at least one user; rendering the 3D model to a second user by the computer system; identifying that a user has placed a virtual mark at a specific position in a virtual environment; transmitting an instruction to display the virtual marker within the environment view of the first person at the position that corresponds with the specified position in the virtual environment where the user placed the marker.

These and other implementations may optionally include one or several of the following features. The rendering of the 3D model containing at least a portion of an environment can be done by augmenting a real-world view of another environment with the 3D models of at least one portion of that environment.

“A head-mounted display device (HMD), can be used to enhance the real-world view and the 3D model of at least a portion of the environment for the second user.”

“A portable computing device, or a 3D display device, can be used for rendering the 3D model at least of the environment for the first user.”

“A virtual object representing a target object can be included in the 3D model of at least a portion of the environment created by the first user. From the 3D model containing at least a portion of the user’s environment, the system creates a 3D model based on the target object. The system can identify that the user input placed a virtual marker at the desired position in the virtual environment. This includes identifying that the user input placed a virtual marker at a specific location on the surface of the 3D model.

“After sending the instruction to display the virtual marker within the view of first user, the system can perform additional operations such as: receiving data that identifies the location in the environment at which a physical object was brought in contact with the target object; rendering, by computing system, and for display to second user within virtual environment, a marker that represents the physical object at the position relative to 3D model target object that corresponds with the position in environment at which the physical object was brought in contact with the target object or near it.

“The system can also display the virtual marker representing the physical implement concurrently to the placed virtual marker. The computing system can then receive a second input from a user that indicates whether the physical instrument was correctly positioned relative to the target object. If the second input is positive, the computing system will transmit a notification of this to a computing device associated with the first user.

“The environment can be represented as a 3D model, which may include a model of a target object. Further operations can be performed by the system, which includes receiving sensor data that identifies values of one or more operational parameter of the target objects and rendering them for display to the second users along with the 3D models of the target objects, and user interface elements that represent the values of those parameters.

“Additional implementations” of the subject matter disclosed include one or several computer-readable media encoding instructions that, when executed on one or multiple processors, cause those processors to perform any method/process disclosed herein. Computer-readable media can also be part of a computer system that contains one or more processors.

“Some implementations may achieve some of the following benefits. The system can facilitate better communication between a first user (or remote user) and another user. The first user could be a field user who performs repairs or diagnostics on the device. The second user might be able to provide remote assistance to first user if they have expertise in the device. This could involve placing virtual markers at specific locations within a virtual environment that is based on a 3D model. The second user can place virtual markers near or on top of specific locations by providing a 3D model in the first user?s environment. This is more accurate than if the second person interacted with a 2-dimensional representation of their device, such as images or video streams. The second benefit is that the virtual markers can be placed in the environment of the first user to make it easier for the user to see the exact location of the marker in the three-dimensional environment. A better placement of the virtual markers may allow for a reduction in the number of transmissions required between a field computing device and a remote computing device. Fourth, it may take less data to identify a coordinate or, optionally, an orientation of a virtual marker than some other methods for communicating positional information via voice or video. Virtual markers can be used to communicate positional information between remote users, which may help reduce latency and save bandwidth. The fifth is that by extracting target objects and portions of a 3D environment model, the file size can be decreased, which may reduce transmission times and channel bandwidth. Sixth, a gaze activated user interface element may be kept in a collapsed condition to allow detailed information to a user based upon a direction of their gaze. This allows a user to inspect an environment in real-world with minimal distractions.

“Additional features or advantages will be evident from the description, claims and the drawings to one with ordinary skill in art.”

“DESCRIPTION of Drawings”

“FIGS. “FIGS. The remote expert can communicate with the field worker to provide precise information for specific repair or maintenance tasks.

“FIG. “FIG.

“FIGS. 3A-3C show a flowchart showing an example of a process for placing and rendering virtual marker in remote locations.

“FIGS. 4A-4D show a physical control object that can used in augmented or mixed reality systems to provide efficient user interface elements that are superimposed in real-world environments.

“FIG. “FIG.

“FIGS. “FIGS.6A-6D show techniques for interfacing with a gaze activated user interface element in an enhanced view of a real world environment.”

“FIG. “FIG.7” is a flowchart that illustrates an example of a process for monitoring and responding in user interactions using a gaze-activated user interaction element.

“FIG. “FIG.

“Like references numbers and designations in the different drawings indicate like elements.”

“FIGS. “FIGS. Field user 102 is situated in a field environment where he or she can interact with any target objects. FIGS. FIGS. 1A-1F show that the target object is a furnace 104. The field user 102 could be an HVAC technician who is performing maintenance or repair work on furnace system 104 in a basement, or another mechanical room. The field user 102 might want to consult another user during repair or maintenance. This could be a user who is an expert with the furnace system 104 or a supervisor who conducts remote quality checks on the field user’s work. Or a homeowner who is absent during repair or maintenance. The expert user is the user with whom the field user 102 communicates via remote modeling.

“Beginning with FIG. “Beginning at FIG. A spatial modeling engine may be included in the system that uses data from a spatial sensor subsystem to create a 3D model or part of the environment surrounding the user. For example, the spatial sensing subsystem could include an infrared emitter that emits infrared light into the field environment, and a depth camera that detects distortions within the infrared patterns projected onto the environment to determine the distances from the camera. The depth camera data can be used by the spatial modeling engine to create a 3D model. Sometimes, field user 102 may point the spatial sensing system at a target object (e.g. furnace system 104) and then walk around the object to obtain a 3D model.

“The field user 102 uses the display device to view the target object as well as other aspects of the surrounding environment. FIG. FIG. 1A shows a view 100 showing the ambient environment for field user 102, with particular attention to furnace system 104. The display device may provide the view 100, in some cases a head-mounted display (HMD). The HMD allows the field user 102 to wear the device on their head and immerse themselves in the view 100.

“After the spatial modeling engine generates a 3D environment model for the field user 102 the system transmits data that represents a minimum of a part of the ambient environmental to a remote system. The transmitted 3D model may be the entire model of the ambient environment generated by the spatial modelling engine. Other implementations may show less of the ambient environment. The system might identify furnace system 104 in the field user’s environment as the target object. The system can identify the furnace 104 as the target object and extract the part of the 3D modeling that corresponds to it from the overall 3D model (e.g., by excluding surroundings not relevant to the furnace 104) to create a 3D model. The furnace model can be transmitted to the remote part of the modeling system by the field computing system.

“FIG. “FIG. Remote computing generates view 150 for an expert user who provides remote assistance to the field user. 102 in relation to the furnace repair or maintenance task 104. The expert user can be presented the 3D furnace model 54 by the system in a variety of ways. The expert user may use a mixed-reality HMD in some cases to view the virtual environment. In this case, the 3D furnace model154 is rendered as an object that augments the direct real-world view 150. The mixed-reality HMD can allow expert users to see through transparent lenses in shops and labs. The expert user may be able to augment his view 150 by using his mixed-reality HMD to render the furnace in 3D. The 3D furnace model154 can be placed in the expert user’s view 150 in a natural location, e.g. on a floor at a fixed place in the environment. The mixed-reality HMD allows the expert user to physically move around the virtual furnace model 104 in order to inspect and view the target object from different angles. Other implementations allow the remote modeling system to render the 3D furnace model 544 for expert users using different types of displays. The system could render the 3D furnace model number 154 on a tablet computer, a desktop monitor or a TV monitor. The 3D furnace model model 154 can be rendered on a 3D display that uses a stereoscope to give depth perception to the model.

“The remote modeling system allows the expert user to interact directly with the 3D model. An expert user can?annotate? The expert user may “annotate” by putting virtual markers in the virtual field environment where the target object’s 3D model is rendered.”

Virtual markers are virtual objects that enhance a view of a 3-D environment and which identify specific locations within the environment. FIG. FIG. 1C shows that the expert user added a virtual marker (156) proximate a coupling on the gas pipe of the furnace 3D model 104 from the environment of field user 102. To accurately communicate location information (or position) to the field user 102, the expert user can specify exactly where the virtual marker should go in the 3D environment. If the expert user suspects there may be a leak in the gas pipe, the virtual marker can place on or near the coupling. The 3D model 154 can be used to specify the location of the virtual marker 156. The expert user can use touchscreen gestures to move a virtual marker and rotate, pan or zoom the 3D-model 154 on a tablet. Implementations in which the 3D Model 154 is displayed on a tablet in mixed-reality or virtual reality environments may require the expert user to use hand gestures, pointing devices or a pen to indicate the location of the virtual marker in 3D space.

A virtual marker may provide additional information than just positional or location information. The expert user can assign an orientation to the virtual markers, give the marker a specific appearance (e.g., select 3D model), add media content, animation, text or note to the marker. The expert user might suggest a tool (implement), which the field user 102 can use to work on the pipe coupling. This could be done by selecting a 3D modeling for that tool. The field user 102 might also add a brief note to the selected 3D modeling, giving instructions on how to use the tool. The system may select an icon to represent a virtual marker that corresponds with a type or supplemental content. FIG. 158 shows a second virtual marker. 1C includes audio content. To alert field users 102 to audio content, the system uses a speaker icon. This is used to record verbal instructions from experts.

“After an expert user confirms that one or more virtual markers have been placed in the 3D environment around or on top of the target object’s 3D model, the remote modeling software transmits data representing these virtual markers to its field section. The field portion can process the data received from the expert user in order to render virtual objects in the 100-degree view of the environment. FIG. FIG. 1D shows a first virtual marker (106) corresponding to virtual marker 156, which is placed on or near the furnace’s top coupling at a location that corresponds with the virtual marker location 156 relative to 3D model furnace 104. A second virtual marker (108) corresponding to virtual mark 158 is also rendered at the base of furnace in a location that corresponds with the virtual marker location 158 relative to furnace 104. The view 100 can be used to show the exact location of virtual markers 106 and 108 in 3D space. This is useful for implementations that use a mixed reality HMD device. The virtual markers 106 and 108 can be fixed in space. This allows the field user 102 to see the exact location of the virtual markers 106 and 108 in 3D space.

“In certain instances, the field user can use the virtual marker 106 displayed in a view 100 to guide them through the environment. The expert user might have placed the virtual marker106 in the environment to indicate to the field user102 the exact location of the target object, where the user should apply a tool. FIG. FIG. The field user 102 can see the virtual marker106 in view 100. This tells him or her that the pipe coupling should be torqued using wrench 110. The field-portion of remote modeling captures, in some cases, the position and orientation of the wrench 110 as the field user (102) torques the pipe coupling. The field user’s data is captured and transferred to the remote computing system. The view 150 can display a virtual representation 160 for the wrench 110 at the exact location of the furnace model 154. This corresponds to where the field user 102 applied the wrench 110 to the furnace system 104. The virtual representation 160 can be a 3D model or sizing of the wrench 110, which visually matches the field user 102. The virtual representation 160 can be displayed simultaneously with the virtual marker 56 that an expert user has previously placed in view of furnace system 104. In a remote environment, the expert user can compare the location where the wrench 110 was placed with the virtual marker 56. The expert user can send a confirmation message 102 to confirm that the location of the wrench 110 as indicated by virtual representation 160 is acceptable. If the location where the wrench 110 was applied is not satisfactory, the expert user can adjust the virtual marker location 156. The updated location can then be transmitted to field user 102. This will cause the virtual marker location 106 to be moved to the new location.

“FIG. 2. This is a block diagram for an example remote modeling software 200. Remote modeling system 200 may include multiple computers located at different locations. The system 200 can be configured in some cases to perform the processes described herein. 1A-1F, 3A-3C. The system 200 generally includes at least two or more parts that are distant from each other. The system 200 includes a field system (202), which is used in the field, and a remote system (204), which is located far from the field. The remote system 204 and field system 202 may communicate over one or more networks. In some cases, the remote system 200 and the field system 202.2 communicate indirectly through a server system 206. The messages may be passed between the field system 200 and remote system 204 by the server system 206. One or more of components 210-228 shown in FIG. may be used in some implementations. 2, as part of the field system 200, or the components 220-228 that are shown in FIG. 2. may be part of remote system 204. Every component 210-228, and each component 230-240 can include one or more shared or dedicated processors.

“The field system202 comprises 210-228 components that provide services to users in the field environment. The controller 210 manages operations of the field network 202 and coordinates activities between the various components 212-228. The communications interface 212 can transmit and receive messages via one or more networks. The communications interface 212 allows field system 202 and remote system system 204 to communicate with each other or both.

In other implementations, field system 202 could use other display devices than head-mounted ones. An example of this is a direct view of the environment such as a video stream that can be displayed on an electronic screen screen on a computer monitor (e.g. an LCD or LED display), a TV, a mobile phone or tablet computing device.

“The field system 200 also includes a spatial modeling engine (216). The spatial modeling engine (216) is designed to create a 3D model for the field user’s ambient environment. The spatial modeling engine 216, which receives spatial sensor data from one or more sensors within the spatial sensing subsystem, processes these spatial sensing data to create the 3D model. The spatial sensing data is the spatial characteristics of the field user?s environment. Some implementations include data from an infrared emitter, which emits a pattern infrared light into the field environment. A depth camera detects distortions in these infrared patterns and projects them onto the environment to determine the distances to the camera. The depth camera data can be used by the spatial modeling engine 216 to create a 3D model. Other computer vision technologies can be used in some instances to complement or substitute the emitter-depth cam approach. The spatial modeling engine 216 can be used to analyze images, videos, and other spatial sensing data in order to identify objects within an environment. The spatial modeling engine 216, which can then generate 3D models of the objects recognized in an environment, and arrange them in the same way as the physical objects. The spatial modeling engine 216 may generate a 3D model in any form you prefer, such as a 3D polygonal grid.

The spatial modeling engine 216 can scan all or part of the environment for the field user. The spatial modeling engine 216 may generate a 3D model, but this could include additional parts that aren’t required or desired by remote experts. The field user might be trying to fix a photocopier machine in an office mailroom. Remote experts can help the field user repair the photocopier machine by interfacing with a 3D model. The 3D model generated by the spatial modeling engine 221 may be representative of the entire office mailroom environment. However, it may also include other aspects of the environment, such as the photocopying machine. Remote modeling systems can also include an extraction engine 228 to remove any extraneous elements from the environment’s 3D model. The extraction engine 228 can be used to locate a target area of a 3D modeling system and extract that target portion from any other parts. The target portion can include one or more target objects which are to be rendered in remote users’ environments. Extraneous parts of the 3D models can also be used. The extraction engine 228 might analyze the initial 3D modeling of the mailroom environment and identify the photocopying device from that initial 3D map. It will then extract the 3D model from that initial 3D file by removing any extraneous parts of the 3D models that are not relevant to the photocopying machines.

The extraction engine 228 is able to identify target objects in 3D models automatically or using user input. The extraction engine 228 can use object recognition and object detection techniques to automatically identify a target object from the 3D model generated by the spatial modeling engine 215. The extraction engine 228 can be triggered by user input. This input may include a selection of an area or point in the 3D environment representation that corresponds with the target object. The extraction engine 228 will then be able to correlate the selection with the target object and extract the 3D model from the initial environment 3D model. The 3D model of an environment can be rendered as a rendering, image, or other appropriate forms.

“In FIG. “In FIG. 2, the extraction engine 228, is shown as part of the field system 200. In other implementations, however, the extraction engine 228, may be located in the remote system, server system 206 or distributed across multiple parts of the system 200. Depending on where the extraction engine 228 is located, there may be different benefits. The extraction engine 228 can be provided at the field system 200, so the data files that represent the 3D model may be smaller than the original environment model before extraction. However, extracting a target object from the environment’s 3D model can be computationally costly and may take longer at the server system 206 or remote system 204. Extract engine 228 is then available at any of the systems 204 and 206.

The field system 202 can display virtual objects such as virtual markers within an environment created by the HMD214. Virtual objects may be added to the environment by the system 202. This could include virtual markers or other items that have been placed at remote systems 204. Local users, such as field users at the field system 200, may insert other virtual objects into the ambient environment view. The field system 202 contains a virtual object manager 220, and a rendering engine 222.

“First, the virtual object manager 220 keeps information about virtual objects that are visible in the ambient environment. The field system 202 receives instructions from remote users via the communications interface 220 to display a virtual item at a specific location in the ambient field environment. An instruction may include various parameters that will affect the way the virtual object is displayed in the environment. These could include a display coordinate value or an object orientation value. Secondary data values can also be included. The display coordinate value specifies the location in 3D space where the virtual object should appear. The object orientation value specifies the orientation in which the virtual object should appear (e.g., vertically, horizontally, or tilted). The model identification value specifies the visual representation that should display for the virtual object. A virtual objects manager 220 could include a models database which maps the model identification value for each pre-stored 2D model (e.g. graphics or icons) or 3D model at field system 202. The virtual objects manager 220 can access the models database using the model identification value received from the user. This allows the user to choose which 2D or 3-D model they want to display in the environment. Any additional information about how the virtual objects should be displayed to the user in view of the environment can be included as secondary data. One example is a text label which will be displayed alongside the 3D or 2D models of the virtual object. Another example is a secondary data value that may contain a link to an additional resource (e.g., an application or a webpage) or media content to render with the 2D and 3D models of the virtual objects.

“As mentioned in the previous paragraph, the field system 200 may contain a display coordinate value. This value indicates where a virtual object should appear in 3D space. The problem is in how to interpret this coordinate value, so that the virtual objects are displayed in the correct position in the view of the environment. This is true even if another user (e.g. remote expert at remote system 204) has placed the virtual object in a 3D environment different from the field user’s ambient environment. The virtual objects manager 220 could include an alignment-and-translation engine (not shown at FIG. 2), which is designed to convert a received display coordinate to a translated display value. This identifies the virtual object’s display position in a 3D space that is specific to the ambient field environment. If, for instance, the coordinates between the ambient environment and the remote environment in which the remote user places the virtual object are not aligned, the received display coordinate value can be translated. Other implementations may use the alignment and translation engine to align the coordinate systems in the remote environment and the ambient field to eliminate the need to translate the received display coordinate value. The spatial modeling engine 216 first determines one or more anchor positions in the ambient environment to align the coordinate systems. These anchor points are used as reference positions (e.g. origins) in a 3D coordinate scheme that defines spatial positions within the ambient field environment. An anchor point could be found at the corner of a room within the ambient field environment. It may also be assigned (0, 0, 0), origin position in a Cartesian Coordinat System. It is possible to determine the locations of objects within a 3D model environment by calculating their offset from the origin. The field system 202 can also transmit data to the remote system, 204 that identifies anchor points in the ambient environment. This includes the 3D model or 3D models of target objects within the ambient environment. The remote system 204 can then identify the anchor points in the ambient field environment. It can then use these anchor points to generate coordinate values for virtual objects in remote environments in terms of their offsets. The coordinate systems relating to the field and remote environments will produce the same coordinate values. If a remote user places virtual objects at specific locations on the surface of a 3D model target object in a remote setting, the coordinate values that identify the exact location of virtual objects in remote environments will also identify equivalent locations on the surface of target objects that actually exist in the field.

“The field system 200 also includes a rendering engine 222. The rendering engine 222, which renders virtual content using HMD 214, or an alternative display device, is used to display the HMD 214. The rendering engine 222 may augment a real-world view by superimposing virtual objects on top of it. The rendering engine 222 may use information from the spatial model engine 216 to determine the parameters that will be used to display a virtual object. This makes it appear as though the object is part of the environment. The virtual object could be placed on the surface of an object that is part of the ambient environment. The rendering engine 222 can place the virtual object on top of the physical object using information from the spatial modelling engine 216. This will allow for an enhanced view of the environment. The rendering engine 222 can render a virtual object appear to be in an obscured area of the 3D environment. Display parameter values for virtual objects indicate how the rendering engine222 places, sizes, and manipulates virtual objects in real-time. This creates the illusion that the virtual object is located at a specific location in the environment, away from the camera or user.

“An input manager 218 in the field system 200 is designed to receive inputs from one or multiple sources. Any suitable mechanism to detect user inputs may be used, such as a camera, gesture recognizer or microphone, speech recognition, a microphone, virtual keyboard, a pointing device or combination of several of these. The field system 202 may be used by a user to input information for a variety of purposes, such as to select an object, provide spoken or written information to remote users, place or move virtual markers, trigger a state capture event or other purposes. The input manager 218 can receive and process data that indicates where a user’s gaze is directed. Data from the HMD214’s orientation and motion sensors (e.g. compasses, accelerometers, gyroscopes), eye tracking sensors, or both can be used to determine the direction of a user?s gaze.

“In some cases, the field system 200 includes a state capture module 224. The state capture engine (224) is designed to store and capture state data about the state of an ambient field environment at specific times. The state capture engine 228 can capture and store information about field users’ actions in the ambient environment. The system 202 might render a virtual marker in a view of an ambient field environment. This marker may be placed by remote users to prompt the field user for operation on a particular component of a target object. The state capture engine 228 may store and capture state data when the field user places a tool in position to start work on the specified component. This includes the tool’s position relative to the target object, or the virtual marker. For review by the field user, the state data can be transmitted to remote system 204. The remote system 204 may use the state data to create a virtual object representing the field user’s tool. This could be a generic marker or a 3D model.

“The field system 200, in certain implementations, also includes one or more additional data sources 226, Auxiliary data sources 226 provide auxiliary data to the field system. Any information that could be used by a field user to assess the field environment, or any information that might impact the field user’s field work in it, can be considered auxiliary data. One example is that a target object may have sensors that emit signals indicating its operating characteristics. Auxiliary data sources 226 can collect, store, or transmit sensor signals that indicate operating characteristics of the target objects to make this information available to the field systems 202, 204, and/or 206. Data from the auxiliary sources 206 may be stored in the data storage 208. This data is accessible directly to the server system. Some implementations allow for raw auxiliary data, information that is at least partially derived from the raw data sources 206, to be presented to either the remote user or the field user. The field user might attach leads from a multimeter to a target circuitboard during an electrical repair. An auxiliary data source 226, which can be used to transmit electrical information, such as voltage, current, and resistance, may be used. The information can also be displayed to the remote user or both, while repairs are ongoing.

Remote system 204 can be implemented in any combination of one or more computers at one or more locations. The remote system 204 can present 3D models of at most a portion the ambient field environment to a user who is located far from the field environment. Remote system 204 is capable of adding objects to a virtual environment. For example, virtual markers can be used to identify specific points in the virtual space. A 3D model can be included in the virtual field environment or a small portion of it, such as a model of a target item located within the ambient environment.

“The remote system 200 includes a controller 230. The remote system 204’s controller 230 manages operations and coordinates activities between the components 232-242. The communications interface232 allows messages to be sent and received over one or more networks. The communications interface232 allows remote systems 204 to communicate with each other, either via the server system or field system 202.

“A display device 234 displays a virtual field environment to remote users, which could be either local to remote system 204 or remote from field system 202. For different user experiences, the remote system 204 might offer different display devices 234 The display device 234 may be a head-mounted display (HMD) in some cases, e.g. HMD 214 in the field system. An HMD provides a user with a realistic view of their environment. It can also augment the view by overlaying a virtual field environment on top of the real-world environment. Mixed-reality HMDs may overlay the 3D model the target object on top of a real-world view of remote user’s environment. This allows remote user to see the 3D model in scale in their own environment. Other implementations display the virtual environment on the 2D display screen 234 such as an LCD, LED screen on a computer monitor, or tablet. For example, the user can use touchscreen gestures and other input mechanisms to manipulate the 3D model of the target object (e.g. rotate, pan, zoom), or to place virtual markers at specific locations in the virtual environment. Another implementation of the display device 234 is to display the virtual environment on a 3D screen. This screen uses stereo imaging to create depth illusions. A 3D screen can provide a more immersive experience to remote users than a 2D screen.

“A rendering engine 236 from the remote system204 is designed to render virtual objects that can be displayed with the display device 234. The rendering engine 236 is configured to receive data that characterizes a virtual environment (e.g. one or more virtual objects), and render the environment in a 3D space. It then outputs a signal 236 to the display device 236, which can be used for display of the rendered virtual environment.

The remote system 204 has a virtual objects manager 238 that is designed to keep information about virtual objects that can display in the virtual environment. A virtual objects manager 238 could include a models database which maps a model identification value for each pre-stored 2D model (e.g. graphics or icons) or 3D model at remote system 204. The virtual objects manager 238 can access the models database using a model ID value to determine which 2D or 3-D model should be displayed in the view of the environment. Any additional information about how the virtual objects should be displayed to the user in view of the environment can be included as secondary data. One example is a text label which will be displayed alongside the 3D or 2D models of the virtual object. Another example is a secondary data value that may contain a link to an additional resource (e.g., an application or a webpage) or media content to render with the virtual object’s 2D or 3-dimensional model.

“The remote system 200 includes a primary input manger 240 and secondary input manager 242. The primary input manager (240) receives primary user inputs. This includes inputs from users who want to add virtual markers to virtual environments or modify virtual markers within virtual environments. Secondary input manager 240 is used to receive secondary user inputs. These user inputs indicate that a user wants to add secondary objects in a virtual environment. Secondary objects are objects other than virtual marker objects, which can be added to a virtual environment. To clarify or provide more information about the tasks the field user is asked to complete, the remote user can add text or voice annotations to a Virtual Marker.

“In some cases, the primary input manager (240) and the secondary input manager (242), are set up to receive inputs from user sources. Any suitable mechanism to detect user inputs may be used, such as a camera, gesture recognizer and microphone, speech recognition, a microphone, virtual keyboard, pointing device or combination thereof. The remote system 204 may be used by a user to input information for many purposes, such as to select an object, provide spoken or written information to a field user or to place or move virtual markers. The input manager 242 can be configured to process and receive data about the direction of the gaze of a user wearing an HMD. Data from the HMD’s orientation and motion sensors (e.g. compasses, accelerometers, gyroscopes) or eye tracking sensors can be used to determine the direction of a user?s gaze.

“Referring to FIGS. 3A-3C is a flowchart of an example 300 process for rendering and placing virtual markers in a 3D environment. A system of computers can perform the process 300 at multiple locations, such as the remote modeling system 200 in FIG. 2. The flowchart shows a series operations between a field part of the computing systems, e.g. field system 202 and a remote section of the computing systems, e.g. remote system 204. In some cases, however, operations may not be performed directly between the remote system and field system. A server system called server system 206 may be used to communicate between the remote and field systems. Some operations, such as 302-342, may also be performed at this intermediate server system.

The field system provides real-world views of the ambient environment for a user out in the field at stage 302. This is where 300 starts. You can view the ambient environment directly through the lenses of a head mounted display (e.g. head-mounted display 214) or indirectly via an electronic display screen that displays images or video.

“At stage304, the field system scans at least part of the environment in the field to create an 3D model. A depth camera, radio detection and range subsystem (RADAR), sound navigation and ranging (SONAR), light detection and mapping subsystems (LIDAR) or any combination of these scanning techniques may be used to scan the environment. The spatial modeling engine (e.g. spatial modeling engine 216) may process the spatial sensing data in order to create the 3D model.

“At stage 306, a field system locates a target object in the ambient field environment. Some implementations automatically identify the target object (e.g. without user input during scanning of the environment and identifying it). Some implementations use user input to identify the target object. The system might prompt the user to choose the target object by showing an image of the environment, or selecting a section of the 3D model that corresponds to the environment.

“At stage 308, a field system extracts the target object from the 3D environment model. A field system can include an extraction engine (e.g. extraction engine 228) to create a 3D model the target object using the 3D environment model. This is done by extracting the target from other parts of the 3D environment model. In some cases, the 3D target object model can be extracted at the remote system or intermediary server system.

“At stage 311 the field system determines one or more anchor point in a 3D space that is used to describe the ambient field environment. Anchor points can be used for aligning coordinate systems between field system and remote system. This allows virtual objects to placed in virtual fields and then rendered in real-world environments at the same position. The system may designate one, two, three or more positions in the target object’s 3D model (or at least a portion of the ambient environmental if extraction has not occurred) as anchor point. You can choose the anchor points arbitrarily, or you can use logic to optimize the selection of specific positions as anchor point.

“At stage 312, a field system transmits the 3D models of the target object or the minimum portion of the environment, if extraction has not occurred yet) to the remote systems. The 3D model may be transmitted directly over the network to remote systems in some cases. Other implementations transmit the 3D model indirectly over the network to remote systems.

“At stage 314, remote system receives 3D model from field system of target object.”

“The system renders the target object’s 3D model to a remote user interface. The form of the remote’s user interface can vary depending on the preferences of the user and the display hardware that displays the rendered model. A virtual representation of the environment can be used as the user interface. The rendering engine of remote systems, such as rendering engine 236, can render the 3D model for the target object. The rendered model may then be displayed with a display device (e.g. display device 224).

“At stage 318 the remote system receives a primary input from the user to place a virtual mark in the remote system’s user interface. The virtual marker can be placed in a virtual environment around the target object, or at least part of the ambient environment. A coordinate value in a 3D coordinates system may specify the location of the marker within the virtual field environment. The coordinate value indicates an offset of the marker’s specified location in 3D space relative to one or more anchor points. Other characteristics of the virtual marker may be specified by the primary user input, such as the orientation, size, and 3D model of the marker.

“At stage 322, the remote system can optionally receive a secondary user input. This allows the remote user to identify any additional information they wish to share with the field user. The field system can render any type of information that is suitable for the field user. The remote user could dictate or type instructions or send audio, hyperlinks, or video to the field user.

“At stage 322, a remote system transmits an instructions to the field system. The instruction contains data that is sent to the field system by the remote user. It will cause the field system augment the view of the environment by adding virtual markers. Instructions can contain display coordinate values and optionally other data that describe the primary and secondary user inputs. This allows virtual markers and additional information to be presented to field users based on the inputs. To transmit the instructions to the field system, the remote system uses a communication interface (e.g. communications interface 232).

“At stage 324 the field system receives instructions from the remote system. The instructions can be received via a network by using a communications interface at field system (e.g. communications interface 312).

“At stage 326 the field system processes the received instructions and renders a virtual mark in an augmented view the ambient environment of a field user. A rendering engine, such as rendering engine 222 can render the virtual marker in an augmented view of the field environment at the location indicated by the display coordinate values. Display device HMD214 is used to display the rendered virtual marker. The rendering engine can also render visual representations for any additional information the remote user has provided to the remote system.

“At stage328, the field system detects user interaction with the target object within the field environment. The system may monitor user actions within the field environment if they occur within a certain distance from the virtual marker. This is shown in the view of an ambient field environment. A state capture engine (e.g. state capture engine 224) may detect the occurrence of pre-defined events in this area of the field environment. This data can be used to characterize the current field system, field environment generally, target object, or any combination thereof. The state capture engine can automatically store and capture data about a tool used to complete a task on the target object. The state capture engine might record, for example, where a wrench is used to fix a pipe. It may also record details such as the orientation and number of turns made, as well as the torque applied.

“At stage 332, the field system uses communications interfaces, e.g. communications interface 312 for data transmission to the remote system. This allows data to be transmitted that characterizes the stored and captured state data. The remote system receives state data from the field using stage 332 via a communications interface (e.g. communications interface 232).

“At stage 334 the remote system uses state data to create a virtual representation the field user’s interaction in the user interface of remote system. If the target object is not extracted, the virtual representation of the interaction of the field user can be displayed in the remote system’s virtual field environment. The virtual representation of the interaction between the field user and remote user can be displayed simultaneously with the marker the remote user placed in the virtual environment at stage 318. A second virtual marker, which represents the place in the environment where the field user applied the tool to the target object, can be displayed alongside the first virtual marker that was placed by the remote user. Both virtual markers can be displayed simultaneously to provide a visual comparison for remote users. The first virtual marker indicates the desired location for field user’s field work. The second virtual marker shows the actual field user’s location. Remote users can compare the virtual markers simultaneously and decide whether or not the work of the field user is acceptable. Remote systems may issue an alert to the remote user in some cases if the positions of the virtual marker are more than a certain distance. Remote systems may also issue an alert if the field user’s work is not in line with the prescribed parameters. The system could generate an alert to inform the remote user if the field user applies a tool in a different way or torqued the wrench in an incorrect direction.

“At stage 336 the remote system might receive corrective feedback from a remote user if it feels that the field user’s work needs to be corrected in any way. If the remote user does not approve of the field user’s work, or if they have other requirements for their work, corrective feedback could be given. Corrective feedback can also be provided if the field users did not follow the instructions of the remote user. Some implementations provide corrective feedback to the remote system via an input manager (e.g. primary input manager 240 and secondary input manager 242). If the remote user wants to communicate to the field user that the environment in which the field user is applying a tool should be changed, the remote user can specify the change in the position of the first and second virtual markers. Another example is that the remote user can add annotations to existing virtual marker or create new markers to provide feedback to the field user.

“At stage 338 the remote system transmits data characterizing remote user’s corrective input to the field system via a communications interface (e.g. communications interface 232). The field system receives data from the remote user at stage 340 using a communications interface (e.g. communications interface 212). After receiving the corrective feedback data, the field system updates stage 342 the augmented real world view of the ambient environment. Based on corrective feedback data, the field system can move a virtual marker into a different location in the 3D space. Some implementations may ghost the virtual marker in its original position while the non-ghosted, fully textured virtual marker is displayed in its updated position.

“FIGS. “FIGS. Many computing systems don’t provide strong hand-based input controls, such as a keyboard or mouse. Users may prefer to use traditional hand-based controls. A head-mounted display (HMD), which provides direct or indirect views of the environment, may be used in mixed-reality and virtual reality environments. It is possible that the user wants to use the HMD to do other tasks. Traditional input controls such as a mouse or keyboard may not be practical. FIGS. 4A-4D can be used to allow users to navigate through complex user interface menus and other elements of a user interface (e.g. a real-world view in an ambient environment) in an efficient and natural way.

“FIG. 4A shows a physical control object in the form of a cube 402. You may also consider other geometries for physical control objects, such as a tetrahedron or a square pyramid or a cube 402. The cube 402 has six faces. Three of these are shown in FIG. 4A. 4A. Respective symbols 406a-c are printed individually on each face. The cube 402 faces 1 and 2 have a circular symbol 406a, while the cube 402 faces 2 and 3 have a triangular symbol 406.b. The cube 402 is within the view area of a camera (404). For example, the camera 404 could be an RGB video camera mounted on a head-mounted monitor so that the cube 402. is within the camera’s view when the user turns their head to look at the cube 402.

“While the cube402 is within the field of vision of the camera404, a computing device coupled to the camera404, e.g. field system202 or remote system204, may use computer Vision techniques to recognize the cube402 as a physical object and render virtual elements over that surface. An example of an augmented view 408 a of the Cube 402 shows a first virtual element 410a being rendered over one face of cube 402, followed by a second virtual element 410b rendered over the second cube 402, and finally a third virtual element 410c rendered over the third cube 402. The virtual display elements can partially or completely obscure the contents of the cube 402’s respective faces. A virtual display element can be superimposed over a cube 402 face to create an augmented view of the environment.

“In some implementations, the virtual display element provides a user control interface with which a user can interact by manipulating a control object (e.g. cube 402). FIG. FIG. 4C illustrates a circular dial 412 that is a virtual display element which has been placed over the cube’s second face. Based on the system’s identification that the second cube face 402 is facing in a predetermined direction (e.g. in the direction of camera 404), the system can render the dial 412 above the cube’s second face. To perform an action with computing system, a user can rotate the dial 412 by rotating the cube402 and keeping the second face of cube402 in the direction toward camera 404. The dial 412 can be fixed relative to cube 402 so it follows cube 402 while it is in the field of vision of the camera.

“In some cases, the user can activate different virtual display elements simply by rotating the cube 402. This allows the user to rotate the cube in different directions toward the camera 404. FIG. FIG. 4D illustrates that the system can render a menu 414 rather than a dial 412 when the cube 402 is rotated so that the first face faces toward the camera 402.

“FIG. “FIG.5” is a flowchart that illustrates an example 500 process for rendering virtual display elements within a user-interface based on actions taken with a physical control object (e.g. cube 402) in the field view of a camera. The process 500 may be carried out in a number of locations. The system detects a physical object in the field of view of camera at stage 502. A physical control object could be a geometric shape that has multiple sides and a symbol printed on each. The system examines the images or video of physical control objects to identify a first symbol that is facing the camera at stage 504. The database may be used to map each of the virtual display elements to one of the symbols printed on a physical control object. The system determines the first symbol that is pointed toward the camera at stage 506 and maps it to the virtual display element. The system renders the chosen virtual display element in an environment that includes the physical controller object at stage 508 The selected virtual display element is rendered at most partially over the face of the physical controller object that faces toward the camera. Stage 510 identifies that the user has manipulated the physical control objects. However, the face with the first symbol is still oriented towards the camera. The user can rotate, shake or tap the first control object’s face. The system can detect user interaction with the control objects by analysing images or video of them using computer vision techniques. Based on user interaction, stage 512 allows the system to perform an operation with the virtual display element. The system might respond to a tilted control object by scrolling through a list of items in the virtual display element. The display element can be activated by shaking or tapping the user. One of the items in the focus list may then be selected. In some cases, the user can activate another virtual display element by turning the physical control object so that it points toward the camera. The system detects a second symbol at stage 514 on the second side of the physical control object. The system returns to stage 506 and continues processing, based on the orientation of the second control object’s face toward the camera.

“FIGS. “FIGS. The physical limitations of the HMD can limit the area that can be augmented by user interface elements (virtual displays elements) when viewing an environment. User interface elements can also distract from the user’s ability to see the real-world environment through the HMD. FIGS. 6A-6D might be a good option for some implementations.

“FIG. 6A shows an annotated view 602 of an environment that is presented to the user via a HMD (e.g. HMD 214). The environment can be real-world, or virtual. The environment contains a gaze-activated user interface element 606 The element 606 might be very small, at most 5-10 pixels in height and width. It may also occupy a small area of the viewing area that can be augmented with virtual objects. Sensors may be included in the HMD, such as orientation and motion sensors and/or eye tracking sensors that detect the direction of the user’s gaze when wearing the HMD214. The reticle 604 represents the point of focus of the user?s gaze. This is the point at which the user?s gaze is currently directed. The reticle 602 changes as the user moves their eyes or heads.

“In some cases, the HMD displays element 606 partially or fully transparently to reduce distractions. However, the user’s gaze does not move in close proximity to the element 606. The transparency of element 606 can decrease as the user’s focus moves closer to it. The distance between the element 606 and the reticle 604 may affect other aspects of the visual appearance, such as color, size, shape or a combination thereof.

“In some implementations, one or more action thresholds are defined around the user interface element 606. FIG. FIG. 6A illustrates a first action threshold 608 that is located at a distance of 606 from the center element 606 and a secondary action threshold 610 that is at a shorter distance from element 606. The system executes an action when the user’s focus crosses an action threshold. The actions can include expanding element 606 or adding descriptive details about element 606. FIG. FIG. 6B shows that the system displays descriptive text 612 in pop-up mode near element 606 when the user crosses the first threshold 608. As the user focuses on element 606 more, and the distance between element 606 (as indicated by the second threshold 610), the system may display pop-up display windows 614 as shown in FIG. 602 c. 6C. The pop-up window 614 can display media content, textual information, or other user interface elements, such as a list interactive items organized in a menu.

“In some cases, the gaze-activated user interface element 606 can be broken into multiple parts. FIG. FIG. 6D shows view 602 d in an environment where the gaze-activated element 606 can be divided into a top and bottom half. A first pop-up window 616 a opens when the user’s gaze is focused on the top of element 606 A second pop-up window 616 b is displayed when the user’s gaze focuses on the element 606 at the bottom.

“FIG. “FIG.7 is a flowchart showing an example 700 process for interfacing with a gaze activated user interface element. One or more computers may perform the process 700 at one or multiple locations, such as field system 202, remote system 204, or a head-mounted display (214).

“At stage 702, a gaze-activated user interaction element is created, e.g. element 606 within an augmented view. The element could be an icon, or another small virtual marker, that is fixed in 3D space as a location through the lenses of a HMD’s screen. When the user’s gaze moves beyond a certain distance from the element, the element can be partially or fully transparent.

Summary for “Rendering virtual objects within 3D environments”

“This specification generally refers to computer-based methods for placing and rendering virtual objects within three-dimensional (3D), environments.”

There are many computing systems that can render 3D environments. Virtual reality (VR), for example, renders a 3D environment that can then be displayed to a user wearing an electronic display screen mounted on their head. A head-mounted display can include an electronic screen screen and optical lenses that allow the user to view the screen and the 3D environment. The system can display the 3D environment stereoscopically on the screen. This creates an illusion of depth for the user when viewed through its lenses. VR systems can provide an immersive experience that allows the user to feel as though they are actually in the virtual environment. VR systems can be used to view, move and manipulate the environment in 3D.

“Another example is augmented reality systems that augment real-world environments with virtual objects. You can make virtual objects appear like they are part the real-world environment. For example, a virtual robot appearing in live video from a lab setting or a virtual pet appearing in live video in a home. Mixed reality systems can augment real-world views with virtual objects that appear to be part of the environment. For example, a direct view of an environment can be obtained through transparent or semi-transparent lenses on a head-mounted display device. A video feed, however, provides an indirect view.

“This specification describes, among others, methods for rendering and placing virtual objects in a 3D environment. Computer systems that are configured in accordance with the techniques described herein can allow remote users to coordinate tasks that involve physical interaction of objects within the environment of one user. A field worker may be the first user, performing maintenance or repair work on the target device. A second user might have the expertise to do the same thing but is not physically present. The system might present a 3D model for the target device to the second person. This second user can interact with the model and place virtual markers near it to indicate where the first user should take action. The system can then display virtual markers to the field worker in real-world views of the worker’s environment. The system could render virtual markers for the field worker by augmenting his real-world environment with a head-mounted display device that mixes reality. The virtual markers are placed on the target device by the user as they physically work on it.

“Some computer-implemented methods are available for some implementations of the subject matter described herein. A device can display a view of the environment of a first person. The first computing system associated to the first person is instructed to display a virtual marker at a specific position in the environment. This position was derived from interaction between a second user and a three-dimensional (3D), model of at most a portion the environment. The device displays the virtual marker at the indicated position in the environment.

“These and other implementations may optionally include one or several of the following features.”

The instruction to display the virtual markers at the location specified by the first user may include a coordinate value which identifies the position and orientation of the virtual markers in a 3D space. The coordinate value can be used to identify the position and location of the virtual marker within a third-dimensional space other than that of the first user. To determine the position of the virtual marker within the environment of the user, the system can map the coordinates from the 3D environment to the 3D environment of user 1. The coordinate value can be used to identify the location of the virtual marker within the 3D space in which the second environment is located. This coordinate value is obtained by using a first 3D offset from one of the anchor points in the second environment. To map the coordinates from the 3D environment to the 3D environment of first user, it is possible to (1) identify one or more anchor point in the environment and (2) apply a second 3D offset to one or more anchorpoints of the environment for the first user. This will allow you to identify the position of the virtual marker within the environment.

“Displaying the virtual mark at the designated position in the environment can be done by superimposing it over the view of first user, so that the virtual markers visually appear as though they are far away from first user at the specific position in the environment.”

“The interaction of the second user with the 3D model at least a portion of the environment can occur even though the second user is located far from the environment of first user.”

The environment can be represented as a 3D model. It may include a model of a target object that is located within the environment. The interaction of the second user with the 3D environment model can include placing the virtual marker at the specified location on the 3D model. The virtual marker can be displayed at the desired location in the environment by the second user.

“The first computing system is capable of obtaining, from one or more spatial sensors spatial sensing data that indicates the spatial characteristics of the environment for the first user.” The first computing system is able to generate a 3D model of at least a portion of the environment based on spatial sensing data.

The first computing system can identify a virtual object within the 3D environment of at least one user. This is called a target object. The system can create a 3D model for the target object by extracting it from the 3D environment model. The second user can place the virtual marker in a specific location within a virtual environment, which includes the 3D model.

The device can also be a head-mounted display device (HMD), where the HMD displays the virtual marker in the designated position of the environment while the HMD is being worn by the first user.

“Providing the view to the environment can include creating an augmented view that shows one or several virtual objects in fixed positions within a 3D space.

“Some implementations are computer-implemented methods of the subject matter described herein.” The method may include: receiving data representing a 3D environment model for at least one user; rendering the 3D model to a second user by the computer system; identifying that a user has placed a virtual mark at a specific position in a virtual environment; transmitting an instruction to display the virtual marker within the environment view of the first person at the position that corresponds with the specified position in the virtual environment where the user placed the marker.

These and other implementations may optionally include one or several of the following features. The rendering of the 3D model containing at least a portion of an environment can be done by augmenting a real-world view of another environment with the 3D models of at least one portion of that environment.

“A head-mounted display device (HMD), can be used to enhance the real-world view and the 3D model of at least a portion of the environment for the second user.”

“A portable computing device, or a 3D display device, can be used for rendering the 3D model at least of the environment for the first user.”

“A virtual object representing a target object can be included in the 3D model of at least a portion of the environment created by the first user. From the 3D model containing at least a portion of the user’s environment, the system creates a 3D model based on the target object. The system can identify that the user input placed a virtual marker at the desired position in the virtual environment. This includes identifying that the user input placed a virtual marker at a specific location on the surface of the 3D model.

“After sending the instruction to display the virtual marker within the view of first user, the system can perform additional operations such as: receiving data that identifies the location in the environment at which a physical object was brought in contact with the target object; rendering, by computing system, and for display to second user within virtual environment, a marker that represents the physical object at the position relative to 3D model target object that corresponds with the position in environment at which the physical object was brought in contact with the target object or near it.

“The system can also display the virtual marker representing the physical implement concurrently to the placed virtual marker. The computing system can then receive a second input from a user that indicates whether the physical instrument was correctly positioned relative to the target object. If the second input is positive, the computing system will transmit a notification of this to a computing device associated with the first user.

“The environment can be represented as a 3D model, which may include a model of a target object. Further operations can be performed by the system, which includes receiving sensor data that identifies values of one or more operational parameter of the target objects and rendering them for display to the second users along with the 3D models of the target objects, and user interface elements that represent the values of those parameters.

“Additional implementations” of the subject matter disclosed include one or several computer-readable media encoding instructions that, when executed on one or multiple processors, cause those processors to perform any method/process disclosed herein. Computer-readable media can also be part of a computer system that contains one or more processors.

“Some implementations may achieve some of the following benefits. The system can facilitate better communication between a first user (or remote user) and another user. The first user could be a field user who performs repairs or diagnostics on the device. The second user might be able to provide remote assistance to first user if they have expertise in the device. This could involve placing virtual markers at specific locations within a virtual environment that is based on a 3D model. The second user can place virtual markers near or on top of specific locations by providing a 3D model in the first user?s environment. This is more accurate than if the second person interacted with a 2-dimensional representation of their device, such as images or video streams. The second benefit is that the virtual markers can be placed in the environment of the first user to make it easier for the user to see the exact location of the marker in the three-dimensional environment. A better placement of the virtual markers may allow for a reduction in the number of transmissions required between a field computing device and a remote computing device. Fourth, it may take less data to identify a coordinate or, optionally, an orientation of a virtual marker than some other methods for communicating positional information via voice or video. Virtual markers can be used to communicate positional information between remote users, which may help reduce latency and save bandwidth. The fifth is that by extracting target objects and portions of a 3D environment model, the file size can be decreased, which may reduce transmission times and channel bandwidth. Sixth, a gaze activated user interface element may be kept in a collapsed condition to allow detailed information to a user based upon a direction of their gaze. This allows a user to inspect an environment in real-world with minimal distractions.

“Additional features or advantages will be evident from the description, claims and the drawings to one with ordinary skill in art.”

“DESCRIPTION of Drawings”

“FIGS. “FIGS. The remote expert can communicate with the field worker to provide precise information for specific repair or maintenance tasks.

“FIG. “FIG.

“FIGS. 3A-3C show a flowchart showing an example of a process for placing and rendering virtual marker in remote locations.

“FIGS. 4A-4D show a physical control object that can used in augmented or mixed reality systems to provide efficient user interface elements that are superimposed in real-world environments.

“FIG. “FIG.

“FIGS. “FIGS.6A-6D show techniques for interfacing with a gaze activated user interface element in an enhanced view of a real world environment.”

“FIG. “FIG.7” is a flowchart that illustrates an example of a process for monitoring and responding in user interactions using a gaze-activated user interaction element.

“FIG. “FIG.

“Like references numbers and designations in the different drawings indicate like elements.”

“FIGS. “FIGS. Field user 102 is situated in a field environment where he or she can interact with any target objects. FIGS. FIGS. 1A-1F show that the target object is a furnace 104. The field user 102 could be an HVAC technician who is performing maintenance or repair work on furnace system 104 in a basement, or another mechanical room. The field user 102 might want to consult another user during repair or maintenance. This could be a user who is an expert with the furnace system 104 or a supervisor who conducts remote quality checks on the field user’s work. Or a homeowner who is absent during repair or maintenance. The expert user is the user with whom the field user 102 communicates via remote modeling.

“Beginning with FIG. “Beginning at FIG. A spatial modeling engine may be included in the system that uses data from a spatial sensor subsystem to create a 3D model or part of the environment surrounding the user. For example, the spatial sensing subsystem could include an infrared emitter that emits infrared light into the field environment, and a depth camera that detects distortions within the infrared patterns projected onto the environment to determine the distances from the camera. The depth camera data can be used by the spatial modeling engine to create a 3D model. Sometimes, field user 102 may point the spatial sensing system at a target object (e.g. furnace system 104) and then walk around the object to obtain a 3D model.

“The field user 102 uses the display device to view the target object as well as other aspects of the surrounding environment. FIG. FIG. 1A shows a view 100 showing the ambient environment for field user 102, with particular attention to furnace system 104. The display device may provide the view 100, in some cases a head-mounted display (HMD). The HMD allows the field user 102 to wear the device on their head and immerse themselves in the view 100.

“After the spatial modeling engine generates a 3D environment model for the field user 102 the system transmits data that represents a minimum of a part of the ambient environmental to a remote system. The transmitted 3D model may be the entire model of the ambient environment generated by the spatial modelling engine. Other implementations may show less of the ambient environment. The system might identify furnace system 104 in the field user’s environment as the target object. The system can identify the furnace 104 as the target object and extract the part of the 3D modeling that corresponds to it from the overall 3D model (e.g., by excluding surroundings not relevant to the furnace 104) to create a 3D model. The furnace model can be transmitted to the remote part of the modeling system by the field computing system.

“FIG. “FIG. Remote computing generates view 150 for an expert user who provides remote assistance to the field user. 102 in relation to the furnace repair or maintenance task 104. The expert user can be presented the 3D furnace model 54 by the system in a variety of ways. The expert user may use a mixed-reality HMD in some cases to view the virtual environment. In this case, the 3D furnace model154 is rendered as an object that augments the direct real-world view 150. The mixed-reality HMD can allow expert users to see through transparent lenses in shops and labs. The expert user may be able to augment his view 150 by using his mixed-reality HMD to render the furnace in 3D. The 3D furnace model154 can be placed in the expert user’s view 150 in a natural location, e.g. on a floor at a fixed place in the environment. The mixed-reality HMD allows the expert user to physically move around the virtual furnace model 104 in order to inspect and view the target object from different angles. Other implementations allow the remote modeling system to render the 3D furnace model 544 for expert users using different types of displays. The system could render the 3D furnace model number 154 on a tablet computer, a desktop monitor or a TV monitor. The 3D furnace model model 154 can be rendered on a 3D display that uses a stereoscope to give depth perception to the model.

“The remote modeling system allows the expert user to interact directly with the 3D model. An expert user can?annotate? The expert user may “annotate” by putting virtual markers in the virtual field environment where the target object’s 3D model is rendered.”

Virtual markers are virtual objects that enhance a view of a 3-D environment and which identify specific locations within the environment. FIG. FIG. 1C shows that the expert user added a virtual marker (156) proximate a coupling on the gas pipe of the furnace 3D model 104 from the environment of field user 102. To accurately communicate location information (or position) to the field user 102, the expert user can specify exactly where the virtual marker should go in the 3D environment. If the expert user suspects there may be a leak in the gas pipe, the virtual marker can place on or near the coupling. The 3D model 154 can be used to specify the location of the virtual marker 156. The expert user can use touchscreen gestures to move a virtual marker and rotate, pan or zoom the 3D-model 154 on a tablet. Implementations in which the 3D Model 154 is displayed on a tablet in mixed-reality or virtual reality environments may require the expert user to use hand gestures, pointing devices or a pen to indicate the location of the virtual marker in 3D space.

A virtual marker may provide additional information than just positional or location information. The expert user can assign an orientation to the virtual markers, give the marker a specific appearance (e.g., select 3D model), add media content, animation, text or note to the marker. The expert user might suggest a tool (implement), which the field user 102 can use to work on the pipe coupling. This could be done by selecting a 3D modeling for that tool. The field user 102 might also add a brief note to the selected 3D modeling, giving instructions on how to use the tool. The system may select an icon to represent a virtual marker that corresponds with a type or supplemental content. FIG. 158 shows a second virtual marker. 1C includes audio content. To alert field users 102 to audio content, the system uses a speaker icon. This is used to record verbal instructions from experts.

“After an expert user confirms that one or more virtual markers have been placed in the 3D environment around or on top of the target object’s 3D model, the remote modeling software transmits data representing these virtual markers to its field section. The field portion can process the data received from the expert user in order to render virtual objects in the 100-degree view of the environment. FIG. FIG. 1D shows a first virtual marker (106) corresponding to virtual marker 156, which is placed on or near the furnace’s top coupling at a location that corresponds with the virtual marker location 156 relative to 3D model furnace 104. A second virtual marker (108) corresponding to virtual mark 158 is also rendered at the base of furnace in a location that corresponds with the virtual marker location 158 relative to furnace 104. The view 100 can be used to show the exact location of virtual markers 106 and 108 in 3D space. This is useful for implementations that use a mixed reality HMD device. The virtual markers 106 and 108 can be fixed in space. This allows the field user 102 to see the exact location of the virtual markers 106 and 108 in 3D space.

“In certain instances, the field user can use the virtual marker 106 displayed in a view 100 to guide them through the environment. The expert user might have placed the virtual marker106 in the environment to indicate to the field user102 the exact location of the target object, where the user should apply a tool. FIG. FIG. The field user 102 can see the virtual marker106 in view 100. This tells him or her that the pipe coupling should be torqued using wrench 110. The field-portion of remote modeling captures, in some cases, the position and orientation of the wrench 110 as the field user (102) torques the pipe coupling. The field user’s data is captured and transferred to the remote computing system. The view 150 can display a virtual representation 160 for the wrench 110 at the exact location of the furnace model 154. This corresponds to where the field user 102 applied the wrench 110 to the furnace system 104. The virtual representation 160 can be a 3D model or sizing of the wrench 110, which visually matches the field user 102. The virtual representation 160 can be displayed simultaneously with the virtual marker 56 that an expert user has previously placed in view of furnace system 104. In a remote environment, the expert user can compare the location where the wrench 110 was placed with the virtual marker 56. The expert user can send a confirmation message 102 to confirm that the location of the wrench 110 as indicated by virtual representation 160 is acceptable. If the location where the wrench 110 was applied is not satisfactory, the expert user can adjust the virtual marker location 156. The updated location can then be transmitted to field user 102. This will cause the virtual marker location 106 to be moved to the new location.

“FIG. 2. This is a block diagram for an example remote modeling software 200. Remote modeling system 200 may include multiple computers located at different locations. The system 200 can be configured in some cases to perform the processes described herein. 1A-1F, 3A-3C. The system 200 generally includes at least two or more parts that are distant from each other. The system 200 includes a field system (202), which is used in the field, and a remote system (204), which is located far from the field. The remote system 204 and field system 202 may communicate over one or more networks. In some cases, the remote system 200 and the field system 202.2 communicate indirectly through a server system 206. The messages may be passed between the field system 200 and remote system 204 by the server system 206. One or more of components 210-228 shown in FIG. may be used in some implementations. 2, as part of the field system 200, or the components 220-228 that are shown in FIG. 2. may be part of remote system 204. Every component 210-228, and each component 230-240 can include one or more shared or dedicated processors.

“The field system202 comprises 210-228 components that provide services to users in the field environment. The controller 210 manages operations of the field network 202 and coordinates activities between the various components 212-228. The communications interface 212 can transmit and receive messages via one or more networks. The communications interface 212 allows field system 202 and remote system system 204 to communicate with each other or both.

In other implementations, field system 202 could use other display devices than head-mounted ones. An example of this is a direct view of the environment such as a video stream that can be displayed on an electronic screen screen on a computer monitor (e.g. an LCD or LED display), a TV, a mobile phone or tablet computing device.

“The field system 200 also includes a spatial modeling engine (216). The spatial modeling engine (216) is designed to create a 3D model for the field user’s ambient environment. The spatial modeling engine 216, which receives spatial sensor data from one or more sensors within the spatial sensing subsystem, processes these spatial sensing data to create the 3D model. The spatial sensing data is the spatial characteristics of the field user?s environment. Some implementations include data from an infrared emitter, which emits a pattern infrared light into the field environment. A depth camera detects distortions in these infrared patterns and projects them onto the environment to determine the distances to the camera. The depth camera data can be used by the spatial modeling engine 216 to create a 3D model. Other computer vision technologies can be used in some instances to complement or substitute the emitter-depth cam approach. The spatial modeling engine 216 can be used to analyze images, videos, and other spatial sensing data in order to identify objects within an environment. The spatial modeling engine 216, which can then generate 3D models of the objects recognized in an environment, and arrange them in the same way as the physical objects. The spatial modeling engine 216 may generate a 3D model in any form you prefer, such as a 3D polygonal grid.

The spatial modeling engine 216 can scan all or part of the environment for the field user. The spatial modeling engine 216 may generate a 3D model, but this could include additional parts that aren’t required or desired by remote experts. The field user might be trying to fix a photocopier machine in an office mailroom. Remote experts can help the field user repair the photocopier machine by interfacing with a 3D model. The 3D model generated by the spatial modeling engine 221 may be representative of the entire office mailroom environment. However, it may also include other aspects of the environment, such as the photocopying machine. Remote modeling systems can also include an extraction engine 228 to remove any extraneous elements from the environment’s 3D model. The extraction engine 228 can be used to locate a target area of a 3D modeling system and extract that target portion from any other parts. The target portion can include one or more target objects which are to be rendered in remote users’ environments. Extraneous parts of the 3D models can also be used. The extraction engine 228 might analyze the initial 3D modeling of the mailroom environment and identify the photocopying device from that initial 3D map. It will then extract the 3D model from that initial 3D file by removing any extraneous parts of the 3D models that are not relevant to the photocopying machines.

The extraction engine 228 is able to identify target objects in 3D models automatically or using user input. The extraction engine 228 can use object recognition and object detection techniques to automatically identify a target object from the 3D model generated by the spatial modeling engine 215. The extraction engine 228 can be triggered by user input. This input may include a selection of an area or point in the 3D environment representation that corresponds with the target object. The extraction engine 228 will then be able to correlate the selection with the target object and extract the 3D model from the initial environment 3D model. The 3D model of an environment can be rendered as a rendering, image, or other appropriate forms.

“In FIG. “In FIG. 2, the extraction engine 228, is shown as part of the field system 200. In other implementations, however, the extraction engine 228, may be located in the remote system, server system 206 or distributed across multiple parts of the system 200. Depending on where the extraction engine 228 is located, there may be different benefits. The extraction engine 228 can be provided at the field system 200, so the data files that represent the 3D model may be smaller than the original environment model before extraction. However, extracting a target object from the environment’s 3D model can be computationally costly and may take longer at the server system 206 or remote system 204. Extract engine 228 is then available at any of the systems 204 and 206.

The field system 202 can display virtual objects such as virtual markers within an environment created by the HMD214. Virtual objects may be added to the environment by the system 202. This could include virtual markers or other items that have been placed at remote systems 204. Local users, such as field users at the field system 200, may insert other virtual objects into the ambient environment view. The field system 202 contains a virtual object manager 220, and a rendering engine 222.

“First, the virtual object manager 220 keeps information about virtual objects that are visible in the ambient environment. The field system 202 receives instructions from remote users via the communications interface 220 to display a virtual item at a specific location in the ambient field environment. An instruction may include various parameters that will affect the way the virtual object is displayed in the environment. These could include a display coordinate value or an object orientation value. Secondary data values can also be included. The display coordinate value specifies the location in 3D space where the virtual object should appear. The object orientation value specifies the orientation in which the virtual object should appear (e.g., vertically, horizontally, or tilted). The model identification value specifies the visual representation that should display for the virtual object. A virtual objects manager 220 could include a models database which maps the model identification value for each pre-stored 2D model (e.g. graphics or icons) or 3D model at field system 202. The virtual objects manager 220 can access the models database using the model identification value received from the user. This allows the user to choose which 2D or 3-D model they want to display in the environment. Any additional information about how the virtual objects should be displayed to the user in view of the environment can be included as secondary data. One example is a text label which will be displayed alongside the 3D or 2D models of the virtual object. Another example is a secondary data value that may contain a link to an additional resource (e.g., an application or a webpage) or media content to render with the 2D and 3D models of the virtual objects.

“As mentioned in the previous paragraph, the field system 200 may contain a display coordinate value. This value indicates where a virtual object should appear in 3D space. The problem is in how to interpret this coordinate value, so that the virtual objects are displayed in the correct position in the view of the environment. This is true even if another user (e.g. remote expert at remote system 204) has placed the virtual object in a 3D environment different from the field user’s ambient environment. The virtual objects manager 220 could include an alignment-and-translation engine (not shown at FIG. 2), which is designed to convert a received display coordinate to a translated display value. This identifies the virtual object’s display position in a 3D space that is specific to the ambient field environment. If, for instance, the coordinates between the ambient environment and the remote environment in which the remote user places the virtual object are not aligned, the received display coordinate value can be translated. Other implementations may use the alignment and translation engine to align the coordinate systems in the remote environment and the ambient field to eliminate the need to translate the received display coordinate value. The spatial modeling engine 216 first determines one or more anchor positions in the ambient environment to align the coordinate systems. These anchor points are used as reference positions (e.g. origins) in a 3D coordinate scheme that defines spatial positions within the ambient field environment. An anchor point could be found at the corner of a room within the ambient field environment. It may also be assigned (0, 0, 0), origin position in a Cartesian Coordinat System. It is possible to determine the locations of objects within a 3D model environment by calculating their offset from the origin. The field system 202 can also transmit data to the remote system, 204 that identifies anchor points in the ambient environment. This includes the 3D model or 3D models of target objects within the ambient environment. The remote system 204 can then identify the anchor points in the ambient field environment. It can then use these anchor points to generate coordinate values for virtual objects in remote environments in terms of their offsets. The coordinate systems relating to the field and remote environments will produce the same coordinate values. If a remote user places virtual objects at specific locations on the surface of a 3D model target object in a remote setting, the coordinate values that identify the exact location of virtual objects in remote environments will also identify equivalent locations on the surface of target objects that actually exist in the field.

“The field system 200 also includes a rendering engine 222. The rendering engine 222, which renders virtual content using HMD 214, or an alternative display device, is used to display the HMD 214. The rendering engine 222 may augment a real-world view by superimposing virtual objects on top of it. The rendering engine 222 may use information from the spatial model engine 216 to determine the parameters that will be used to display a virtual object. This makes it appear as though the object is part of the environment. The virtual object could be placed on the surface of an object that is part of the ambient environment. The rendering engine 222 can place the virtual object on top of the physical object using information from the spatial modelling engine 216. This will allow for an enhanced view of the environment. The rendering engine 222 can render a virtual object appear to be in an obscured area of the 3D environment. Display parameter values for virtual objects indicate how the rendering engine222 places, sizes, and manipulates virtual objects in real-time. This creates the illusion that the virtual object is located at a specific location in the environment, away from the camera or user.

“An input manager 218 in the field system 200 is designed to receive inputs from one or multiple sources. Any suitable mechanism to detect user inputs may be used, such as a camera, gesture recognizer or microphone, speech recognition, a microphone, virtual keyboard, a pointing device or combination of several of these. The field system 202 may be used by a user to input information for a variety of purposes, such as to select an object, provide spoken or written information to remote users, place or move virtual markers, trigger a state capture event or other purposes. The input manager 218 can receive and process data that indicates where a user’s gaze is directed. Data from the HMD214’s orientation and motion sensors (e.g. compasses, accelerometers, gyroscopes), eye tracking sensors, or both can be used to determine the direction of a user?s gaze.

“In some cases, the field system 200 includes a state capture module 224. The state capture engine (224) is designed to store and capture state data about the state of an ambient field environment at specific times. The state capture engine 228 can capture and store information about field users’ actions in the ambient environment. The system 202 might render a virtual marker in a view of an ambient field environment. This marker may be placed by remote users to prompt the field user for operation on a particular component of a target object. The state capture engine 228 may store and capture state data when the field user places a tool in position to start work on the specified component. This includes the tool’s position relative to the target object, or the virtual marker. For review by the field user, the state data can be transmitted to remote system 204. The remote system 204 may use the state data to create a virtual object representing the field user’s tool. This could be a generic marker or a 3D model.

“The field system 200, in certain implementations, also includes one or more additional data sources 226, Auxiliary data sources 226 provide auxiliary data to the field system. Any information that could be used by a field user to assess the field environment, or any information that might impact the field user’s field work in it, can be considered auxiliary data. One example is that a target object may have sensors that emit signals indicating its operating characteristics. Auxiliary data sources 226 can collect, store, or transmit sensor signals that indicate operating characteristics of the target objects to make this information available to the field systems 202, 204, and/or 206. Data from the auxiliary sources 206 may be stored in the data storage 208. This data is accessible directly to the server system. Some implementations allow for raw auxiliary data, information that is at least partially derived from the raw data sources 206, to be presented to either the remote user or the field user. The field user might attach leads from a multimeter to a target circuitboard during an electrical repair. An auxiliary data source 226, which can be used to transmit electrical information, such as voltage, current, and resistance, may be used. The information can also be displayed to the remote user or both, while repairs are ongoing.

Remote system 204 can be implemented in any combination of one or more computers at one or more locations. The remote system 204 can present 3D models of at most a portion the ambient field environment to a user who is located far from the field environment. Remote system 204 is capable of adding objects to a virtual environment. For example, virtual markers can be used to identify specific points in the virtual space. A 3D model can be included in the virtual field environment or a small portion of it, such as a model of a target item located within the ambient environment.

“The remote system 200 includes a controller 230. The remote system 204’s controller 230 manages operations and coordinates activities between the components 232-242. The communications interface232 allows messages to be sent and received over one or more networks. The communications interface232 allows remote systems 204 to communicate with each other, either via the server system or field system 202.

“A display device 234 displays a virtual field environment to remote users, which could be either local to remote system 204 or remote from field system 202. For different user experiences, the remote system 204 might offer different display devices 234 The display device 234 may be a head-mounted display (HMD) in some cases, e.g. HMD 214 in the field system. An HMD provides a user with a realistic view of their environment. It can also augment the view by overlaying a virtual field environment on top of the real-world environment. Mixed-reality HMDs may overlay the 3D model the target object on top of a real-world view of remote user’s environment. This allows remote user to see the 3D model in scale in their own environment. Other implementations display the virtual environment on the 2D display screen 234 such as an LCD, LED screen on a computer monitor, or tablet. For example, the user can use touchscreen gestures and other input mechanisms to manipulate the 3D model of the target object (e.g. rotate, pan, zoom), or to place virtual markers at specific locations in the virtual environment. Another implementation of the display device 234 is to display the virtual environment on a 3D screen. This screen uses stereo imaging to create depth illusions. A 3D screen can provide a more immersive experience to remote users than a 2D screen.

“A rendering engine 236 from the remote system204 is designed to render virtual objects that can be displayed with the display device 234. The rendering engine 236 is configured to receive data that characterizes a virtual environment (e.g. one or more virtual objects), and render the environment in a 3D space. It then outputs a signal 236 to the display device 236, which can be used for display of the rendered virtual environment.

The remote system 204 has a virtual objects manager 238 that is designed to keep information about virtual objects that can display in the virtual environment. A virtual objects manager 238 could include a models database which maps a model identification value for each pre-stored 2D model (e.g. graphics or icons) or 3D model at remote system 204. The virtual objects manager 238 can access the models database using a model ID value to determine which 2D or 3-D model should be displayed in the view of the environment. Any additional information about how the virtual objects should be displayed to the user in view of the environment can be included as secondary data. One example is a text label which will be displayed alongside the 3D or 2D models of the virtual object. Another example is a secondary data value that may contain a link to an additional resource (e.g., an application or a webpage) or media content to render with the virtual object’s 2D or 3-dimensional model.

“The remote system 200 includes a primary input manger 240 and secondary input manager 242. The primary input manager (240) receives primary user inputs. This includes inputs from users who want to add virtual markers to virtual environments or modify virtual markers within virtual environments. Secondary input manager 240 is used to receive secondary user inputs. These user inputs indicate that a user wants to add secondary objects in a virtual environment. Secondary objects are objects other than virtual marker objects, which can be added to a virtual environment. To clarify or provide more information about the tasks the field user is asked to complete, the remote user can add text or voice annotations to a Virtual Marker.

“In some cases, the primary input manager (240) and the secondary input manager (242), are set up to receive inputs from user sources. Any suitable mechanism to detect user inputs may be used, such as a camera, gesture recognizer and microphone, speech recognition, a microphone, virtual keyboard, pointing device or combination thereof. The remote system 204 may be used by a user to input information for many purposes, such as to select an object, provide spoken or written information to a field user or to place or move virtual markers. The input manager 242 can be configured to process and receive data about the direction of the gaze of a user wearing an HMD. Data from the HMD’s orientation and motion sensors (e.g. compasses, accelerometers, gyroscopes) or eye tracking sensors can be used to determine the direction of a user?s gaze.

“Referring to FIGS. 3A-3C is a flowchart of an example 300 process for rendering and placing virtual markers in a 3D environment. A system of computers can perform the process 300 at multiple locations, such as the remote modeling system 200 in FIG. 2. The flowchart shows a series operations between a field part of the computing systems, e.g. field system 202 and a remote section of the computing systems, e.g. remote system 204. In some cases, however, operations may not be performed directly between the remote system and field system. A server system called server system 206 may be used to communicate between the remote and field systems. Some operations, such as 302-342, may also be performed at this intermediate server system.

The field system provides real-world views of the ambient environment for a user out in the field at stage 302. This is where 300 starts. You can view the ambient environment directly through the lenses of a head mounted display (e.g. head-mounted display 214) or indirectly via an electronic display screen that displays images or video.

“At stage304, the field system scans at least part of the environment in the field to create an 3D model. A depth camera, radio detection and range subsystem (RADAR), sound navigation and ranging (SONAR), light detection and mapping subsystems (LIDAR) or any combination of these scanning techniques may be used to scan the environment. The spatial modeling engine (e.g. spatial modeling engine 216) may process the spatial sensing data in order to create the 3D model.

“At stage 306, a field system locates a target object in the ambient field environment. Some implementations automatically identify the target object (e.g. without user input during scanning of the environment and identifying it). Some implementations use user input to identify the target object. The system might prompt the user to choose the target object by showing an image of the environment, or selecting a section of the 3D model that corresponds to the environment.

“At stage 308, a field system extracts the target object from the 3D environment model. A field system can include an extraction engine (e.g. extraction engine 228) to create a 3D model the target object using the 3D environment model. This is done by extracting the target from other parts of the 3D environment model. In some cases, the 3D target object model can be extracted at the remote system or intermediary server system.

“At stage 311 the field system determines one or more anchor point in a 3D space that is used to describe the ambient field environment. Anchor points can be used for aligning coordinate systems between field system and remote system. This allows virtual objects to placed in virtual fields and then rendered in real-world environments at the same position. The system may designate one, two, three or more positions in the target object’s 3D model (or at least a portion of the ambient environmental if extraction has not occurred) as anchor point. You can choose the anchor points arbitrarily, or you can use logic to optimize the selection of specific positions as anchor point.

“At stage 312, a field system transmits the 3D models of the target object or the minimum portion of the environment, if extraction has not occurred yet) to the remote systems. The 3D model may be transmitted directly over the network to remote systems in some cases. Other implementations transmit the 3D model indirectly over the network to remote systems.

“At stage 314, remote system receives 3D model from field system of target object.”

“The system renders the target object’s 3D model to a remote user interface. The form of the remote’s user interface can vary depending on the preferences of the user and the display hardware that displays the rendered model. A virtual representation of the environment can be used as the user interface. The rendering engine of remote systems, such as rendering engine 236, can render the 3D model for the target object. The rendered model may then be displayed with a display device (e.g. display device 224).

“At stage 318 the remote system receives a primary input from the user to place a virtual mark in the remote system’s user interface. The virtual marker can be placed in a virtual environment around the target object, or at least part of the ambient environment. A coordinate value in a 3D coordinates system may specify the location of the marker within the virtual field environment. The coordinate value indicates an offset of the marker’s specified location in 3D space relative to one or more anchor points. Other characteristics of the virtual marker may be specified by the primary user input, such as the orientation, size, and 3D model of the marker.

“At stage 322, the remote system can optionally receive a secondary user input. This allows the remote user to identify any additional information they wish to share with the field user. The field system can render any type of information that is suitable for the field user. The remote user could dictate or type instructions or send audio, hyperlinks, or video to the field user.

“At stage 322, a remote system transmits an instructions to the field system. The instruction contains data that is sent to the field system by the remote user. It will cause the field system augment the view of the environment by adding virtual markers. Instructions can contain display coordinate values and optionally other data that describe the primary and secondary user inputs. This allows virtual markers and additional information to be presented to field users based on the inputs. To transmit the instructions to the field system, the remote system uses a communication interface (e.g. communications interface 232).

“At stage 324 the field system receives instructions from the remote system. The instructions can be received via a network by using a communications interface at field system (e.g. communications interface 312).

“At stage 326 the field system processes the received instructions and renders a virtual mark in an augmented view the ambient environment of a field user. A rendering engine, such as rendering engine 222 can render the virtual marker in an augmented view of the field environment at the location indicated by the display coordinate values. Display device HMD214 is used to display the rendered virtual marker. The rendering engine can also render visual representations for any additional information the remote user has provided to the remote system.

“At stage328, the field system detects user interaction with the target object within the field environment. The system may monitor user actions within the field environment if they occur within a certain distance from the virtual marker. This is shown in the view of an ambient field environment. A state capture engine (e.g. state capture engine 224) may detect the occurrence of pre-defined events in this area of the field environment. This data can be used to characterize the current field system, field environment generally, target object, or any combination thereof. The state capture engine can automatically store and capture data about a tool used to complete a task on the target object. The state capture engine might record, for example, where a wrench is used to fix a pipe. It may also record details such as the orientation and number of turns made, as well as the torque applied.

“At stage 332, the field system uses communications interfaces, e.g. communications interface 312 for data transmission to the remote system. This allows data to be transmitted that characterizes the stored and captured state data. The remote system receives state data from the field using stage 332 via a communications interface (e.g. communications interface 232).

“At stage 334 the remote system uses state data to create a virtual representation the field user’s interaction in the user interface of remote system. If the target object is not extracted, the virtual representation of the interaction of the field user can be displayed in the remote system’s virtual field environment. The virtual representation of the interaction between the field user and remote user can be displayed simultaneously with the marker the remote user placed in the virtual environment at stage 318. A second virtual marker, which represents the place in the environment where the field user applied the tool to the target object, can be displayed alongside the first virtual marker that was placed by the remote user. Both virtual markers can be displayed simultaneously to provide a visual comparison for remote users. The first virtual marker indicates the desired location for field user’s field work. The second virtual marker shows the actual field user’s location. Remote users can compare the virtual markers simultaneously and decide whether or not the work of the field user is acceptable. Remote systems may issue an alert to the remote user in some cases if the positions of the virtual marker are more than a certain distance. Remote systems may also issue an alert if the field user’s work is not in line with the prescribed parameters. The system could generate an alert to inform the remote user if the field user applies a tool in a different way or torqued the wrench in an incorrect direction.

“At stage 336 the remote system might receive corrective feedback from a remote user if it feels that the field user’s work needs to be corrected in any way. If the remote user does not approve of the field user’s work, or if they have other requirements for their work, corrective feedback could be given. Corrective feedback can also be provided if the field users did not follow the instructions of the remote user. Some implementations provide corrective feedback to the remote system via an input manager (e.g. primary input manager 240 and secondary input manager 242). If the remote user wants to communicate to the field user that the environment in which the field user is applying a tool should be changed, the remote user can specify the change in the position of the first and second virtual markers. Another example is that the remote user can add annotations to existing virtual marker or create new markers to provide feedback to the field user.

“At stage 338 the remote system transmits data characterizing remote user’s corrective input to the field system via a communications interface (e.g. communications interface 232). The field system receives data from the remote user at stage 340 using a communications interface (e.g. communications interface 212). After receiving the corrective feedback data, the field system updates stage 342 the augmented real world view of the ambient environment. Based on corrective feedback data, the field system can move a virtual marker into a different location in the 3D space. Some implementations may ghost the virtual marker in its original position while the non-ghosted, fully textured virtual marker is displayed in its updated position.

“FIGS. “FIGS. Many computing systems don’t provide strong hand-based input controls, such as a keyboard or mouse. Users may prefer to use traditional hand-based controls. A head-mounted display (HMD), which provides direct or indirect views of the environment, may be used in mixed-reality and virtual reality environments. It is possible that the user wants to use the HMD to do other tasks. Traditional input controls such as a mouse or keyboard may not be practical. FIGS. 4A-4D can be used to allow users to navigate through complex user interface menus and other elements of a user interface (e.g. a real-world view in an ambient environment) in an efficient and natural way.

“FIG. 4A shows a physical control object in the form of a cube 402. You may also consider other geometries for physical control objects, such as a tetrahedron or a square pyramid or a cube 402. The cube 402 has six faces. Three of these are shown in FIG. 4A. 4A. Respective symbols 406a-c are printed individually on each face. The cube 402 faces 1 and 2 have a circular symbol 406a, while the cube 402 faces 2 and 3 have a triangular symbol 406.b. The cube 402 is within the view area of a camera (404). For example, the camera 404 could be an RGB video camera mounted on a head-mounted monitor so that the cube 402. is within the camera’s view when the user turns their head to look at the cube 402.

“While the cube402 is within the field of vision of the camera404, a computing device coupled to the camera404, e.g. field system202 or remote system204, may use computer Vision techniques to recognize the cube402 as a physical object and render virtual elements over that surface. An example of an augmented view 408 a of the Cube 402 shows a first virtual element 410a being rendered over one face of cube 402, followed by a second virtual element 410b rendered over the second cube 402, and finally a third virtual element 410c rendered over the third cube 402. The virtual display elements can partially or completely obscure the contents of the cube 402’s respective faces. A virtual display element can be superimposed over a cube 402 face to create an augmented view of the environment.

“In some implementations, the virtual display element provides a user control interface with which a user can interact by manipulating a control object (e.g. cube 402). FIG. FIG. 4C illustrates a circular dial 412 that is a virtual display element which has been placed over the cube’s second face. Based on the system’s identification that the second cube face 402 is facing in a predetermined direction (e.g. in the direction of camera 404), the system can render the dial 412 above the cube’s second face. To perform an action with computing system, a user can rotate the dial 412 by rotating the cube402 and keeping the second face of cube402 in the direction toward camera 404. The dial 412 can be fixed relative to cube 402 so it follows cube 402 while it is in the field of vision of the camera.

“In some cases, the user can activate different virtual display elements simply by rotating the cube 402. This allows the user to rotate the cube in different directions toward the camera 404. FIG. FIG. 4D illustrates that the system can render a menu 414 rather than a dial 412 when the cube 402 is rotated so that the first face faces toward the camera 402.

“FIG. “FIG.5” is a flowchart that illustrates an example 500 process for rendering virtual display elements within a user-interface based on actions taken with a physical control object (e.g. cube 402) in the field view of a camera. The process 500 may be carried out in a number of locations. The system detects a physical object in the field of view of camera at stage 502. A physical control object could be a geometric shape that has multiple sides and a symbol printed on each. The system examines the images or video of physical control objects to identify a first symbol that is facing the camera at stage 504. The database may be used to map each of the virtual display elements to one of the symbols printed on a physical control object. The system determines the first symbol that is pointed toward the camera at stage 506 and maps it to the virtual display element. The system renders the chosen virtual display element in an environment that includes the physical controller object at stage 508 The selected virtual display element is rendered at most partially over the face of the physical controller object that faces toward the camera. Stage 510 identifies that the user has manipulated the physical control objects. However, the face with the first symbol is still oriented towards the camera. The user can rotate, shake or tap the first control object’s face. The system can detect user interaction with the control objects by analysing images or video of them using computer vision techniques. Based on user interaction, stage 512 allows the system to perform an operation with the virtual display element. The system might respond to a tilted control object by scrolling through a list of items in the virtual display element. The display element can be activated by shaking or tapping the user. One of the items in the focus list may then be selected. In some cases, the user can activate another virtual display element by turning the physical control object so that it points toward the camera. The system detects a second symbol at stage 514 on the second side of the physical control object. The system returns to stage 506 and continues processing, based on the orientation of the second control object’s face toward the camera.

“FIGS. “FIGS. The physical limitations of the HMD can limit the area that can be augmented by user interface elements (virtual displays elements) when viewing an environment. User interface elements can also distract from the user’s ability to see the real-world environment through the HMD. FIGS. 6A-6D might be a good option for some implementations.

“FIG. 6A shows an annotated view 602 of an environment that is presented to the user via a HMD (e.g. HMD 214). The environment can be real-world, or virtual. The environment contains a gaze-activated user interface element 606 The element 606 might be very small, at most 5-10 pixels in height and width. It may also occupy a small area of the viewing area that can be augmented with virtual objects. Sensors may be included in the HMD, such as orientation and motion sensors and/or eye tracking sensors that detect the direction of the user’s gaze when wearing the HMD214. The reticle 604 represents the point of focus of the user?s gaze. This is the point at which the user?s gaze is currently directed. The reticle 602 changes as the user moves their eyes or heads.

“In some cases, the HMD displays element 606 partially or fully transparently to reduce distractions. However, the user’s gaze does not move in close proximity to the element 606. The transparency of element 606 can decrease as the user’s focus moves closer to it. The distance between the element 606 and the reticle 604 may affect other aspects of the visual appearance, such as color, size, shape or a combination thereof.

“In some implementations, one or more action thresholds are defined around the user interface element 606. FIG. FIG. 6A illustrates a first action threshold 608 that is located at a distance of 606 from the center element 606 and a secondary action threshold 610 that is at a shorter distance from element 606. The system executes an action when the user’s focus crosses an action threshold. The actions can include expanding element 606 or adding descriptive details about element 606. FIG. FIG. 6B shows that the system displays descriptive text 612 in pop-up mode near element 606 when the user crosses the first threshold 608. As the user focuses on element 606 more, and the distance between element 606 (as indicated by the second threshold 610), the system may display pop-up display windows 614 as shown in FIG. 602 c. 6C. The pop-up window 614 can display media content, textual information, or other user interface elements, such as a list interactive items organized in a menu.

“In some cases, the gaze-activated user interface element 606 can be broken into multiple parts. FIG. FIG. 6D shows view 602 d in an environment where the gaze-activated element 606 can be divided into a top and bottom half. A first pop-up window 616 a opens when the user’s gaze is focused on the top of element 606 A second pop-up window 616 b is displayed when the user’s gaze focuses on the element 606 at the bottom.

“FIG. “FIG.7 is a flowchart showing an example 700 process for interfacing with a gaze activated user interface element. One or more computers may perform the process 700 at one or multiple locations, such as field system 202, remote system 204, or a head-mounted display (214).

“At stage 702, a gaze-activated user interaction element is created, e.g. element 606 within an augmented view. The element could be an icon, or another small virtual marker, that is fixed in 3D space as a location through the lenses of a HMD’s screen. When the user’s gaze moves beyond a certain distance from the element, the element can be partially or fully transparent.

Click here to view the patent on Google Patents.