Metaverse – Stefan Johannes Josef HOLZER, Alexander Jay Bruen Trevor, Michelle Jung-Ah Ho, David Klein, Stephen David Miller, Shuichi Tsutsumi, Radu Bogdan Rusu, Fyusion Inc

Abstract for “Live Augmented Reality Guides”

“The present invention generally relates to systems and methods of analyzing and manipulating images or video in various embodiments. Particular embodiments analyze the spatial relationships between multiple images and videos, along with location data, to create a representation, referred to herein below as a surround view, which can be presented on a device. The visual guide is used to capture the images in the surround view. A visual guide is a virtual object that can be rendered in real time into the images to be displayed on an image capture device. The visual guide is useful for keeping the device’s trajectory in motion.

Background for “Live Augmented Reality Guides”

“We will now refer to specific examples of the invention, including the best methods that the inventors considered for carrying out their invention. The accompanying drawings show examples of specific embodiments. Although the disclosure is made in connection with the specific embodiments described, it should be understood that the invention is not limited to these embodiments. It is, however, intended to include all modifications and equivalents that may be within the scope and spirit of the invention, as described in the appended claims.

“The following description provides a detailed understanding of the invention. Some embodiments of this invention can be implemented without all or some of these details. Other instances of well-known process operations are not described in detail to avoid confusing the present invention.

“Various aspects of this invention concern systems and methods that analyze the spatial relationship among multiple images and videos together with location data. This creates a single representation called a surround view. It eliminates redundancy and provides a user an immersive and interactive active viewing experience. Active is used in various embodiments to allow a user to change the view of the visual information on a screen.

“In particular, AR is used in example embodiments to assist a user in capturing multiple images in a surround view. A virtual guide can be embedded into live image data on a mobile. A virtual guide is a tool that allows the user to guide the mobile device along the desired path for creating surround views. AR images have the ability to respond to the movements of the mobile phone. You can determine the movement of your mobile device from many sources, including an Inertial Measurement Unit or image data.

“A surround view, according to different embodiments of the invention, is a multi-view interactive media representation. Referring to FIG. FIG. 1 shows an example of a surround-view acquisition system 100. The surround view acquisition system 100 in the present example embodiment is shown in a flow sequence that can generate a surround image. According to different embodiments, data that is used to create a surround view can be derived from many sources.

To create a surround view, data like, but not limited, to 2D images 104 can be used. These 2D images may include color image streams, such as multiple sequences of images, video data, etc. or multiple images in any number of formats, depending on the application. With respect to FIGS. An AR system can be used during image capture. AR systems can augment live image data by incorporating virtual data. Virtual data can contain guides that allow users to direct the movement of an image-capture device.

Environment information 106 is another source of data that can help to create a surround view. This environment information 106 is available from various sources, including accelerometers and gyroscopes as well as magnetometers, GPS, WiFi and IMU-like systems (Inertial Measurement unit systems). Depth images are another source of data that could be used to create a surround view. These depth images include 3D or depth image data streams and can be captured using devices such as stereo cameras, time of flight cameras, three-dimensional cameras and the like.

“In the current example embodiment, the data may then be fused at sensor fusion block 110. Some embodiments allow for a surround view to be created using a combination data set that includes 2D images 104, environment information 106 and depth images 108. Other embodiments allow depth images 108 to be combined with environment information 106 at sensor fusion block 110. You can combine different combinations of image data with the environment information at 106 depending on your application and data.

“In the current example embodiment, the data fused at sensor fusion block 110 can then be used for content modeling 112 or context modeling 114. FIG. 4 The subject matter in the images can be divided into context and content. The context can be defined as the scene surrounding the object or interest. The content may be a three-dimensional model depicting an object, but in certain embodiments it can also be a 2-dimensional image. 4. In some embodiments, the context may be a two-dimensional representation of the scenery around the object. The context may provide views in two dimensions of the scenery around the object of curiosity in many cases, but some embodiments allow for three-dimensional elements. The context could be described as a “flat” image. Image along a cylindrical canvas,? so that the image appears flat. The image is projected on the surface of a cylindrical. Some examples also include three-dimensional context modeling, which is useful when objects are identified as such in the surroundings. The models generated by content modeling 112 or context modeling 114 can be combined with the location information data. 3.”

According to different embodiments, the context and content of surround views are determined based upon a specific object of interest. Some examples show that an object of interest can be selected automatically based on the processing of image data and location information. If a dominant object is identified in a set of images, it can be chosen as the content. As shown in FIG. 2, a user-specified target 102 could be selected. 1. However, it should be noted that surround views can be created in certain applications without the need for a target to be specified by the user.

“In the current example embodiment, one (or more) enhancement algorithms can be applied to enhancement algorithm(s), block 116. In certain embodiments, different algorithms can be used to capture surround view data regardless of the capture mode. These algorithms can be used for user enhancement. Automatic frame selection, stabilization and view interpolation can all be used to capture surround view data. These enhancement algorithms may be applied to image data following acquisition. These enhancement algorithms can also be applied to image data captured with surround view data.

“Automatic frame selection can be used, according to certain example embodiments to create a more pleasant surround view. Frames are automatically chosen so that the transition between them is smoother or more even. This automatic frame selection can incorporate blur- and overexposure-detection in some applications, as well as more uniformly sampling poses such that they are more evenly distributed.”

Stabilization can be used in some embodiments to create surround views that look similar to video. To improve the quality of surround views, such as smoother transitions and enhanced focus on the content, keyframes can be stabilized. But, unlike video, surround views can be stabilized using IMU information and depth information. Computer vision techniques, direct selections of stabilized areas, face detection, and other methods are all available.

IMU information, for example, can be extremely helpful in stabilization. IMU information can be used to estimate the camera tremor, even though it may not be accurate or reliable. This information can be used to cancel, reduce, or remove the effects of camera tremor.

“Some examples show depth information that can be used, if possible, to stabilize a surround view. These points of interest in surround views are three-dimensional rather than two-dimensional. This makes it easier to track and match these points as the search space shrinks. Descriptors of points of interest can be used to describe them using both depth and color information, making them more discriminative. It is possible to provide depth information for automatic or semi-automatic content selecting. A user can select a specific pixel from an image and have it expanded to cover the entire surface. Furthermore, content can also be selected automatically by using a foreground/background differentiation based on depth. The content may be visible and can remain stable in certain situations.

Computer vision techniques can be used to stabilize surround views, according to many examples. Keypoints, for example, can be identified and tracked. There are certain scenes that cannot be stabilized by a simple warp, such as dynamic scenes or static scenes with parallax. There is a compromise: certain scenes get more attention for stabilization, while other parts of the scene are less. A surround view often focuses on one object of interest. It can be content-weighted to ensure that the object is maximumly stabilized in certain examples.

Direct selection of a specific region on a screen is another way to increase stabilization in surround views. If a user taps on a specific area of the screen to focus, and then records a convex surround image, that region can be maximally stabilized. This allows stabilization algorithms that can be focused on a specific area or object of particular interest.

“Face detection can be used in some cases to stabilize the scene. When recording with a front-facing cam, it is likely that the subject of interest is the user. Face detection can be used for weight stabilization in that area. If face detection is accurate enough, facial features (such as the eyes, nose, or mouth) can be used to stabilize areas. This is in contrast to using generic keypoints. Another example is that a user can choose an area of an image to be used as a source of keypoints.

“View interpolation can be used to enhance the viewing experience, according to various examples. To avoid sudden “jumps”, interpolation is particularly useful. Between stabilized frames, synthetic intermediate views can be rendered quickly. This information can be provided by content-weighted keypoint tracks, IMU information, and denser pixels-to-pixel matches. The process may be simplified if depth information is available. In some embodiments, view interpolation may be used during surround view capture. Other embodiments allow view interpolation to be applied during surround-view generation.

“Filters can be used to enhance the viewing experience, such as when capturing or creating a surround view. A lot of popular photo sharing sites offer aesthetic filters that can be applied only to static images. However, surround images can also be used. Because a surround view is more expressive than a 2-dimensional image and there is more information in a 3-D surround view, filters can be extended to include effects not possible in 2-D photos. In a surround view, motion blur can also be added to the background (e.g. While the content is still clear, motion blur can be added to the background (i.e. Another example is to add a drop shadow to the object of concern in a surround view.

“Compression can be used in various ways as an enhancement algorithm. Compression can be used to improve user experience by reducing data download and upload costs. Surround views can send far less data than normal videos, but retain the desired characteristics of surround views. The IMU, keypoint tracks and user input can all be combined with the view interpolation described earlier to reduce the data that must transfer from and to a device during the upload or downloading of surround views. A variable compression style for content and context can be used if the object of interest can easily be identified. Variable compression styles can have lower quality resolutions for background information (i.e. context) and a higher quality resolution for the foreground (i.e. Some examples show how to reduce the amount of data transmitted. These examples show how data can be reduced without compromising the context quality and still maintaining the desired content quality.

“In the present embodiment, a surroundview 118 is generated after any enhancement algorithms have been applied. A surround view can be used to represent multi-view interactive digital media. The surround view can be a three-dimensional model for the content or a two-dimensional model for the context. In some cases, however, the context may be a flat? The context can be viewed as the background or scenery projected along a surface such as a cylindrical surface or another-shaped surface. This means that the context is more than two-dimensional. Another example is that the context may include three-dimensional aspects.

According to different embodiments, surround view have many advantages over traditional videos or two-dimensional images. These include the ability for surround views to deal with moving scenery, or both, the ability remove redundant information, and the ability for users to modify the view. The above-described characteristics can be integrated natively into the surround view representation and are available for use in many applications. Surround views can be used in many fields, such as ecommerce, visual search and file sharing. They also allow for user interaction and entertainment.

According to different examples, after a surround view 118 has been generated, user feedback can be given for acquisition 120 of additional data. If a surround view requires additional views to accurately represent the context or content, users may be asked to provide additional views. These additional views will be received by the surroundview acquisition system 100. They can then be processed by system 100 and integrated into the surroundview.

“With reference to FIG. 2 shows an example of a flow diagram that can be used to generate a surround view 200. The present example shows 202 as the location where a plurality images are obtained. According to different embodiments, the plurality can include 2D images or data streams. These 2D images may contain location information which can be used for creating a surround view. As described in FIG. 1. You can include location information with depth images in different examples.

“As described in greater detail with regard to FIGS. 7A-11B allows for the enhancement of the images that are captured with a plurality. The plurality can be captured with a camera system on a smartphone. Live image data can be output to a mobile device’s display. It may also include virtual data such as status indicators and guides. Guides can be used to guide the user through the motion of their mobile device. Status indicators indicate the amount of images that have been captured to generate a surround view. Virtual data may not be included within the image data that was captured to generate the surround view.

According to different embodiments, the plurality images obtained at number 202 may include a variety sources and characteristics. The plurality can be obtained from multiple users. These images could be a collection from different users, such as video or 2D images, taken from the internet. The plurality of images may include images with different time information. The images can be taken at different times with the same object. Multiple images of the same statue, for example, can be taken at different times of the day or during different seasons. Another example is that the plurality can be used to represent moving objects. The images could include an object of particular interest that is moving in the scenery, such a car traveling on a road or a plane flying through the sky. Other instances include images that show an object of interest moving in motion, such as someone running, dancing, twirling, or a vehicle traveling along a road.

“In the current example embodiment, the plurality images are fused into content models and context models at 204. According to different embodiments, subject matter in images can be divided into context and content. The context can be defined as the scene surrounding the object or interest. In some embodiments, the content may be a three-dimensional model depicting an object, while in others it can be a 2-dimensional image.

“According the present example embodiment, one to several enhancement algorithms can applied to the context and content models at 206. These algorithms can be used for enhancing the user experience. These algorithms can be used to enhance the user experience, such as automatic frame selection and stabilization, view-interpolation, filters and/or compression. These enhancement algorithms may be applied to images during capture. These enhancement algorithms can also be applied to image data following acquisition.

“In the present embodiment, the surround view is generated using the context and content models at 208. A surround view can be used to create a multi-view interactive digital multimedia representation. The surround view may include both a three-dimensional model for the content and a model of its context. Depending on the method of capture and the views of the images, certain characteristics can be included in the surround view model. There are three types of surround views: a locally concave, locally convex, and a locally plain. It is important to note that surround views can be combined with different characteristics depending on the application.

“With reference FIG. “With reference to FIG. Multiple images can be captured at different viewpoints and combined to create a surround view, according to many embodiments. Three cameras 312, 314, 316 are located at 322 and 324 respectively and close to the object of interest 308. The object of interest 308 can be surrounded by scenery, such as object 310. The subject matter overlaps in the views 302, 304 and 306 taken from their respective cameras 312, 314 and 316. Each view 302, 304, and 306 include the object of interest 308 as well as varying degrees of visibility in the surrounding scenery 310. View 302 shows the object of interest 308 located in front of the cylindrical. This view is part of the scenery surrounding object 310. View 306 shows 308 of the object of curiosity to one side of cylinder. View 304 shows 308 of the object, but no view of the cylindrical.

The present embodiment shows the views 302, 304 and 316, along with their locations 322, 324 and 326, each, offer rich information about the object of interest 308 as well as the surrounding context, which can be used to create a surround view. The various views 302, 326 and 304 provide information about the different sides of the object and the relationship it has with the scenery when they are analyzed together. This information can be used, according to different embodiments, to separate the object of interest 308 into content and the setting as the context. As described in FIGS. These viewpoints can produce images that are immersive and interactive.

“FIG. “FIG. A surround view, according to different embodiments of the invention, is a multi-view interactive media representation of a scene 400. Referring to FIG. 4. The user 402 is located in scene 400. The user 402 is taking images of an object of particular interest, such a statue. The digital visual data captured by the user can be used to create a surround view.

According to different embodiments of the disclosure, digital visual data that is included in surround views can be separated semantically and/or practical into context 404 and content 404. Particular embodiments allow content 404 to include an object, person or scene of interest. The context 406 can then represent the rest of the scene around the content 404. A surround view can represent content 404 in three-dimensional data and the context 406 in two-dimensional panoramic backgrounds. A surround view could also be used to represent the context 406 and content 404 as two-dimensional panoramic scenes. Another example is that context 406 and content 404 may contain three-dimensional components. Particular embodiments differ in the way the surround view depicts context 404 and content 406 depending on the mode of capture used to acquire the images.

The context 406 and content 404 may look identical in some cases, including recordings of objects, people, or parts thereof, recordings of large flat areas and recordings where no subject is visible. These surround views may share some similarities with other digital media, such as panoramas. According to different embodiments, surround view may include additional features that differentiate them from other types of digital media. A surround view may be used to represent moving data, for example. A surround view does not have to be restricted to a particular cylindrical, spherical, or translational movement. You can capture image data using a camera, or any other capture device. A surround view, which is different from a stitched panorama can show different sides of an object.

“FIGS. “FIGS. These views are especially useful when a camera phone has been used. The camera is located on the back of a phone and faces away from the user. Concave and convex views, in particular embodiments can influence how content and context are identified in surround views.

“With reference to FIG. “With reference to FIG. The user is holding a camera so that the camera location 502 doesn’t leave axis 508 when image capture takes place. The camera captures a panoramic view around the user by pivoting about axis 508. This creates a concave view. Because of the way the images were captured, this embodiment shows the object of interests 504 and distant scenery 506 in a similar manner. This example shows that all objects in the concave view appear to be at infinity. The content therefore corresponds to the context.

“With reference to FIG. 5B is an example of a convexview 520 where a user can change his position while taking images of an object 524. The user is moving around the object 524 and taking photos from various angles from the camera locations 528 to 530 and 532. Each image includes a view and background of distant scenery 526. The object of interest 524 is the content. The distant scenery 526 is the context.

“FIGS. “FIGS.6A-6D are examples of different capture modes for surround views. While there are many motions that can be used for capturing surround views, they don’t have to be restricted to one type of motion. However, there are three main types of motion that can be used to capture certain features or views. Each of these three types can produce a locally concave surround, a locally convex surround, or a locally flat surround. A surround view may include multiple types of motions within one surround view.

“With reference to FIG. “With reference to FIG. 6A, the following is an example of a concave surround view captured from the back. A locally concave surround view, according to different embodiments, is one where the viewing angles of the camera and other capture devices diverge. This can be compared to the motion required for capturing a 360-degree spherical panorama (pure rotation), but the motion can also be applied to any curving sweeping motion where the view faces outward. The experience in the current example is that of a stationary observer looking out at a (possibly) dynamic context.

In the current example embodiment, user 602 uses a back-facing camera 606 in order to take images towards world 600 and away from user 602. A back-facing camera is a device that has a camera facing away from the user. This includes the camera on the back or back of a smartphone. Camera is moved in concave motion 608, so that views 604a, 604b and 604c capture different parts of capture area 609.

“Refer to FIG. “With reference to FIG. 6B, shown below is an example of a convex surround view captured from the back. A locally convex surround view, according to different embodiments, is one where viewing angles meet at a single object. A locally convex surround view may give the viewer the feeling of being orbited around a point. In these cases, the viewer can see multiple sides to the same object. This object could be an “object of interest”. This object, which may be an?object of interest?, can be separated from its surround view to become content. Any surrounding data can also be separated to become context. This type of viewing angle was not recognized by previous technologies.

In the present example embodiment, user 602 uses a back-facing camera 614 in order to capture images towards world 600 and away from user 602. Camera is moved in convex motion 610 so that views 612a, 612b and 612c capture different parts of the capture area 611. The convex motion610 can orbit around an object of interest. Views 612a, 612b, and 612c can show different sides of the object in these examples.

“With reference to FIG. “With reference to FIG. 6C, the following is an example of a concave surround view captured from the front. A front-facing camera is a camera that faces the user as in the example above. Front-facing cameras can be used for taking’selfies’. “Self-portraits” of the user are taken with front-facing cameras.

Camera 620 faces user 602. Camera 606 follows a concave motion 606 so that views 618a, 618b and 618c diverge in an angular sense. The capture area 617 is a concave form that includes the user around a perimeter.

“With reference to FIG. 6D is an example of a convex surround view taken from the front. Camera 626 faces user 602. Camera 626 faces user 602. The camera moves in a convex manner 622 so that views 624a, 624b and 624c all converge towards user 602. There are many modes that can be used to capture images in a surround view, as described above. These modes include locally concave, locally conex and locally linear motions. They can be used for either individual images or continuous recording of a scene. This recording can capture multiple images in a single session.

Next, we will describe details of an augmented-reality system that can be used in image capture for surround views. 7A to FIG. 11B. 11B. In one embodiment, an augmented reality system may be implemented on a mobile device such as a cell telephone. Virtual objects can be added to the live camera data that is displayed on the mobile device. These virtual objects can be embedded into the live camera data. One embodiment of the virtual objects provides feedback to users when images are captured for surround views.

“FIGS. “FIGS.7A and 7B show an example of a flow to capture images in surround views using augmented reality. Live image data can be obtained from a camera system in 702. Live image data, for example, can be received by a camera system on a mobile device such as a smartphone. Image data may include pixel data from a camera sensor. Pixel data can vary from one frame to the next. The pixel data may be 2-D in one embodiment. Other embodiments allow depth data to be added to the pixel data.

“In 704, sensor data is possible to be received. The mobile device may include an IMU that includes accelerometers and/or gyroscopes. You can use the sensor data to determine the orientation of your mobile device. For example, you could calculate the tilt angle relative to the gravity vector. It is possible to determine the orientation of live 2-D data relative the gravity vector. It is possible to separate the accelerations applied by the user from those due to gravity. This allows you to track changes in the position of your mobile device over time.

“In certain embodiments, it is possible to determine a camera reference frame. The camera reference frame has one axis that is perpendicular with the camera lens. The camera reference frame can also be linked to an Earth reference frame by using an accelerometer. An earth reference frame is a 3-D coordinate system that aligns one of the axes with the Earth’s gravitational vector. The relationship between the Earth reference frame and camera frame can be expressed as yaw/roll and tilt/pitch. Most often, at least two of these three sensors, namely the accelerometers or gyroscopes on mobile devices, are available.

The combination of the yaw-roll tilt information from sensors such as smart phones or tablets accelerometers, and data from the camera including pixel data can be used for relating the 2-D arrangement of pixels in the camera field to the 3-D reference frames in the real world. One embodiment allows the 2-D data from each picture to be converted to a reference frame by drawing a line through the center lens perpendicularly to the lens’ surface. This reference frame is also known as the Earth reference frame. This calibration of the pixels data allows a curve or an object to be mapped to the plane associated with the 2-D pixel data using 3-D space. If depth data is available (i.e. the distance from the camera to a pixels), This information can be used in a transformation.

Alternate embodiments don’t require the Earth reference frame to define the 3-D reference frame where an object is defined. A 3-D reference is one in which an object can be drawn and then rendered into a 2-D pixel frame. This reference can be relative to the Earth frame. Another embodiment allows for a 3-D reference to be created relative to an object or surface that is identified in the pixeldata. The pixel data can then be calibrated to this 3-D frame.

“Example: A number of tracking points can be used to define the surface or object. The camera can then move using the sensor data, and the new position of the track points, to determine if there is a change in orientation from frame to frame. This information can then be used to render virtual data as live image data or surround views.

“Returning back to FIG. 7A, in 706, virtual information associated with a target may be generated from the live image data. Crosshairs can be used as a target. The target can be made in any shape, or combination of shapes. One embodiment allows the user to adjust the position of a target using an input interface. The user might be able, for example, to place the target in a specific location on the synthetic image by touching a touch screen that is placed over the display where the live image data is displayed. A synthetic image may include live image data combined with one or more virtual objects.

“For example, the target could be placed over an object in the image such as a person or face. An interface allows the user to provide additional input, such as indicating that the target is at a particular location. The user can tap on the touch screen to point at the target. You can then select an object from the image below the target. Another example is that the microphone can be used in the interface to receive voice commands. These commands direct the target’s position in the image. ”

“Object recognition is possible in certain cases. Object recognition is a way to identify objects within an image. Live images can then be enhanced with indicators such as targets to mark identified objects. The image can also be used to mark objects such as cars, people, and parts of people (e.g. faces). The interface may allow the user to choose one of the marked objects. Another embodiment allows the user to use voice commands to select an object. The person might say, “select face,” or “select car.” Or?select car.

“In 708, you can receive the object selection. To identify tracking points, the object selection can be used. If the image data area is larger than a target, tracking points can be used to associate the object with the live image data.

Tracking points that are related to an object can be identified in 710. Once an object has been selected, tracking points can be identified frame by frame. If the camera moves or translates, the tracking points can be identified. The target can then be rendered in live images so that the object appears to be hovering over it. We will discuss this feature in greater detail below.

“In 712, a 3D coordinate system can be associated in the physical universe with the image. This can, as mentioned above, be related to the camera reference frame associated the 2-D pixels data. One embodiment allows the calibration of 2-D image data so that the 3-D coordinate system associated with the selected target is anchored to it.

“A 2-D or 3-D trajectory, or path, can then be created in 714 using the 3-D coordinate systems. A trajectory, or path, such a parabola or an arc, can be mapped onto a drawing plane that is perpendicular the Earth’s gravity vector. Based on the orientation of your camera (e.g. information from an IMU), you can map the camera’s reference frame to the Earth reference frame. This includes the 2-D pixel data. This mapping can be used for rendering the 3-D curve from the live data into 2-D pixel data. A synthetic image can then be created from the live data and the virtual object (the trajectory or path) and output to a display.

Virtual objects can be described in 3-D coordinate systems, such the Earth reference frame, or any other coordinate system that is related to the orientation of the camera. To create a synthetic object, virtual objects can then be rendered into the 2-D pixels data associated with live image data. You can output the synthetic image to a display.

“In one embodiment, curves or surfaces may be associated with a 3D model of an object such as a person or car. Another embodiment allows the surfaces or curves to be associated with text. A text message can then be embedded into the live image data. Other embodiments allow for textures to be applied to surfaces of the 3-D model. These textures can be added to the 2-D pixel data of the live image when a synthetic image has been created.

“When a curve in 3-D coordinates is drawn on a drawing plane, such as the Earth referenceframe, one or more of its determined tracking points can then be projected onto that drawing plane. Another example is to project onto the drawing plane a centroid that has been tracked. The curve can then be defined relative one or more points projected onto a drawing plane. A point on the drawing plane can be defined based on the target location. This allows you to use the point as the center of any circle or arc that has been drawn in the drawing plan.

“Based on the associated coordinate systems, 714 allows you to render a curve into the live data. This is part of the AR system. One or more virtual objects can be rendered into live image data, including multiple lines, curves or surfaces. In real-time, the synthetic image can include the live data and virtual objects.

“In one embodiment, the virtual objects rendered into the live data can be used by a user to capture images to create surround views. The user may indicate that they wish to create a surround view from a real object in the live data. You can choose to have the surround view from any angle, including forty-five, ninety or one hundred eighty degrees. A virtual object can then be created and used as a guide. The guide is then inserted into the live data. A guide can indicate the direction along which the camera should be moved and the progress made along that path. Inserting the guide may involve changing the pixel data in live image data to conform to coordinate system 712.

The real object in the above example can be any object that appears in the live data. A 3-D model of the object may not be necessary. Instead, the 2-D pixel data can be used to associate the real object with pixel locations and pixel areas. This definition of the actual object is far more computationally efficient than trying to create a 3-D model. Furthermore, it is possible to avoid transformations such as projections between image space or a 3-D coordinate scheme, which reduces computational costs.

“Virtual objects such as lines and surfaces can be modelled in 3-D space. It is possible to define the virtual objects a priori. The virtual objects can be created in advance, so they don’t need to be built in real time. This is computationally expensive. It is not possible to know the real objects that may be shown in an image. Unfortunately, it is not possible to create 3-D models of the actual object. The synthetic image may contain?real? objects. objects that are not defined in 2-D space. This is done by assigning tracking points or areas to real objects and virtual objects. The live image data is then rendered in 3-D coordinates.

“Returning back to FIG. 7A can output an AR image with one or several virtual objects. You can receive the pixel data from the live image data at a specific frame rate. The augmented frames can be output at a similar frame rate to what it received in certain embodiments. It can also be output at a lower frame rate in other embodiments. A reduced frame rate can reduce computation requirements. Live data can be received at 30 frames per seconds and output at 15 frames each second. Another embodiment allows AR images to be output at a lower resolution such as 240p instead of the standard 480p. Reduced resolution can be used to lower computational requirements.”

“In 718, one to several images can be selected from live image data and stored for future use in a surround-view. One or more virtual objects can be stored in some embodiments. The virtual objects can then be integrated into the surround view. The virtual objects can be output only as part of other AR systems. The surround view image data may not contain the virtual objects.

“In other embodiments, the AR system allows for the storage of a portion the virtual objects that are output to the display. The AR system can be used, for example, to render a surround view image and then render a label. The surround view image data may include the label. The guide may not be saved. A copy of the guide may be required to store the images and the virtual objects. You can make a copy of the images and modify the virtual data. The original can then be output to a display. Or, you can store the original before it is modified.

“In FIG. “In FIG. 7A can be continued. New image data can be received in 722. New IMU data, or, more generally, sensor data, can be received in 724. IMU data can be used to indicate the current orientation of a camera. The location of tracking points that were identified in previous image data 726 can be identified in the current image data.

“The camera could have moved or tilted. The tracking points could appear in a different place in the pixel data. The tracking points can be used as a way to identify a live object in the image data. The tracking points can be used to track the object from one image to another by identifying their location in the new data. Knowing the frame rate and the differences in IMU data can help you determine the location of tracking points in live image data.

Tracking points that are associated with real objects appearing in live image data can change over time. Some tracking points on the real objects may become obscured as the camera moves around them. In 726, it is possible to determine if a tracking point remains visible in an image. A determination can also be made about whether a new section of the target object is visible. To allow the tracking of the object in real time from frame to frame, new tracking points can be added.

“In 728, you can associate the image with a coordinate system. The pixel data can then be calibrated to an Earth referenceframe using the orientation of the camera as determined by the sensor data. Based on the current tracking points placed on the object, and the coordinate system, 730 can determine the target location. You can place the target over the object, which is tracked using live image data. As mentioned above, the number and location of tracking points in an image can change with time due to changes in the camera’s position relative to the camera. The 2-D pixel data may show a target that is not located in its original location. The live image data can also include a virtual object that represents the target.

“In 732, you can determine the track location in live image date. Tracks can be used to provide feedback regarding the position and orientation of cameras in physical space during image capture for surround views. The track can be drawn in a drawing plane perpendicular or parallel to the gravity vector. The track can also be rendered relative to the position of the target. This is a virtual object placed over an actual object in the live data. The track can surround or partially surround an object. The current tracking points that are associated with the object in the image can be used to determine the target’s position. You can project the target’s position onto the chosen drawing plane.

“In 734, you can determine the status of your capture indicator. This indicator provides feedback on how much of the surround view image data has been captured. The status indicator might indicate that only half of the images required for a surround view have been captured. Another embodiment of the status indicator can be used to give feedback on whether the camera is following a preferred path or maintaining a desired orientation within physical space. The status indicator can indicate whether the current orientation or path of the camera is desirable. If the current orientation or path of the camera is undesirable, the status indicator can be set to indicate the type of correction that is required, including but not limited to, moving the camera slower, starting the capture over, tilting it in a specific direction, and/or translating it in a particular direction.

“In 736, you can determine the location of the capture indicator. This location can be used for rendering the capture indicator in the live image and generating the synthetic image. One embodiment allows the location of the capture indicator to be determined relative to the actual position of an object in the image. This can be done using the current set tracking points (such as to the left and above the object). A synthetic image can be created in 738. This is a live image that has been enhanced with virtual objects. The synthetic image can contain the target, track, and one or more status indicator at their respective locations. Image data can be captured in 740 for use in surround views. The stored image data may be either raw image data or virtual objects, as described above.

“In 742, it is possible to check whether the images required to create a surround-view have been captured according to the selected parameters. For example, a surroundview that spans a desired angle range. If the capture fails to complete, additional image data can be received. The method could return to 722. Once the capture is complete, the virtual object can be rendered in the live data. This indicates that the surround view has been completed. A surround view can also be created. It is possible that some virtual objects created during the capture process will cease to exist. The live image data may not contain the track that was used to guide the camera during capture.

“FIGS. 8A and 8B show aspects of creating an Augmented Reality (AR), image capture track to capture images used in a surroundview. FIG. FIG. 8A shows a mobile device 814 that has a display 816. A mobile device may include at least one camera (not illustrated) with a field view 800. The field of view 800 is used to select a real object 802, which can be a person. To help you select the object real, a virtual object (not shown) may have been used. The target may have been placed on the touch screen of the mobile device 814 and selected.

“The camera may include an image sensor that captures light in the 800 field of view. Data from the image sensor can then be converted into pixel data. To create a synthetic image, the pixel data can also be modified before it is output to display 816. Modifications can include rendering virtual objects from the pixel data in an augmented reality system (AR).

“Using the pixel data,/or and a selected object 802, tracking points can be determined on the object. These tracking points can be used to define the object in image space. These are the locations of current tracking points such as 805, 806, and 808, that can be attached to object 802. The captured pixel data may show changes in the shape and location of the object 802, depending on how the camera is oriented. The location of the tracking points can also change in the pixel data. A previously identified tracking point may move from one location in the image data, to another. A tracking point may also disappear from an image if certain objects are obscured.

“Using sensor data from the 814 mobile device, an Earth reference framework 3-D coordinate system 804 is possible to be associated with the image. Arrow 810 indicates the direction of gravity vector. The arrow 810 indicates the direction of the gravity vector. In a particular embodiment, it is possible to calibrate the 2-D image data relative to the Earth reference frames. The live image data does not include the arrow indicating the gravity vector. An indicator representing the gravity vector could be added to the synthetic image if necessary.

It is possible to determine the location of a plane that is perpendicular with the gravity vector. You can determine the location of the plane using the tracking points shown in the image (805, 806 or 808). This information can be used to draw a curve in the plane. You can convert the circle to 2-D image data, and then output it as part of the AR system. The circle appears to surround object 802. The circle can be used to guide images taken in surround views.

“If the camera 814 is turned in any way, such as tilted or rotated, the object’s shape will change on display 816. The new orientation can be determined in space, including the direction of gravity vector. It is possible to determine a plane that runs parallel to the gravity vector. A centroid of an object can be used to determine the position of the plane, and thus the position of the curve in an image. This is determined using tracking points 802. The curve may appear parallel to the ground. This is because the camera 814 moves. The curve’s position can change in live images as the object or its shape changes.

“In FIG. 8B shows a mobile device 834 that includes a camera (not illustrated) and a display 836 to output the image data from the camera. In the field of view of camera 802, a cup 822 can be seen. The object 822 has been linked to tracking points 824 and 826. These tracking points can be used to define object 822 in image space. An Earth reference frame was created using the IMU data 834 from the mobile device. In one embodiment, the pixel data may be calibrated to match the Earth reference frame, as described above. The 3-D axes 824 indicate the Earth reference frame, while the direction of gravity vector is indicated with arrow 828.

“A plane can be found relative to the Earth reference frames, as described above. The plane in this example is perpendicular to gravity vector’s axis. This plane is used for defining a path to the surround view that goes above the object 830. Any plane can be identified in the Earth reference frame. The curve that is used as a guide can then be rendered into the chosen plane.

“Using the locations and coordinates of the tracking points, one embodiment can determine the centroid of object 822 on the chosen plane in the Earth reference. An 830 curve, such as a circle can be drawn relative to the centroid. This example shows how a circle can be drawn around an object 822 in the chosen plane.

The curve 830 can be used to guide the camera along a specific path. Images taken along this path can then be converted into surround views. One embodiment allows the determination of the position of the camera on the path. An indicator that indicates the current position of the camera on the path can then be created. This example shows the arrow 832 indicating current location.

The camera’s position along the path may not correspond to the physical space. The IMU data can be used to estimate the distance traveled by the camera and, optionally, the frame rate. You can map the estimated distance to the distance traveled along the curve. However, the distance along the path 830 cannot be equal to the distance in physical space. Another example is to estimate the total time it takes to travel the path 830. Then, the time that images were recorded can then be tracked. To indicate the progress of the path 830, the ratio between the recording time and the total time can also be used.

“The path 830 which is an arc and the arrow 832 are rendered in the live image data in virtual objects according to their positions in the 3-D coordinate systems associated with the live 2-D data. Display 836 will display the cup 822, circle 830, and 832 arrow. If the orientation of display 836 is altered relative to cup 822, the orientations of the curve 830 or arrow 832 can be changed.

“In particular embodiments, the size of the object 822 can be modified in the image data. A digital zoom can make the object larger or smaller. Another example is the ability to move the camera (e.g. on a mobile device 834) closer or further away from an object 822 to increase or decrease the object’s size.

The distance between tracking points can change as the object’s size changes. This means that the distance between tracking points and the objects can change. You can use distance changes to determine a scaling factor. One embodiment can scale the AR system 830 or arrow 832 as the object’s size changes. This allows for the curve to be kept at a constant size relative to the object.

In another embodiment, the size of the curve may be fixed. A diameter of the curve, for example, can be related to the image’s pixel height, width, or both. For example, 80 percent of the image’s pixel height, or width, could be used. The object 822 may appear to shrink or grow depending on whether the zoom is used or the position of the camera. The image’s curve 830 can be relatively small.

“FIG. “FIG. 9 shows a second example of how to generate an Augmented Reality image capture track for photographing images used in a surroundview on a mobile device. FIG. FIG. The display shows images at three times 906a, 906b and 906c. A touch screen can overlay the display.

“In 906a, an image representing an object 908 will be output to the display at state 906a. The object is a rectangular rectangle box. Live image data can be output to the display from the mobile device’s camera. Remote cameras could also be used as the camera.

“A target such as 910 can be rendered to the display in one embodiment. To create a synthetic image, the target and live image data can be combined. The input interface on the smartphone allows users to adjust the position of the target. You can place the target on an object, and then you can input additional information to select the object. You can tap on the target’s location to activate the touch screen.

“Object recognition can also be applied to live image data in another embodiment. Multiple markers can be added to the display that indicate the location of identified objects in live image data. The touchscreen can be tapped on the location of a marker in the image to select an object.

“After selecting an object, a number initial tracking points can also be identified on the object such as 912-914 or 916. The tracking points might not be visible on the display in one embodiment. The tracking points can be rendered to the screen in another embodiment. If the tracking points are not on the object of interest in one embodiment, the user can select them and move or delete them so that they lie on the object.

“Next, the orientation of your mobile device can be changed. An orientation can be a rotation of the device through one or more angles, as well as translational motion. You can capture the IMU data from IMU902 on your device to see the orientation change and current orientation.

“As the device’s orientation changes, one or more tracking points can be occluded, such as 912-914 or 916. You can also change the appearance of surfaces in an image. You can determine movement at different pixel locations based on the changes in frames. Surfaces associated with object 908 can easily be predicted using the IMU data. As the camera position changes, new surfaces may appear in the image. These surfaces can have new tracking points.

Summary for “Live Augmented Reality Guides”

“We will now refer to specific examples of the invention, including the best methods that the inventors considered for carrying out their invention. The accompanying drawings show examples of specific embodiments. Although the disclosure is made in connection with the specific embodiments described, it should be understood that the invention is not limited to these embodiments. It is, however, intended to include all modifications and equivalents that may be within the scope and spirit of the invention, as described in the appended claims.

“The following description provides a detailed understanding of the invention. Some embodiments of this invention can be implemented without all or some of these details. Other instances of well-known process operations are not described in detail to avoid confusing the present invention.

“Various aspects of this invention concern systems and methods that analyze the spatial relationship among multiple images and videos together with location data. This creates a single representation called a surround view. It eliminates redundancy and provides a user an immersive and interactive active viewing experience. Active is used in various embodiments to allow a user to change the view of the visual information on a screen.

“In particular, AR is used in example embodiments to assist a user in capturing multiple images in a surround view. A virtual guide can be embedded into live image data on a mobile. A virtual guide is a tool that allows the user to guide the mobile device along the desired path for creating surround views. AR images have the ability to respond to the movements of the mobile phone. You can determine the movement of your mobile device from many sources, including an Inertial Measurement Unit or image data.

“A surround view, according to different embodiments of the invention, is a multi-view interactive media representation. Referring to FIG. FIG. 1 shows an example of a surround-view acquisition system 100. The surround view acquisition system 100 in the present example embodiment is shown in a flow sequence that can generate a surround image. According to different embodiments, data that is used to create a surround view can be derived from many sources.

To create a surround view, data like, but not limited, to 2D images 104 can be used. These 2D images may include color image streams, such as multiple sequences of images, video data, etc. or multiple images in any number of formats, depending on the application. With respect to FIGS. An AR system can be used during image capture. AR systems can augment live image data by incorporating virtual data. Virtual data can contain guides that allow users to direct the movement of an image-capture device.

Environment information 106 is another source of data that can help to create a surround view. This environment information 106 is available from various sources, including accelerometers and gyroscopes as well as magnetometers, GPS, WiFi and IMU-like systems (Inertial Measurement unit systems). Depth images are another source of data that could be used to create a surround view. These depth images include 3D or depth image data streams and can be captured using devices such as stereo cameras, time of flight cameras, three-dimensional cameras and the like.

“In the current example embodiment, the data may then be fused at sensor fusion block 110. Some embodiments allow for a surround view to be created using a combination data set that includes 2D images 104, environment information 106 and depth images 108. Other embodiments allow depth images 108 to be combined with environment information 106 at sensor fusion block 110. You can combine different combinations of image data with the environment information at 106 depending on your application and data.

“In the current example embodiment, the data fused at sensor fusion block 110 can then be used for content modeling 112 or context modeling 114. FIG. 4 The subject matter in the images can be divided into context and content. The context can be defined as the scene surrounding the object or interest. The content may be a three-dimensional model depicting an object, but in certain embodiments it can also be a 2-dimensional image. 4. In some embodiments, the context may be a two-dimensional representation of the scenery around the object. The context may provide views in two dimensions of the scenery around the object of curiosity in many cases, but some embodiments allow for three-dimensional elements. The context could be described as a “flat” image. Image along a cylindrical canvas,? so that the image appears flat. The image is projected on the surface of a cylindrical. Some examples also include three-dimensional context modeling, which is useful when objects are identified as such in the surroundings. The models generated by content modeling 112 or context modeling 114 can be combined with the location information data. 3.”

According to different embodiments, the context and content of surround views are determined based upon a specific object of interest. Some examples show that an object of interest can be selected automatically based on the processing of image data and location information. If a dominant object is identified in a set of images, it can be chosen as the content. As shown in FIG. 2, a user-specified target 102 could be selected. 1. However, it should be noted that surround views can be created in certain applications without the need for a target to be specified by the user.

“In the current example embodiment, one (or more) enhancement algorithms can be applied to enhancement algorithm(s), block 116. In certain embodiments, different algorithms can be used to capture surround view data regardless of the capture mode. These algorithms can be used for user enhancement. Automatic frame selection, stabilization and view interpolation can all be used to capture surround view data. These enhancement algorithms may be applied to image data following acquisition. These enhancement algorithms can also be applied to image data captured with surround view data.

“Automatic frame selection can be used, according to certain example embodiments to create a more pleasant surround view. Frames are automatically chosen so that the transition between them is smoother or more even. This automatic frame selection can incorporate blur- and overexposure-detection in some applications, as well as more uniformly sampling poses such that they are more evenly distributed.”

Stabilization can be used in some embodiments to create surround views that look similar to video. To improve the quality of surround views, such as smoother transitions and enhanced focus on the content, keyframes can be stabilized. But, unlike video, surround views can be stabilized using IMU information and depth information. Computer vision techniques, direct selections of stabilized areas, face detection, and other methods are all available.

IMU information, for example, can be extremely helpful in stabilization. IMU information can be used to estimate the camera tremor, even though it may not be accurate or reliable. This information can be used to cancel, reduce, or remove the effects of camera tremor.

“Some examples show depth information that can be used, if possible, to stabilize a surround view. These points of interest in surround views are three-dimensional rather than two-dimensional. This makes it easier to track and match these points as the search space shrinks. Descriptors of points of interest can be used to describe them using both depth and color information, making them more discriminative. It is possible to provide depth information for automatic or semi-automatic content selecting. A user can select a specific pixel from an image and have it expanded to cover the entire surface. Furthermore, content can also be selected automatically by using a foreground/background differentiation based on depth. The content may be visible and can remain stable in certain situations.

Computer vision techniques can be used to stabilize surround views, according to many examples. Keypoints, for example, can be identified and tracked. There are certain scenes that cannot be stabilized by a simple warp, such as dynamic scenes or static scenes with parallax. There is a compromise: certain scenes get more attention for stabilization, while other parts of the scene are less. A surround view often focuses on one object of interest. It can be content-weighted to ensure that the object is maximumly stabilized in certain examples.

Direct selection of a specific region on a screen is another way to increase stabilization in surround views. If a user taps on a specific area of the screen to focus, and then records a convex surround image, that region can be maximally stabilized. This allows stabilization algorithms that can be focused on a specific area or object of particular interest.

“Face detection can be used in some cases to stabilize the scene. When recording with a front-facing cam, it is likely that the subject of interest is the user. Face detection can be used for weight stabilization in that area. If face detection is accurate enough, facial features (such as the eyes, nose, or mouth) can be used to stabilize areas. This is in contrast to using generic keypoints. Another example is that a user can choose an area of an image to be used as a source of keypoints.

“View interpolation can be used to enhance the viewing experience, according to various examples. To avoid sudden “jumps”, interpolation is particularly useful. Between stabilized frames, synthetic intermediate views can be rendered quickly. This information can be provided by content-weighted keypoint tracks, IMU information, and denser pixels-to-pixel matches. The process may be simplified if depth information is available. In some embodiments, view interpolation may be used during surround view capture. Other embodiments allow view interpolation to be applied during surround-view generation.

“Filters can be used to enhance the viewing experience, such as when capturing or creating a surround view. A lot of popular photo sharing sites offer aesthetic filters that can be applied only to static images. However, surround images can also be used. Because a surround view is more expressive than a 2-dimensional image and there is more information in a 3-D surround view, filters can be extended to include effects not possible in 2-D photos. In a surround view, motion blur can also be added to the background (e.g. While the content is still clear, motion blur can be added to the background (i.e. Another example is to add a drop shadow to the object of concern in a surround view.

“Compression can be used in various ways as an enhancement algorithm. Compression can be used to improve user experience by reducing data download and upload costs. Surround views can send far less data than normal videos, but retain the desired characteristics of surround views. The IMU, keypoint tracks and user input can all be combined with the view interpolation described earlier to reduce the data that must transfer from and to a device during the upload or downloading of surround views. A variable compression style for content and context can be used if the object of interest can easily be identified. Variable compression styles can have lower quality resolutions for background information (i.e. context) and a higher quality resolution for the foreground (i.e. Some examples show how to reduce the amount of data transmitted. These examples show how data can be reduced without compromising the context quality and still maintaining the desired content quality.

“In the present embodiment, a surroundview 118 is generated after any enhancement algorithms have been applied. A surround view can be used to represent multi-view interactive digital media. The surround view can be a three-dimensional model for the content or a two-dimensional model for the context. In some cases, however, the context may be a flat? The context can be viewed as the background or scenery projected along a surface such as a cylindrical surface or another-shaped surface. This means that the context is more than two-dimensional. Another example is that the context may include three-dimensional aspects.

According to different embodiments, surround view have many advantages over traditional videos or two-dimensional images. These include the ability for surround views to deal with moving scenery, or both, the ability remove redundant information, and the ability for users to modify the view. The above-described characteristics can be integrated natively into the surround view representation and are available for use in many applications. Surround views can be used in many fields, such as ecommerce, visual search and file sharing. They also allow for user interaction and entertainment.

According to different examples, after a surround view 118 has been generated, user feedback can be given for acquisition 120 of additional data. If a surround view requires additional views to accurately represent the context or content, users may be asked to provide additional views. These additional views will be received by the surroundview acquisition system 100. They can then be processed by system 100 and integrated into the surroundview.

“With reference to FIG. 2 shows an example of a flow diagram that can be used to generate a surround view 200. The present example shows 202 as the location where a plurality images are obtained. According to different embodiments, the plurality can include 2D images or data streams. These 2D images may contain location information which can be used for creating a surround view. As described in FIG. 1. You can include location information with depth images in different examples.

“As described in greater detail with regard to FIGS. 7A-11B allows for the enhancement of the images that are captured with a plurality. The plurality can be captured with a camera system on a smartphone. Live image data can be output to a mobile device’s display. It may also include virtual data such as status indicators and guides. Guides can be used to guide the user through the motion of their mobile device. Status indicators indicate the amount of images that have been captured to generate a surround view. Virtual data may not be included within the image data that was captured to generate the surround view.

According to different embodiments, the plurality images obtained at number 202 may include a variety sources and characteristics. The plurality can be obtained from multiple users. These images could be a collection from different users, such as video or 2D images, taken from the internet. The plurality of images may include images with different time information. The images can be taken at different times with the same object. Multiple images of the same statue, for example, can be taken at different times of the day or during different seasons. Another example is that the plurality can be used to represent moving objects. The images could include an object of particular interest that is moving in the scenery, such a car traveling on a road or a plane flying through the sky. Other instances include images that show an object of interest moving in motion, such as someone running, dancing, twirling, or a vehicle traveling along a road.

“In the current example embodiment, the plurality images are fused into content models and context models at 204. According to different embodiments, subject matter in images can be divided into context and content. The context can be defined as the scene surrounding the object or interest. In some embodiments, the content may be a three-dimensional model depicting an object, while in others it can be a 2-dimensional image.

“According the present example embodiment, one to several enhancement algorithms can applied to the context and content models at 206. These algorithms can be used for enhancing the user experience. These algorithms can be used to enhance the user experience, such as automatic frame selection and stabilization, view-interpolation, filters and/or compression. These enhancement algorithms may be applied to images during capture. These enhancement algorithms can also be applied to image data following acquisition.

“In the present embodiment, the surround view is generated using the context and content models at 208. A surround view can be used to create a multi-view interactive digital multimedia representation. The surround view may include both a three-dimensional model for the content and a model of its context. Depending on the method of capture and the views of the images, certain characteristics can be included in the surround view model. There are three types of surround views: a locally concave, locally convex, and a locally plain. It is important to note that surround views can be combined with different characteristics depending on the application.

“With reference FIG. “With reference to FIG. Multiple images can be captured at different viewpoints and combined to create a surround view, according to many embodiments. Three cameras 312, 314, 316 are located at 322 and 324 respectively and close to the object of interest 308. The object of interest 308 can be surrounded by scenery, such as object 310. The subject matter overlaps in the views 302, 304 and 306 taken from their respective cameras 312, 314 and 316. Each view 302, 304, and 306 include the object of interest 308 as well as varying degrees of visibility in the surrounding scenery 310. View 302 shows the object of interest 308 located in front of the cylindrical. This view is part of the scenery surrounding object 310. View 306 shows 308 of the object of curiosity to one side of cylinder. View 304 shows 308 of the object, but no view of the cylindrical.

The present embodiment shows the views 302, 304 and 316, along with their locations 322, 324 and 326, each, offer rich information about the object of interest 308 as well as the surrounding context, which can be used to create a surround view. The various views 302, 326 and 304 provide information about the different sides of the object and the relationship it has with the scenery when they are analyzed together. This information can be used, according to different embodiments, to separate the object of interest 308 into content and the setting as the context. As described in FIGS. These viewpoints can produce images that are immersive and interactive.

“FIG. “FIG. A surround view, according to different embodiments of the invention, is a multi-view interactive media representation of a scene 400. Referring to FIG. 4. The user 402 is located in scene 400. The user 402 is taking images of an object of particular interest, such a statue. The digital visual data captured by the user can be used to create a surround view.

According to different embodiments of the disclosure, digital visual data that is included in surround views can be separated semantically and/or practical into context 404 and content 404. Particular embodiments allow content 404 to include an object, person or scene of interest. The context 406 can then represent the rest of the scene around the content 404. A surround view can represent content 404 in three-dimensional data and the context 406 in two-dimensional panoramic backgrounds. A surround view could also be used to represent the context 406 and content 404 as two-dimensional panoramic scenes. Another example is that context 406 and content 404 may contain three-dimensional components. Particular embodiments differ in the way the surround view depicts context 404 and content 406 depending on the mode of capture used to acquire the images.

The context 406 and content 404 may look identical in some cases, including recordings of objects, people, or parts thereof, recordings of large flat areas and recordings where no subject is visible. These surround views may share some similarities with other digital media, such as panoramas. According to different embodiments, surround view may include additional features that differentiate them from other types of digital media. A surround view may be used to represent moving data, for example. A surround view does not have to be restricted to a particular cylindrical, spherical, or translational movement. You can capture image data using a camera, or any other capture device. A surround view, which is different from a stitched panorama can show different sides of an object.

“FIGS. “FIGS. These views are especially useful when a camera phone has been used. The camera is located on the back of a phone and faces away from the user. Concave and convex views, in particular embodiments can influence how content and context are identified in surround views.

“With reference to FIG. “With reference to FIG. The user is holding a camera so that the camera location 502 doesn’t leave axis 508 when image capture takes place. The camera captures a panoramic view around the user by pivoting about axis 508. This creates a concave view. Because of the way the images were captured, this embodiment shows the object of interests 504 and distant scenery 506 in a similar manner. This example shows that all objects in the concave view appear to be at infinity. The content therefore corresponds to the context.

“With reference to FIG. 5B is an example of a convexview 520 where a user can change his position while taking images of an object 524. The user is moving around the object 524 and taking photos from various angles from the camera locations 528 to 530 and 532. Each image includes a view and background of distant scenery 526. The object of interest 524 is the content. The distant scenery 526 is the context.

“FIGS. “FIGS.6A-6D are examples of different capture modes for surround views. While there are many motions that can be used for capturing surround views, they don’t have to be restricted to one type of motion. However, there are three main types of motion that can be used to capture certain features or views. Each of these three types can produce a locally concave surround, a locally convex surround, or a locally flat surround. A surround view may include multiple types of motions within one surround view.

“With reference to FIG. “With reference to FIG. 6A, the following is an example of a concave surround view captured from the back. A locally concave surround view, according to different embodiments, is one where the viewing angles of the camera and other capture devices diverge. This can be compared to the motion required for capturing a 360-degree spherical panorama (pure rotation), but the motion can also be applied to any curving sweeping motion where the view faces outward. The experience in the current example is that of a stationary observer looking out at a (possibly) dynamic context.

In the current example embodiment, user 602 uses a back-facing camera 606 in order to take images towards world 600 and away from user 602. A back-facing camera is a device that has a camera facing away from the user. This includes the camera on the back or back of a smartphone. Camera is moved in concave motion 608, so that views 604a, 604b and 604c capture different parts of capture area 609.

“Refer to FIG. “With reference to FIG. 6B, shown below is an example of a convex surround view captured from the back. A locally convex surround view, according to different embodiments, is one where viewing angles meet at a single object. A locally convex surround view may give the viewer the feeling of being orbited around a point. In these cases, the viewer can see multiple sides to the same object. This object could be an “object of interest”. This object, which may be an?object of interest?, can be separated from its surround view to become content. Any surrounding data can also be separated to become context. This type of viewing angle was not recognized by previous technologies.

In the present example embodiment, user 602 uses a back-facing camera 614 in order to capture images towards world 600 and away from user 602. Camera is moved in convex motion 610 so that views 612a, 612b and 612c capture different parts of the capture area 611. The convex motion610 can orbit around an object of interest. Views 612a, 612b, and 612c can show different sides of the object in these examples.

“With reference to FIG. “With reference to FIG. 6C, the following is an example of a concave surround view captured from the front. A front-facing camera is a camera that faces the user as in the example above. Front-facing cameras can be used for taking’selfies’. “Self-portraits” of the user are taken with front-facing cameras.

Camera 620 faces user 602. Camera 606 follows a concave motion 606 so that views 618a, 618b and 618c diverge in an angular sense. The capture area 617 is a concave form that includes the user around a perimeter.

“With reference to FIG. 6D is an example of a convex surround view taken from the front. Camera 626 faces user 602. Camera 626 faces user 602. The camera moves in a convex manner 622 so that views 624a, 624b and 624c all converge towards user 602. There are many modes that can be used to capture images in a surround view, as described above. These modes include locally concave, locally conex and locally linear motions. They can be used for either individual images or continuous recording of a scene. This recording can capture multiple images in a single session.

Next, we will describe details of an augmented-reality system that can be used in image capture for surround views. 7A to FIG. 11B. 11B. In one embodiment, an augmented reality system may be implemented on a mobile device such as a cell telephone. Virtual objects can be added to the live camera data that is displayed on the mobile device. These virtual objects can be embedded into the live camera data. One embodiment of the virtual objects provides feedback to users when images are captured for surround views.

“FIGS. “FIGS.7A and 7B show an example of a flow to capture images in surround views using augmented reality. Live image data can be obtained from a camera system in 702. Live image data, for example, can be received by a camera system on a mobile device such as a smartphone. Image data may include pixel data from a camera sensor. Pixel data can vary from one frame to the next. The pixel data may be 2-D in one embodiment. Other embodiments allow depth data to be added to the pixel data.

“In 704, sensor data is possible to be received. The mobile device may include an IMU that includes accelerometers and/or gyroscopes. You can use the sensor data to determine the orientation of your mobile device. For example, you could calculate the tilt angle relative to the gravity vector. It is possible to determine the orientation of live 2-D data relative the gravity vector. It is possible to separate the accelerations applied by the user from those due to gravity. This allows you to track changes in the position of your mobile device over time.

“In certain embodiments, it is possible to determine a camera reference frame. The camera reference frame has one axis that is perpendicular with the camera lens. The camera reference frame can also be linked to an Earth reference frame by using an accelerometer. An earth reference frame is a 3-D coordinate system that aligns one of the axes with the Earth’s gravitational vector. The relationship between the Earth reference frame and camera frame can be expressed as yaw/roll and tilt/pitch. Most often, at least two of these three sensors, namely the accelerometers or gyroscopes on mobile devices, are available.

The combination of the yaw-roll tilt information from sensors such as smart phones or tablets accelerometers, and data from the camera including pixel data can be used for relating the 2-D arrangement of pixels in the camera field to the 3-D reference frames in the real world. One embodiment allows the 2-D data from each picture to be converted to a reference frame by drawing a line through the center lens perpendicularly to the lens’ surface. This reference frame is also known as the Earth reference frame. This calibration of the pixels data allows a curve or an object to be mapped to the plane associated with the 2-D pixel data using 3-D space. If depth data is available (i.e. the distance from the camera to a pixels), This information can be used in a transformation.

Alternate embodiments don’t require the Earth reference frame to define the 3-D reference frame where an object is defined. A 3-D reference is one in which an object can be drawn and then rendered into a 2-D pixel frame. This reference can be relative to the Earth frame. Another embodiment allows for a 3-D reference to be created relative to an object or surface that is identified in the pixeldata. The pixel data can then be calibrated to this 3-D frame.

“Example: A number of tracking points can be used to define the surface or object. The camera can then move using the sensor data, and the new position of the track points, to determine if there is a change in orientation from frame to frame. This information can then be used to render virtual data as live image data or surround views.

“Returning back to FIG. 7A, in 706, virtual information associated with a target may be generated from the live image data. Crosshairs can be used as a target. The target can be made in any shape, or combination of shapes. One embodiment allows the user to adjust the position of a target using an input interface. The user might be able, for example, to place the target in a specific location on the synthetic image by touching a touch screen that is placed over the display where the live image data is displayed. A synthetic image may include live image data combined with one or more virtual objects.

“For example, the target could be placed over an object in the image such as a person or face. An interface allows the user to provide additional input, such as indicating that the target is at a particular location. The user can tap on the touch screen to point at the target. You can then select an object from the image below the target. Another example is that the microphone can be used in the interface to receive voice commands. These commands direct the target’s position in the image. ”

“Object recognition is possible in certain cases. Object recognition is a way to identify objects within an image. Live images can then be enhanced with indicators such as targets to mark identified objects. The image can also be used to mark objects such as cars, people, and parts of people (e.g. faces). The interface may allow the user to choose one of the marked objects. Another embodiment allows the user to use voice commands to select an object. The person might say, “select face,” or “select car.” Or?select car.

“In 708, you can receive the object selection. To identify tracking points, the object selection can be used. If the image data area is larger than a target, tracking points can be used to associate the object with the live image data.

Tracking points that are related to an object can be identified in 710. Once an object has been selected, tracking points can be identified frame by frame. If the camera moves or translates, the tracking points can be identified. The target can then be rendered in live images so that the object appears to be hovering over it. We will discuss this feature in greater detail below.

“In 712, a 3D coordinate system can be associated in the physical universe with the image. This can, as mentioned above, be related to the camera reference frame associated the 2-D pixels data. One embodiment allows the calibration of 2-D image data so that the 3-D coordinate system associated with the selected target is anchored to it.

“A 2-D or 3-D trajectory, or path, can then be created in 714 using the 3-D coordinate systems. A trajectory, or path, such a parabola or an arc, can be mapped onto a drawing plane that is perpendicular the Earth’s gravity vector. Based on the orientation of your camera (e.g. information from an IMU), you can map the camera’s reference frame to the Earth reference frame. This includes the 2-D pixel data. This mapping can be used for rendering the 3-D curve from the live data into 2-D pixel data. A synthetic image can then be created from the live data and the virtual object (the trajectory or path) and output to a display.

Virtual objects can be described in 3-D coordinate systems, such the Earth reference frame, or any other coordinate system that is related to the orientation of the camera. To create a synthetic object, virtual objects can then be rendered into the 2-D pixels data associated with live image data. You can output the synthetic image to a display.

“In one embodiment, curves or surfaces may be associated with a 3D model of an object such as a person or car. Another embodiment allows the surfaces or curves to be associated with text. A text message can then be embedded into the live image data. Other embodiments allow for textures to be applied to surfaces of the 3-D model. These textures can be added to the 2-D pixel data of the live image when a synthetic image has been created.

“When a curve in 3-D coordinates is drawn on a drawing plane, such as the Earth referenceframe, one or more of its determined tracking points can then be projected onto that drawing plane. Another example is to project onto the drawing plane a centroid that has been tracked. The curve can then be defined relative one or more points projected onto a drawing plane. A point on the drawing plane can be defined based on the target location. This allows you to use the point as the center of any circle or arc that has been drawn in the drawing plan.

“Based on the associated coordinate systems, 714 allows you to render a curve into the live data. This is part of the AR system. One or more virtual objects can be rendered into live image data, including multiple lines, curves or surfaces. In real-time, the synthetic image can include the live data and virtual objects.

“In one embodiment, the virtual objects rendered into the live data can be used by a user to capture images to create surround views. The user may indicate that they wish to create a surround view from a real object in the live data. You can choose to have the surround view from any angle, including forty-five, ninety or one hundred eighty degrees. A virtual object can then be created and used as a guide. The guide is then inserted into the live data. A guide can indicate the direction along which the camera should be moved and the progress made along that path. Inserting the guide may involve changing the pixel data in live image data to conform to coordinate system 712.

The real object in the above example can be any object that appears in the live data. A 3-D model of the object may not be necessary. Instead, the 2-D pixel data can be used to associate the real object with pixel locations and pixel areas. This definition of the actual object is far more computationally efficient than trying to create a 3-D model. Furthermore, it is possible to avoid transformations such as projections between image space or a 3-D coordinate scheme, which reduces computational costs.

“Virtual objects such as lines and surfaces can be modelled in 3-D space. It is possible to define the virtual objects a priori. The virtual objects can be created in advance, so they don’t need to be built in real time. This is computationally expensive. It is not possible to know the real objects that may be shown in an image. Unfortunately, it is not possible to create 3-D models of the actual object. The synthetic image may contain?real? objects. objects that are not defined in 2-D space. This is done by assigning tracking points or areas to real objects and virtual objects. The live image data is then rendered in 3-D coordinates.

“Returning back to FIG. 7A can output an AR image with one or several virtual objects. You can receive the pixel data from the live image data at a specific frame rate. The augmented frames can be output at a similar frame rate to what it received in certain embodiments. It can also be output at a lower frame rate in other embodiments. A reduced frame rate can reduce computation requirements. Live data can be received at 30 frames per seconds and output at 15 frames each second. Another embodiment allows AR images to be output at a lower resolution such as 240p instead of the standard 480p. Reduced resolution can be used to lower computational requirements.”

“In 718, one to several images can be selected from live image data and stored for future use in a surround-view. One or more virtual objects can be stored in some embodiments. The virtual objects can then be integrated into the surround view. The virtual objects can be output only as part of other AR systems. The surround view image data may not contain the virtual objects.

“In other embodiments, the AR system allows for the storage of a portion the virtual objects that are output to the display. The AR system can be used, for example, to render a surround view image and then render a label. The surround view image data may include the label. The guide may not be saved. A copy of the guide may be required to store the images and the virtual objects. You can make a copy of the images and modify the virtual data. The original can then be output to a display. Or, you can store the original before it is modified.

“In FIG. “In FIG. 7A can be continued. New image data can be received in 722. New IMU data, or, more generally, sensor data, can be received in 724. IMU data can be used to indicate the current orientation of a camera. The location of tracking points that were identified in previous image data 726 can be identified in the current image data.

“The camera could have moved or tilted. The tracking points could appear in a different place in the pixel data. The tracking points can be used as a way to identify a live object in the image data. The tracking points can be used to track the object from one image to another by identifying their location in the new data. Knowing the frame rate and the differences in IMU data can help you determine the location of tracking points in live image data.

Tracking points that are associated with real objects appearing in live image data can change over time. Some tracking points on the real objects may become obscured as the camera moves around them. In 726, it is possible to determine if a tracking point remains visible in an image. A determination can also be made about whether a new section of the target object is visible. To allow the tracking of the object in real time from frame to frame, new tracking points can be added.

“In 728, you can associate the image with a coordinate system. The pixel data can then be calibrated to an Earth referenceframe using the orientation of the camera as determined by the sensor data. Based on the current tracking points placed on the object, and the coordinate system, 730 can determine the target location. You can place the target over the object, which is tracked using live image data. As mentioned above, the number and location of tracking points in an image can change with time due to changes in the camera’s position relative to the camera. The 2-D pixel data may show a target that is not located in its original location. The live image data can also include a virtual object that represents the target.

“In 732, you can determine the track location in live image date. Tracks can be used to provide feedback regarding the position and orientation of cameras in physical space during image capture for surround views. The track can be drawn in a drawing plane perpendicular or parallel to the gravity vector. The track can also be rendered relative to the position of the target. This is a virtual object placed over an actual object in the live data. The track can surround or partially surround an object. The current tracking points that are associated with the object in the image can be used to determine the target’s position. You can project the target’s position onto the chosen drawing plane.

“In 734, you can determine the status of your capture indicator. This indicator provides feedback on how much of the surround view image data has been captured. The status indicator might indicate that only half of the images required for a surround view have been captured. Another embodiment of the status indicator can be used to give feedback on whether the camera is following a preferred path or maintaining a desired orientation within physical space. The status indicator can indicate whether the current orientation or path of the camera is desirable. If the current orientation or path of the camera is undesirable, the status indicator can be set to indicate the type of correction that is required, including but not limited to, moving the camera slower, starting the capture over, tilting it in a specific direction, and/or translating it in a particular direction.

“In 736, you can determine the location of the capture indicator. This location can be used for rendering the capture indicator in the live image and generating the synthetic image. One embodiment allows the location of the capture indicator to be determined relative to the actual position of an object in the image. This can be done using the current set tracking points (such as to the left and above the object). A synthetic image can be created in 738. This is a live image that has been enhanced with virtual objects. The synthetic image can contain the target, track, and one or more status indicator at their respective locations. Image data can be captured in 740 for use in surround views. The stored image data may be either raw image data or virtual objects, as described above.

“In 742, it is possible to check whether the images required to create a surround-view have been captured according to the selected parameters. For example, a surroundview that spans a desired angle range. If the capture fails to complete, additional image data can be received. The method could return to 722. Once the capture is complete, the virtual object can be rendered in the live data. This indicates that the surround view has been completed. A surround view can also be created. It is possible that some virtual objects created during the capture process will cease to exist. The live image data may not contain the track that was used to guide the camera during capture.

“FIGS. 8A and 8B show aspects of creating an Augmented Reality (AR), image capture track to capture images used in a surroundview. FIG. FIG. 8A shows a mobile device 814 that has a display 816. A mobile device may include at least one camera (not illustrated) with a field view 800. The field of view 800 is used to select a real object 802, which can be a person. To help you select the object real, a virtual object (not shown) may have been used. The target may have been placed on the touch screen of the mobile device 814 and selected.

“The camera may include an image sensor that captures light in the 800 field of view. Data from the image sensor can then be converted into pixel data. To create a synthetic image, the pixel data can also be modified before it is output to display 816. Modifications can include rendering virtual objects from the pixel data in an augmented reality system (AR).

“Using the pixel data,/or and a selected object 802, tracking points can be determined on the object. These tracking points can be used to define the object in image space. These are the locations of current tracking points such as 805, 806, and 808, that can be attached to object 802. The captured pixel data may show changes in the shape and location of the object 802, depending on how the camera is oriented. The location of the tracking points can also change in the pixel data. A previously identified tracking point may move from one location in the image data, to another. A tracking point may also disappear from an image if certain objects are obscured.

“Using sensor data from the 814 mobile device, an Earth reference framework 3-D coordinate system 804 is possible to be associated with the image. Arrow 810 indicates the direction of gravity vector. The arrow 810 indicates the direction of the gravity vector. In a particular embodiment, it is possible to calibrate the 2-D image data relative to the Earth reference frames. The live image data does not include the arrow indicating the gravity vector. An indicator representing the gravity vector could be added to the synthetic image if necessary.

It is possible to determine the location of a plane that is perpendicular with the gravity vector. You can determine the location of the plane using the tracking points shown in the image (805, 806 or 808). This information can be used to draw a curve in the plane. You can convert the circle to 2-D image data, and then output it as part of the AR system. The circle appears to surround object 802. The circle can be used to guide images taken in surround views.

“If the camera 814 is turned in any way, such as tilted or rotated, the object’s shape will change on display 816. The new orientation can be determined in space, including the direction of gravity vector. It is possible to determine a plane that runs parallel to the gravity vector. A centroid of an object can be used to determine the position of the plane, and thus the position of the curve in an image. This is determined using tracking points 802. The curve may appear parallel to the ground. This is because the camera 814 moves. The curve’s position can change in live images as the object or its shape changes.

“In FIG. 8B shows a mobile device 834 that includes a camera (not illustrated) and a display 836 to output the image data from the camera. In the field of view of camera 802, a cup 822 can be seen. The object 822 has been linked to tracking points 824 and 826. These tracking points can be used to define object 822 in image space. An Earth reference frame was created using the IMU data 834 from the mobile device. In one embodiment, the pixel data may be calibrated to match the Earth reference frame, as described above. The 3-D axes 824 indicate the Earth reference frame, while the direction of gravity vector is indicated with arrow 828.

“A plane can be found relative to the Earth reference frames, as described above. The plane in this example is perpendicular to gravity vector’s axis. This plane is used for defining a path to the surround view that goes above the object 830. Any plane can be identified in the Earth reference frame. The curve that is used as a guide can then be rendered into the chosen plane.

“Using the locations and coordinates of the tracking points, one embodiment can determine the centroid of object 822 on the chosen plane in the Earth reference. An 830 curve, such as a circle can be drawn relative to the centroid. This example shows how a circle can be drawn around an object 822 in the chosen plane.

The curve 830 can be used to guide the camera along a specific path. Images taken along this path can then be converted into surround views. One embodiment allows the determination of the position of the camera on the path. An indicator that indicates the current position of the camera on the path can then be created. This example shows the arrow 832 indicating current location.

The camera’s position along the path may not correspond to the physical space. The IMU data can be used to estimate the distance traveled by the camera and, optionally, the frame rate. You can map the estimated distance to the distance traveled along the curve. However, the distance along the path 830 cannot be equal to the distance in physical space. Another example is to estimate the total time it takes to travel the path 830. Then, the time that images were recorded can then be tracked. To indicate the progress of the path 830, the ratio between the recording time and the total time can also be used.

“The path 830 which is an arc and the arrow 832 are rendered in the live image data in virtual objects according to their positions in the 3-D coordinate systems associated with the live 2-D data. Display 836 will display the cup 822, circle 830, and 832 arrow. If the orientation of display 836 is altered relative to cup 822, the orientations of the curve 830 or arrow 832 can be changed.

“In particular embodiments, the size of the object 822 can be modified in the image data. A digital zoom can make the object larger or smaller. Another example is the ability to move the camera (e.g. on a mobile device 834) closer or further away from an object 822 to increase or decrease the object’s size.

The distance between tracking points can change as the object’s size changes. This means that the distance between tracking points and the objects can change. You can use distance changes to determine a scaling factor. One embodiment can scale the AR system 830 or arrow 832 as the object’s size changes. This allows for the curve to be kept at a constant size relative to the object.

In another embodiment, the size of the curve may be fixed. A diameter of the curve, for example, can be related to the image’s pixel height, width, or both. For example, 80 percent of the image’s pixel height, or width, could be used. The object 822 may appear to shrink or grow depending on whether the zoom is used or the position of the camera. The image’s curve 830 can be relatively small.

“FIG. “FIG. 9 shows a second example of how to generate an Augmented Reality image capture track for photographing images used in a surroundview on a mobile device. FIG. FIG. The display shows images at three times 906a, 906b and 906c. A touch screen can overlay the display.

“In 906a, an image representing an object 908 will be output to the display at state 906a. The object is a rectangular rectangle box. Live image data can be output to the display from the mobile device’s camera. Remote cameras could also be used as the camera.

“A target such as 910 can be rendered to the display in one embodiment. To create a synthetic image, the target and live image data can be combined. The input interface on the smartphone allows users to adjust the position of the target. You can place the target on an object, and then you can input additional information to select the object. You can tap on the target’s location to activate the touch screen.

“Object recognition can also be applied to live image data in another embodiment. Multiple markers can be added to the display that indicate the location of identified objects in live image data. The touchscreen can be tapped on the location of a marker in the image to select an object.

“After selecting an object, a number initial tracking points can also be identified on the object such as 912-914 or 916. The tracking points might not be visible on the display in one embodiment. The tracking points can be rendered to the screen in another embodiment. If the tracking points are not on the object of interest in one embodiment, the user can select them and move or delete them so that they lie on the object.

“Next, the orientation of your mobile device can be changed. An orientation can be a rotation of the device through one or more angles, as well as translational motion. You can capture the IMU data from IMU902 on your device to see the orientation change and current orientation.

“As the device’s orientation changes, one or more tracking points can be occluded, such as 912-914 or 916. You can also change the appearance of surfaces in an image. You can determine movement at different pixel locations based on the changes in frames. Surfaces associated with object 908 can easily be predicted using the IMU data. As the camera position changes, new surfaces may appear in the image. These surfaces can have new tracking points.

Click here to view the patent on Google Patents.

How to Search for Patents

A patent search is the first step to getting your patent. You can do a google patent search or do a USPTO search. Patent-pending is the term for the product that has been covered by the patent application. You can search the public pair to find the patent application. After the patent office approves your application, you will be able to do a patent number look to locate the patent issued. Your product is now patentable. You can also use the USPTO search engine. See below for details. You can get help from a patent lawyer. Patents in the United States are granted by the US trademark and patent office or the United States Patent and Trademark office. This office also reviews trademark applications.

Are you interested in similar patents? These are the steps to follow:

1. Brainstorm terms to describe your invention, based on its purpose, composition, or use.

Write down a brief, but precise description of the invention. Don’t use generic terms such as “device”, “process,” or “system”. Consider synonyms for the terms you chose initially. Next, take note of important technical terms as well as keywords.

Use the questions below to help you identify keywords or concepts.

  • What is the purpose of the invention Is it a utilitarian device or an ornamental design?
  • Is invention a way to create something or perform a function? Is it a product?
  • What is the composition and function of the invention? What is the physical composition of the invention?
  • What’s the purpose of the invention
  • What are the technical terms and keywords used to describe an invention’s nature? A technical dictionary can help you locate the right terms.

2. These terms will allow you to search for relevant Cooperative Patent Classifications at Classification Search Tool. If you are unable to find the right classification for your invention, scan through the classification’s class Schemas (class schedules) and try again. If you don’t get any results from the Classification Text Search, you might consider substituting your words to describe your invention with synonyms.

3. Check the CPC Classification Definition for confirmation of the CPC classification you found. If the selected classification title has a blue box with a “D” at its left, the hyperlink will take you to a CPC classification description. CPC classification definitions will help you determine the applicable classification’s scope so that you can choose the most relevant. These definitions may also include search tips or other suggestions that could be helpful for further research.

4. The Patents Full-Text Database and the Image Database allow you to retrieve patent documents that include the CPC classification. By focusing on the abstracts and representative drawings, you can narrow down your search for the most relevant patent publications.

5. This selection of patent publications is the best to look at for any similarities to your invention. Pay attention to the claims and specification. Refer to the applicant and patent examiner for additional patents.

6. You can retrieve published patent applications that match the CPC classification you chose in Step 3. You can also use the same search strategy that you used in Step 4 to narrow your search results to only the most relevant patent applications by reviewing the abstracts and representative drawings for each page. Next, examine all published patent applications carefully, paying special attention to the claims, and other drawings.

7. You can search for additional US patent publications by keyword searching in AppFT or PatFT databases, as well as classification searching of patents not from the United States per below. Also, you can use web search engines to search non-patent literature disclosures about inventions. Here are some examples:

  • Add keywords to your search. Keyword searches may turn up documents that are not well-categorized or have missed classifications during Step 2. For example, US patent examiners often supplement their classification searches with keyword searches. Think about the use of technical engineering terminology rather than everyday words.
  • Search for foreign patents using the CPC classification. Then, re-run the search using international patent office search engines such as Espacenet, the European Patent Office’s worldwide patent publication database of over 130 million patent publications. Other national databases include:
  • Search non-patent literature. Inventions can be made public in many non-patent publications. It is recommended that you search journals, books, websites, technical catalogs, conference proceedings, and other print and electronic publications.

To review your search, you can hire a registered patent attorney to assist. A preliminary search will help one better prepare to talk about their invention and other related inventions with a professional patent attorney. In addition, the attorney will not spend too much time or money on patenting basics.

Download patent guide file – Click here