Invented by Helen Balinsky, Steven J Simske, David Subiros Perez, Hewlett Packard Development Co LP
The Hewlett Packard Development Co LP invention works as followsA data leaked prevention system” includes an application residing on a client device that has source code which is not modifiable or unavailable. The application emits a system call as a result an action and must occur before a data-leak event occurs. This action is based on a document that contains i) the most recent full contents, ii), metadata, or iii), a combination of both the full contents and metadata. On the client device, a system call interceptor is also installed. The interceptor agent also includes a system-call interceptor that intercepts the system calls emitted by an application, and suspends the system call. The system includes a decision engine that analyzes at least one of the following: i) latest full contents; ii), metadata or iii), the combination. It then implements a policy action on the basis of the analysis.
Background for Data leakage prevention systems and Methods
The disclosure is a general description of data leakage prevention methods and systems.
Electronic data communications is ubiquitous.” Data transmission is often required by business practices and commerce to ensure that legal, customer, or internal standards are met. These standards are designed to prevent unauthorized information transmission and attachments. To achieve this goal policies were implemented that required labels and tags such as “private”, “confidential”, “restricted”, and’sensitive” to be added to documents and data transmissions. Some policies call for keywords to be added in the document content (e.g. persistent terminology that differentiates between one class of documents sensitive from another). It may be difficult to ensure that these policies are implemented, in part, because employees have different levels of competence (e.g. new to the company, role, or job; temporary replacement); or they make genuine mistakes (e.g. distraction, forgetfulness or lack of concentration); or lack current corporate, legal or other policy information. It may be necessary to provide adequate training for employees in conjunction with the policy changes or soon after. However, it is possible that there may be a delay in the implementation of the policy. If policies aren’t enforced, sensitive information/data may leak. The incidents of data leaks can range from sending sensitive emails to the wrong addresses to uploading confidential documents instead of conference submissions; from saving unprotected backups on USB devices to printing sensitive documents on an easily accessible printer, and forgetting to print them. Data leak prevention methods include, for instance, the analysis of incoming information on gateway systems, secure erasure, USB lock-out, and mandatory encryption. Existing leak prevention methods are office-based and may not be appropriate for data leaks coming from devices outside the office (e.g. DVD, USB drives, external hard disks, etc .).
In the examples presented here, sensitive data policies are enforced locally on devices within a trusted environment. These devices can range from desktop computers to mobile phones. A part of the disclosed system is resident on the device, and it captures the system calls that are emitted from an application for which the source code is unavailable or cannot be modified (e.g. due to practical considerations or legal concerns). This type of application, which is usually a third-party application installed on the device and not designed to handle data sensitively in accordance with the policies set up for the trusted environment, is often a third-party application. Sandboxing such applications is a method and system disclosed in this document to prevent data leakage and policy breaches. Sandboxing is the process of evaluating dynamically-changing document sensitivity, based on document metadata or actual document content. This evaluation occurs before a document can be exported from a trusted environment. This evaluation occurs when the user attempts to export a document from the trusted environment. The system and method described herein is designed to prevent sensitive information leaks by blocking sensitive data exports from a trusted area before they have a chance to happen.
The program for sensitive data policy enforcement can be installed on the device of the user in a safe compartment/account. The program can be used in either an advisory or mandatory mode. The advisory mode is a safety net that prevents accidental user errors, and the program/solution for policy enforcement is deployed to the benefit of the user. Mandatory mode is when the user cannot access, modify, or disable the policy enforcement program/solution.
Examples of the system disclosed herein detect and, in some cases, amend potential policy-violating user actions during the intention stage. This is when a system call associated with the action has not yet been performed. Before deploying the method that will actually prevent data leaks, an application with no source code or which cannot be modified is analyzed in order to catalog the systems calls and reduce the number of calls to those associated with sensitive tasks for which data leak prevention is desired. FIG. shows the analysis of the application. 1.
Each of the steps shown in FIG. Hardware, firmware, or software (i.e. computer-readable instructions) are used to perform FIG. In one example, hardware (e.g. a processor on a client device or local device) executes computer-readable instructions (i.e. software) embedded on a tangible, non-transitory computer-readable medium.
At first, the system call emitted by an application is captured and analyzed (see the reference number 100 in FIG. 1). For example, it is possible to capture and analyze all the system calls emitted from an application. In some cases, however, it may be undesirable to capture and analyse all system calls. The computer-readable instructions that perform the analysis of the app may, for example, be programmed to recognize certain system calls, such as?open? or?new? These system calls are not related at all to data leakage events and would therefore not be captured or analyzed. In some cases, system calls triggered by an action with the potential to leak data will be filtered. In these examples, every system call associated with a potentially data-leaking action is captured. StraceNT, a Windows equivalent of the original Linux tool, can be used in one example to intercept and record calls. Filtering can be done manually or with an automation tool to identify system calls that are associated with data leakage actions.
During the analysis, system calls that have been identified as being associated with or related to an event of potential data leakage are filtered out” (see FIG. 1). As used in this document, “potential data leakage events” As used herein, ‘potential data leak events’ or ‘data leak events’ are actions that can result in sensitive data being transmitted outside of the trusted environment. Actions that could result in the transmission of sensitive data outside of a trusted environment. Included are actions that can export data recognized by policy out of the trusted area. Exporting data can be done in many ways, including saving, printing, copying and moving files, dragging-and-dropping, uploading or other methods.
The action(s), which may be triggered at any time by the user, is/are called user actions herein. User actions can include pressing a button to send an e-mail, uploading a file, printing a file, or copying and moving a paper (using Microsoft Windows Explorer, for example). When sensitive data, such as passwords, is stored in plain text on an external device (such as a USB drive, pen drive, external hard drive or similar), and then the device is lost, saving may result in a data leak. Data leaks can also occur when sensitive data is stored on a network drive that has a low level of security or is widely accessible. A data leak can occur when an emailed is sent to an unintended receiver or to a large audience. Or, it could be that a user accidentally sends confidential information to the wrong recipient via a public network. Uploading a document may result in a data breach when, for instance, the incorrect file containing sensitive data is uploaded accidentally to a server, instead of the intended file which does not have sensitive data. Data leaks can occur when a sensitive or confidential document is printed without the ability to retrieve a personal identification number.
The application action can also be triggered from an internal application state (i.e. a process within the application). Microsoft Outlook is an example of a application action. It sends a scheduled or delayed e-mail (e.g. a user can select the desired date and time for delivery in the menu “Options”). Under the sub-menu “Delivery Options? Microsoft Word’s?AutoSave” is another example of an action performed by an application. Another example of an action in an application is when it is programmed to automatically upload information to a distant web site.
When filtering occurs, the computer-readable instructions are programmed so that they analyze each captured call to a system and determine if the call(s), individually or in a group, corresponds with the possible data leakage event (see the reference number 104 on FIG. 1). “The system calls that are deemed to be unrelated to possible data leakage events are not filtered and are not further analysed.
In contrast, the system calls that were filtered during the initial analysis undergo further processing to determine which of the filtered call(s), corresponds with a specific action that if executed fully, could leak information from the trusted environment. (See again the reference number 104 in FIG. 1). The system will determine which system calls are filtered out that correspond to actions (i) taking place before a data leak can happen, and ii), involving the latest version of the document (including the full contents or metadata). The definition is achieved by identifying, from the filtered calls, either a single or group of calls responsible for performing the actual action. From the identified system calls, select the system call (or other means) that provides access to the most recent (i.e. current, latest) version of the documents (i.e. contents and/or meta data). The system call(s), responsible for performing the action, are temporarily blocked. (i.e. suspended while analysis takes place) To prevent data leakage, the behavior of the application resulting from the temporary blocking is observed. In response to the blocked/suspended calls, the application’s behavior identifies which system call is the lowest in the hierarchy or the combination of system calls that prevents an action from being executed. The very last or most convenient system calls that provide access to the latest metadata and/or document contents are identified. In addition, the application’s response to blocked/suspended calls identifies which system call provides the most recent version of the document, as opposed to, say, just a fragment. A system call which is only responsible for writing a part of a file, or one that occurs after the document has been written in a remote location, may not be used. This is at least partly because the first call does not give the complete document context, and the second call comes too late (i.e. after sensitive data may have leaked). The system call(s), which involve the latest version of the document, are defined and identified as the event that could lead to a data leak. However, it is important to note that other methods may be used to retrieve the document contents corresponding to a data leak. The latest document content can be recovered if the full document contents for a specific application and action cannot be retrieved from the corresponding system call. For example, during document modification operations, the contents of the document may be cached so that they are available to be evaluated when the system call is stopped.
FIG. FIG. 1 shows the analysis of the application to determine what system calls are called by the app 12 when data leakage events are about occur. It also illustrates how to select the system calls that effectively prevent data leakage while providing full document content and/or metadata. Although not shown in FIG. It is understood that the client device 14 can perform a detour for any system calls specified or defined by the policy. For example, the client device 14 may perform a detour for each suspicious/suspended call that is found during analysis. This will allow it to determine which calls are most easily detourable, safest, or best suited to attach/execute a desired action. In some cases, during the analysis the potential data leak may be simulated. The detour functions are then implemented and tested to help improve the function during deployment. Detour functions may include suspending system calls and, while an analysis is being performed, informing the user that, for instance, the system calls have been temporarily blocked or that an analysis is underway, or the application will resume shortly. “The communications to the user can depend on the deployment scenario.
When testing multiple system calls, they can be blocked individually or in groups. System call(s) that are identified as performing/preventing the action and as being associated with the latest version of the document are added to the set of system calls for the particular action. The set contains the minimum number of system calls required to perform the action. The emission of any set system call during application use triggers, as will be explained further below, a series processes to prevent data loss.
Table 1 gives examples of actions, application and system calls which are all part of the set of actions and applications for a specific action. The system calls can be used in addition to the traditional ones (e.g. writefile, replacefile etc.). The system and method disclosed in this document may be enhanced by adding application add-ons or plug-ins. As’system calls’. Table 1 shows that higher level calls such as events generated by Outlook MAPI (e.g. fnevCreated), or Browser Helper Object events (e.g. BeforeNavigate2) can be designated as a system call for a specific action. Browser Helper Objects are DLL modules that add functionality to Microsoft’s Internet Explorer browser. With BHOs, for example, you can create components that Internet Explorer loads each time the browser starts. BHOs are run in the same memory context of the browser, and can perform any actions on the available modules and windows.
TABLE 1\nSystem Call(s)\nAction Application to Capture\nSave File Microsoft Word WriteFile\nReplaceFile\nMicrosoft Excel WriteFile\nReplaceFile\nMicrosoft PowerPoint WriteFile\nAcrobat Reader WriteFile\nNotepad WriteFile\nSend Email Microsoft Outlook fnevCreated\nWeb Based BeforeNavigate2\nUpload File Internet Explorer BeforeNavigate2\nPrint Notepad WriteFile\nWritePrinter\nMicrosoft Word WriteFile\nWritePrinter
As mentioned above, FIG. The method shown in FIG. 1 can be used for any application, even if the source code is not available or cannot be modified. Even for actions that appear to be identical, the system calls used by various applications depend on their internal behavior and implementation. System calls for the Save File action in Microsoft Word can be different than system calls for the Save File action in Notepad. It may therefore be desirable to generate a set (or sets) of system calls for each application, and for every data leak event in each application.
FIG. The method of data leakage prevention illustrated in FIG. 2 includes both the analysis and deployment of the application after the analysis. The description of FIG. In the description of FIG. The components of the systems 10 and 10? “Figures 3 and 4 can be referenced as well as described.
The application 12 is loaded onto the local or client device 14, 14? (as shown by the reference numeral 200). The application 12 (as indicated by the reference number 200) is loaded on the local or client device 14?. Local or client device 14,14? The local or client device 14, 14? Out-of-intranet devices include any company-owned mobile device (e.g., laptops, smart phones, etc.) They can carry sensitive data and be used inside or outside a trusted/secure environment. “A peripheral device is a device that can or is connected to an intranet of the company, such as a USB, memory device or other device.
In the examples provided here, the application 12 refers to an application which is useful for the user but does not handle sensitive data according to a policy established for the document, user, etc. The source code for the application 12 cannot be modified or is unavailable. This type of application 12 can cause accidental or deliberate cross contamination of sensitive information if sensitive data is opened by the application or copied into the app. The sandboxing described in FIG. 1 is used because the application 12 initially cannot handle sensitive data. The sandboxing method described in FIG. 2). In a nutshell, to reiterate what was said above, the analysis in FIG. The analysis of FIG.
The term “document” is used in this document to refer to text, media, video, audio files and other data types. The term “document” includes all types of data and media including text, media, audio, files (including video), labels, tags, phrases, patterns, networks, packet streams, signals, records, emails and other data. The “contents” The?contents? Document metadata may be included in the document.Click here to view the patent on Google Patents.