Invented by Rahul Chander Kashyap, Vadim Dmitriyevich Kotov, Samuel John Oswald, Homer Valentine Strong, Cylance Inc
The Cylance Inc invention works as followsA number of events are monitored for each computing node that is part of the network topology. The network topology contains antivirus tools that detect malicious software before it can access one of the nodes. Using at least a machine learning model it is then determined that at least one event is indicative of malicious behavior that has bypassed or circumvented the antivirus tools. The data that characterizes this determination is then provided. “Also described are related apparatus, systems, technologies and articles.
Background for Machine learning for endpoint detection and response
Antivirus software and hardware are increasingly effective in preventing malware infection of individual computers, as well as enterprise computing systems. These antivirus tools, however, are not foolproof and can be vulnerable to malicious activities. Endpoint detection and reaction (EDR), systems are used to detect suspicious activity, contain identified threats and remediate them to reduce the risk of sensitive data being lost.
In a first approach, multiple events are monitored for each computing node that is part of the network topology. The network topology contains antivirus tools that detect malicious software before it can access one of the nodes. Using at least one machine-learning model, it’s determined that at least a single event is indicative of malicious behavior that bypassed or circumvented the antivirus tools. The data that characterized the determination is provided.
Providing data describing the determination can include: displaying the data on an electronic visual display; loading the data into a memory; storing the physical data persistence or transmitting the information to a remote computing system.
The data that is provided can include an identification a source for malicious activity. A node that corresponds to the source of malicious activity can then be isolated from other nodes based on the identification. As an alternative or additional response to the identification the source of malicious activity, remediation at the corresponding network node can be initiated to prevent further damages to the node. In some variations, remediation uses at least one reinforcement-learning method from the group of multi-armed bands, Q-learning or Bayesian Optimization.
The antivirus tools may include antivirus software or at least one computer gateway appliance.
The machine learning models may include generalized linear models (OLS), ridge regression (ridge regression), elastic net (multi-task elastic network), least angle regression (LARS), orthogonal matching pursuit, Bayesian regression (naive Bayesian), logistic regression (SGD), Perceptron (passive aggressive algorithms), Huber regression (huber regression), polynomial and quadratic regression (polynomial regression), support vector machines (SVM), nearest neighbor (nearest neighbor), Gaus
At least one machine learning model can use supervised training. These models can use labels from existing label corpuses, indicators of compromise or deterministic finite automated tailored to recognize specific tactics, techniques and procedures.
At least one machine learning model can use unsupervised methods to characterize qualitative changes in a node, based on monitored events. These methods include clustering, anomaly detecting, or latent variables models.
The machine-learning models can be embedded in a variety of machine-learning packs. Each machine learning package can be distinct and separate, and identify different types malicious activity. For example, the different types of malicious activity identified by the machine learning packs can include memory-based attacks, POWERSHELL/macro-based exploits, privilege escalation, lateral movement, data exfiltration, anti-analysis efforts, password stealer, backdoor/tunnel, and/or insider threat.
At lease one of the machine-learning packs can be dynamically configured on a node in response to communications with another node or a remote computing systems. At least one machine learning pack on a node can also be updated by communications with another system or node. At least one machine-learning model can also be dynamically modified based on monitored events.
In some cases, missing data may be reconstructed using at least one generative models. These missing data may provide context to at least one malicious event.
Nontransitory Computer Program Products” (i.e. physically embodied computers program products) that store instructions are also described. These instructions, when executed by an or more data processing systems of an or more computing system, cause the at least one processor to perform certain operations. Computer systems can also be described, which may include one or multiple data processors coupled with memory. Memory can be used to temporarily or permanently store the instructions that will cause one or more processors to perform any of the operations described in this document. One or more data processing systems can implement methods, either in a single computer system or across two or more computers. These computing systems are connected, and they can exchange data, commands, other instructions, or the similar via one or several connections. This includes, but is not limited to, a connection via a network such as the Internet, wireless wide area networks, local area networks, wide area networks, wired or wireless, etc.
The subject matter described in this document provides many advantages. The current subject matter, for example, provides enhanced techniques that allow earlier and more effective identification of threats, which in turn allows more rapid remediation or containment of threats. “More rapid remediation can reduce the amount and/or likelihood of sensitive data being lost within an enterprise computing environment.
The drawings and description below provide details on one or more variations. The description, drawings and claims will reveal other features and benefits of the subject matter.
DESCRIPTION of Drawings
The current subject is a machine-learning-based endpoint detection and reaction (EDR), which identifies threats more quickly and accurately within a network topology. FIG. FIG. 1 shows a simplified network topology (which can correspond to a computer network in an enterprise) with a number of computing nodes (also known as endpoints), which are connected via a network (e.g. the Internet, a Wide Area Network, a Local Area Network, etc.). The communication can be peer-to-peer or in an asynchronous fashion. The nodes 110 may also be able to access remote computing systems or networks (not shown) through the network 120, or in other ways. Nodes 110 could be mobile phones, desktop computers, laptops, or servers.
Events associated to each computing node 110 can be monitored locally by the nodes 110 or centrally by a monitoring node. Various, or all, of these events can be logged, or made accessible in another way for analysis or retrieval. Event data can, for instance, be multidimensional and stored into one or more tables within a database or another data storage device. If you want to log monitored events, you can use a rule engine that executes a static set of rules, one or several machine learning models or a combination. The static rule set, for example, can be used to filter certain events out later using a machine-learning model. In some cases software hooks and other event handlers can be used to monitor the operating system or application events of each node. Software hooks or data interception techniques may be used instead to monitor packet traffic from and/or to the nodes 110.
In some variations, all or some of the nodes can have a local log 112. This is where data describing the monitored events will be stored. In some variations, nodes 110 are triggered periodically or by a triggering event, such as a demand. The monitoring node 130 stores data describing various monitored events in a central log. The monitoring node 130, which can include distributed computing devices, can use a central log, whether or not local logs 112 are part of the topology. If no logs at all are used, some or all data collected as part of monitoring can be applied to one or more model packages for machine learning 134. The U.S. Patent Application Ser. No. App. No. 15/354.966 filed Nov. 17, 2016. App. Ser. No. No.Click here to view the patent on Google Patents.