Invented by Rahul Chander Kashyap, Vadim Dmitriyevich Kotov, Samuel John Oswald, Homer Valentine Strong, Cylance Inc

The market for machine learning for endpoint detection and response (EDR) has been rapidly growing in recent years. Endpoint detection and response refers to the process of monitoring and responding to security threats on individual devices, such as laptops, desktops, and mobile devices, within a network. Machine learning, a subset of artificial intelligence, has become a powerful tool in the field of cybersecurity. It involves training computer systems to learn from data and make predictions or take actions without being explicitly programmed. In the context of EDR, machine learning algorithms can analyze large volumes of data from endpoints to identify patterns and anomalies that may indicate a security breach or malicious activity. One of the main advantages of using machine learning for EDR is its ability to detect previously unknown threats. Traditional signature-based detection methods rely on known patterns or signatures of known malware to identify threats. However, cybercriminals are constantly evolving their tactics and creating new malware variants that can bypass these signature-based systems. Machine learning algorithms, on the other hand, can learn from historical data and adapt to new and emerging threats, making them more effective in detecting and responding to unknown attacks. Another benefit of machine learning for EDR is its ability to reduce false positives. False positives occur when a security system incorrectly identifies benign activity as malicious, leading to unnecessary alerts and potentially overwhelming security teams. Machine learning algorithms can analyze vast amounts of data and learn to differentiate between normal and abnormal behavior, reducing the number of false positives and allowing security teams to focus on genuine threats. The market for machine learning for EDR is driven by the increasing complexity and sophistication of cyber threats. With the proliferation of connected devices and the rise of remote work, organizations are facing a growing number of endpoints that need to be secured. Traditional security solutions are struggling to keep up with the rapidly evolving threat landscape, leading to a demand for more advanced and intelligent solutions. Several major players in the cybersecurity industry have recognized the potential of machine learning for EDR and have developed their own solutions. These solutions typically involve a combination of machine learning algorithms, behavioral analytics, and threat intelligence to provide comprehensive endpoint security. They can automatically detect and respond to threats in real-time, enabling organizations to proactively protect their endpoints and mitigate potential damage. The market for machine learning for EDR is expected to continue its growth trajectory in the coming years. As organizations increasingly prioritize cybersecurity and invest in advanced threat detection and response capabilities, the demand for machine learning-based solutions will continue to rise. Additionally, advancements in machine learning techniques and technologies, such as deep learning and neural networks, will further enhance the effectiveness and accuracy of EDR systems. In conclusion, the market for machine learning for endpoint detection and response is experiencing significant growth as organizations seek more advanced and intelligent solutions to combat evolving cyber threats. Machine learning algorithms offer the ability to detect unknown threats and reduce false positives, making them a valuable tool in the fight against cybercrime. As the threat landscape continues to evolve, the demand for machine learning-based EDR solutions will only continue to increase.

The Cylance Inc invention works as follows

A number of events are monitored for each computing node that is part of the network topology. The network topology contains antivirus tools that detect malicious software before it can access one of the nodes. Using at least a machine learning model it is then determined that at least one event is indicative of malicious behavior that has bypassed or circumvented the antivirus tools. The data that characterizes this determination is then provided. “Also described are related apparatus, systems, technologies and articles.

Background for Machine learning for endpoint detection and response

Antivirus software and hardware are increasingly effective in preventing malware infection of individual computers, as well as enterprise computing systems. These antivirus tools, however, are not foolproof and can be vulnerable to malicious activities. Endpoint detection and reaction (EDR), systems are used to detect suspicious activity, contain identified threats and remediate them to reduce the risk of sensitive data being lost.

In a first approach, multiple events are monitored for each computing node that is part of the network topology. The network topology contains antivirus tools that detect malicious software before it can access one of the nodes. Using at least one machine-learning model, it’s determined that at least a single event is indicative of malicious behavior that bypassed or circumvented the antivirus tools. The data that characterized the determination is provided.

Providing data describing the determination can include: displaying the data on an electronic visual display; loading the data into a memory; storing the physical data persistence or transmitting the information to a remote computing system.

The data that is provided can include an identification a source for malicious activity. A node that corresponds to the source of malicious activity can then be isolated from other nodes based on the identification. As an alternative or additional response to the identification the source of malicious activity, remediation at the corresponding network node can be initiated to prevent further damages to the node. In some variations, remediation uses at least one reinforcement-learning method from the group of multi-armed bands, Q-learning or Bayesian Optimization.

The antivirus tools may include antivirus software or at least one computer gateway appliance.

The machine learning models may include generalized linear models (OLS), ridge regression (ridge regression), elastic net (multi-task elastic network), least angle regression (LARS), orthogonal matching pursuit, Bayesian regression (naive Bayesian), logistic regression (SGD), Perceptron (passive aggressive algorithms), Huber regression (huber regression), polynomial and quadratic regression (polynomial regression), support vector machines (SVM), nearest neighbor (nearest neighbor), Gaus

At least one machine learning model can use supervised training. These models can use labels from existing label corpuses, indicators of compromise or deterministic finite automated tailored to recognize specific tactics, techniques and procedures.

At least one machine learning model can use unsupervised methods to characterize qualitative changes in a node, based on monitored events. These methods include clustering, anomaly detecting, or latent variables models.

The machine-learning models can be embedded in a variety of machine-learning packs. Each machine learning package can be distinct and separate, and identify different types malicious activity. For example, the different types of malicious activity identified by the machine learning packs can include memory-based attacks, POWERSHELL/macro-based exploits, privilege escalation, lateral movement, data exfiltration, anti-analysis efforts, password stealer, backdoor/tunnel, and/or insider threat.

At lease one of the machine-learning packs can be dynamically configured on a node in response to communications with another node or a remote computing systems. At least one machine learning pack on a node can also be updated by communications with another system or node. At least one machine-learning model can also be dynamically modified based on monitored events.

In some cases, missing data may be reconstructed using at least one generative models. These missing data may provide context to at least one malicious event.

Nontransitory Computer Program Products” (i.e. physically embodied computers program products) that store instructions are also described. These instructions, when executed by an or more data processing systems of an or more computing system, cause the at least one processor to perform certain operations. Computer systems can also be described, which may include one or multiple data processors coupled with memory. Memory can be used to temporarily or permanently store the instructions that will cause one or more processors to perform any of the operations described in this document. One or more data processing systems can implement methods, either in a single computer system or across two or more computers. These computing systems are connected, and they can exchange data, commands, other instructions, or the similar via one or several connections. This includes, but is not limited to, a connection via a network such as the Internet, wireless wide area networks, local area networks, wide area networks, wired or wireless, etc.

The subject matter described in this document provides many advantages. The current subject matter, for example, provides enhanced techniques that allow earlier and more effective identification of threats, which in turn allows more rapid remediation or containment of threats. “More rapid remediation can reduce the amount and/or likelihood of sensitive data being lost within an enterprise computing environment.

The drawings and description below provide details on one or more variations. The description, drawings and claims will reveal other features and benefits of the subject matter.

DESCRIPTION of Drawings

FIG. “FIG.

FIG. “FIG.

FIG. “FIG.

The current subject is a machine-learning-based endpoint detection and reaction (EDR), which identifies threats more quickly and accurately within a network topology. FIG. FIG. 1 shows a simplified network topology (which can correspond to a computer network in an enterprise) with a number of computing nodes (also known as endpoints), which are connected via a network (e.g. the Internet, a Wide Area Network, a Local Area Network, etc.). The communication can be peer-to-peer or in an asynchronous fashion. The nodes 110 may also be able to access remote computing systems or networks (not shown) through the network 120, or in other ways. Nodes 110 could be mobile phones, desktop computers, laptops, or servers.

Events associated to each computing node 110 can be monitored locally by the nodes 110 or centrally by a monitoring node. Various, or all, of these events can be logged, or made accessible in another way for analysis or retrieval. Event data can, for instance, be multidimensional and stored into one or more tables within a database or another data storage device. If you want to log monitored events, you can use a rule engine that executes a static set of rules, one or several machine learning models or a combination. The static rule set, for example, can be used to filter certain events out later using a machine-learning model. In some cases software hooks and other event handlers can be used to monitor the operating system or application events of each node. Software hooks or data interception techniques may be used instead to monitor packet traffic from and/or to the nodes 110.

In some variations, all or some of the nodes can have a local log 112. This is where data describing the monitored events will be stored. In some variations, nodes 110 are triggered periodically or by a triggering event, such as a demand. The monitoring node 130 stores data describing various monitored events in a central log. The monitoring node 130, which can include distributed computing devices, can use a central log, whether or not local logs 112 are part of the topology. If no logs at all are used, some or all data collected as part of monitoring can be applied to one or more model packages for machine learning 134. The U.S. Patent Application Ser. No. App. No. 15/354.966 filed Nov. 17, 2016. App. Ser. No. No.

Click here to view the patent on Google Patents.