Internet – Andrew Layman, Eric Hauser, Matthew John Fleckenstein, Tatyana Mamut, Adam Bosworth, Salesforce com Inc

Abstract for “Simplified entity management of the lifecycle”

The technology disclosed implements a declarative framework for multi-step interaction with an entity. This declarative framework can be used repeatedly for many different applications. It provides a simple rule-based authoring platform that can be used to specify various elements and components of complex state machines, including state definitions and state transition triggers. The declarative input from a non-technical user is used to create and implement the state machine.

Background for “Simplified entity management of the lifecycle”

“The discussion of the subject matter in this section is not to be taken to be prior art simply because it has been mentioned in this section. A problem that is mentioned in this section, or related to the subject matter given as background, should not be assumed previously to have been recognized in prior art. This section does not contain any specific approaches. They may be used in conjunction with implementations of the claimed technology.

The technology disclosed implements a state machine that allows for multi-step interaction with an entity. Because it is a simple rule-based authoring system that allows for the specification of different elements and components of complex state machines, such as state definitions, state triggers, conditional state transitions and state actions, the declarative framework can be used over and over again for many applications. The declarative input from a non-technical user is used to create and implement the state machine.

“In today?s world, we deal with massive data volumes, commonly referred to as ‘Big Data?. Web applications that manage millions of Internet users such as Facebook, Instagram, Twitter, and banking websites or online retail shops such as Amazon.com. Are you looking for eBay or Facebook? They must ingest large volumes of data quickly so end users have a real-time experience.

“Internet of Things” is another major contributor to Big Data. (IoT). IoT is about a pervasive presence in the environment of a variety of things/objects that through wireless and wired connections are able to interact with each other and cooperate with other things/objects to create new applications/services. These applications/services are in areas likes smart cities (regions), smart car and mobility, smart home and assisted living, smart industries, public safety, energy and environmental protection, agriculture and tourism.”

“Currently, there is a need to make such IoT applications/services more accessible to non-experts. Non-technical domain experts who are highly valued have been watching from the sidelines of IoT’s ecosystem. This is because IoT relies on technical products that require extensive programming experience. It is imperative that non-experts are able to combine big data computing and analysis without the need for expensive technical consultants.

“There is a great opportunity to offer systems and methods that are simple and easily codeable declarative language-based solutions to big data computing and analysis tasks. This could lead to higher revenue, user retention, better user engagement and improved experience.

This summary provides a quick overview of the various aspects of the exemplary, non-limiting implementations. The detailed description and accompanying drawings will provide a more in depth explanation. However, this summary is not meant to be an exhaustive or comprehensive overview. This summary serves as a brief introduction to the details of various implementations.

The technology disclosed provides a declarative framework to implement a state machine that allows for multi-step interaction with an entity. Because it is a simple rule-based authoring system that allows for the specification of different elements and components of complex state machines, such as state definitions, state triggers, conditional state transitions and state actions, the declarative framework can be used over and over again for many applications. The declarative input from a non-technical user is used to create and implement the state machine.

“Other advantages and aspects of the technology disclosed can also be seen by reviewing the drawings, detailed description, and claims which are included.”

The following description includes a reference to the figures. The claims do not limit the scope of the disclosed technology. Instead, samples are used to illustrate its implementations. Anyone with ordinary skill in art will be able to recognize many equivalent variations in the following description.

The following structure is used for the discussion: The discussion begins with an explanation of the terminology used. Next, an introduction will describe some of the technical issues and solutions offered by different implementations. An architectural overview of several implementations will then be provided. A state machine that implements an entity management workflow will be described. A few user interface views that are used in different implementations will also be discussed. The next section will discuss more specific actions to implement the system. This includes data entry models and transitive triggers, condition definitions, and data entry models. Finally, we will discuss some specific implementations.

“Terminology”

“Entity”: A thing or object that interacts or communicates with another thing or object and the environment. This includes exchanging data and information about the environment and reacting to real/physical events. The entity provides services for information transfer and analytics. People, web users, wireless/wireless sensors, smart phones and smart watches are all examples of entities. Some implementations use the singular “entity” designation. The singular?entity? and plural?entities? may be used interchangeably in some implementations. These terms are interchangeable in this application to ensure clarity. In some implementations of this application, “entities” is used. are?data sources?”,?users?”, and other actors?

“Internet of Things Platform: What is the?Internet of Things platform (IoT?)? The integrated environment described herein can be defined as one that gathers and processes large amounts of data from multiple entities in real-time (or near real-time), often with low latency. Sometimes, processing logic can also be applied to the data in order to generate real-time and near-real-time analytics. An IoT platform can be described as an integrated framework that uses a combination of batch and stream mode computation to periodically generate aggregates using offline and batch analytics. It also substitutes real-time data streams with real-time results to generate real time analytics. This includes computational tasks such as data mining, statistical processing and predictive analytics. One implementation of the IoT platform allows for high throughput, with nodes processing 1 million tuples per minute. The IoT platform also offers insight to end-users through rich visualization using GUI and/or API tools such as standard graphs and bars, charts, and overlaid infographics.

“Near Real-Time data stream: A near real time (NRT) data streaming is a collection events that are recorded as they are generated. An NRT data stream can be described as an unbounded sequence or tuples of data. A NRT data stream can emit one million events per second in some implementations.

“Event”: A data unit that identifies an event and provides information about it. An event can provide information about an entity in one implementation. An event can include three components: a timestamp that indicates when it occurred; a set dimensions that indicate various attributes about the event; as well as metrics. Event can also be generated by users, such as mouse clicks or keystrokes. Statistics (e.g. Latency/numbers of bytes, etc. There are many other options, including program loading and errors. Events can include network flow variables as well as information about user and group members, device information, device information, and information on applications (e.g. resource condition, variables, and custom triggered events). A message, token or count is a value, token, pattern, or marker that can easily be identified within a NRT stream. This includes network traffic, specific error signals or conditions, thresholds crossed and counts accumulated. An example of a user interaction with Pardot. A sequence of events occurs in the context of a session. These are the main events to be noted: (a) login?provide user credential to a hosted server to authenticate the user; and (b) application transaction?execute a series of application-level transactions, e.g. Add leads or create new operations. Log-out?this event ends the session with the server. Deep packet inspection logic can be used to track event data and store them in an event repository. In some cases, this application is interchangeable with?events’. In some implementations, ‘events? can be interchanged with?data?. Another example of events that are generated by or about entities is telemetry from wearable sensors, data from smart watches, and data and/or metadata generated when a user uses a feature in an application (such Microsoft Word). Trip or journey data generated using a GPS device used by a driver to start or complete a trip, speed and location data from a vehicle, data generated via a medical device reporting a sensor reading etc.

“Pipeline”: A series of interrelated, grouped events is a pipeline. One implementation grouping is done on a type-by-type basis. Another implementation uses a batch-by-batch approach to grouping.

Online Social Network: An “online social network” A combination of software, protocols, and/or hardware that allows a community or individual to share information, resources, and the like over a computer network (such the Internet) is known as an online social network. An online social network is a platform such as a blog, website or forum that encourages interaction, engagement, and information sharing. An example of an online social networking site is Facebook? Twitter? Flickr? Picasa??, Digg? RSS? Biogs? Reddit? Linkedin? Wikipedia? Pinterest? Google Plus+? MySpace? Bitly? The like. In some cases, this application can be interchangeably referred to as?online social networking? as?social network? or?social-media site?, or?social networking service?. and?social network entity? and vice versa.

“Application Programming Interface (API): An?application programming interfacing (API). A package of code libraries, methods, and fields that are part of a class, together with its interface types, is defined as the API. Developers and programmers can access the API to create their own software by simply importing the appropriate classes and writing statements that instantiate them and call their methods or fields. An API can also be a source-code-based specification that allows software components to communicate with one another. An API can contain specifications for routines and data structures. An API is basically an interface that allows programmers and developers to access the platform capabilities and features of online social media networks. Twitter’s Search API allows you to pollute Twitter’s data by entering a username or searching for it. Twitter’s Search API allows developers and programmers to access data sets that exist from tweets that have been made. Developers and programmers can request tweets that meet their search criteria through the Search API. Keywords, usernames and locations can all be used as search criteria. Named places, names, and other information are also possible. Twitter’s streaming API, for example, allows you to push data in real-time as tweets are published almost immediately. Developers and programmers can register criteria using Twitter’s streaming API (e.g. keywords, usernames locations, named places etc.). As soon as tweets meet the criteria, they will be pushed directly to developers and programmers. Twitter Firehose, another example, pushes data to programmers and developers in near real time and guarantees delivery for all tweets that meet the criteria.

“Application” refers to a network hosted services accessed via a uniform resources locator (URL). Software as a Service (SaaS), platform as a Service (PaaS), and infrastructure as an a Service (IaaS), are some examples. Salesforcel Platform? Sales Cloud? Data.com? and Service Cloud? are some examples of these applications. Pardot?, Marketing Cloud, Desk.com? Wave Analytics?. Box.net?., Marketing Cloud?. Pardot? Concur? An application can provide insights to end-users through rich visualization using GUI or API based tools such as standard graphs, bars and charts, and overlaid infographics.

“Entity Experience Operation”: A?entity experience operation. An orchestrated effort usually undertaken by an operator (e.g. To enable effective user management, resource provisioning, app life cycle management, user engagement and traffic monitoring, activity tracking. Provisioning for application modeling.

“Identification: The term used herein is?identification. The identification of an item does not necessarily need to be specific. Information can be identified? In a field, information can be?identified? by simply referring back to it through one or more layers or by identifying one of several items of information that are sufficient to identify the actual item. The term?specify?” can also be used herein. The term?specify? is also used in this context to refer to the same thing as ‘identify”.

“Physical thread: A container can operate over a number of so-called “physical threads” once it is deployed. A physical thread uses a processor core on a worker node. It runs within a set code processes (e.g. Java processes) that are distributed across the worker node. There is no more than one physical core per physical thread. A physical thread can also carry out the logic for a set of tasks/jobs that are assigned to different elements or components (e.g. emitters and transformers).

“Long Tail Task Sequence”: A?long tail task order? A task sequence that uses dedicated computing resources, even if they are properly sized at the start of the task sequence. This is when the task sequence ends. A long tail task sequence could be the distribution of fantasy football tokens by a gaming company during Super Bowl. The demand for fantasy football tapers decreases after the Super Bowl. This also means that the number of game tokens used falls. The number of requests for game token redemptions electronically made at events decreases as well. The gaming company will still honor any unused tokens that have been redeemed gradually over a longer period of time after the Super Bowl. Because it doesn’t require as many computation resources as the Super Bowl surge, the extended lull characterizes long tail tasks and can therefore be run on less computational resources than originally allocated.

“Emitter”: Data enters a container via a so-called “emitter?”. Emitters are event source tuples for a container. They are responsible for getting event tuples to the container. Emitters are responsible for pulling event tuples out of input queues in one implementation. Some implementations allow emitters to include user-specified conversions. They consume byte strings from input queues and forward them to downstream transformers as tuples. An emitter is a device that retrieves one or more jobs/tasks from a worker node.

“Transformers”: A transformer is a unit that computes the contents of a container. It processes the event tuples and passes them on to the next transformer downstream. A transformer can pass one or more jobs downstream. These tasks/jobs are typically further transformed by one or more physical threads from a worker node.

“Introduction”

“We present a system and several implementations for simplifying the creation of an entity management workflow by a non-programming user. The technology described includes the generation of a data columnar that accepts declarative input. This specifies a state machine that implements an automated multi-step process of interaction with entities. The data entry columnar may include at least one column that lists states during the multi-step process, as well as time-based transition triggers and event based triggers. It also includes conditions and other actions that are responsive to state changes. It also receives data that indicates inputs to the columnar. These state transition triggers can be alternatively defined by timers or events that cause state changes. Further, the disclosed technology includes the ability to measure the conditions of a state transformation against at least one value in a database field that the condition refers and respond to these conditions by executing alternative actions during the state changes.

The Internet of Things (IoT), is a revolutionary new technology. Because objects or things can communicate information about their own, they are able to make or enable context-related decisions. The foundation of IoT is big data. Big data computing and analytic tools like Pig are great for big data. While big data computing and analytic systems like Pig were intended for programmers with advanced programming skills, most companies require big data computing or analytics to be available to other people, such as domain experts (e.g. marketers, CEOs, sales reps) who aren’t code developers. Users don’t have the time or the ability to create fully developed workflow processes that are based on big data.

“Also the IoT applications currently in use to implement entity lifecycle management operations are often created by technical experts for technical experts, making them difficult for non-technical users. The current IoT applications are not suitable for novice programmers who have little experience in formal methods. Furthermore, IoT solutions that are commercially or research-driven are primarily applications and not flexible frameworks. They require extensive reprogramming to be used in different situations and for different purposes.

The technology disclosed provides a declarative framework to implement a state machine that allows for multi-step interaction with an entity. Because it is a simple rule-based authoring system that allows for the specification of different elements and components of complex state machines, such as state definitions, state triggers, conditional state transitions and state actions, the declarative framework can be used over and over again for many applications. The declarative input from a non-technical user is used to create and implement the state machine.

“Our world is made up of the 1s, and 0s that make-up the binary code created through the streams of data flowing through all sectors of the global economy. What is the amount of data?

According to IBM, 12.5 exabytes were created each day in 2012 according to IBM. This is 2.5 billion gigabytes data created in one day. Facebook alone generated 500,000 gigabytes per day that same year. Data.gov is an initiative by the U.S. Government to allow data access and analysis. Over the past decade, data storage and processing costs have fallen by more than 1,000. However, once the data is stored it can be difficult to retrieve and use.

According to The Boston Consulting Group, a third of bank data is not used. This is due to 75% of our data being unstructured. It is difficult to index and retrieve because it is not structured.

“Where does all this data come from?” The data generated from old systems of record is an obvious source. This is data generated from cloud software, as evidenced by the rapid adoption Software as a Service (SaaS), as the new business model.

It is data that is being created every second by mobile phones, devices and sensors. These sensors are placed on almost everything that can be tracked in the real world. Social media is the biggest data stream, and they are being created at astronomical levels.

“Forget text messages, think about all the videos and photos being uploaded to social media sites like YouTube, Facebook and Instagram.”

The smartphone is the main enabler of this data tsunami. Both feature phones and PCs (mobile phones other than smartphones) are in decline, while smartphones are increasing in the opposite direction in sub-Saharan Africa. There is an app for every smartphone. There is an application for almost every human activity.

“Applications control all the real-time data streams created by our fingers (the camera, the motion sensor and Bluetooth antenna), as well as the GPS antenna, Bluetooth antenna and Bluetooth antenna. While smartphone manufacturers continue to add more capabilities and sensors to these devices, developers continue to create applications that delight us all.”

According to The Economist in 2015, half of all adults owned a smartphone. This number will rise to 80% by 2020. The biggest ripple, however, is still just beginning to form. It is called?sensorification?, a term created by Andreessen Horowitz. The physical world. Combining cheap, connected, miniature computers with sensors will create a new world of intelligent, connected products.

“Internet of Things” is a new category of technology. (IoT). General Electric takes it one step further by using the term “industrial internet” to refer to things like locomotives, jet engines, and MRI machines.

The Internet of Things is a transformative wave of IT innovation. This is the third wave in IT-driven competition according to the Harvard Business Review. The first two waves were brought about by minicomputers and mainframes. It is clear that harnessing and analysing these data streams will be the greatest challenge IT and businesses face in the coming decade.

This massive amount of data can be described as “Big Data” Big Data is a new type of data storage that cannot handle these high-speed volumes. The challenges do not end there.

“Enterprises must also figure out how to capture the data and search, analyze, visualize, as well as link it to their customer and business data. The ultimate goal of predictive analytics and intelligent decision-making is to be able to use it in real-time. This will require IT transformation from systems that record to systems that are intelligence.

“Before big data was introduced, the term business intelligence (BI), had been a common phrase in the 1990s. There were also a number of new BI software vendors that entered the market.

“BI provided the tools and methods required to transform data into useful and meaningful information for businesses.” BI’s functions during this time were very basic. They consisted of gathering and organizing data and visualizing it in a visually appealing way.

“Innovations continue and the introduction data warehouses dramatically reduced the time required to access enterprise data from records systems. These innovations have not eliminated the core challenge. These data warehouses require deep expertise, and extensive training in BI tools.

These tools are not accessible to mere mortals working in business. If you have the right questions in mind, most BI tools will provide answers. Sometimes, you don’t know what questions to ask. These tools don’t allow business users to get the insights they need when, how, or where they need them.

“Fortunately, all this is changing. Data analytics tools that run entirely in the cloud are being developed for the first time. IT is not required to install or configure the data platform, nor do they need to provision hardware. It takes just a few days to complete all of the integration and schema development. Innovation in technology has been able to abolish the two-step service bureau model that required involvement for every request.

These innovations open the door to democratization of data, allowing business users to not only access data but also take part in its analysis. This allows for self-service, which means that users can access the answers directly without having to consult IT, analysts, or data scientists. Users can quickly find and share information with their business colleagues. It is not necessary to know in advance what questions you will ask about the data. Business users are able to quickly find questions that will allow them to gain insight into the data sets.

“Mobile is also a key driver of this democratization. Workers can now access data and answers to business questions using their smartphones, tablets, and wearables from anywhere and anytime. “The democratization and sharing of data is a crucial step in our journey towards building intelligent systems.

While data democratization has many benefits, it is mainly about empowering business users to access and analyze data from both legacy systems of record as well as cloud-based business apps. Some of these new BI tools may provide real-time data access and analysis. They are not designed to capture and analyze real-time streams from smartphones, wearables and the upcoming explosion of sensors in our physical world.

“Real-time data streams provide information that is very different from the historical, backward-looking data most BI platforms and tools harness. Real-time data can be lost quickly. It is therefore imperative that real-time data be detected and acted on. The concept of “time to insight?” is a key performance indicator for intelligence systems. One of the most important performance indicators for intelligence systems is?time to insight? These insights will require new packaging and consumption. It is essential that the information be presented in context, at the correct time and in a manner that cuts through the noise of data we receive in our day to day work lives.

“Systems of Intelligence require knowledge about how to use data insights and how to get them to the right worker, based on their job function. These systems can be set up quickly and are as democratic as traditional BI tools. These systems are designed to handle the daily flood of data that we face every day at work. Applications for consumers such as traffic and social media help us to deal more intelligently with the things that are most important to us.

“The standard for applications that are connected to our intelligence systems is the same as it is for consumer applications. It means one-click installation, a beautiful and easy user interface, and access via your mobile device. Real-time data streams can be harnessed and analysed to allow for action, as well as the ability to predict what will happen. This is traditionally the domain of data scientists, who deal with everything from statistical modeling and computation to visualization and reporting. Data scientists create models that look at historical trends to predict future patterns. Enterprises face a major human resource challenge when trying to create computational models that can analyze large amounts of real-time data streams.

McKinsey Global Institute estimates that the United States could be facing a shortage between 140,000 and 190,000. They also need 1.5 million managers or analysts who can use big data analysis to make informed decisions.

“None of the companies has data scientists who can both analyze big data streams in real time and make something out of it. Many companies are unable to fill open positions with qualified people. Universities will not be able to prepare enough data scientists for the future demand. Let’s suppose you have your data scientists trained to analyze and organize the data. Next, what? What can you do to make this actionable? How can you teach your line managers and directors how to interpret the analysis to make the right decision?

While systems of intelligence won’t replace data scientists anytime soon these systems will help to reduce the need for data scientists to be hired. Systems of intelligence can harness the collective wisdom, expertise, gained insights, and experience of an organization to make intelligent decisions. A rules engine can express the collective intelligence as rules. These powerful tools allow business users to use this collective intelligence to create simple, logical business rules. They evaluate and analyze real-time data streams and make intelligent decisions.

“Data science is the art of formulating a quantitative query that can be answered using data, cleaning and analyzing data, and communicating the result to the relevant audience.”

“The low-hanging fruits that enterprises will harvest from their intelligence systems will be the most valuable, which is the value derived from simple business rules. As organizations gain more insight from their systems, and more sensors and devices are included in the equation, machine learning and algorithms will play an increasing role in intelligent decision-making.

“Enterprises will more often turn to artificial intelligence because they won’t be able to employ enough business analysts or data scientists to sort through all the data. It is becoming very sophisticated and credit card fraud detection is one great example.

Although artificial intelligence doesn’t completely eliminate the need to have a fraud expert trained, it dramatically reduces the number suspicious cases that require human investigation.”

As organizations begin to spin up big data initiatives, there will be many things to consider. This will require the right people, tools and methods. Technology is unbounded in terms of data sources and sizes. It can handle any ad-hoc question to whatever depth you wish.

“The next step is the systems of intelligence which start to tell customers what questions to ask.” A blueprint is needed for intelligence systems.

“The data streams come from signals coming in real time from smartphones, consumer wearables such as the Fitbit or Apple Watch. These signals are controlled by the application.

“The application is what gives context to the raw data created by human inputs or the sensors embedded within these devices.”

Wikipedia defines a sensor as a device that detects or senses a characteristic of its environment. It detects changes or events in quantities and produces an corresponding output, usually as an optical or electrical signal.

The digital plumbing or application programming interfaces (APIs) is what ties all this together. APIs will allow the end-to-end transport of high speed data at high volumes in the system, along with every element of the data stream flow as shown in the schematic. API may not be used in common usage outside of the United States, but it is a term that has been around for a while.

IT, it will be, in much the same manner that terms of art and internet are common language for business communication today.

The connected consumer products and industrial machines will provide the largest data streams. These signals will be sent from the product sensors in our cars, homes, security systems, and everywhere else that is relevant.

“Signals from industrial internet will be emanated from sensors on any machine or piece of equipment that needs monitoring, maintenance, and repair. Anything that can be digitally monitored using sensors in the environment will be. These signals must be identified and harnessed by systems of intelligence.”

“A ‘digital watchdog’ is needed to capture high-volume and high speed data signals. These signal inputs must be monitored. An event is a digital signal that indicates something significant has occurred. An example of an event would be when your car’s temperature sensor is activated to alert you of the danger of freezing temperatures.

“Systems of Intelligence will need the technology to monitor and ingest these data streams. The digital signals create events that are broadcasted through messages. This allows the system to move the information around so that the digestion process proceeds as planned. Here filters can start to analyze these data streams. The system must be capable of handling growing volumes and faster data flows. It must also be able not to lose any data if it crashes.

Once data has been captured and processed it is sent to the digestion phase. Here is where the magic happens. This involves the analysis and monitoring of real-time data streams. After the data has been analyzed and processed it must be placed somewhere.

“The data streams that are flowing in are not suitable to traditional database storage, such as structured query language relational databases or relational databases. This is why it is necessary to have specialized technology that can store and handle large data sets. It is an essential component of systems of intelligence.

“Another important component of this system is its ability to apply filters in form of business rules to the analysis data streams. This will allow the system to eliminate human error by directly incorporating the collective wisdom and expertise of the company into it. These data streams can be further analyzed using artificial intelligence, which can come in the form machine learning and algorithms.

“Enterprise data” is made up of systems of record, systems of engagement and other systems that are the core of enterprise IT. Systems of record were created as IT moved from mainframes and minicomputers into the Internet. They have been primarily about moving paper and manual processes into digital. Systems of record are about automating daily activities, capturing their information by products and reporting on what are essentially historical documents.”

“Systems for engagement are fundamentally different to systems of record in that their focus is on the social nature and interactions with customers and partners. These conversations are shaped by social media and IT’s consumerization. Systems of engagement are not document-based digital artifacts. They add elements such as time, context and place. Systems of record will not disappear; enterprises just need to embrace next generation communication and collaboration through systems of engagement.

“Systems for engagement and systems of record are essential elements to provide context to the data streams and filtering. Without a complete picture of the customer or partner, you cannot understand the data streams and their outputs. These systems are essential for illuminating the intelligent decisions and analytical insights driven by systems intelligence.

After ingesting, digesting and applying enterprise context to data streams, intelligent outputs can be produced and delivered in the correct form, at the right moment, and to what channel. Insights and dashboards are the first two channels. Dashboards provide context and visualization of what is happening so that people can take action, such as launching new initiatives or tweaking existing marketing programs. Insights are more focused on providing real-time decision making. This is the key difference between analytical insights and dashboards. It is a great way to eliminate bad decisions and avoidable ones by expressing the collective knowledge and expertise within the organization using business rules. Data scientists will become more adept at applying their models and methods to create machine learning algorithms that make intelligent and predictive decisions as signals increase.

“The enterprise is now ready to start transforming the business by applying the intelligent outputs. Our core premise is that every application, device and connected product is a customer. This IoT platform’s role is to connect the device data to the user success platform to engage customers through sales, marketing, customer service, communities, applications, and analytics.

The technology disclosed simplifies, for non-programming users, the creation of an entity management workflow using computer-implemented system. The disclosed technology can be used in any computer-implemented systems, including a database system, multi-tenant environment, and a relational database implementation such as an Oracle. Compatible database implementation, IBM DB2 Enterprise Server MySQL is compatible with relational databases. or PostgreSQL? Compatible relational database implementation, or a Microsoft SQL Server Compatible relational database implementation, or a NoSQL Non-Relational Database Implementation such as a Vampire An Apache Cassandra is compatible with a non-relational implementation. Compatible non-relational data base implementation, BigTable? Compatible non-relational data base implementation, a BigTable? DynamoDB or compatible non-relational database implementation? Compatible non-relational implementation of DynamoDB

“Moreover, the technology disclosed may be used with multiple computer-implemented systems that communicate and cooperate. This technology can be used in many ways. It may be implemented as a method, an apparatus or a system. Or as a product that includes a computer usable medium with a computer-readable program code.

“In addition, the technology disclosed may be implemented using different programming models such as MapReduce? bulk synchronous programming, MPI primitives and others. or other stream management systems such as Apache Storm? and Apache Spark?. Apace Kafka?, Truviso?, IBM Info-Sphere?, Borealis?. Yahoo! S4?.”

“IoT Platform & Stream-Batch Processing Framework”

“We present a system and several implementations for simplifying the creation of entity management workflows by non-programming users. FIG. will illustrate the system and its processes. 1. and FIG. 2 shows an architectural level schematic for a system according to an implementation. Because FIG. FIG. 2.2 and FIG. 2.3 are architectural diagrams. As such, certain details are intentionally omitted to improve clarity. FIG. 1 and FIG. 2 are architectural diagrams. Certain details have been omitted in order to make the description more clear. The discussion of FIG. FIG. The organization of 2 will be as follows: The elements of the respective figures will first be described. Next, their interconnections will be discussed. Next, we will describe in detail how the elements are used in the system.

“FIG. “FIG. IoT platform 100 comprises data sources 102 and input connectors 104, stream container(s), 106, batch container (s), 108, rich context data store 110, orchestration systems 112, out connectors 122, and application(s). Rich contextual data store 110 contains various storage nodes C1 to C3. Orchestration 112 contains a data columnar 114 and an explorer engine 110. A live dashboard builder engine 128 is also available. A morphing engine 117 is also available. Applications 123 contain various SaaS, PaaS, and IaaS offerings.

“FIG. FIG. 2 shows a stream processing framework 200 that is used in the platform illustrated in FIG. 1. This is a single implementation of the technology. Framework 200 contains data sources 102 and input pipeline 204. It also includes stream container 106. Rich contextual data store 110 is included. Output pipeline 218 is included. Stream container 106 contains an emitter Tier 206, a scheduler 208, a coordinator 218 and a worker Tier 214.

The interconnection of elements of IoT platform 100, streaming framework 200 and scheduler 208 will be described. The network (not shown), links the data sources 102 and the input connectors 104, the stream container 106 and batch container 108. It also includes the orchestration system 112. The columnar 114, output connectors 122, application(s) 123. It also connects the application(s), 123, the orchestration platform 112, the columnar 110, the orchestration data store 110, the orchestration software 112, the orchestration system 121, the orchestration systems 112, the columnar (114), the coordinator 210, 214, the worker tier 218, the pipeline 218. The actual communication path may be point-to?point over private and public networks. Some items, like data from data sources 101, may be delivered indirectly. An application store (not illustrated). All communications can be made over many networks, such as the Internet. Private networks, VPNs, MPLS circuits, or the Internet can all be used. They also have access to appropriate APIs and data interchange formats such as JSON, SOAP, XML, and SOAP. REST, JSON and/or JMS. All communications can be encrypted. The communication takes place over a network like the LAN, WAN, LAN, telephone network (Public Switched Telephone Network, PSTN), Session Initiation protocol (SIP), wireless network point-to-point network star network token ring network hub network, Internet inclusive of mobile Internet via protocols such EDGE, 3G and 4G LTE. You can also use a variety authentication and authorization techniques to protect the communication, including username/password and OAuth, Kerberos and SecureID.

“Having described the elements in FIG. 1. (IoT platform 100) and 2. (Streaming framework 200) will be described in greater detail. 2 (Streaming Framework 200) and their interconnections will be discussed in detail.

“Data sources 102” are entities like a smart phone or WiFi access point, a sensor network, a mobile app, a web client and logs from servers, social media sites, etc. One implementation allows data from data source 102 to be accessed via an API Application Programming Interface. This interface allows sensors, devices and gateways, proxy servers, and other types of clients to register data source 102 in IoT platform 100. Data can then be ingested from these sources. Data sources 102 data can contain events in structured data (e.g. User profiles and the interest graph, unstructured text (e.g. Tweets), semi-structured interaction logs, and user profiles. Events include clicks on links and impressions of recommendations. Server logs contain information about user identities, such as user handles, user IDs and user names. Content posted by users to respective feeds on social networks, data on social graphs, metadata, including comments posted in response to previous posts, events, news articles and so on. Events can be stored in semi-structured data formats like a JSON, BSON, Binary JSON, XML or Thrift object. These include string fields (or columns), and corresponding values for potentially different types such as numbers, strings arrays, objects, and so forth. JSON objects can have multiple values and can be nested, such as arrays or nested arrays.

“Near real-time (NRT), data streams 103, as described infra are collections of events that are recorded as they are generated. Events are sent over HTTP to the input pipeline 204 in one implementation. Another implementation transmits events via POST requests to a receiver working on behalf of input pipe 204. Twitter Firehose API, which is accessible via companies affiliated with Twitter like Datashift, nTweetStreamer and tiwwter4j, provides unbounded time-stamped events (called tweets) as a stream JSON object along with metadata about those Tweets including timestamp data, user information, location. These JSON objects are stored in a NoSQL schema-less or NoSQL data-store such as Apache Cassandra? and Google’s BigTable?. The parsed JSON objects are stored using key spaces that are equivalent in SQL to a database. Each key space is divided into column family that look like tables, and consist of rows and sets.

“Input connectors 104 obtain data from data sources 101 and convert it into an input format that can be used by containers 106 and 110. The input connectors 104 can pull full data from data sources 102 and/or incrementally. In another implementation, input connectors104 access metadata from data sources 102. The input connectors 104 can issue a ‘describe? API call to retrieve metadata about an entity, and then issue the appropriate API calls to fetch that data. Connector SDK is used in some cases to create customized input connectors 104 Individual data sources 102

“In other implementations, the workflow definition also includes a list of operators and connectors as well as the order in which they are executed. One implementation specifies such a workflow as a directed diagram, where operators and connectors are graph nodes, and edges represent the data flow. Another implementation uses multiple data streams 103 that are combined and transformed before being fed into the containers 106 or 108.

“Batch processing framework in container(s), 108 generates business insight using OnLine Analytic Processing (OLAP queries), which are stored in rich context data store 110. One implementation stores events in batch container(s), 108 as a backup to raw events, so that batch processing jobs can be run at any time. In some implementations, batch container(s)108 provides raw counts and descriptive statistics like mean, median, and percentile breakdowns. Scalding is an analytics tool that can be used in one implementation. Scalding? and Pig? Batch container(s)108 contains Pig? to allow retrospective analysis, machine-learning modeling, and other batch analytics. Batch container(s)108 can also be used to correct errors in stream container 106, or to handle upgraded capabilities. It is used to run analytics on historical data to recompute the results. A batch processing framework is Hadoop distributed filesystem (HDFS), which implements a MapReduce programming system.

“Batch containers(s)108 ingest event tutles from respective input pipes that collect data for a plurality NRT data streams. Multiple NRT data streams can be assigned for a single pipeline, and multiple pipelines can also be assigned to one batch container in some implementations.

“Stream processing framework 200 allows for near-real-time (NRT), processing of sequences unbounded events to deliver immediate analytics and insights. Framework 200 can process one million events per second per node in one implementation. Framework 200 can be implemented with Apache Storm or one of the stream processors. Apache Samza? Or a batch-stream processor like Apache Spark? Framework 200 has an API that allows you to create jobs and execute operations on a sequence event-tuples.

“Events can be ingested into the framework 200 by input pipe 204. This reads data from data sources 102 and stores events for consumption by stream container 106. Input pipeline 204, which is one delivery point for events entering container 106, can be implemented in one implementation. Flume? and Apache Kafka? are some examples of input pipeline 204. They provide a single delivery endpoint for events entering the container 106. In some cases, the input pipeline 204 also includes a listener that can listen to NRT data streams 103 as well as data flows originating 102 from data sources 102. This is done by connecting with their respective APIs (e.g. Chatter API, Facebook API (e.g. Open Graph), Twitter API. via the Internet. Some implementations include heterogeneous instances that are responsible for receiving data from different sources.102 An implementation states that the input pipeline 204 can receive data over the network using either an application protocol layer or another higher protocol layer such as HTTP protocol layer. There are many standard and proprietary protocols layers. These higher protocol layers are capable of encoding, packaging and/or reformatting data to send and receive messages over a network layer such as Internet Protocol, (IP) and/or a transportation layer such as Transmission Control Protocol, (TCP), and/or User Datagram Protocol, (UDP).

“In a particular implementation, Apache Kafka? is used as the input pipe 204. Kafka is a distributed messaging platform with a publish-subscribe model. Kafka organizes events into topics. Events are published by so called producers. They are then pulled and processed by so known consumers. Kafka is a distributed system. Each node in the cluster is called a broker. This stores events in a replicable commit log. Different messaging and queuing systems may be used in other implementations.

NRT data streams 103 are queued as batches in an implementation. A batch is an assemblage or group of event tuples. This can also be referred to as units of work. Time-slice based means that at least one NRT data stream is divided by the most recently received portion. A batch-size-based definition means that at least one incoming NRT stream is divided by the most recently received portion, which can be restricted or restricted by data size (e.g. one batch contains 10MB of most recent event tuples). Other implementations use a combination time-size and batch-size basis to define batches.

Summary for “Simplified entity management of the lifecycle”

“The discussion of the subject matter in this section is not to be taken to be prior art simply because it has been mentioned in this section. A problem that is mentioned in this section, or related to the subject matter given as background, should not be assumed previously to have been recognized in prior art. This section does not contain any specific approaches. They may be used in conjunction with implementations of the claimed technology.

The technology disclosed implements a state machine that allows for multi-step interaction with an entity. Because it is a simple rule-based authoring system that allows for the specification of different elements and components of complex state machines, such as state definitions, state triggers, conditional state transitions and state actions, the declarative framework can be used over and over again for many applications. The declarative input from a non-technical user is used to create and implement the state machine.

“In today?s world, we deal with massive data volumes, commonly referred to as ‘Big Data?. Web applications that manage millions of Internet users such as Facebook, Instagram, Twitter, and banking websites or online retail shops such as Amazon.com. Are you looking for eBay or Facebook? They must ingest large volumes of data quickly so end users have a real-time experience.

“Internet of Things” is another major contributor to Big Data. (IoT). IoT is about a pervasive presence in the environment of a variety of things/objects that through wireless and wired connections are able to interact with each other and cooperate with other things/objects to create new applications/services. These applications/services are in areas likes smart cities (regions), smart car and mobility, smart home and assisted living, smart industries, public safety, energy and environmental protection, agriculture and tourism.”

“Currently, there is a need to make such IoT applications/services more accessible to non-experts. Non-technical domain experts who are highly valued have been watching from the sidelines of IoT’s ecosystem. This is because IoT relies on technical products that require extensive programming experience. It is imperative that non-experts are able to combine big data computing and analysis without the need for expensive technical consultants.

“There is a great opportunity to offer systems and methods that are simple and easily codeable declarative language-based solutions to big data computing and analysis tasks. This could lead to higher revenue, user retention, better user engagement and improved experience.

This summary provides a quick overview of the various aspects of the exemplary, non-limiting implementations. The detailed description and accompanying drawings will provide a more in depth explanation. However, this summary is not meant to be an exhaustive or comprehensive overview. This summary serves as a brief introduction to the details of various implementations.

The technology disclosed provides a declarative framework to implement a state machine that allows for multi-step interaction with an entity. Because it is a simple rule-based authoring system that allows for the specification of different elements and components of complex state machines, such as state definitions, state triggers, conditional state transitions and state actions, the declarative framework can be used over and over again for many applications. The declarative input from a non-technical user is used to create and implement the state machine.

“Other advantages and aspects of the technology disclosed can also be seen by reviewing the drawings, detailed description, and claims which are included.”

The following description includes a reference to the figures. The claims do not limit the scope of the disclosed technology. Instead, samples are used to illustrate its implementations. Anyone with ordinary skill in art will be able to recognize many equivalent variations in the following description.

The following structure is used for the discussion: The discussion begins with an explanation of the terminology used. Next, an introduction will describe some of the technical issues and solutions offered by different implementations. An architectural overview of several implementations will then be provided. A state machine that implements an entity management workflow will be described. A few user interface views that are used in different implementations will also be discussed. The next section will discuss more specific actions to implement the system. This includes data entry models and transitive triggers, condition definitions, and data entry models. Finally, we will discuss some specific implementations.

“Terminology”

“Entity”: A thing or object that interacts or communicates with another thing or object and the environment. This includes exchanging data and information about the environment and reacting to real/physical events. The entity provides services for information transfer and analytics. People, web users, wireless/wireless sensors, smart phones and smart watches are all examples of entities. Some implementations use the singular “entity” designation. The singular?entity? and plural?entities? may be used interchangeably in some implementations. These terms are interchangeable in this application to ensure clarity. In some implementations of this application, “entities” is used. are?data sources?”,?users?”, and other actors?

“Internet of Things Platform: What is the?Internet of Things platform (IoT?)? The integrated environment described herein can be defined as one that gathers and processes large amounts of data from multiple entities in real-time (or near real-time), often with low latency. Sometimes, processing logic can also be applied to the data in order to generate real-time and near-real-time analytics. An IoT platform can be described as an integrated framework that uses a combination of batch and stream mode computation to periodically generate aggregates using offline and batch analytics. It also substitutes real-time data streams with real-time results to generate real time analytics. This includes computational tasks such as data mining, statistical processing and predictive analytics. One implementation of the IoT platform allows for high throughput, with nodes processing 1 million tuples per minute. The IoT platform also offers insight to end-users through rich visualization using GUI and/or API tools such as standard graphs and bars, charts, and overlaid infographics.

“Near Real-Time data stream: A near real time (NRT) data streaming is a collection events that are recorded as they are generated. An NRT data stream can be described as an unbounded sequence or tuples of data. A NRT data stream can emit one million events per second in some implementations.

“Event”: A data unit that identifies an event and provides information about it. An event can provide information about an entity in one implementation. An event can include three components: a timestamp that indicates when it occurred; a set dimensions that indicate various attributes about the event; as well as metrics. Event can also be generated by users, such as mouse clicks or keystrokes. Statistics (e.g. Latency/numbers of bytes, etc. There are many other options, including program loading and errors. Events can include network flow variables as well as information about user and group members, device information, device information, and information on applications (e.g. resource condition, variables, and custom triggered events). A message, token or count is a value, token, pattern, or marker that can easily be identified within a NRT stream. This includes network traffic, specific error signals or conditions, thresholds crossed and counts accumulated. An example of a user interaction with Pardot. A sequence of events occurs in the context of a session. These are the main events to be noted: (a) login?provide user credential to a hosted server to authenticate the user; and (b) application transaction?execute a series of application-level transactions, e.g. Add leads or create new operations. Log-out?this event ends the session with the server. Deep packet inspection logic can be used to track event data and store them in an event repository. In some cases, this application is interchangeable with?events’. In some implementations, ‘events? can be interchanged with?data?. Another example of events that are generated by or about entities is telemetry from wearable sensors, data from smart watches, and data and/or metadata generated when a user uses a feature in an application (such Microsoft Word). Trip or journey data generated using a GPS device used by a driver to start or complete a trip, speed and location data from a vehicle, data generated via a medical device reporting a sensor reading etc.

“Pipeline”: A series of interrelated, grouped events is a pipeline. One implementation grouping is done on a type-by-type basis. Another implementation uses a batch-by-batch approach to grouping.

Online Social Network: An “online social network” A combination of software, protocols, and/or hardware that allows a community or individual to share information, resources, and the like over a computer network (such the Internet) is known as an online social network. An online social network is a platform such as a blog, website or forum that encourages interaction, engagement, and information sharing. An example of an online social networking site is Facebook? Twitter? Flickr? Picasa??, Digg? RSS? Biogs? Reddit? Linkedin? Wikipedia? Pinterest? Google Plus+? MySpace? Bitly? The like. In some cases, this application can be interchangeably referred to as?online social networking? as?social network? or?social-media site?, or?social networking service?. and?social network entity? and vice versa.

“Application Programming Interface (API): An?application programming interfacing (API). A package of code libraries, methods, and fields that are part of a class, together with its interface types, is defined as the API. Developers and programmers can access the API to create their own software by simply importing the appropriate classes and writing statements that instantiate them and call their methods or fields. An API can also be a source-code-based specification that allows software components to communicate with one another. An API can contain specifications for routines and data structures. An API is basically an interface that allows programmers and developers to access the platform capabilities and features of online social media networks. Twitter’s Search API allows you to pollute Twitter’s data by entering a username or searching for it. Twitter’s Search API allows developers and programmers to access data sets that exist from tweets that have been made. Developers and programmers can request tweets that meet their search criteria through the Search API. Keywords, usernames and locations can all be used as search criteria. Named places, names, and other information are also possible. Twitter’s streaming API, for example, allows you to push data in real-time as tweets are published almost immediately. Developers and programmers can register criteria using Twitter’s streaming API (e.g. keywords, usernames locations, named places etc.). As soon as tweets meet the criteria, they will be pushed directly to developers and programmers. Twitter Firehose, another example, pushes data to programmers and developers in near real time and guarantees delivery for all tweets that meet the criteria.

“Application” refers to a network hosted services accessed via a uniform resources locator (URL). Software as a Service (SaaS), platform as a Service (PaaS), and infrastructure as an a Service (IaaS), are some examples. Salesforcel Platform? Sales Cloud? Data.com? and Service Cloud? are some examples of these applications. Pardot?, Marketing Cloud, Desk.com? Wave Analytics?. Box.net?., Marketing Cloud?. Pardot? Concur? An application can provide insights to end-users through rich visualization using GUI or API based tools such as standard graphs, bars and charts, and overlaid infographics.

“Entity Experience Operation”: A?entity experience operation. An orchestrated effort usually undertaken by an operator (e.g. To enable effective user management, resource provisioning, app life cycle management, user engagement and traffic monitoring, activity tracking. Provisioning for application modeling.

“Identification: The term used herein is?identification. The identification of an item does not necessarily need to be specific. Information can be identified? In a field, information can be?identified? by simply referring back to it through one or more layers or by identifying one of several items of information that are sufficient to identify the actual item. The term?specify?” can also be used herein. The term?specify? is also used in this context to refer to the same thing as ‘identify”.

“Physical thread: A container can operate over a number of so-called “physical threads” once it is deployed. A physical thread uses a processor core on a worker node. It runs within a set code processes (e.g. Java processes) that are distributed across the worker node. There is no more than one physical core per physical thread. A physical thread can also carry out the logic for a set of tasks/jobs that are assigned to different elements or components (e.g. emitters and transformers).

“Long Tail Task Sequence”: A?long tail task order? A task sequence that uses dedicated computing resources, even if they are properly sized at the start of the task sequence. This is when the task sequence ends. A long tail task sequence could be the distribution of fantasy football tokens by a gaming company during Super Bowl. The demand for fantasy football tapers decreases after the Super Bowl. This also means that the number of game tokens used falls. The number of requests for game token redemptions electronically made at events decreases as well. The gaming company will still honor any unused tokens that have been redeemed gradually over a longer period of time after the Super Bowl. Because it doesn’t require as many computation resources as the Super Bowl surge, the extended lull characterizes long tail tasks and can therefore be run on less computational resources than originally allocated.

“Emitter”: Data enters a container via a so-called “emitter?”. Emitters are event source tuples for a container. They are responsible for getting event tuples to the container. Emitters are responsible for pulling event tuples out of input queues in one implementation. Some implementations allow emitters to include user-specified conversions. They consume byte strings from input queues and forward them to downstream transformers as tuples. An emitter is a device that retrieves one or more jobs/tasks from a worker node.

“Transformers”: A transformer is a unit that computes the contents of a container. It processes the event tuples and passes them on to the next transformer downstream. A transformer can pass one or more jobs downstream. These tasks/jobs are typically further transformed by one or more physical threads from a worker node.

“Introduction”

“We present a system and several implementations for simplifying the creation of an entity management workflow by a non-programming user. The technology described includes the generation of a data columnar that accepts declarative input. This specifies a state machine that implements an automated multi-step process of interaction with entities. The data entry columnar may include at least one column that lists states during the multi-step process, as well as time-based transition triggers and event based triggers. It also includes conditions and other actions that are responsive to state changes. It also receives data that indicates inputs to the columnar. These state transition triggers can be alternatively defined by timers or events that cause state changes. Further, the disclosed technology includes the ability to measure the conditions of a state transformation against at least one value in a database field that the condition refers and respond to these conditions by executing alternative actions during the state changes.

The Internet of Things (IoT), is a revolutionary new technology. Because objects or things can communicate information about their own, they are able to make or enable context-related decisions. The foundation of IoT is big data. Big data computing and analytic tools like Pig are great for big data. While big data computing and analytic systems like Pig were intended for programmers with advanced programming skills, most companies require big data computing or analytics to be available to other people, such as domain experts (e.g. marketers, CEOs, sales reps) who aren’t code developers. Users don’t have the time or the ability to create fully developed workflow processes that are based on big data.

“Also the IoT applications currently in use to implement entity lifecycle management operations are often created by technical experts for technical experts, making them difficult for non-technical users. The current IoT applications are not suitable for novice programmers who have little experience in formal methods. Furthermore, IoT solutions that are commercially or research-driven are primarily applications and not flexible frameworks. They require extensive reprogramming to be used in different situations and for different purposes.

The technology disclosed provides a declarative framework to implement a state machine that allows for multi-step interaction with an entity. Because it is a simple rule-based authoring system that allows for the specification of different elements and components of complex state machines, such as state definitions, state triggers, conditional state transitions and state actions, the declarative framework can be used over and over again for many applications. The declarative input from a non-technical user is used to create and implement the state machine.

“Our world is made up of the 1s, and 0s that make-up the binary code created through the streams of data flowing through all sectors of the global economy. What is the amount of data?

According to IBM, 12.5 exabytes were created each day in 2012 according to IBM. This is 2.5 billion gigabytes data created in one day. Facebook alone generated 500,000 gigabytes per day that same year. Data.gov is an initiative by the U.S. Government to allow data access and analysis. Over the past decade, data storage and processing costs have fallen by more than 1,000. However, once the data is stored it can be difficult to retrieve and use.

According to The Boston Consulting Group, a third of bank data is not used. This is due to 75% of our data being unstructured. It is difficult to index and retrieve because it is not structured.

“Where does all this data come from?” The data generated from old systems of record is an obvious source. This is data generated from cloud software, as evidenced by the rapid adoption Software as a Service (SaaS), as the new business model.

It is data that is being created every second by mobile phones, devices and sensors. These sensors are placed on almost everything that can be tracked in the real world. Social media is the biggest data stream, and they are being created at astronomical levels.

“Forget text messages, think about all the videos and photos being uploaded to social media sites like YouTube, Facebook and Instagram.”

The smartphone is the main enabler of this data tsunami. Both feature phones and PCs (mobile phones other than smartphones) are in decline, while smartphones are increasing in the opposite direction in sub-Saharan Africa. There is an app for every smartphone. There is an application for almost every human activity.

“Applications control all the real-time data streams created by our fingers (the camera, the motion sensor and Bluetooth antenna), as well as the GPS antenna, Bluetooth antenna and Bluetooth antenna. While smartphone manufacturers continue to add more capabilities and sensors to these devices, developers continue to create applications that delight us all.”

According to The Economist in 2015, half of all adults owned a smartphone. This number will rise to 80% by 2020. The biggest ripple, however, is still just beginning to form. It is called?sensorification?, a term created by Andreessen Horowitz. The physical world. Combining cheap, connected, miniature computers with sensors will create a new world of intelligent, connected products.

“Internet of Things” is a new category of technology. (IoT). General Electric takes it one step further by using the term “industrial internet” to refer to things like locomotives, jet engines, and MRI machines.

The Internet of Things is a transformative wave of IT innovation. This is the third wave in IT-driven competition according to the Harvard Business Review. The first two waves were brought about by minicomputers and mainframes. It is clear that harnessing and analysing these data streams will be the greatest challenge IT and businesses face in the coming decade.

This massive amount of data can be described as “Big Data” Big Data is a new type of data storage that cannot handle these high-speed volumes. The challenges do not end there.

“Enterprises must also figure out how to capture the data and search, analyze, visualize, as well as link it to their customer and business data. The ultimate goal of predictive analytics and intelligent decision-making is to be able to use it in real-time. This will require IT transformation from systems that record to systems that are intelligence.

“Before big data was introduced, the term business intelligence (BI), had been a common phrase in the 1990s. There were also a number of new BI software vendors that entered the market.

“BI provided the tools and methods required to transform data into useful and meaningful information for businesses.” BI’s functions during this time were very basic. They consisted of gathering and organizing data and visualizing it in a visually appealing way.

“Innovations continue and the introduction data warehouses dramatically reduced the time required to access enterprise data from records systems. These innovations have not eliminated the core challenge. These data warehouses require deep expertise, and extensive training in BI tools.

These tools are not accessible to mere mortals working in business. If you have the right questions in mind, most BI tools will provide answers. Sometimes, you don’t know what questions to ask. These tools don’t allow business users to get the insights they need when, how, or where they need them.

“Fortunately, all this is changing. Data analytics tools that run entirely in the cloud are being developed for the first time. IT is not required to install or configure the data platform, nor do they need to provision hardware. It takes just a few days to complete all of the integration and schema development. Innovation in technology has been able to abolish the two-step service bureau model that required involvement for every request.

These innovations open the door to democratization of data, allowing business users to not only access data but also take part in its analysis. This allows for self-service, which means that users can access the answers directly without having to consult IT, analysts, or data scientists. Users can quickly find and share information with their business colleagues. It is not necessary to know in advance what questions you will ask about the data. Business users are able to quickly find questions that will allow them to gain insight into the data sets.

“Mobile is also a key driver of this democratization. Workers can now access data and answers to business questions using their smartphones, tablets, and wearables from anywhere and anytime. “The democratization and sharing of data is a crucial step in our journey towards building intelligent systems.

While data democratization has many benefits, it is mainly about empowering business users to access and analyze data from both legacy systems of record as well as cloud-based business apps. Some of these new BI tools may provide real-time data access and analysis. They are not designed to capture and analyze real-time streams from smartphones, wearables and the upcoming explosion of sensors in our physical world.

“Real-time data streams provide information that is very different from the historical, backward-looking data most BI platforms and tools harness. Real-time data can be lost quickly. It is therefore imperative that real-time data be detected and acted on. The concept of “time to insight?” is a key performance indicator for intelligence systems. One of the most important performance indicators for intelligence systems is?time to insight? These insights will require new packaging and consumption. It is essential that the information be presented in context, at the correct time and in a manner that cuts through the noise of data we receive in our day to day work lives.

“Systems of Intelligence require knowledge about how to use data insights and how to get them to the right worker, based on their job function. These systems can be set up quickly and are as democratic as traditional BI tools. These systems are designed to handle the daily flood of data that we face every day at work. Applications for consumers such as traffic and social media help us to deal more intelligently with the things that are most important to us.

“The standard for applications that are connected to our intelligence systems is the same as it is for consumer applications. It means one-click installation, a beautiful and easy user interface, and access via your mobile device. Real-time data streams can be harnessed and analysed to allow for action, as well as the ability to predict what will happen. This is traditionally the domain of data scientists, who deal with everything from statistical modeling and computation to visualization and reporting. Data scientists create models that look at historical trends to predict future patterns. Enterprises face a major human resource challenge when trying to create computational models that can analyze large amounts of real-time data streams.

McKinsey Global Institute estimates that the United States could be facing a shortage between 140,000 and 190,000. They also need 1.5 million managers or analysts who can use big data analysis to make informed decisions.

“None of the companies has data scientists who can both analyze big data streams in real time and make something out of it. Many companies are unable to fill open positions with qualified people. Universities will not be able to prepare enough data scientists for the future demand. Let’s suppose you have your data scientists trained to analyze and organize the data. Next, what? What can you do to make this actionable? How can you teach your line managers and directors how to interpret the analysis to make the right decision?

While systems of intelligence won’t replace data scientists anytime soon these systems will help to reduce the need for data scientists to be hired. Systems of intelligence can harness the collective wisdom, expertise, gained insights, and experience of an organization to make intelligent decisions. A rules engine can express the collective intelligence as rules. These powerful tools allow business users to use this collective intelligence to create simple, logical business rules. They evaluate and analyze real-time data streams and make intelligent decisions.

“Data science is the art of formulating a quantitative query that can be answered using data, cleaning and analyzing data, and communicating the result to the relevant audience.”

“The low-hanging fruits that enterprises will harvest from their intelligence systems will be the most valuable, which is the value derived from simple business rules. As organizations gain more insight from their systems, and more sensors and devices are included in the equation, machine learning and algorithms will play an increasing role in intelligent decision-making.

“Enterprises will more often turn to artificial intelligence because they won’t be able to employ enough business analysts or data scientists to sort through all the data. It is becoming very sophisticated and credit card fraud detection is one great example.

Although artificial intelligence doesn’t completely eliminate the need to have a fraud expert trained, it dramatically reduces the number suspicious cases that require human investigation.”

As organizations begin to spin up big data initiatives, there will be many things to consider. This will require the right people, tools and methods. Technology is unbounded in terms of data sources and sizes. It can handle any ad-hoc question to whatever depth you wish.

“The next step is the systems of intelligence which start to tell customers what questions to ask.” A blueprint is needed for intelligence systems.

“The data streams come from signals coming in real time from smartphones, consumer wearables such as the Fitbit or Apple Watch. These signals are controlled by the application.

“The application is what gives context to the raw data created by human inputs or the sensors embedded within these devices.”

Wikipedia defines a sensor as a device that detects or senses a characteristic of its environment. It detects changes or events in quantities and produces an corresponding output, usually as an optical or electrical signal.

The digital plumbing or application programming interfaces (APIs) is what ties all this together. APIs will allow the end-to-end transport of high speed data at high volumes in the system, along with every element of the data stream flow as shown in the schematic. API may not be used in common usage outside of the United States, but it is a term that has been around for a while.

IT, it will be, in much the same manner that terms of art and internet are common language for business communication today.

The connected consumer products and industrial machines will provide the largest data streams. These signals will be sent from the product sensors in our cars, homes, security systems, and everywhere else that is relevant.

“Signals from industrial internet will be emanated from sensors on any machine or piece of equipment that needs monitoring, maintenance, and repair. Anything that can be digitally monitored using sensors in the environment will be. These signals must be identified and harnessed by systems of intelligence.”

“A ‘digital watchdog’ is needed to capture high-volume and high speed data signals. These signal inputs must be monitored. An event is a digital signal that indicates something significant has occurred. An example of an event would be when your car’s temperature sensor is activated to alert you of the danger of freezing temperatures.

“Systems of Intelligence will need the technology to monitor and ingest these data streams. The digital signals create events that are broadcasted through messages. This allows the system to move the information around so that the digestion process proceeds as planned. Here filters can start to analyze these data streams. The system must be capable of handling growing volumes and faster data flows. It must also be able not to lose any data if it crashes.

Once data has been captured and processed it is sent to the digestion phase. Here is where the magic happens. This involves the analysis and monitoring of real-time data streams. After the data has been analyzed and processed it must be placed somewhere.

“The data streams that are flowing in are not suitable to traditional database storage, such as structured query language relational databases or relational databases. This is why it is necessary to have specialized technology that can store and handle large data sets. It is an essential component of systems of intelligence.

“Another important component of this system is its ability to apply filters in form of business rules to the analysis data streams. This will allow the system to eliminate human error by directly incorporating the collective wisdom and expertise of the company into it. These data streams can be further analyzed using artificial intelligence, which can come in the form machine learning and algorithms.

“Enterprise data” is made up of systems of record, systems of engagement and other systems that are the core of enterprise IT. Systems of record were created as IT moved from mainframes and minicomputers into the Internet. They have been primarily about moving paper and manual processes into digital. Systems of record are about automating daily activities, capturing their information by products and reporting on what are essentially historical documents.”

“Systems for engagement are fundamentally different to systems of record in that their focus is on the social nature and interactions with customers and partners. These conversations are shaped by social media and IT’s consumerization. Systems of engagement are not document-based digital artifacts. They add elements such as time, context and place. Systems of record will not disappear; enterprises just need to embrace next generation communication and collaboration through systems of engagement.

“Systems for engagement and systems of record are essential elements to provide context to the data streams and filtering. Without a complete picture of the customer or partner, you cannot understand the data streams and their outputs. These systems are essential for illuminating the intelligent decisions and analytical insights driven by systems intelligence.

After ingesting, digesting and applying enterprise context to data streams, intelligent outputs can be produced and delivered in the correct form, at the right moment, and to what channel. Insights and dashboards are the first two channels. Dashboards provide context and visualization of what is happening so that people can take action, such as launching new initiatives or tweaking existing marketing programs. Insights are more focused on providing real-time decision making. This is the key difference between analytical insights and dashboards. It is a great way to eliminate bad decisions and avoidable ones by expressing the collective knowledge and expertise within the organization using business rules. Data scientists will become more adept at applying their models and methods to create machine learning algorithms that make intelligent and predictive decisions as signals increase.

“The enterprise is now ready to start transforming the business by applying the intelligent outputs. Our core premise is that every application, device and connected product is a customer. This IoT platform’s role is to connect the device data to the user success platform to engage customers through sales, marketing, customer service, communities, applications, and analytics.

The technology disclosed simplifies, for non-programming users, the creation of an entity management workflow using computer-implemented system. The disclosed technology can be used in any computer-implemented systems, including a database system, multi-tenant environment, and a relational database implementation such as an Oracle. Compatible database implementation, IBM DB2 Enterprise Server MySQL is compatible with relational databases. or PostgreSQL? Compatible relational database implementation, or a Microsoft SQL Server Compatible relational database implementation, or a NoSQL Non-Relational Database Implementation such as a Vampire An Apache Cassandra is compatible with a non-relational implementation. Compatible non-relational data base implementation, BigTable? Compatible non-relational data base implementation, a BigTable? DynamoDB or compatible non-relational database implementation? Compatible non-relational implementation of DynamoDB

“Moreover, the technology disclosed may be used with multiple computer-implemented systems that communicate and cooperate. This technology can be used in many ways. It may be implemented as a method, an apparatus or a system. Or as a product that includes a computer usable medium with a computer-readable program code.

“In addition, the technology disclosed may be implemented using different programming models such as MapReduce? bulk synchronous programming, MPI primitives and others. or other stream management systems such as Apache Storm? and Apache Spark?. Apace Kafka?, Truviso?, IBM Info-Sphere?, Borealis?. Yahoo! S4?.”

“IoT Platform & Stream-Batch Processing Framework”

“We present a system and several implementations for simplifying the creation of entity management workflows by non-programming users. FIG. will illustrate the system and its processes. 1. and FIG. 2 shows an architectural level schematic for a system according to an implementation. Because FIG. FIG. 2.2 and FIG. 2.3 are architectural diagrams. As such, certain details are intentionally omitted to improve clarity. FIG. 1 and FIG. 2 are architectural diagrams. Certain details have been omitted in order to make the description more clear. The discussion of FIG. FIG. The organization of 2 will be as follows: The elements of the respective figures will first be described. Next, their interconnections will be discussed. Next, we will describe in detail how the elements are used in the system.

“FIG. “FIG. IoT platform 100 comprises data sources 102 and input connectors 104, stream container(s), 106, batch container (s), 108, rich context data store 110, orchestration systems 112, out connectors 122, and application(s). Rich contextual data store 110 contains various storage nodes C1 to C3. Orchestration 112 contains a data columnar 114 and an explorer engine 110. A live dashboard builder engine 128 is also available. A morphing engine 117 is also available. Applications 123 contain various SaaS, PaaS, and IaaS offerings.

“FIG. FIG. 2 shows a stream processing framework 200 that is used in the platform illustrated in FIG. 1. This is a single implementation of the technology. Framework 200 contains data sources 102 and input pipeline 204. It also includes stream container 106. Rich contextual data store 110 is included. Output pipeline 218 is included. Stream container 106 contains an emitter Tier 206, a scheduler 208, a coordinator 218 and a worker Tier 214.

The interconnection of elements of IoT platform 100, streaming framework 200 and scheduler 208 will be described. The network (not shown), links the data sources 102 and the input connectors 104, the stream container 106 and batch container 108. It also includes the orchestration system 112. The columnar 114, output connectors 122, application(s) 123. It also connects the application(s), 123, the orchestration platform 112, the columnar 110, the orchestration data store 110, the orchestration software 112, the orchestration system 121, the orchestration systems 112, the columnar (114), the coordinator 210, 214, the worker tier 218, the pipeline 218. The actual communication path may be point-to?point over private and public networks. Some items, like data from data sources 101, may be delivered indirectly. An application store (not illustrated). All communications can be made over many networks, such as the Internet. Private networks, VPNs, MPLS circuits, or the Internet can all be used. They also have access to appropriate APIs and data interchange formats such as JSON, SOAP, XML, and SOAP. REST, JSON and/or JMS. All communications can be encrypted. The communication takes place over a network like the LAN, WAN, LAN, telephone network (Public Switched Telephone Network, PSTN), Session Initiation protocol (SIP), wireless network point-to-point network star network token ring network hub network, Internet inclusive of mobile Internet via protocols such EDGE, 3G and 4G LTE. You can also use a variety authentication and authorization techniques to protect the communication, including username/password and OAuth, Kerberos and SecureID.

“Having described the elements in FIG. 1. (IoT platform 100) and 2. (Streaming framework 200) will be described in greater detail. 2 (Streaming Framework 200) and their interconnections will be discussed in detail.

“Data sources 102” are entities like a smart phone or WiFi access point, a sensor network, a mobile app, a web client and logs from servers, social media sites, etc. One implementation allows data from data source 102 to be accessed via an API Application Programming Interface. This interface allows sensors, devices and gateways, proxy servers, and other types of clients to register data source 102 in IoT platform 100. Data can then be ingested from these sources. Data sources 102 data can contain events in structured data (e.g. User profiles and the interest graph, unstructured text (e.g. Tweets), semi-structured interaction logs, and user profiles. Events include clicks on links and impressions of recommendations. Server logs contain information about user identities, such as user handles, user IDs and user names. Content posted by users to respective feeds on social networks, data on social graphs, metadata, including comments posted in response to previous posts, events, news articles and so on. Events can be stored in semi-structured data formats like a JSON, BSON, Binary JSON, XML or Thrift object. These include string fields (or columns), and corresponding values for potentially different types such as numbers, strings arrays, objects, and so forth. JSON objects can have multiple values and can be nested, such as arrays or nested arrays.

“Near real-time (NRT), data streams 103, as described infra are collections of events that are recorded as they are generated. Events are sent over HTTP to the input pipeline 204 in one implementation. Another implementation transmits events via POST requests to a receiver working on behalf of input pipe 204. Twitter Firehose API, which is accessible via companies affiliated with Twitter like Datashift, nTweetStreamer and tiwwter4j, provides unbounded time-stamped events (called tweets) as a stream JSON object along with metadata about those Tweets including timestamp data, user information, location. These JSON objects are stored in a NoSQL schema-less or NoSQL data-store such as Apache Cassandra? and Google’s BigTable?. The parsed JSON objects are stored using key spaces that are equivalent in SQL to a database. Each key space is divided into column family that look like tables, and consist of rows and sets.

“Input connectors 104 obtain data from data sources 101 and convert it into an input format that can be used by containers 106 and 110. The input connectors 104 can pull full data from data sources 102 and/or incrementally. In another implementation, input connectors104 access metadata from data sources 102. The input connectors 104 can issue a ‘describe? API call to retrieve metadata about an entity, and then issue the appropriate API calls to fetch that data. Connector SDK is used in some cases to create customized input connectors 104 Individual data sources 102

“In other implementations, the workflow definition also includes a list of operators and connectors as well as the order in which they are executed. One implementation specifies such a workflow as a directed diagram, where operators and connectors are graph nodes, and edges represent the data flow. Another implementation uses multiple data streams 103 that are combined and transformed before being fed into the containers 106 or 108.

“Batch processing framework in container(s), 108 generates business insight using OnLine Analytic Processing (OLAP queries), which are stored in rich context data store 110. One implementation stores events in batch container(s), 108 as a backup to raw events, so that batch processing jobs can be run at any time. In some implementations, batch container(s)108 provides raw counts and descriptive statistics like mean, median, and percentile breakdowns. Scalding is an analytics tool that can be used in one implementation. Scalding? and Pig? Batch container(s)108 contains Pig? to allow retrospective analysis, machine-learning modeling, and other batch analytics. Batch container(s)108 can also be used to correct errors in stream container 106, or to handle upgraded capabilities. It is used to run analytics on historical data to recompute the results. A batch processing framework is Hadoop distributed filesystem (HDFS), which implements a MapReduce programming system.

“Batch containers(s)108 ingest event tutles from respective input pipes that collect data for a plurality NRT data streams. Multiple NRT data streams can be assigned for a single pipeline, and multiple pipelines can also be assigned to one batch container in some implementations.

“Stream processing framework 200 allows for near-real-time (NRT), processing of sequences unbounded events to deliver immediate analytics and insights. Framework 200 can process one million events per second per node in one implementation. Framework 200 can be implemented with Apache Storm or one of the stream processors. Apache Samza? Or a batch-stream processor like Apache Spark? Framework 200 has an API that allows you to create jobs and execute operations on a sequence event-tuples.

“Events can be ingested into the framework 200 by input pipe 204. This reads data from data sources 102 and stores events for consumption by stream container 106. Input pipeline 204, which is one delivery point for events entering container 106, can be implemented in one implementation. Flume? and Apache Kafka? are some examples of input pipeline 204. They provide a single delivery endpoint for events entering the container 106. In some cases, the input pipeline 204 also includes a listener that can listen to NRT data streams 103 as well as data flows originating 102 from data sources 102. This is done by connecting with their respective APIs (e.g. Chatter API, Facebook API (e.g. Open Graph), Twitter API. via the Internet. Some implementations include heterogeneous instances that are responsible for receiving data from different sources.102 An implementation states that the input pipeline 204 can receive data over the network using either an application protocol layer or another higher protocol layer such as HTTP protocol layer. There are many standard and proprietary protocols layers. These higher protocol layers are capable of encoding, packaging and/or reformatting data to send and receive messages over a network layer such as Internet Protocol, (IP) and/or a transportation layer such as Transmission Control Protocol, (TCP), and/or User Datagram Protocol, (UDP).

“In a particular implementation, Apache Kafka? is used as the input pipe 204. Kafka is a distributed messaging platform with a publish-subscribe model. Kafka organizes events into topics. Events are published by so called producers. They are then pulled and processed by so known consumers. Kafka is a distributed system. Each node in the cluster is called a broker. This stores events in a replicable commit log. Different messaging and queuing systems may be used in other implementations.

NRT data streams 103 are queued as batches in an implementation. A batch is an assemblage or group of event tuples. This can also be referred to as units of work. Time-slice based means that at least one NRT data stream is divided by the most recently received portion. A batch-size-based definition means that at least one incoming NRT stream is divided by the most recently received portion, which can be restricted or restricted by data size (e.g. one batch contains 10MB of most recent event tuples). Other implementations use a combination time-size and batch-size basis to define batches.

Click here to view the patent on Google Patents.