Software – Venkatraman Iyer, Darren S. Jue, Robert G. Blankenship, Fulvio Spagna, Ashish Gupta, Intel Corp

Abstract for “High performance interconnect physical Layer”

“Re-initialization can be done without terminating a link. The link includes a transmitter or receiver that are coupled to each lane of the number of lanes. Re-initialization must include transmission of a predetermined sequence on each lane.

Background for “High performance interconnect physical Layer”

Semi-conductor processing has made it possible to increase the logic found on integrated circuit devices. Computer system configurations have changed from one or more integrated circuits to include multiple cores, multiple processor threads and multiple logical processing units. A processor or integrated circuit is typically composed of a single physical processor, which may contain any number cores, hardware threads and logical processors. It can also include interfaces, memory hubs, controller hubs, memory, controllers, and so on.

Due to the increased processing power available in smaller devices, computing devices that are smaller have become more popular. The popularity of smartphones, tablets, ultrathin notebooks and other equipment has increased exponentially. These smaller devices depend on servers for complex processing and data storage. The high-performance computing market, i.e. The demand for high-performance computing (i.e., server space) has increased. Modern servers typically have multiple processors with multiple cores. However, they also have multiple physical processors (also known as multiple sockets). This allows for greater computing power. As the computing power increases, so does the importance of communication between sockets.

Interconnects have evolved from traditional multi-drop buses, which primarily handled electrical communication, to fully blown interconnect architectures that allow for fast communication. As processors continue to demand higher consumption rates, corresponding demands are placed on existing interconnect architectures.

“The following description contains many specific details. These include examples of specific processors and system configurations. Specific register configurations. Specific instruction types. Specific processor pipeline stages. Specific packet/transaction configurations. Specific transaction names. Specific protocol exchanges. Specific link widths. Specified implementations. Operation. This description is intended to give a complete understanding of the invention. However, it may be obvious to those skilled in the art that not all details are necessary to practice the subject matter disclosed. Other instances have been avoided to provide a detailed description of known components and methods, such as specific processor architectures/code for the described algorithms, specific firmware operation, low-level interconnect operations, specific logic configurations and materials, specific fabrication techniques and materials, specific implementations of compilers, specific expressions of algorithms in code and specific power down and gate techniques/logic, and other operational details of a computer system, in order not to obscure the present disclosure.

The following embodiments can be used to describe energy conservation, energy efficiency and processing efficiency in particular integrated circuits such as microprocessors or computing platforms. However, they are also applicable to other types integrated circuits and logic device types. The same techniques and teachings may be applied to other types or semiconductor devices that might also benefit from the features described herein. The disclosed embodiments can be used in a variety of devices such as smartphones, tablets, ultrabooks, server computers, laptops and desktop computer systems. Examples of handheld devices are cellular phones, Internet protocol device, digital cameras and personal digital assistants (PDAs), as well as handheld computers. Similar techniques can be used to create a high-performance interconnect. Embedded applications usually include a microcontroller and a digital signal processing (DSP), as well as set-top boxes, network computers, hubs, wide-area network (WAN), switches, and any other system capable of performing the functions and operations described below. The apparatus, methods and systems described herein do not only apply to physical computing devices. They may also be applicable to software optimizations to improve energy efficiency and conservation. The description below will make it clear that the described embodiments of methods, devices, and systems (whether they refer to hardware, firmware or software or a combination thereof), may be vital to a “green technology”. Future performance considerations are balanced.”

“Components in computing systems are getting more complicated as they advance. To ensure optimal component operation, the interconnect architecture used to connect and communicate between components has increased in complexity. Different market segments require different interconnect architectures depending on their market. Servers require greater performance while mobile ecosystems can sometimes sacrifice overall performance to save power. However, this is not the only purpose of fabrics. They are designed to deliver maximum performance while maximizing power savings. The subject matter can also be beneficial to a number of interconnects.

“The Peripheral component interconnect (PCIe Express (PCIe), interconnect fabric architecture and QuickPath (QPI) fabric architecture can be improved, among others, according to one or more of the principles described herein. The primary purpose of PCIe, for example, is to allow components and devices from different manufacturers to inter-operate within an open architecture that spans multiple market segments: Clients (Desktops, Mobile), Servers Standard and Enterprise, and Embedded, and Communication devices. PCI Express is an I/O interconnect that can be used for general purposes and designed to work with a variety of future computing platforms. Some PCI attributes such as its usage model and load-store architecture have been preserved through revisions. Previously implemented parallel buses have been replaced with a fully serialized, highly scalable interface. Newer versions of PCI Express make use of point-to-point interconnects and Switch-based technology. They also support packetized protocols to provide new levels of performance. PCI Express supports advanced features such as Power Management, QoS, Hot-Plug/Hot?Swap support and Data Integrity. While the main discussion is focused on a new high performance interconnect (HPI), aspects of the invention described in this document may also be applicable to other interconnect architectures such as a PCIe compliant architecture or a QPI-compliant architectural, a MIPI compatible architecture, high-performance architecture or another known interconnect architecture.

Referring to FIG. “Referring to FIG. 1, an illustration of a fabric made up of point-to?point Links that interconnect a number of components is shown. The system 100 includes processor 105, system memory 110, and controller hub 115. The processor 105 may include any type of processing element such as a microprocessor or embedded processor, co-processors, or other processors. Through front-side bus FSB 106, processor 105 is connected to controller hub 115. FSB 106, in one embodiment, is a serial point to point interconnect. As described below. Link 106 is another embodiment that includes a serial-to-point differential interconnect architecture, which conforms to different interconnect standards.

“System memory 110” includes any memory device such as random access memory(RAM) or non-volatile memory (NV), and is accessible by all devices in system 100. Through memory interface 116, system memory 110 is connected to controller hub 115. A memory interface can be described as a double-data-rate (DDR), dual-channel DDR memory interfacing, or dynamic RAM (DRAM), among others.

“In one embodiment controller hub 115 may include a root complex, root hub, or root controller such as in a PCIe Interconnection Hierarchy. A chipset, controller hub (MCH), an interconnect controller hub, (ICH), a southbridge and a root controller/hub are all examples of controller hub 115. Chipsets are often used to refer to two separate controller hubs. For example, a memory controller (MCH), coupled to an interconnect hub (ICH). The MCH is often integrated with processor 95, and controller 115 is used to communicate with I/O devices in a similar way as the below. Peer-to-peer routing may be supported in some embodiments by root complex 115.

“Here controller hub 115 is connected to switch/bridge 120 via serial link 119. Interfaces/ports 117/121 may also be called input/output modules, 117 or 121. They can contain/implement a layer protocol stack to allow communication between controller hub 120 and switch 120. Multiple devices can be coupled to switch 120 in one embodiment.

“Switch/bridge 120 routes packets/messages starting at device 125 upstream. Up a hierarchy towards the root complex to controller hub 110 and downstream, i.e. Down a hierarchy towards a root controller: processor 105, system memory 110, to device 125. In one embodiment, Switch 120 is a logical assembly that includes multiple virtual PCI to-PCI bridge devices. Device 125 can include any external or internal device that is to be connected to an electronic system. It includes an I/O device (NIC), an add in card, an audio processor (audio processor), a network processor (NIC), a hard-drive and a storage device. A router, a portable storage device. A Firewire device. A Universal Serial Bus (USB). device. A scanner. And other input/output devices. In the PCIe language, a device is often referred to simply as an endpoint. Device 125, although not shown in detail, may include a bridge (e.g. a PCIe-PCI/PCIX bridge) to allow legacy versions or interconnect fabrics to be supported by such devices.

“Graphics accelerator 130 may also be connected to controller hub 115 via serial link 132. One embodiment of graphics accelerator 130 is connected to an MCH which is then coupled to an ICH. The ICH is coupled to switch 120 and I/O device number 125. I/O modules 131, 118 and 117 are to also implement a layered protocol stack for communication between controller hub 115 and graphics accelerator 130. A graphics controller, or the graphics accelerator 130, may be integrated into processor 105.

“Turning towards FIG. 2. An illustration of a layer protocol stack is shown. Layered protocol stack 200 may include any type of layered communication stack such as a QPI or PCIe stack, next generation high-performance computing interconnect (HPI), stack, or another layered stack. Protocol stack 200 may include transaction layer, link layer, and physical layers in one embodiment. Interfaces such as interfaces 117 to 118, 121 to 122, 126 and 131 in FIG. 1 may be represented by communication protocol stack 200. Representation as a communication protocol stack may also be referred to as a module or interface implementing/including a protocol stack.”

“Packets are used to communicate information among components. To carry information from one component to another, packets can be created in the Transaction Layer 205 or Data Link Layer 210. The packets are extended by the layers that handle them as they flow through them. The reverse process takes place at the receiver side. At the receiving end, packets are transformed from their Physical Layer 220 representation into the Data Link Layer 210 representation. Finally (for Transaction Layer Packets), they are converted to the form that can then be processed by the Transaction Layer 205.

“In one embodiment transaction layer 205 may provide an interface between a device?s processing core, and the interconnect architecture such as Data Link Layer 210 or Physical Layer 220. The primary responsibility for the transaction layer205 could include the assembly and disassembly packets (i.e. transaction layer packets or TLPs). The translation layer205 can also manage credit-based flow controls for TLPs. Split transactions are possible in some implementations. This allows a link to carry additional traffic while the target device collects data for the reply.

“Credit-based flow management can be used for virtual channels and networks that utilize the interconnect fabric. One example is that a device could advertise a credit amount for each buffer in Transaction Layer 205. A controller hub 115, an external device located at the other end of the link (FIG. 1 can count the credit consumed by each TLP. If the transaction is within a credit limit, a transaction can be transmitted. Once you receive a response, the amount of credit will be restored. This credit scheme has many advantages, including the fact that latency in credit returns does not affect performance, provided the credit limit is not exceeded.

In one embodiment, four transaction addresses spaces may include a configuration space, a memory space, and an input/output space. One or more memory space transactions can include read requests and writes to transfer data from/to a memory-mapped place. Memory space transactions can be used in one embodiment to use two different address formats. For example, a 32-bit short address format or a 64-bit long address format. You can access the configuration space of devices connected to an interconnect using configuration space transactions. Transactions to configuration space may include both read and write requests. To enable interconnect agents to communicate in-band, message space transactions (or simply messages) may also be defined. In one embodiment transaction layer 205 is capable of assembling packet header/payload 206

Refer to FIG. 3 illustrates an example embodiment for a transaction layer packet description. Transaction descriptor 300 is a method of carrying transaction information. Transaction descriptor 300 allows for the identification of transactions within a system. Tracking modifications to default transaction ordering can also be used. Transaction descriptor 300 may include, for example, global identifier 302, attributes field 304, and channel identifier 306. The illustrated example shows global identifier 302, which includes local transaction identifier 308 and source identification field 310. Global transaction identifier 302 can be used to identify all outstanding requests in one embodiment.

According to one implementation, local transactions identifier field 308 can be a field that is generated by a requester agent and can be unique for all outstanding orders that are required to be completed for that agent. In this example, source ID 310 uniquely identifies a requester agent in an interconnect hierarchy. In this way, source ID 310 and local transaction identifier (308 fields) provide global identification of transactions within a hierarchy domain.

“Attributes field 304 describes the characteristics and relationships of the transaction. Attributes field 304 can be used to provide additional information which allows for modification of default transaction handling. Attributes field 304 may include priority field 312, reserved fields 314, ordering field 316 and no-snoop area 318. In this case, the priority sub-field 312 may be modified by the initiator to give priority to the transaction. The reserved attribute field 314 can be used for future or vendor-defined purposes. The reserved attribute field may be used to implement usage models that use priority or security attributes.

“In this case, the ordering attribute field 311.6 is used to provide optional information about the type of order that might alter default ordering rules. An ordering attribute of 0 indicates that default ordering rules are to be applied. An ordering attribute of ‘0? indicates that default ordering rules will be applied, while an ordering attribute?1? denotes that no ordering rules will apply. denotes relaxed ordering where writes and reads can be passed in the same direction. To determine if transactions have been snooped, the Snoop attribute 318 is used. Channel ID Field 306 is used to identify the channel with which a transaction is associated.

“Returning back to FIG. “Returning to the discussion of FIG. 2, a Link Layer 210, also known as data link layer210, can serve as an intermediary stage between transaction layer205 and physical layer 220. One embodiment of the data layer 210’s responsibility is to provide a reliable mechanism for exchanging Transaction Layer Packets between two components on a linked component. The Data Link Layer 210 accepts the TLPs that have been assembled by the Transaction Layer 205 and applies the packet sequence identifier 211 to the other side. an identification number or packet number. Calculates and applies an error detection cod, i.e. CRC 212 and transmits modified TLPs via the Physical Layer 220 to be transmitted across a physical device to an external one.”

“In one example, physical Layer 220 contains logical sub-block 221 and an electrical sub-block 222, which are used to physically transmit a packet from an external device. The?digital? functions of Physical Layer 221. Physical Layer 221. The logical sub-block may include a transmit section that prepares outgoing information to be transmitted by physical sub-block 221. A receiver section is responsible for identifying and preparing received information before it is passed to Layer 210.

“Physical block 222 includes a transmitter as well as a receiver. The logical sub-block 221 supplies the transmitter with symbols. The transmitter serializes these symbols and transmits them onto an external device. The receiver receives serialized symbols from an outside device. It transforms the received signals into bit-streams. The bit-stream is then de-serialized, and sent to sub-block 221. In one example embodiment, an 8b/10b transmission code is employed, where ten-bit symbols are transmitted/received. Special symbols are used here to frame the packet with frames 223. The receiver may also provide a symbol clock from the incoming serial stream in an example.

“Although transaction layer 205 and link layer 210 are mentioned in relation to a particular embodiment of a protocol Stack (such as a PCIe protocol Stack), a layered protocol Stack is not limited. Any layered protocol can be implemented and adopted the features described herein. A port/interface can be represented as a layer protocol. It may include (1) a layer to build packets. A transaction layer; a layer to sequence the packets, i.e. A transaction layer; a second layer to sequence packets, i.e. A physical layer. A high-performance interconnect layer protocol is used as an example.

Referring to FIG. An example of a serial point-to-point fabric is shown in FIG. 4. Any transmission path that transmits serial data can be included in a serial point-to?point link. The embodiment shows two low-voltage differentially driven signal pairs. A transmit pair 406/411 is shown and a receive pairing 412/407. Device 405 has transmission logic 406 for transmitting data to device 407 and receiving logic 407 for receiving data from device 407, respectively. Two transmitting paths are also available, i.e. paths 416, 417 and 2 receiving paths, i.e. Some implementations of a hyperlink include paths 418- 419.

A transmission path is any path that transmits data. It can be a transmission line or copper line, optical line, wireless communication channel, infrared link, or any other path. A link is a connection between two devices such as device 405 or device 410. One lane may be supported by a link. Each lane represents a set differential signal pairs (one pair to transmit, one pair to receive). A link can aggregate multiple lanes to increase bandwidth. xN denotes the number of supported links, with N being any width.

A differential pair refers to two transmission paths that transmit differential signals, such as lines 416 or 417. Line 416 switches from a low voltage to a high voltage, i.e. A rising edge is when line 416 toggles from a low voltage level to a high voltage level, i.e. a falling edge. Differential signals may have better electrical properties, such as higher signal integrity. cross-coupling, voltage overshoot/undershoot, ringing, among other example advantages. This gives rise to a wider timing window that allows for faster transmission frequencies.

“In one embodiment, the High Performance Interconnect (HPI), is provided. HPI may include a next generation cache-coherent link-based interconnect. HPI can be used in high-performance computing platforms such as workstations and servers. It may also be used in systems that use PCIe or other interconnect protocols to connect accelerators, processors, I/O devices and the like. HPI’s use is not limited. HPI can be used in any of these systems or platforms. The individual ideas may also be applied to other interconnects or platforms such as PCIe and MIPI.

HPI can be used to support multiple devices. One example implementation could include an Instruction Set Architecture agnostic (i.e. HPI can be used in many different devices. HPI can also be used to connect high-performance I/O devices. A high-performance PCIe device, for example, could be connected to HPI via a translation bridge (i.e. HPI to HPIe The HPI links can be used by many HPI-based devices such as processors in different ways (e.g. stars, rings, meshes, etc.). FIG. FIG. 5 shows examples of multi-socket configurations. Two-socket configuration 505 can have two HPI links, but in other cases, only one HPI link is possible. Any configuration can be used for larger topologies as long as an identifier is assignable, and there is some type of virtual path among other additional or replacement features. In one example, the four-socket configuration 510 includes an HPI link between each processor and another. In the configuration 515 example, eight sockets are connected via an HPI link. If a virtual channel or path exists between processors, however, the configuration can be supported. There are two (2) supported processors in a native domain. Multidomains and other interconnects between node controllers can be used to increase the number of processors.

The HPI architecture defines a layer protocol architecture. This architecture may include, in some cases, protocol layers (coherent or non-coherent and, optionally other memory-based protocols), routing layers, link layers, and physical layers. HPI also allows for enhancements in power management (such as power control unit (PCUs), design for testing and debugging (DFT), fault handling (registers), security, among others. FIG. FIG. 5 shows an example of a HPI layered protocol stack. FIG. 5 may only be used in certain implementations. 5. May be optional. Each layer has its own level or quantum information. This includes the protocol layer 605 with packets 630 and link layer 610 with flits 635. The physical layer 605 with phits 645. A packet may contain partial flits or a single flit depending on its implementation.

“A width of a Phit 640 is a first example. It includes a 1-1 mapping of link widths to bits (e.g. A 20-bit link width is a phit that has 20 bits. Flits can be larger than 184,192, or 200 bytes. If phit 635 is 20 bytes wide and phit 635 is 184 bytes, then it takes a fraction of phits 635 to transmit one flit 635. 9.2 phits are required to transmit a 184-bit flit 635 at 20 bits, or 9.6 at 20 bit to transmit a 194-bit flit 635 at 192 bits, to name a few. The width of the fundamental link at a physical layer can vary. The number of lanes per direction can include 2, 4, 6, 8, 12, 14, 16, 18, 20, 22, 24 etc. Link layer 610a,b can embed multiple pieces of transactions in one flit. One or more headers may also be possible (e.g. The flit may contain 1, 2, 3, or 4. HPI can split the headers into the appropriate slots in order to allow multiple messages to be sent from the flit to different nodes.

“Physical Layer 605 a,b can in one embodiment be responsible for fast information transfer on the physical medium (electrical, optical, etc.). Point-to-point physical links can be established between two Link layers entities such as 605a and 605b. The Link Layer 610 a.b can abstract the Physical Layer 605 a.b from the higher layers. It provides the ability to transfer data and requests reliably and manage flow control between directly connected entities. The Link Layer is also responsible for virtualizing the physical channel into multiple channels and message classes. Protocol layer 620a,b uses the Link layer610a,b to map protocol message into the appropriate message class and virtual channels, before handing them over to the Physical layer 605a,b for transmission across physical links. Multiple messages may be supported by Link layer 610a,b, including a request, snoop response, writeback and non-coherent data.

“The Physical layer 605a,b (or the PHY) of HPI may be implemented above or below the electrical layer (i.e. As illustrated in FIG. 6. Each agent can have a Physical layer and the corresponding logic. This connects the link layers of two agents (A or B) that are separated (e.g. On devices located on either side of the link. Physical media (e.g. wires, conductors, optical, etc.). In one embodiment, the Physical layer 605a,b has two phases: initialization and operation. Initialization is when the link layer blocks the connection. Signaling can include a combination timed states or handshake events. The connection is transparent to link layer during operation. Signaling occurs at a fast speed with all lanes working together as one link. The Physical layer transports flits between agent A and agent B during the operation phase. A connection, also known as a link, abstracts certain physical aspects such as media, width and velocity from the link layers while exchanging fly-bys and controlling/status current configuration (e.g. width) with the link layers. Minor phases, e.g. Polling, Configuration. Minor phases, such as configuration and polling, are also included in the operation phase. “Link power management states”.

“In one embodiment, Link Layer 610 a.b can be implemented to ensure reliable data transfer between two routing or protocol entities. The Link layer can abstract Physical Layer 605 a.b from Protocol layer 620.b. It can control flow between two protocol agents (A and B) and provide virtual channel services (Virtual Networks) to the Protocol layer and Routing layers (Message Classes and Routing layers). The packet level is typically the interface between Protocol layer 620 a.b and Link Layer 610 a.b. One embodiment refers to the smallest unit of transfer at the Link Layer as a “flit” which is a number of bits such as 192 bits or another denomination. The Link Layer 610a,b uses the Physical Layer 605a,b to frame 605 a.b units of transfer (phit), into the Link Layer’s 605 a.b units of transfer (flit). The Link Layer 610 a.b can be logically divided into two parts: a sender or a receiver. One entity could have a sender/receiver pair and another entity might have a receiver/sender couple. Flow Control can be performed on both a packet and a flit basis. Correction and error detection can also be performed on a packet basis.

“In one embodiment Routing layer 615a,b may provide a distributed and flexible method to route HPI transactions between a source and a destination. This scheme can be customized to suit multiple topologies by using programmable routing tables at each router. Programming in one embodiment can be done either via firmware or software. The routing functionality can be distributed. Each routing step is defined by a lookup at a table at the source, intermediate or destination routers. A lookup at a source can be used to inject HPI packets into the HPI fabric. A lookup at an intermediate router can be used to route an HPI package from an input port into an output port. To target the destination HPI protocol agents, you may use the lookup at a port. The Routing layer in some implementations can be thin because the routing tables and thus the routing algorithms are not defined by specification. This allows flexibility and a range of usage models to be used, as well as flexible platform architecture topologies that can be defined by system implementations. The Routing layer 615a,b depends on the Link layer610a,b to allow the use of three or more virtual networks (VNs). In one example, there are two VNs that are deadlock-free, VN0, and VN1, with multiple message classes. Although a shared adaptive virtual network (VNA), may be defined in Link layer, this adaptive network might not be directly exposed in routing concepts. Each message class and virtual network could have dedicated resources and guaranteed forward progression, among other features.

An embedded clock can be used in HPI’s implementations. Data can be transmitted via the interconnect with a clock signal embedded. The dock signal can be embedded in data. This allows for the omission of separate clock lanes. This is useful in certain situations, such as when more pins can be used for data transfer, especially in systems with limited space.

An interconnect can establish a link between agents. The agent sending the data can either be a local agent or a remote agent. Both agents can use state machines to manage different aspects of the link. One embodiment allows the Physical layer datapath to transmit flits between the link layer and the electrical front-end. In one implementation, the control path includes a state-machine (also known as a link training machine or something similar). Internal signals, timers and external signals may be used to control the state machine’s actions or exits from different states. Some states, like initialization states, might have timers that provide an exit timeout value. In some embodiments detect refers to the detection of an event on both legs a lane. However, it is not always simultaneous. In other embodiments, detect is the detection of an event using an agent of reference. Debounce is one example of sustained assertion of a signal. HPI is one embodiment of HPI. It supports operation even if there are no function lanes. This allows lanes to be dropped in specific states.

“States” can be reset states, initialization state, or operational states. There are many other subcategories and categories. One example is that some initialization states may have a secondary-timer, which can be used to exit the state with a timeout (essentially an abandonment due to insufficient progress in the state). A state may be aborted if it updates its status register. Some states may also have primary timers that are used to set the state’s primary functions. You can also define other states so that external or internal signals (e.g. handshake protocols), drive the transition from one state to another.

“A state machine can also support debug through single steps, freeze on initialization abort, and the use of testers. State exits can be delayed or held until debug software is available. The exit can sometimes be delayed or held until the second timeout. One embodiment allows for actions and exits to be based on the exchange of training sequences. One embodiment of the link state machine requires that it runs in the local agent clock domain. The transition from one state into the next must coincide with a transmitter-training sequence boundary. To reflect the current state, status registers can be used.

“FIG. 7 shows a representation at most of the state machine used in an example implementation of HPI. The state table in FIG. 7 includes a non-exhaustive list of possible states. To simplify the diagram, certain transitions have been omitted. Some states can be combined, divided, or omitted while others may be added. These states include:

“Event reset state”: Entered on a cold or warm reset event. Restores default values. Initialize counters (e.g. sync counters). May exit to another state such as another reset.

“Timed reset state”: Timed state for in hand reset. Remote receivers can detect the EOS and enter the timed reset. The receiver has lanes that hold electrical settings. Agent may be contacted to reset the state.

“Calibrate reset status: calibration without signaling in the lane (e.g. Receiver calibration state) or turning off drivers. A timer may be used to set a predetermined time limit. It may be used to set an operational speed. When a port isn’t enabled, it may act as a waiting state. This may include minimum residency time. Based on design, receiver conditioning and staggering off can occur. After completing calibration, the receiver may exit to a detect state.

“Receiver detection state: detect the presence of a receiver in lane(s). Receiver termination may be detected (e.g. Receiver pulldown insertion. When a value is set, or if another value is not set, the calibrate reset state may be exited. If a receiver is detected, or a timeout has been reached, the transmitter calibrate state may be exited.

“Transmitter calibration state: for transmitter calibrations. This may be a state that is timed for transmitter calibrations. Signaling on a lane may be included. An EOS may be continuously driven, such as an electronic idle exit ordered set (or EIEIOS). When the timer expires or calibration is completed, may exit to compliance state. If a counter expires or a secondary timeout occurs, may exit to transmitter detection state.

“Transmitter detect state: qualifies valid signaling. This may be a handshake condition in which an agent performs actions and exits to another state based upon remote agent signaling. Valid signaling may be received by the receiver. In one embodiment, the receiver looks for a wake detection and, if it is, searches for it on other lanes. Transmitter sends a detect signal. The polling state may be exited if all lanes have been debounced and/or there is a timeout, or debounce is not completed on all lanes. One or more monitor lanes can be kept awake in this situation to debounce a wake-signal. If the wake signal is debounced, then all other lanes can be debounced. This allows for power savings in low power situations.

“Polling state: Receive adapts, initializes drift buffer, and locks bits/bytes (e.g. identifies symbol boundaries). Sometimes, lane may be rearranged. Remote agents may cause exit to another state (e.g. In response to an acknowledgment message, a remote agent may cause an exit to a next state (e.g. A polling system can also include a lock to a training sequence header and an EOS. The distance from lane to lane skew at the remote transmitter can be set at a maximum length for top speed and at a minimum length for slow speed. You can also deskew in an operational or slow mode. The receiver may set a maximum lane-to lane skew to be deskew. This could include 8, 16 or 32 intervals. Latency fixing may be included in receiver actions. In one embodiment, receiver actions can be accomplished upon successful deskews of valid lane maps. One example of a successful handshake is when several training sequence headers are sent with acknowledgements. After the receiver completes its actions, a number more training sequences with acknowledgements are transmitted.

“Link width state: Agent communicates with the last lane map to remote transmitter. The information is received by the receiver and decoded. After checking for a previous lane value in another structure, the receiver may record a configured lanes map in the structure. The receiver may respond by indicating their acknowledgement (?ACK?). May initiate an in band reset. One example is the first state that initiates an in-band reset. In one embodiment, the ACK causes exit to a new state (e.g., flit configuration state). A reset signal can also be generated prior to entering low-power state. This is done if the frequency at which a wake detect signal occurs drops below a certain value (e.g. 1 for every number of unit intervals, such as 4KUI. The receiver may have both current and past lane maps. Based on different training sequences, transmitter may use different lanes. In some embodiments, lane map may not alter some status registers.

“Flitlock configuration state”: The state is entered by a transmitter, but it is considered to be exited (i.e. Secondary timeout moot: When both receiver and transmitter have exited to another link state, or a blocking state, In one embodiment, transmitter exit to a link condition includes the start of a data and training sequence boundary following receipt of a planetary alignment signal. Receiver exit can be made based upon receiving an SDS signal from a remote transmitter. This state could be used as a bridge between agent and link state. SDS is identified by the receiver. If SDS is received after descrambler initialization, the receiver may exit to blocking state (BLS), or a control window. Exit may be to reset the state if a timeout occurs. Transmitter sends configuration signals to drive lanes. Transmitter exit can be to reset, BLS or other states depending on the conditions or timeouts.

“Transmitting link state: A link state. Flits are sent from a remote agent. You can enter from a blocked link state. If you get a timeout, the link will return to a blocked state. Transmitter sends flits. The receiver receives the flits. Also, may exit to a low-power link state. In certain implementations, the transmitting link state (TLS), can also be called the L0 state.

“Blocking Link State: a link state. Transmitter, receiver and other parties are working in a coordinated manner. This may be a state in which the link layer flits and the physical layer information are communicated to remote agents. It may exit to a low-power link state (or another link state depending on the design). In one embodiment, a blocking link state (BLS) periodically occurs. This is called a BLS interval. It can be timed and may vary between operational speed and slow speed. The link layer can be periodically blocked front sending flights so that a Physical Layer control sequence of length may be sent. This could be during a transmitting or partial width transmitting state. Some implementations can refer to a blocking link state (BLS), or L0 control, as L0c.

“Partial Width Transmitting Link State: Link state. You can save power by entering a partial width status. Asymmetric partial width may refer to a link that has two directions with different widths. Some designs may support this. FIG. shows an example of an initiator (such as a transmitter) sending a partial width signal to enter partial width transmitting state. 9. This is a partial width indication that is sent when transmitting on a link having a first width to allow the link to transition to transmit at a new width. A mismatch could result in a reset. Speeds cannot be changed, but widths can. Flits can be sent at different widths. It may be similar to a transmitting state logically, but it can take longer to transmit flits because there is a smaller width. You may exit to other link states such as a low-power link state that is based only on certain received or sent messages, or an exit from the partial width transmitting state or a block state based upon other events. To improve signal integrity (i.e.,.), a transmitter port can turn off idle lanes in a staggered fashion. As shown in the timing diagram, noise mitigation is possible. These flits are not retryable and can be used when the link width changes. These null flits can be dropped by the receiver and idle lanes may be turned off in a staggered fashion. Also, one or more structures may record the current and past lane maps. The status register and the note status may not be altered. Partial width transmitting state may be called a partial L0 or L0p in some implementations.

“Exit Partial Width Transmitting State: Leave the partial width state. In some cases, a blocking state may be used. In one embodiment, the transmitter initiates exit by sending exit patterns of partial width to the idle lanes in order to train and debrief them. An example of an exit pattern is EIEOS. This signals that the lane has been ready to enter a full transmitting state. It may also end with SDS (or Fast Training Sequence) on idle lanes. Failures during the exit sequence (receiver activities, such as deskew that was not completed before timeout), stop flit transfers and assert a reset. This is done by resetting link on the next blockage link state occurrence. The SDS may also initialize the scrambler/descrambler on the lanes to appropriate values.”

A “Low Power link state” is one that has a lower power level. It is, in one embodiment, lower power than the partial-width link state. Signaling is stopped in both directions and on all lanes in this embodiment. For requesting a low-power link state, transmitters may use a blocking state. The receiver can decode the request, and reply with an ACK (or a NAK); otherwise, reset may be triggered. Some implementations can refer to a low power link state as a “L1 state”.

“Some implementations allow state transitions to be made to enable states to be bypassed. This is possible, for example, when certain configurations or calibrations have been completed by the states. You can save previous state results and configurations for a link and reuse them in subsequent initializations or configurations. Instead of repeating the same configurations or state actions, the corresponding states can be bypassed. However, traditional systems that implement state bypasses often have complex designs and costly validation escapes. HPI is able to use short timers in certain states instead of a traditional bypass. This is because the state actions are not repeated. This could allow for uniform and synchronized state machines transitions, among other benefits.

“In one case, a software-based control (e.g. through an external point for the Physical Layer) can allow for a brief timer for one or several particular states. A state that has already had actions performed and stored can be short-timed to allow for quick exit to another state. A state exit is possible if the previous state action does not work or cannot be applied within the time limit. The controller can also disable the short timer to indicate when state actions should be repeated. Each state can have a default timer or a long timer. A state exit is possible if configuration actions at the state are not completed within the timer. You can set the long timer to a reasonable length to allow for state actions to be completed. However, the short timer may be much shorter and make it difficult to complete state actions in certain cases without referring back to previous state actions.

“In certain instances, while initialization (or the re-initialization of a link) is occurring, agents move through a state machine towards an operational link state. In these cases, one or more failures, or state exits, can occur, which causes the state to reset (e.g. to a reset state or another state). The initialization of a link can go through several states before it enters a link state. One example is keeping track of the number of loops that occur during state transitions. A counter can be incremented for each instance when an initialization returns from a reset state to reach a link state. Once the link enters a linked state, the counter can be reset. Agents on both sides can maintain such counters. A software-based controller that uses one or more external control points can also set a threshold. The link can be stopped if the number of unproductive loops exceeds (or meets) the threshold. This is done by setting and holding the reset state. A software-based controller may be used to restart the initialization or release it from suspension. Software-based tools are able to analyze the nature and suspend initialize, perform diagnostics, set register value, and perform other operations to prevent further looping. In some cases, the controller may be able to set a higher threshold or override counter in order to restart a suspended link initialization.

“In some implementations HPI, supersequences may be defined. Each supersequence corresponds to a state or entry/exit from the state. A supersequence may contain a repeating sequence data sets or symbols. In some cases, the sequences may continue until completion of a state, state transition or communication of an corresponding event. Sometimes, the supersequence repeating sequence can be set to repeat at a specific frequency (e.g., a number of unit intervals, UIs). One unit interval (UI), can be defined as the time taken to transmit a bit over a link or system. In some cases, the repeating sequence may begin with an electronically ordered set (EOS). An instance of the EOS can therefore be expected to continue in accordance with the predefined frequency. These ordered sets can be implemented using 16 Byte codes, which may include hexadecimal representations. An EIEIOS can be used as the EOS for a supersequence. An EIEOS could be interpreted as a low frequency clock signal, such as a predefined number FFF00 or FFF000 hexadecimal symbol repeats, etc. An EOS can be followed by a predefined set data, such as a predefined amount of training sequences and other data. These supersequences are useful in state transitions, including link state transitions and initialization.

“In some implementations, such QPI, terminations can be brought on or off of serial data links, such when a link resets or is initialized. This can add complexity to the initialization process and take time. Some implementations of HPI allow terminations of links to be kept, even during resets or re-initializations. HPI allows hot-plugging devices. The voltage characteristics of the lane where the remote agent is added changes when another device is installed, whether hot-plugging or not. These changes in lane voltage can be detected by the local agent to prompt initialization of a link. To coordinate the configuration, initialization, and detection of a link, state machine states and timers are possible in the state machine.

HPI supports re-initialization of an in-band reset with no termination changes. This is possible through the screening by the receiving agent for incoming signals. Signaling can be used for identifying good lanes. To facilitate the configuration and discovery of the link, the lane can be checked for one of the pre-defined signals. A supersequence that corresponds to one or more initializations or re-initialization tasks can be one example. A pre-defined sequence may include additional sequence data and an EIEOS. Sometimes, devices on either side can become active and send a supersequence that corresponds to an initialization state. Two types of pin resets are possible in one embodiment: power-on (or cold?) Reset and Warm reset can be supported in one embodiment. Resets initiated by software, or originated (in the Physical layer or another layer) may be communicated in band to the other agent. An embedded clock may allow for an in-band reset to be done by communicating with another agent using an order set such as an electrical ordered set (EIOS) or a specific electrical ordered list.

The ordered set can be sent after initialization. Send the ordered set after initialization. Blocking the link layer from sending flits can be done by setting the block state. Another example is that link layer traffic might be stopped to send a few NULL Flits, which could be discarded at receiver.

Initialization can, as mentioned above, be performed at a slow speed and then at a faster speed. Initialization at slow speeds uses default values for registers and timers. The slow speed link is used by software to set the registers, timers, and electrical parameters. Finally, the software clears the calibration semaphores in preparation for faster initialization. Initialization may include states and tasks such as Reset. Detect. Polling. and Configuration.

A link layer blocking control sequence, such as. A blocking state (BLS or L0c) may include a timed condition during which the link layer flits while the PHY information to the remote agent is transmitted. The receiver and transmitter may then start a block control sequence. After the timers expire, both the receiver and transmitter can exit the block state. They may also take other actions such as exiting to reset, exiting to another link state (or any other state), or exiting to send flits across the line.

Link training may be provided in one embodiment. It can include sending one or more of ordered sets, scrambled training sequences, and control sequences such as when connected with a defined suprasequence. One or more of the following elements may be included in a training sequence symbol: a header, reserved portions of the target latency, a pair, a pair number, a physical lanes map code reference lanes, or a group, as well as an initialization state. One embodiment of the header may include a ACK and NAK. Training sequences can be sent in supersequences, and may be scrambled.

“In one embodiment, ordered sequences and control sequences do not have to be scrambled, staggered, and they are transmitted identically, simultaneously, and completely on all lanes. Valid reception of an ordered collection may include the checking of at most a portion of the set, or all of it for partial ordered sets. Ordered sets can include an electrically ordered (EOS), such an Electrical Idle Ordered Set or EIEOS. A supersequence can include the start of a data sequence or Fast Training Sequence (FTS). These sets and control sequences may be predefined. They can have any pattern, hexadecimal representation, and any length. Ordered sots or supersequences can have lengths of 8, 16, 32, and 64 bytes respectively. FTS can also be used to lock fast bits during exit of partial width transmitting links states. The FTS definition can be per lane or may use a rotated FTS.

In one embodiment, “Supersequences” can include the insert of an EOS (such as an EIEOS) in a training sequence stream. In one implementation, the signaling begins and lanes power-on in a staggered fashion. However, this may lead to some initial supersequences appearing truncated at receivers on certain lanes. However, supersequences may be repeated over shorter intervals (e.g. approximately one-thousand unit intervals or?1 KUI). Additional functions that can be performed with the training supersequences include configuration, deskew, and communicating initialization target, lanes map, among others. EIEOS can be used to transition a lane between inactive and active, screen for good lanes or TS boundaries.

“Turning towards FIG. 8 are illustrations of examples supersequences. An example Detect supersequence 805 is shown. The Detect supersequence 805 can contain a repeating sequence of a single EIEOS or other EOS, followed by a predefined number (TS) of a specific training sequence. One example is that the EIEOS can immediately be transmitted and seven instances of TS can follow. After seven TSes have been sent, the EIEOS can again be transmitted. Seven additional instances of T can then follow. The sequence can be repeated at a predetermined frequency. FIG. 8 shows an example. FIG. 8 shows how the EIEOS may reappear on lanes approximately every one thousand UIs (or?1 KUI), followed by the rest of the Detect supersequence 805 The receiver can check the lanes for a repeating Detect Supersequence 805 which can be followed by the remainder of the Detect supersequence 805.

“Another example is that another supersequence 810 could be used to indicate a loopback, configuration, or polling condition. A receiver can monitor the lanes of a link for a Poll/Config/Loop Supersequence 810 in order to identify a loopback condition, configuration state, polling state, or state. A Poll/Config/Loop loop supersequence 810 may begin with an EIEOS and be followed by a predefined amount of repeated instances of the TS. In one example, the EIEOS could be followed by three-one (31) instances TS, with the EIEOS repeating about every four thousand (e.g.,?4 KUI).

“Another example is a partial width transmitting states (PWTS), exit supersequence 815. A PWTS exit sequence can include an EIEOS that repeats to pre-condition lanes before the sending of the full sequence. An EIEOS can be used to start the sequence in supersequence 815. It will repeat approximately one time every 1KUI. Fast training sequences can be used in place of other training sequences. FTS are designed to speed up bit lock,byte lock and deskewing. An FTS can also be unscrambled in some cases to aid in the return of idle lanes to active as quickly as possible. The supersequence 815, like other supersequences that precede an entry into a link transmitting status, can be stopped and terminated by sending a start-of-data sequence (SDS) A partial FTS (FTSp), which can be sent, can also be used to help synchronize the new lanes with the active lanes. This could include allowing bits to subtract (or add) to the FTSp.

“Supersequences” include Detect supersequence 705, Poll/Config/Loop 710, and Poll/Config/Loop 710. Potentially, a substantial amount of data can be sent during the initialization or renewal of a link. A receiver can respond to a supersequence by echoing it over the lanes to the transmitter after receiving and detecting it. A handshake can be used to confirm a supersequence received and validated by the transmitter and receiver. This handshake, e.g., using a Detect Supersequence 705, can be used to recognize reinitialization. Another example is that such a handshake can also be used to signal the end of an electric reset or low-power state. This will result in corresponding lanes being restored, among other things. A handshake between the transmitter and receiver can identify the end of an electrical reset, as each transmitter transmits a Detect Supersequence 705 to their respective receivers.

“Another example is that lanes can be monitored to detect supersequences, and then use these supersequences for screening lanes for state exits, entries and wakes, among other events. These supersequences are predefined and predictable and can be used for initialization tasks such as bit lock,byte lock, descrambling and deskewing, adaptation and latency fixing, negotiated delay, and other possible uses. To speed up the system’s ability to respond to and process these events, lanes can be monitored in a substantial way.

“In the case debouncing, transients may be introduced onto lanes due to a variety of circumstances. Transients can be introduced onto a lane by powering on or adding a device. Poor lane quality, or electrical failure can cause voltage irregularities on the lane. Sometimes,?bouncing’ can occur. False positives such as false EIEOS can be caused by bouncing on a lane. In some cases, supersequences may be started with an EIEOS. However, defined supersequences might include additional sequences and a frequency at which the EIEOS will repeat. A logic analyzer at the receiver can detect a false EIEOS by validating data that succeeds it. The receiver logic analyzer may fail to validate the EIEOS if the expected TS or any other data doesn’t follow the EIEOS, or if the EIEOS doesn’t repeat within one of predefined frequencies of one the predefined supersequences. False negatives may also occur when a device is added or removed from a line. A device may send a Detect supersequence 705 when it is added to a link. This will alert the other end of the link about its presence and initiate initialization. Transients can cause data corruption in the supersequence, including the EIEOS and TS instances. A logic analyzer can be installed on the receiving device to continue monitoring the lanes and identify next EIEOS that is being sent by the new device in repeating Detect supersequence 705 among other examples.

“In some cases, an HPI link can operate at multiple speeds thanks to the embedded clock. A slow mode, for example, can be specified. The slow mode can sometimes be used to facilitate initialization of a connection. Software-based controllers can be used to calibrate the link. These controllers provide logic for setting different characteristics such as the link’s configuration, which lanes it is to use, which lanes it will use, how fast the link operates, sync of agents and lanes, target latency, deskew, and other possible characteristics. These software-based tools can be used to add data to Physical Layer registers in order to control different aspects of the Physical Layer facilities and logic.

The link’s operational speed can be much faster than that of the software-based controllers used in its initialization. To allow the use of these software-based controllers during initialization and re-initialization, a slow mode can be used. The slow mode can be used on lanes connecting receivers and transmitters, such as when a link turns on, is initialized, reset or turned off. This will facilitate calibration of the link.

“In one embodiment, the clock may be embedded in data to eliminate separate clock lanes. To facilitate clock recovery, the flits transmitted over the lanes may be scrambled. One example is the receiver clock recovery unit that can deliver sampling clocks (i.e. The receiver takes the clock from the data and extracts it for sampling the incoming data. Some implementations of receivers adapt to the incoming bit stream continuously. Pinout can be reduced by embedding the clock. The in-band reset process can be affected if the clock is embedded in the data. After initialization, one embodiment allows for the use of a blocking state (BLS). To facilitate reset, among other things, it is possible to use electrical ordered set sequences during initialization. An embedded clock can be shared between devices on a link, and the common operational time can be set during calibration or configuration. HPI links can reference a common clock that has drift buffers. This implementation may have a lower latency than the elastic buffers in common reference clocks. The reference clock distribution segments can be matched within certain limits.

An HPI link, as mentioned above, can operate at multiple speeds. For initialization, default power-up, and so forth. The operational speed (or?fast?) of each device can be set by BIOS. The BIOS can statically control the speed or mode of each device. Based on the operational speeds of each device, the common clock can be set up on the link. The link speed can, for example, be determined by the slower device operation speeds. A warm or cold reset may accompany any operational speed change.

“In some cases, when the power is turned on, the link switches to Slow Mode, which has a transfer rate of 100 MT/s. The software then sets up both sides of the link for operational speed and starts the initialization. A sideband mechanism can also be used to establish a link in case of slow mode or non-availability.

In one embodiment, a slow mode initialization phase can use the same encoding and scrambling, training sequences, states, and other features. As fast as the operational speed, but potentially with fewer features (e.g. no adaptation or electrical parameter setup). The slow mode operation phase may also use the same encoding and scrambling. Although other implementations may not, they may use the same encoding, scrambling, and other features. However these states and features may be smaller than operational speed (e.g. no low power states).

“Further, slow modes can be implemented using native phase lock loop (PLL), clock frequency of device. HPI supports an emulated slow mode, which can be done without having to change the PLL clock frequency. Some designs may have separate PLLs to support slow and fast speeds, but in some HPI implementations an emulated slow mode is possible by allowing the PLL to run at the same speed as the slow mode. A transmitter can emulate slower clock signals by repeating bits multiplely so that it emulates a slow high and slow low clock signals. The receiver can then sample the received signal and locate the edges that are emulated by repeating bits to identify the bit. Ports sharing a PLL can coexist at both slow and fast speeds in such an implementation.

Two devices can share a common slow mode speed. Two devices may have different speeds of operation, for example. For example, a common slow speed can be set during the discovery phase. An emulation multiple is an integer (or not-integer) ratio between fast speed and slow speed. The different fast speeds can then be converted to the same slow speed. Two device agents that support the same frequency can be hot-attached regardless of the speed at the host port. Software discovery might then use the slow speed link to determine and set up the best link operational speeds. The multiple is an integer proportion of slow speed to fast speed. Different fast speeds can be used for software discovery (e.g. of hot attach).

“Some implementations of HPI allow for the adaptation of lanes on a link. The Physical layer supports both transmitter or sender adaptation. The receiver adaption allows the transmitter to send samples to the receiver. This data can be used by the receiver logic to detect any deficiencies in the electrical characteristics or quality of the signal. Based on the data received, the receiver can adjust the calibration of lane to optimize lane. The receiver can also receive the sample data again and create metrics that describe the quality of the lanes. However, this communication to the transmitter is done using a backchannel (e.g., a software, hardware embedded sideband, or other channel) in order to allow it to adjust the lane based upon the feedback. The Polling supersequence from the remote transmitter can initiate receiver adaptation. The same procedure can be used to adapt transmitter parameters. Both agents can enter Loopback Pattern as masters to transmit the specified pattern. Each receiver can measure the metric (e.g. BER) for the particular transmitter setting at a distant agent. Loopback Marker can be used by both agents to reset and exchange metrics. These metrics will allow you to identify the next transmitter setting. The optimal transmitter setting will eventually be found and saved for future use.

“As both devices can run off the same clock reference (e.g. ref clk), elastic buffers can be omitted. Any elastic buffers can be bypassed or used to drift buffers with the lowest latency. To transfer the receiver bit stream from the remote clock to the local clock domain, each lane can use phase adjustment or drift buffers. Although the latency of the drift buffers might be sufficient to manage summation of all sources of electrical specification (e.g. temperature, voltage, residual SSC due to reference clock routing mismatches and so forth), it should be as low as possible to minimize transport delay. Drift errors can manifest as a series of CRC errors if the drift buffer is too small. In some cases, a drift alarm may be used to initiate a physical layer reset prior to an actual drift error.

Some implementations of HPI support both sides running at the same nominal reference clock frequency, but with a difference of ppm. Frequency adjustment (or elasticity) may be required in this situation. These buffers can be adjusted during extended BLS windows or special sequences that would be occurring periodically, among others.

“The operation and the HPI PHY logic layer can be performed independently of the underlying transmission media, provided that the latency doesn’t result in latency fixing errors at the link layer or timeouts at it.”

To aid in the management of the Physical Layer, HPI can provide external interfaces. External signals, such as those from pins, fuses and other layers, timers, control registers, and status registers, can be provided. Although the input signals can change in relation to PHY state at any given time, they must be observed by Physical layers at certain points within a state. A changing alignment signal, as described below, may be received, but it will not have any effect once the link has entered a transmitting state. Also, Physical layer entities can only observe command register values at certain points in time. Physical layer logic, for example, can take a snapshot and use it in future operations. To avoid any anomalous behavior, some implementations allow updates to command registers to be associated with specific periods (e.g. in a transmitting state or holding in Reset calibration in slow mode transmitting state).

“Status values track hardware changes so the value read can depend on the time they were read. However, some status values such as speed, link map, latency and speed may not change after initialization. A re-initialization, or low power link state (LPLS), is what may cause them to change. (e.g., hard lane failures in TLS may not cause reconfiguration of the link until reinitialization is triggered.

“Interface signals could include signals that are not external but have an effect on Physical layer behavior. These interface signals could include timing and encoding signals. Design specific interface signals are possible. These signals may be inputs or outputs. Interface signals such as semaphores or prefixed EO, for example, may only be active once per assertion edge. They can then be deasserted and reasserted again to take effect, among other things. Table 1 shows an example list of functions.

“TABLE 1\nFunction\ninput pin reset (aka warm reset)\ninput pin reset (aka cold reset)\ninput in-band reset pulse; causes semaphore to be set; semaphore is cleared\nwhen in-band reset occurs\ninput enables low power states\ninput loopback parameters; applied for loopback pattern\ninput to enter PWLTS\ninput to exit PWLTS\ninput to enter LPLS\ninput to exit LPLS\ninput from idle exit detect (aka squelch break)\ninput enables use of CPhyInitBegin\ninput from local or planetary alignment for transmitter to exit initialization\noutput when remote agent NAKs LPLS request\noutput when agent enters LPLS\noutput to link layer to force non-retryable flits\noutput to link layer to force NULL flits\noutput when transmitter is in partial width link transmitting state (PWLTS)\noutput when receiver is in PWLTS”

CSR timer default values may be given in pairs: one for slow mode and another for operational speed. The value 0 will disable the timer in some cases (i.e. timeout never happens). These timers can be combined with those in Table 2 below. To time the expected actions in a state, primary timers are useful. Secondary timers can be used to abort initializations that are not progressing, or to make forward state transitions at specific times in an automated testing equipment (or ATE mode). Secondary timers may sometimes be larger than primary timers for a particular state. Exponential timer sets can also be suffixed by exp, and the timer value will be 2 higher than the field value. The field value is used for linear timers. Both timers could use different granularities. A timing profile is a combination of timers from the power management section. These can be linked to a common timing diagram.

“TABLE 2\nTimers\nTable Tpriexp Set\nReset residency for driving EIEOS\nReceiver calibration minimum time; for stagger transmitter off\nTransmitter calibration minimum time; for stagger on\nTsecexp Set\nTimed receiver calibration\nTimed transmitter calibration\nSquelch exit detect/debounce\nDetectAtRx overhang for handshake\nAdapt + bitlock/bytelock/deskew\nConfigure link widths\nWait for planetary aligned clean flit boundary\nRe-bytelock/deskew\nTdebugexp Set\nFor hot plug; non-0 value to debug hangs\nTBLSentry Set\nBLS entry delay – fine\nBLS entry delay – coarse\nTBLS Set\nBLS duration for transmitter\nBLS duration for receiver\nBLS clean flit interval for transmitter\nTBLS clean flit interval for receiver”

“Command and Control Registers” can be provided. Software may read or write control registers in certain cases. In Reset, late-action values may take effect (e.g. pass through from the software-facing stage to the hardware-facing stage). Hardware can clear control semaphores (prefixed with CP) which are RWIS. Any of the items listed herein may be performed using control registers. They can be modified and made accessible via hardware, software, firmware or any combination thereof.

“Status Registers can be used to track hardware changes. They can be written and used by hardware and can be read-only. However, debug software may be able write to them. These registers can be complemented by many private status registers and may not impact interoperability. Status semaphores (prefixed with SP) can be mandating since they can be cleared by software to undo the actions that set the status. As a subset, default can be used to indicate initial (on reset). This register can be copied to a storage structure upon an initialization abort.

You can also provide “Tool Box registers.” Testability tool-box registers can be used in the Physical layer to provide loop back control mechanisms, pattern generation, pattern checking, and pattern checking. These registers can be used in conjunction with electrical parameters to determine margins. This tool-box can be used by Interconnect built in testing to determine margins. These registers can be combined with the registers discussed in the previous sections to adapt transmitters, among other examples.

HPI can support Reliability. Availability. and Serviceability (RAS), in some implementations. This is done by using the Physical layer. HPI supports hot plug-and-remove with one or more layers. This may include software. Hot remove may include quiescing the connection and clearing an initialization start state/signal for the agent to then be removed. Remote agent (i.e. Remote agents (i.e. In-band resets (e.g. through BLS) can cause the agents to wait in a reset status, such as a Calibrate Reset State(CRS). The agent that is to be removed can then be removed or can be kept in targeted pin reset, power down, among other features. Some of the events mentioned above may not be necessary and other events can be added.

Hot add can also include initialization speed. An initialization signal can set on the agent so that it defaults to slow. Software can change the speed to slow, which may be used to clear the initialization signal. Software can set a speed and allow the link to come up in slow mode. Sometimes, a remote cannot be locked by PLL at this time. Both agents can set their operational speed and enable can be used to allow adaptation (if this has not been done before). Both agents can clear the initialization start indicator. A BLS reset in-band can also be used to cause agents to remain in CRS. Software may assert a warm reset, which can be a targeted or self-reset of an agent (to go added), which could cause a PLL lock to relock. Software can also set the initialization start signal using any logic known and then set it on remote (thus advancing to Receiver Detect State, RDS). Software can also de-assert the warm reset of an adding agent, thus advancing it to RDS. The link can then be initialized at operational speed to a Transmitting Link State, (or to Loopback, if the adaption signals are set), and then it can go to full operational speed. You can add additional events to the list of events, or you could omit some of these events.

Data lane failure recovery support can be provided. One embodiment of HPI allows a link to be protected against hard error by configuring it to less than its full width (e.g. The faulty lane can be excluded by a link in HPI that is less than half its full width. The configuration can be done using the link state machine. Unused lanes can also be disabled in the configuration state. The flit can be sent across at a smaller width.

“In some HPI implementations, lane reverse can be supported by some links. Lane reversal could refer to, for example, lanes 0/1/2 . . A transmitter connected to lanes 1 & 2 . . of a receiver (e.g. n can equal 19, 7, or both. The receiver can detect lane reversal by identifying it in the field of a TSC header. You can reverse lane reversal at the receiver by starting in a Polling condition by using physical lanes n. . . 0 for the logical lane 0. . . n. Referring to a lanes may also refer to a number of logical lanes. Thus, board designers can lay out the physical or electrical design more efficiently and HPI may be able to work with virtual lanes assignments, as described in this article. In one embodiment, the polarity can be reversed (i.e. When a differential transmitter +/- is connected to receiver?/+ is connected to receiver ?/+. A receiver can detect polarity from one or more TS header field and handle it in one embodiment in the Polling State.

Referring to FIG. “Referring to FIG. 10, an embodiment is shown of a block diagram that shows a computing system with a multicore CPU. A processor or processing device that includes a microprocessor or embedded processor (DSP), digital signal processor (DSP), network processor (DSP), a digital signal processor(DSP), a network processor (DSP), a network and handheld processors, an application processor, co-processors, system on a chip, SOC, or any other device that executes code is called Processor 1000. In one embodiment, processor 1000 includes at least two cores:core 1001 or 1002, which can include asymmetric cores and symmetric cores (the illustrated embodiment). But processor 1000 can include any number processing elements, symmetrical or not.

Summary for “High performance interconnect physical Layer”

Semi-conductor processing has made it possible to increase the logic found on integrated circuit devices. Computer system configurations have changed from one or more integrated circuits to include multiple cores, multiple processor threads and multiple logical processing units. A processor or integrated circuit is typically composed of a single physical processor, which may contain any number cores, hardware threads and logical processors. It can also include interfaces, memory hubs, controller hubs, memory, controllers, and so on.

Due to the increased processing power available in smaller devices, computing devices that are smaller have become more popular. The popularity of smartphones, tablets, ultrathin notebooks and other equipment has increased exponentially. These smaller devices depend on servers for complex processing and data storage. The high-performance computing market, i.e. The demand for high-performance computing (i.e., server space) has increased. Modern servers typically have multiple processors with multiple cores. However, they also have multiple physical processors (also known as multiple sockets). This allows for greater computing power. As the computing power increases, so does the importance of communication between sockets.

Interconnects have evolved from traditional multi-drop buses, which primarily handled electrical communication, to fully blown interconnect architectures that allow for fast communication. As processors continue to demand higher consumption rates, corresponding demands are placed on existing interconnect architectures.

“The following description contains many specific details. These include examples of specific processors and system configurations. Specific register configurations. Specific instruction types. Specific processor pipeline stages. Specific packet/transaction configurations. Specific transaction names. Specific protocol exchanges. Specific link widths. Specified implementations. Operation. This description is intended to give a complete understanding of the invention. However, it may be obvious to those skilled in the art that not all details are necessary to practice the subject matter disclosed. Other instances have been avoided to provide a detailed description of known components and methods, such as specific processor architectures/code for the described algorithms, specific firmware operation, low-level interconnect operations, specific logic configurations and materials, specific fabrication techniques and materials, specific implementations of compilers, specific expressions of algorithms in code and specific power down and gate techniques/logic, and other operational details of a computer system, in order not to obscure the present disclosure.

The following embodiments can be used to describe energy conservation, energy efficiency and processing efficiency in particular integrated circuits such as microprocessors or computing platforms. However, they are also applicable to other types integrated circuits and logic device types. The same techniques and teachings may be applied to other types or semiconductor devices that might also benefit from the features described herein. The disclosed embodiments can be used in a variety of devices such as smartphones, tablets, ultrabooks, server computers, laptops and desktop computer systems. Examples of handheld devices are cellular phones, Internet protocol device, digital cameras and personal digital assistants (PDAs), as well as handheld computers. Similar techniques can be used to create a high-performance interconnect. Embedded applications usually include a microcontroller and a digital signal processing (DSP), as well as set-top boxes, network computers, hubs, wide-area network (WAN), switches, and any other system capable of performing the functions and operations described below. The apparatus, methods and systems described herein do not only apply to physical computing devices. They may also be applicable to software optimizations to improve energy efficiency and conservation. The description below will make it clear that the described embodiments of methods, devices, and systems (whether they refer to hardware, firmware or software or a combination thereof), may be vital to a “green technology”. Future performance considerations are balanced.”

“Components in computing systems are getting more complicated as they advance. To ensure optimal component operation, the interconnect architecture used to connect and communicate between components has increased in complexity. Different market segments require different interconnect architectures depending on their market. Servers require greater performance while mobile ecosystems can sometimes sacrifice overall performance to save power. However, this is not the only purpose of fabrics. They are designed to deliver maximum performance while maximizing power savings. The subject matter can also be beneficial to a number of interconnects.

“The Peripheral component interconnect (PCIe Express (PCIe), interconnect fabric architecture and QuickPath (QPI) fabric architecture can be improved, among others, according to one or more of the principles described herein. The primary purpose of PCIe, for example, is to allow components and devices from different manufacturers to inter-operate within an open architecture that spans multiple market segments: Clients (Desktops, Mobile), Servers Standard and Enterprise, and Embedded, and Communication devices. PCI Express is an I/O interconnect that can be used for general purposes and designed to work with a variety of future computing platforms. Some PCI attributes such as its usage model and load-store architecture have been preserved through revisions. Previously implemented parallel buses have been replaced with a fully serialized, highly scalable interface. Newer versions of PCI Express make use of point-to-point interconnects and Switch-based technology. They also support packetized protocols to provide new levels of performance. PCI Express supports advanced features such as Power Management, QoS, Hot-Plug/Hot?Swap support and Data Integrity. While the main discussion is focused on a new high performance interconnect (HPI), aspects of the invention described in this document may also be applicable to other interconnect architectures such as a PCIe compliant architecture or a QPI-compliant architectural, a MIPI compatible architecture, high-performance architecture or another known interconnect architecture.

Referring to FIG. “Referring to FIG. 1, an illustration of a fabric made up of point-to?point Links that interconnect a number of components is shown. The system 100 includes processor 105, system memory 110, and controller hub 115. The processor 105 may include any type of processing element such as a microprocessor or embedded processor, co-processors, or other processors. Through front-side bus FSB 106, processor 105 is connected to controller hub 115. FSB 106, in one embodiment, is a serial point to point interconnect. As described below. Link 106 is another embodiment that includes a serial-to-point differential interconnect architecture, which conforms to different interconnect standards.

“System memory 110” includes any memory device such as random access memory(RAM) or non-volatile memory (NV), and is accessible by all devices in system 100. Through memory interface 116, system memory 110 is connected to controller hub 115. A memory interface can be described as a double-data-rate (DDR), dual-channel DDR memory interfacing, or dynamic RAM (DRAM), among others.

“In one embodiment controller hub 115 may include a root complex, root hub, or root controller such as in a PCIe Interconnection Hierarchy. A chipset, controller hub (MCH), an interconnect controller hub, (ICH), a southbridge and a root controller/hub are all examples of controller hub 115. Chipsets are often used to refer to two separate controller hubs. For example, a memory controller (MCH), coupled to an interconnect hub (ICH). The MCH is often integrated with processor 95, and controller 115 is used to communicate with I/O devices in a similar way as the below. Peer-to-peer routing may be supported in some embodiments by root complex 115.

“Here controller hub 115 is connected to switch/bridge 120 via serial link 119. Interfaces/ports 117/121 may also be called input/output modules, 117 or 121. They can contain/implement a layer protocol stack to allow communication between controller hub 120 and switch 120. Multiple devices can be coupled to switch 120 in one embodiment.

“Switch/bridge 120 routes packets/messages starting at device 125 upstream. Up a hierarchy towards the root complex to controller hub 110 and downstream, i.e. Down a hierarchy towards a root controller: processor 105, system memory 110, to device 125. In one embodiment, Switch 120 is a logical assembly that includes multiple virtual PCI to-PCI bridge devices. Device 125 can include any external or internal device that is to be connected to an electronic system. It includes an I/O device (NIC), an add in card, an audio processor (audio processor), a network processor (NIC), a hard-drive and a storage device. A router, a portable storage device. A Firewire device. A Universal Serial Bus (USB). device. A scanner. And other input/output devices. In the PCIe language, a device is often referred to simply as an endpoint. Device 125, although not shown in detail, may include a bridge (e.g. a PCIe-PCI/PCIX bridge) to allow legacy versions or interconnect fabrics to be supported by such devices.

“Graphics accelerator 130 may also be connected to controller hub 115 via serial link 132. One embodiment of graphics accelerator 130 is connected to an MCH which is then coupled to an ICH. The ICH is coupled to switch 120 and I/O device number 125. I/O modules 131, 118 and 117 are to also implement a layered protocol stack for communication between controller hub 115 and graphics accelerator 130. A graphics controller, or the graphics accelerator 130, may be integrated into processor 105.

“Turning towards FIG. 2. An illustration of a layer protocol stack is shown. Layered protocol stack 200 may include any type of layered communication stack such as a QPI or PCIe stack, next generation high-performance computing interconnect (HPI), stack, or another layered stack. Protocol stack 200 may include transaction layer, link layer, and physical layers in one embodiment. Interfaces such as interfaces 117 to 118, 121 to 122, 126 and 131 in FIG. 1 may be represented by communication protocol stack 200. Representation as a communication protocol stack may also be referred to as a module or interface implementing/including a protocol stack.”

“Packets are used to communicate information among components. To carry information from one component to another, packets can be created in the Transaction Layer 205 or Data Link Layer 210. The packets are extended by the layers that handle them as they flow through them. The reverse process takes place at the receiver side. At the receiving end, packets are transformed from their Physical Layer 220 representation into the Data Link Layer 210 representation. Finally (for Transaction Layer Packets), they are converted to the form that can then be processed by the Transaction Layer 205.

“In one embodiment transaction layer 205 may provide an interface between a device?s processing core, and the interconnect architecture such as Data Link Layer 210 or Physical Layer 220. The primary responsibility for the transaction layer205 could include the assembly and disassembly packets (i.e. transaction layer packets or TLPs). The translation layer205 can also manage credit-based flow controls for TLPs. Split transactions are possible in some implementations. This allows a link to carry additional traffic while the target device collects data for the reply.

“Credit-based flow management can be used for virtual channels and networks that utilize the interconnect fabric. One example is that a device could advertise a credit amount for each buffer in Transaction Layer 205. A controller hub 115, an external device located at the other end of the link (FIG. 1 can count the credit consumed by each TLP. If the transaction is within a credit limit, a transaction can be transmitted. Once you receive a response, the amount of credit will be restored. This credit scheme has many advantages, including the fact that latency in credit returns does not affect performance, provided the credit limit is not exceeded.

In one embodiment, four transaction addresses spaces may include a configuration space, a memory space, and an input/output space. One or more memory space transactions can include read requests and writes to transfer data from/to a memory-mapped place. Memory space transactions can be used in one embodiment to use two different address formats. For example, a 32-bit short address format or a 64-bit long address format. You can access the configuration space of devices connected to an interconnect using configuration space transactions. Transactions to configuration space may include both read and write requests. To enable interconnect agents to communicate in-band, message space transactions (or simply messages) may also be defined. In one embodiment transaction layer 205 is capable of assembling packet header/payload 206

Refer to FIG. 3 illustrates an example embodiment for a transaction layer packet description. Transaction descriptor 300 is a method of carrying transaction information. Transaction descriptor 300 allows for the identification of transactions within a system. Tracking modifications to default transaction ordering can also be used. Transaction descriptor 300 may include, for example, global identifier 302, attributes field 304, and channel identifier 306. The illustrated example shows global identifier 302, which includes local transaction identifier 308 and source identification field 310. Global transaction identifier 302 can be used to identify all outstanding requests in one embodiment.

According to one implementation, local transactions identifier field 308 can be a field that is generated by a requester agent and can be unique for all outstanding orders that are required to be completed for that agent. In this example, source ID 310 uniquely identifies a requester agent in an interconnect hierarchy. In this way, source ID 310 and local transaction identifier (308 fields) provide global identification of transactions within a hierarchy domain.

“Attributes field 304 describes the characteristics and relationships of the transaction. Attributes field 304 can be used to provide additional information which allows for modification of default transaction handling. Attributes field 304 may include priority field 312, reserved fields 314, ordering field 316 and no-snoop area 318. In this case, the priority sub-field 312 may be modified by the initiator to give priority to the transaction. The reserved attribute field 314 can be used for future or vendor-defined purposes. The reserved attribute field may be used to implement usage models that use priority or security attributes.

“In this case, the ordering attribute field 311.6 is used to provide optional information about the type of order that might alter default ordering rules. An ordering attribute of 0 indicates that default ordering rules are to be applied. An ordering attribute of ‘0? indicates that default ordering rules will be applied, while an ordering attribute?1? denotes that no ordering rules will apply. denotes relaxed ordering where writes and reads can be passed in the same direction. To determine if transactions have been snooped, the Snoop attribute 318 is used. Channel ID Field 306 is used to identify the channel with which a transaction is associated.

“Returning back to FIG. “Returning to the discussion of FIG. 2, a Link Layer 210, also known as data link layer210, can serve as an intermediary stage between transaction layer205 and physical layer 220. One embodiment of the data layer 210’s responsibility is to provide a reliable mechanism for exchanging Transaction Layer Packets between two components on a linked component. The Data Link Layer 210 accepts the TLPs that have been assembled by the Transaction Layer 205 and applies the packet sequence identifier 211 to the other side. an identification number or packet number. Calculates and applies an error detection cod, i.e. CRC 212 and transmits modified TLPs via the Physical Layer 220 to be transmitted across a physical device to an external one.”

“In one example, physical Layer 220 contains logical sub-block 221 and an electrical sub-block 222, which are used to physically transmit a packet from an external device. The?digital? functions of Physical Layer 221. Physical Layer 221. The logical sub-block may include a transmit section that prepares outgoing information to be transmitted by physical sub-block 221. A receiver section is responsible for identifying and preparing received information before it is passed to Layer 210.

“Physical block 222 includes a transmitter as well as a receiver. The logical sub-block 221 supplies the transmitter with symbols. The transmitter serializes these symbols and transmits them onto an external device. The receiver receives serialized symbols from an outside device. It transforms the received signals into bit-streams. The bit-stream is then de-serialized, and sent to sub-block 221. In one example embodiment, an 8b/10b transmission code is employed, where ten-bit symbols are transmitted/received. Special symbols are used here to frame the packet with frames 223. The receiver may also provide a symbol clock from the incoming serial stream in an example.

“Although transaction layer 205 and link layer 210 are mentioned in relation to a particular embodiment of a protocol Stack (such as a PCIe protocol Stack), a layered protocol Stack is not limited. Any layered protocol can be implemented and adopted the features described herein. A port/interface can be represented as a layer protocol. It may include (1) a layer to build packets. A transaction layer; a layer to sequence the packets, i.e. A transaction layer; a second layer to sequence packets, i.e. A physical layer. A high-performance interconnect layer protocol is used as an example.

Referring to FIG. An example of a serial point-to-point fabric is shown in FIG. 4. Any transmission path that transmits serial data can be included in a serial point-to?point link. The embodiment shows two low-voltage differentially driven signal pairs. A transmit pair 406/411 is shown and a receive pairing 412/407. Device 405 has transmission logic 406 for transmitting data to device 407 and receiving logic 407 for receiving data from device 407, respectively. Two transmitting paths are also available, i.e. paths 416, 417 and 2 receiving paths, i.e. Some implementations of a hyperlink include paths 418- 419.

A transmission path is any path that transmits data. It can be a transmission line or copper line, optical line, wireless communication channel, infrared link, or any other path. A link is a connection between two devices such as device 405 or device 410. One lane may be supported by a link. Each lane represents a set differential signal pairs (one pair to transmit, one pair to receive). A link can aggregate multiple lanes to increase bandwidth. xN denotes the number of supported links, with N being any width.

A differential pair refers to two transmission paths that transmit differential signals, such as lines 416 or 417. Line 416 switches from a low voltage to a high voltage, i.e. A rising edge is when line 416 toggles from a low voltage level to a high voltage level, i.e. a falling edge. Differential signals may have better electrical properties, such as higher signal integrity. cross-coupling, voltage overshoot/undershoot, ringing, among other example advantages. This gives rise to a wider timing window that allows for faster transmission frequencies.

“In one embodiment, the High Performance Interconnect (HPI), is provided. HPI may include a next generation cache-coherent link-based interconnect. HPI can be used in high-performance computing platforms such as workstations and servers. It may also be used in systems that use PCIe or other interconnect protocols to connect accelerators, processors, I/O devices and the like. HPI’s use is not limited. HPI can be used in any of these systems or platforms. The individual ideas may also be applied to other interconnects or platforms such as PCIe and MIPI.

HPI can be used to support multiple devices. One example implementation could include an Instruction Set Architecture agnostic (i.e. HPI can be used in many different devices. HPI can also be used to connect high-performance I/O devices. A high-performance PCIe device, for example, could be connected to HPI via a translation bridge (i.e. HPI to HPIe The HPI links can be used by many HPI-based devices such as processors in different ways (e.g. stars, rings, meshes, etc.). FIG. FIG. 5 shows examples of multi-socket configurations. Two-socket configuration 505 can have two HPI links, but in other cases, only one HPI link is possible. Any configuration can be used for larger topologies as long as an identifier is assignable, and there is some type of virtual path among other additional or replacement features. In one example, the four-socket configuration 510 includes an HPI link between each processor and another. In the configuration 515 example, eight sockets are connected via an HPI link. If a virtual channel or path exists between processors, however, the configuration can be supported. There are two (2) supported processors in a native domain. Multidomains and other interconnects between node controllers can be used to increase the number of processors.

The HPI architecture defines a layer protocol architecture. This architecture may include, in some cases, protocol layers (coherent or non-coherent and, optionally other memory-based protocols), routing layers, link layers, and physical layers. HPI also allows for enhancements in power management (such as power control unit (PCUs), design for testing and debugging (DFT), fault handling (registers), security, among others. FIG. FIG. 5 shows an example of a HPI layered protocol stack. FIG. 5 may only be used in certain implementations. 5. May be optional. Each layer has its own level or quantum information. This includes the protocol layer 605 with packets 630 and link layer 610 with flits 635. The physical layer 605 with phits 645. A packet may contain partial flits or a single flit depending on its implementation.

“A width of a Phit 640 is a first example. It includes a 1-1 mapping of link widths to bits (e.g. A 20-bit link width is a phit that has 20 bits. Flits can be larger than 184,192, or 200 bytes. If phit 635 is 20 bytes wide and phit 635 is 184 bytes, then it takes a fraction of phits 635 to transmit one flit 635. 9.2 phits are required to transmit a 184-bit flit 635 at 20 bits, or 9.6 at 20 bit to transmit a 194-bit flit 635 at 192 bits, to name a few. The width of the fundamental link at a physical layer can vary. The number of lanes per direction can include 2, 4, 6, 8, 12, 14, 16, 18, 20, 22, 24 etc. Link layer 610a,b can embed multiple pieces of transactions in one flit. One or more headers may also be possible (e.g. The flit may contain 1, 2, 3, or 4. HPI can split the headers into the appropriate slots in order to allow multiple messages to be sent from the flit to different nodes.

“Physical Layer 605 a,b can in one embodiment be responsible for fast information transfer on the physical medium (electrical, optical, etc.). Point-to-point physical links can be established between two Link layers entities such as 605a and 605b. The Link Layer 610 a.b can abstract the Physical Layer 605 a.b from the higher layers. It provides the ability to transfer data and requests reliably and manage flow control between directly connected entities. The Link Layer is also responsible for virtualizing the physical channel into multiple channels and message classes. Protocol layer 620a,b uses the Link layer610a,b to map protocol message into the appropriate message class and virtual channels, before handing them over to the Physical layer 605a,b for transmission across physical links. Multiple messages may be supported by Link layer 610a,b, including a request, snoop response, writeback and non-coherent data.

“The Physical layer 605a,b (or the PHY) of HPI may be implemented above or below the electrical layer (i.e. As illustrated in FIG. 6. Each agent can have a Physical layer and the corresponding logic. This connects the link layers of two agents (A or B) that are separated (e.g. On devices located on either side of the link. Physical media (e.g. wires, conductors, optical, etc.). In one embodiment, the Physical layer 605a,b has two phases: initialization and operation. Initialization is when the link layer blocks the connection. Signaling can include a combination timed states or handshake events. The connection is transparent to link layer during operation. Signaling occurs at a fast speed with all lanes working together as one link. The Physical layer transports flits between agent A and agent B during the operation phase. A connection, also known as a link, abstracts certain physical aspects such as media, width and velocity from the link layers while exchanging fly-bys and controlling/status current configuration (e.g. width) with the link layers. Minor phases, e.g. Polling, Configuration. Minor phases, such as configuration and polling, are also included in the operation phase. “Link power management states”.

“In one embodiment, Link Layer 610 a.b can be implemented to ensure reliable data transfer between two routing or protocol entities. The Link layer can abstract Physical Layer 605 a.b from Protocol layer 620.b. It can control flow between two protocol agents (A and B) and provide virtual channel services (Virtual Networks) to the Protocol layer and Routing layers (Message Classes and Routing layers). The packet level is typically the interface between Protocol layer 620 a.b and Link Layer 610 a.b. One embodiment refers to the smallest unit of transfer at the Link Layer as a “flit” which is a number of bits such as 192 bits or another denomination. The Link Layer 610a,b uses the Physical Layer 605a,b to frame 605 a.b units of transfer (phit), into the Link Layer’s 605 a.b units of transfer (flit). The Link Layer 610 a.b can be logically divided into two parts: a sender or a receiver. One entity could have a sender/receiver pair and another entity might have a receiver/sender couple. Flow Control can be performed on both a packet and a flit basis. Correction and error detection can also be performed on a packet basis.

“In one embodiment Routing layer 615a,b may provide a distributed and flexible method to route HPI transactions between a source and a destination. This scheme can be customized to suit multiple topologies by using programmable routing tables at each router. Programming in one embodiment can be done either via firmware or software. The routing functionality can be distributed. Each routing step is defined by a lookup at a table at the source, intermediate or destination routers. A lookup at a source can be used to inject HPI packets into the HPI fabric. A lookup at an intermediate router can be used to route an HPI package from an input port into an output port. To target the destination HPI protocol agents, you may use the lookup at a port. The Routing layer in some implementations can be thin because the routing tables and thus the routing algorithms are not defined by specification. This allows flexibility and a range of usage models to be used, as well as flexible platform architecture topologies that can be defined by system implementations. The Routing layer 615a,b depends on the Link layer610a,b to allow the use of three or more virtual networks (VNs). In one example, there are two VNs that are deadlock-free, VN0, and VN1, with multiple message classes. Although a shared adaptive virtual network (VNA), may be defined in Link layer, this adaptive network might not be directly exposed in routing concepts. Each message class and virtual network could have dedicated resources and guaranteed forward progression, among other features.

An embedded clock can be used in HPI’s implementations. Data can be transmitted via the interconnect with a clock signal embedded. The dock signal can be embedded in data. This allows for the omission of separate clock lanes. This is useful in certain situations, such as when more pins can be used for data transfer, especially in systems with limited space.

An interconnect can establish a link between agents. The agent sending the data can either be a local agent or a remote agent. Both agents can use state machines to manage different aspects of the link. One embodiment allows the Physical layer datapath to transmit flits between the link layer and the electrical front-end. In one implementation, the control path includes a state-machine (also known as a link training machine or something similar). Internal signals, timers and external signals may be used to control the state machine’s actions or exits from different states. Some states, like initialization states, might have timers that provide an exit timeout value. In some embodiments detect refers to the detection of an event on both legs a lane. However, it is not always simultaneous. In other embodiments, detect is the detection of an event using an agent of reference. Debounce is one example of sustained assertion of a signal. HPI is one embodiment of HPI. It supports operation even if there are no function lanes. This allows lanes to be dropped in specific states.

“States” can be reset states, initialization state, or operational states. There are many other subcategories and categories. One example is that some initialization states may have a secondary-timer, which can be used to exit the state with a timeout (essentially an abandonment due to insufficient progress in the state). A state may be aborted if it updates its status register. Some states may also have primary timers that are used to set the state’s primary functions. You can also define other states so that external or internal signals (e.g. handshake protocols), drive the transition from one state to another.

“A state machine can also support debug through single steps, freeze on initialization abort, and the use of testers. State exits can be delayed or held until debug software is available. The exit can sometimes be delayed or held until the second timeout. One embodiment allows for actions and exits to be based on the exchange of training sequences. One embodiment of the link state machine requires that it runs in the local agent clock domain. The transition from one state into the next must coincide with a transmitter-training sequence boundary. To reflect the current state, status registers can be used.

“FIG. 7 shows a representation at most of the state machine used in an example implementation of HPI. The state table in FIG. 7 includes a non-exhaustive list of possible states. To simplify the diagram, certain transitions have been omitted. Some states can be combined, divided, or omitted while others may be added. These states include:

“Event reset state”: Entered on a cold or warm reset event. Restores default values. Initialize counters (e.g. sync counters). May exit to another state such as another reset.

“Timed reset state”: Timed state for in hand reset. Remote receivers can detect the EOS and enter the timed reset. The receiver has lanes that hold electrical settings. Agent may be contacted to reset the state.

“Calibrate reset status: calibration without signaling in the lane (e.g. Receiver calibration state) or turning off drivers. A timer may be used to set a predetermined time limit. It may be used to set an operational speed. When a port isn’t enabled, it may act as a waiting state. This may include minimum residency time. Based on design, receiver conditioning and staggering off can occur. After completing calibration, the receiver may exit to a detect state.

“Receiver detection state: detect the presence of a receiver in lane(s). Receiver termination may be detected (e.g. Receiver pulldown insertion. When a value is set, or if another value is not set, the calibrate reset state may be exited. If a receiver is detected, or a timeout has been reached, the transmitter calibrate state may be exited.

“Transmitter calibration state: for transmitter calibrations. This may be a state that is timed for transmitter calibrations. Signaling on a lane may be included. An EOS may be continuously driven, such as an electronic idle exit ordered set (or EIEIOS). When the timer expires or calibration is completed, may exit to compliance state. If a counter expires or a secondary timeout occurs, may exit to transmitter detection state.

“Transmitter detect state: qualifies valid signaling. This may be a handshake condition in which an agent performs actions and exits to another state based upon remote agent signaling. Valid signaling may be received by the receiver. In one embodiment, the receiver looks for a wake detection and, if it is, searches for it on other lanes. Transmitter sends a detect signal. The polling state may be exited if all lanes have been debounced and/or there is a timeout, or debounce is not completed on all lanes. One or more monitor lanes can be kept awake in this situation to debounce a wake-signal. If the wake signal is debounced, then all other lanes can be debounced. This allows for power savings in low power situations.

“Polling state: Receive adapts, initializes drift buffer, and locks bits/bytes (e.g. identifies symbol boundaries). Sometimes, lane may be rearranged. Remote agents may cause exit to another state (e.g. In response to an acknowledgment message, a remote agent may cause an exit to a next state (e.g. A polling system can also include a lock to a training sequence header and an EOS. The distance from lane to lane skew at the remote transmitter can be set at a maximum length for top speed and at a minimum length for slow speed. You can also deskew in an operational or slow mode. The receiver may set a maximum lane-to lane skew to be deskew. This could include 8, 16 or 32 intervals. Latency fixing may be included in receiver actions. In one embodiment, receiver actions can be accomplished upon successful deskews of valid lane maps. One example of a successful handshake is when several training sequence headers are sent with acknowledgements. After the receiver completes its actions, a number more training sequences with acknowledgements are transmitted.

“Link width state: Agent communicates with the last lane map to remote transmitter. The information is received by the receiver and decoded. After checking for a previous lane value in another structure, the receiver may record a configured lanes map in the structure. The receiver may respond by indicating their acknowledgement (?ACK?). May initiate an in band reset. One example is the first state that initiates an in-band reset. In one embodiment, the ACK causes exit to a new state (e.g., flit configuration state). A reset signal can also be generated prior to entering low-power state. This is done if the frequency at which a wake detect signal occurs drops below a certain value (e.g. 1 for every number of unit intervals, such as 4KUI. The receiver may have both current and past lane maps. Based on different training sequences, transmitter may use different lanes. In some embodiments, lane map may not alter some status registers.

“Flitlock configuration state”: The state is entered by a transmitter, but it is considered to be exited (i.e. Secondary timeout moot: When both receiver and transmitter have exited to another link state, or a blocking state, In one embodiment, transmitter exit to a link condition includes the start of a data and training sequence boundary following receipt of a planetary alignment signal. Receiver exit can be made based upon receiving an SDS signal from a remote transmitter. This state could be used as a bridge between agent and link state. SDS is identified by the receiver. If SDS is received after descrambler initialization, the receiver may exit to blocking state (BLS), or a control window. Exit may be to reset the state if a timeout occurs. Transmitter sends configuration signals to drive lanes. Transmitter exit can be to reset, BLS or other states depending on the conditions or timeouts.

“Transmitting link state: A link state. Flits are sent from a remote agent. You can enter from a blocked link state. If you get a timeout, the link will return to a blocked state. Transmitter sends flits. The receiver receives the flits. Also, may exit to a low-power link state. In certain implementations, the transmitting link state (TLS), can also be called the L0 state.

“Blocking Link State: a link state. Transmitter, receiver and other parties are working in a coordinated manner. This may be a state in which the link layer flits and the physical layer information are communicated to remote agents. It may exit to a low-power link state (or another link state depending on the design). In one embodiment, a blocking link state (BLS) periodically occurs. This is called a BLS interval. It can be timed and may vary between operational speed and slow speed. The link layer can be periodically blocked front sending flights so that a Physical Layer control sequence of length may be sent. This could be during a transmitting or partial width transmitting state. Some implementations can refer to a blocking link state (BLS), or L0 control, as L0c.

“Partial Width Transmitting Link State: Link state. You can save power by entering a partial width status. Asymmetric partial width may refer to a link that has two directions with different widths. Some designs may support this. FIG. shows an example of an initiator (such as a transmitter) sending a partial width signal to enter partial width transmitting state. 9. This is a partial width indication that is sent when transmitting on a link having a first width to allow the link to transition to transmit at a new width. A mismatch could result in a reset. Speeds cannot be changed, but widths can. Flits can be sent at different widths. It may be similar to a transmitting state logically, but it can take longer to transmit flits because there is a smaller width. You may exit to other link states such as a low-power link state that is based only on certain received or sent messages, or an exit from the partial width transmitting state or a block state based upon other events. To improve signal integrity (i.e.,.), a transmitter port can turn off idle lanes in a staggered fashion. As shown in the timing diagram, noise mitigation is possible. These flits are not retryable and can be used when the link width changes. These null flits can be dropped by the receiver and idle lanes may be turned off in a staggered fashion. Also, one or more structures may record the current and past lane maps. The status register and the note status may not be altered. Partial width transmitting state may be called a partial L0 or L0p in some implementations.

“Exit Partial Width Transmitting State: Leave the partial width state. In some cases, a blocking state may be used. In one embodiment, the transmitter initiates exit by sending exit patterns of partial width to the idle lanes in order to train and debrief them. An example of an exit pattern is EIEOS. This signals that the lane has been ready to enter a full transmitting state. It may also end with SDS (or Fast Training Sequence) on idle lanes. Failures during the exit sequence (receiver activities, such as deskew that was not completed before timeout), stop flit transfers and assert a reset. This is done by resetting link on the next blockage link state occurrence. The SDS may also initialize the scrambler/descrambler on the lanes to appropriate values.”

A “Low Power link state” is one that has a lower power level. It is, in one embodiment, lower power than the partial-width link state. Signaling is stopped in both directions and on all lanes in this embodiment. For requesting a low-power link state, transmitters may use a blocking state. The receiver can decode the request, and reply with an ACK (or a NAK); otherwise, reset may be triggered. Some implementations can refer to a low power link state as a “L1 state”.

“Some implementations allow state transitions to be made to enable states to be bypassed. This is possible, for example, when certain configurations or calibrations have been completed by the states. You can save previous state results and configurations for a link and reuse them in subsequent initializations or configurations. Instead of repeating the same configurations or state actions, the corresponding states can be bypassed. However, traditional systems that implement state bypasses often have complex designs and costly validation escapes. HPI is able to use short timers in certain states instead of a traditional bypass. This is because the state actions are not repeated. This could allow for uniform and synchronized state machines transitions, among other benefits.

“In one case, a software-based control (e.g. through an external point for the Physical Layer) can allow for a brief timer for one or several particular states. A state that has already had actions performed and stored can be short-timed to allow for quick exit to another state. A state exit is possible if the previous state action does not work or cannot be applied within the time limit. The controller can also disable the short timer to indicate when state actions should be repeated. Each state can have a default timer or a long timer. A state exit is possible if configuration actions at the state are not completed within the timer. You can set the long timer to a reasonable length to allow for state actions to be completed. However, the short timer may be much shorter and make it difficult to complete state actions in certain cases without referring back to previous state actions.

“In certain instances, while initialization (or the re-initialization of a link) is occurring, agents move through a state machine towards an operational link state. In these cases, one or more failures, or state exits, can occur, which causes the state to reset (e.g. to a reset state or another state). The initialization of a link can go through several states before it enters a link state. One example is keeping track of the number of loops that occur during state transitions. A counter can be incremented for each instance when an initialization returns from a reset state to reach a link state. Once the link enters a linked state, the counter can be reset. Agents on both sides can maintain such counters. A software-based controller that uses one or more external control points can also set a threshold. The link can be stopped if the number of unproductive loops exceeds (or meets) the threshold. This is done by setting and holding the reset state. A software-based controller may be used to restart the initialization or release it from suspension. Software-based tools are able to analyze the nature and suspend initialize, perform diagnostics, set register value, and perform other operations to prevent further looping. In some cases, the controller may be able to set a higher threshold or override counter in order to restart a suspended link initialization.

“In some implementations HPI, supersequences may be defined. Each supersequence corresponds to a state or entry/exit from the state. A supersequence may contain a repeating sequence data sets or symbols. In some cases, the sequences may continue until completion of a state, state transition or communication of an corresponding event. Sometimes, the supersequence repeating sequence can be set to repeat at a specific frequency (e.g., a number of unit intervals, UIs). One unit interval (UI), can be defined as the time taken to transmit a bit over a link or system. In some cases, the repeating sequence may begin with an electronically ordered set (EOS). An instance of the EOS can therefore be expected to continue in accordance with the predefined frequency. These ordered sets can be implemented using 16 Byte codes, which may include hexadecimal representations. An EIEIOS can be used as the EOS for a supersequence. An EIEOS could be interpreted as a low frequency clock signal, such as a predefined number FFF00 or FFF000 hexadecimal symbol repeats, etc. An EOS can be followed by a predefined set data, such as a predefined amount of training sequences and other data. These supersequences are useful in state transitions, including link state transitions and initialization.

“In some implementations, such QPI, terminations can be brought on or off of serial data links, such when a link resets or is initialized. This can add complexity to the initialization process and take time. Some implementations of HPI allow terminations of links to be kept, even during resets or re-initializations. HPI allows hot-plugging devices. The voltage characteristics of the lane where the remote agent is added changes when another device is installed, whether hot-plugging or not. These changes in lane voltage can be detected by the local agent to prompt initialization of a link. To coordinate the configuration, initialization, and detection of a link, state machine states and timers are possible in the state machine.

HPI supports re-initialization of an in-band reset with no termination changes. This is possible through the screening by the receiving agent for incoming signals. Signaling can be used for identifying good lanes. To facilitate the configuration and discovery of the link, the lane can be checked for one of the pre-defined signals. A supersequence that corresponds to one or more initializations or re-initialization tasks can be one example. A pre-defined sequence may include additional sequence data and an EIEOS. Sometimes, devices on either side can become active and send a supersequence that corresponds to an initialization state. Two types of pin resets are possible in one embodiment: power-on (or cold?) Reset and Warm reset can be supported in one embodiment. Resets initiated by software, or originated (in the Physical layer or another layer) may be communicated in band to the other agent. An embedded clock may allow for an in-band reset to be done by communicating with another agent using an order set such as an electrical ordered set (EIOS) or a specific electrical ordered list.

The ordered set can be sent after initialization. Send the ordered set after initialization. Blocking the link layer from sending flits can be done by setting the block state. Another example is that link layer traffic might be stopped to send a few NULL Flits, which could be discarded at receiver.

Initialization can, as mentioned above, be performed at a slow speed and then at a faster speed. Initialization at slow speeds uses default values for registers and timers. The slow speed link is used by software to set the registers, timers, and electrical parameters. Finally, the software clears the calibration semaphores in preparation for faster initialization. Initialization may include states and tasks such as Reset. Detect. Polling. and Configuration.

A link layer blocking control sequence, such as. A blocking state (BLS or L0c) may include a timed condition during which the link layer flits while the PHY information to the remote agent is transmitted. The receiver and transmitter may then start a block control sequence. After the timers expire, both the receiver and transmitter can exit the block state. They may also take other actions such as exiting to reset, exiting to another link state (or any other state), or exiting to send flits across the line.

Link training may be provided in one embodiment. It can include sending one or more of ordered sets, scrambled training sequences, and control sequences such as when connected with a defined suprasequence. One or more of the following elements may be included in a training sequence symbol: a header, reserved portions of the target latency, a pair, a pair number, a physical lanes map code reference lanes, or a group, as well as an initialization state. One embodiment of the header may include a ACK and NAK. Training sequences can be sent in supersequences, and may be scrambled.

“In one embodiment, ordered sequences and control sequences do not have to be scrambled, staggered, and they are transmitted identically, simultaneously, and completely on all lanes. Valid reception of an ordered collection may include the checking of at most a portion of the set, or all of it for partial ordered sets. Ordered sets can include an electrically ordered (EOS), such an Electrical Idle Ordered Set or EIEOS. A supersequence can include the start of a data sequence or Fast Training Sequence (FTS). These sets and control sequences may be predefined. They can have any pattern, hexadecimal representation, and any length. Ordered sots or supersequences can have lengths of 8, 16, 32, and 64 bytes respectively. FTS can also be used to lock fast bits during exit of partial width transmitting links states. The FTS definition can be per lane or may use a rotated FTS.

In one embodiment, “Supersequences” can include the insert of an EOS (such as an EIEOS) in a training sequence stream. In one implementation, the signaling begins and lanes power-on in a staggered fashion. However, this may lead to some initial supersequences appearing truncated at receivers on certain lanes. However, supersequences may be repeated over shorter intervals (e.g. approximately one-thousand unit intervals or?1 KUI). Additional functions that can be performed with the training supersequences include configuration, deskew, and communicating initialization target, lanes map, among others. EIEOS can be used to transition a lane between inactive and active, screen for good lanes or TS boundaries.

“Turning towards FIG. 8 are illustrations of examples supersequences. An example Detect supersequence 805 is shown. The Detect supersequence 805 can contain a repeating sequence of a single EIEOS or other EOS, followed by a predefined number (TS) of a specific training sequence. One example is that the EIEOS can immediately be transmitted and seven instances of TS can follow. After seven TSes have been sent, the EIEOS can again be transmitted. Seven additional instances of T can then follow. The sequence can be repeated at a predetermined frequency. FIG. 8 shows an example. FIG. 8 shows how the EIEOS may reappear on lanes approximately every one thousand UIs (or?1 KUI), followed by the rest of the Detect supersequence 805 The receiver can check the lanes for a repeating Detect Supersequence 805 which can be followed by the remainder of the Detect supersequence 805.

“Another example is that another supersequence 810 could be used to indicate a loopback, configuration, or polling condition. A receiver can monitor the lanes of a link for a Poll/Config/Loop Supersequence 810 in order to identify a loopback condition, configuration state, polling state, or state. A Poll/Config/Loop loop supersequence 810 may begin with an EIEOS and be followed by a predefined amount of repeated instances of the TS. In one example, the EIEOS could be followed by three-one (31) instances TS, with the EIEOS repeating about every four thousand (e.g.,?4 KUI).

“Another example is a partial width transmitting states (PWTS), exit supersequence 815. A PWTS exit sequence can include an EIEOS that repeats to pre-condition lanes before the sending of the full sequence. An EIEOS can be used to start the sequence in supersequence 815. It will repeat approximately one time every 1KUI. Fast training sequences can be used in place of other training sequences. FTS are designed to speed up bit lock,byte lock and deskewing. An FTS can also be unscrambled in some cases to aid in the return of idle lanes to active as quickly as possible. The supersequence 815, like other supersequences that precede an entry into a link transmitting status, can be stopped and terminated by sending a start-of-data sequence (SDS) A partial FTS (FTSp), which can be sent, can also be used to help synchronize the new lanes with the active lanes. This could include allowing bits to subtract (or add) to the FTSp.

“Supersequences” include Detect supersequence 705, Poll/Config/Loop 710, and Poll/Config/Loop 710. Potentially, a substantial amount of data can be sent during the initialization or renewal of a link. A receiver can respond to a supersequence by echoing it over the lanes to the transmitter after receiving and detecting it. A handshake can be used to confirm a supersequence received and validated by the transmitter and receiver. This handshake, e.g., using a Detect Supersequence 705, can be used to recognize reinitialization. Another example is that such a handshake can also be used to signal the end of an electric reset or low-power state. This will result in corresponding lanes being restored, among other things. A handshake between the transmitter and receiver can identify the end of an electrical reset, as each transmitter transmits a Detect Supersequence 705 to their respective receivers.

“Another example is that lanes can be monitored to detect supersequences, and then use these supersequences for screening lanes for state exits, entries and wakes, among other events. These supersequences are predefined and predictable and can be used for initialization tasks such as bit lock,byte lock, descrambling and deskewing, adaptation and latency fixing, negotiated delay, and other possible uses. To speed up the system’s ability to respond to and process these events, lanes can be monitored in a substantial way.

“In the case debouncing, transients may be introduced onto lanes due to a variety of circumstances. Transients can be introduced onto a lane by powering on or adding a device. Poor lane quality, or electrical failure can cause voltage irregularities on the lane. Sometimes,?bouncing’ can occur. False positives such as false EIEOS can be caused by bouncing on a lane. In some cases, supersequences may be started with an EIEOS. However, defined supersequences might include additional sequences and a frequency at which the EIEOS will repeat. A logic analyzer at the receiver can detect a false EIEOS by validating data that succeeds it. The receiver logic analyzer may fail to validate the EIEOS if the expected TS or any other data doesn’t follow the EIEOS, or if the EIEOS doesn’t repeat within one of predefined frequencies of one the predefined supersequences. False negatives may also occur when a device is added or removed from a line. A device may send a Detect supersequence 705 when it is added to a link. This will alert the other end of the link about its presence and initiate initialization. Transients can cause data corruption in the supersequence, including the EIEOS and TS instances. A logic analyzer can be installed on the receiving device to continue monitoring the lanes and identify next EIEOS that is being sent by the new device in repeating Detect supersequence 705 among other examples.

“In some cases, an HPI link can operate at multiple speeds thanks to the embedded clock. A slow mode, for example, can be specified. The slow mode can sometimes be used to facilitate initialization of a connection. Software-based controllers can be used to calibrate the link. These controllers provide logic for setting different characteristics such as the link’s configuration, which lanes it is to use, which lanes it will use, how fast the link operates, sync of agents and lanes, target latency, deskew, and other possible characteristics. These software-based tools can be used to add data to Physical Layer registers in order to control different aspects of the Physical Layer facilities and logic.

The link’s operational speed can be much faster than that of the software-based controllers used in its initialization. To allow the use of these software-based controllers during initialization and re-initialization, a slow mode can be used. The slow mode can be used on lanes connecting receivers and transmitters, such as when a link turns on, is initialized, reset or turned off. This will facilitate calibration of the link.

“In one embodiment, the clock may be embedded in data to eliminate separate clock lanes. To facilitate clock recovery, the flits transmitted over the lanes may be scrambled. One example is the receiver clock recovery unit that can deliver sampling clocks (i.e. The receiver takes the clock from the data and extracts it for sampling the incoming data. Some implementations of receivers adapt to the incoming bit stream continuously. Pinout can be reduced by embedding the clock. The in-band reset process can be affected if the clock is embedded in the data. After initialization, one embodiment allows for the use of a blocking state (BLS). To facilitate reset, among other things, it is possible to use electrical ordered set sequences during initialization. An embedded clock can be shared between devices on a link, and the common operational time can be set during calibration or configuration. HPI links can reference a common clock that has drift buffers. This implementation may have a lower latency than the elastic buffers in common reference clocks. The reference clock distribution segments can be matched within certain limits.

An HPI link, as mentioned above, can operate at multiple speeds. For initialization, default power-up, and so forth. The operational speed (or?fast?) of each device can be set by BIOS. The BIOS can statically control the speed or mode of each device. Based on the operational speeds of each device, the common clock can be set up on the link. The link speed can, for example, be determined by the slower device operation speeds. A warm or cold reset may accompany any operational speed change.

“In some cases, when the power is turned on, the link switches to Slow Mode, which has a transfer rate of 100 MT/s. The software then sets up both sides of the link for operational speed and starts the initialization. A sideband mechanism can also be used to establish a link in case of slow mode or non-availability.

In one embodiment, a slow mode initialization phase can use the same encoding and scrambling, training sequences, states, and other features. As fast as the operational speed, but potentially with fewer features (e.g. no adaptation or electrical parameter setup). The slow mode operation phase may also use the same encoding and scrambling. Although other implementations may not, they may use the same encoding, scrambling, and other features. However these states and features may be smaller than operational speed (e.g. no low power states).

“Further, slow modes can be implemented using native phase lock loop (PLL), clock frequency of device. HPI supports an emulated slow mode, which can be done without having to change the PLL clock frequency. Some designs may have separate PLLs to support slow and fast speeds, but in some HPI implementations an emulated slow mode is possible by allowing the PLL to run at the same speed as the slow mode. A transmitter can emulate slower clock signals by repeating bits multiplely so that it emulates a slow high and slow low clock signals. The receiver can then sample the received signal and locate the edges that are emulated by repeating bits to identify the bit. Ports sharing a PLL can coexist at both slow and fast speeds in such an implementation.

Two devices can share a common slow mode speed. Two devices may have different speeds of operation, for example. For example, a common slow speed can be set during the discovery phase. An emulation multiple is an integer (or not-integer) ratio between fast speed and slow speed. The different fast speeds can then be converted to the same slow speed. Two device agents that support the same frequency can be hot-attached regardless of the speed at the host port. Software discovery might then use the slow speed link to determine and set up the best link operational speeds. The multiple is an integer proportion of slow speed to fast speed. Different fast speeds can be used for software discovery (e.g. of hot attach).

“Some implementations of HPI allow for the adaptation of lanes on a link. The Physical layer supports both transmitter or sender adaptation. The receiver adaption allows the transmitter to send samples to the receiver. This data can be used by the receiver logic to detect any deficiencies in the electrical characteristics or quality of the signal. Based on the data received, the receiver can adjust the calibration of lane to optimize lane. The receiver can also receive the sample data again and create metrics that describe the quality of the lanes. However, this communication to the transmitter is done using a backchannel (e.g., a software, hardware embedded sideband, or other channel) in order to allow it to adjust the lane based upon the feedback. The Polling supersequence from the remote transmitter can initiate receiver adaptation. The same procedure can be used to adapt transmitter parameters. Both agents can enter Loopback Pattern as masters to transmit the specified pattern. Each receiver can measure the metric (e.g. BER) for the particular transmitter setting at a distant agent. Loopback Marker can be used by both agents to reset and exchange metrics. These metrics will allow you to identify the next transmitter setting. The optimal transmitter setting will eventually be found and saved for future use.

“As both devices can run off the same clock reference (e.g. ref clk), elastic buffers can be omitted. Any elastic buffers can be bypassed or used to drift buffers with the lowest latency. To transfer the receiver bit stream from the remote clock to the local clock domain, each lane can use phase adjustment or drift buffers. Although the latency of the drift buffers might be sufficient to manage summation of all sources of electrical specification (e.g. temperature, voltage, residual SSC due to reference clock routing mismatches and so forth), it should be as low as possible to minimize transport delay. Drift errors can manifest as a series of CRC errors if the drift buffer is too small. In some cases, a drift alarm may be used to initiate a physical layer reset prior to an actual drift error.

Some implementations of HPI support both sides running at the same nominal reference clock frequency, but with a difference of ppm. Frequency adjustment (or elasticity) may be required in this situation. These buffers can be adjusted during extended BLS windows or special sequences that would be occurring periodically, among others.

“The operation and the HPI PHY logic layer can be performed independently of the underlying transmission media, provided that the latency doesn’t result in latency fixing errors at the link layer or timeouts at it.”

To aid in the management of the Physical Layer, HPI can provide external interfaces. External signals, such as those from pins, fuses and other layers, timers, control registers, and status registers, can be provided. Although the input signals can change in relation to PHY state at any given time, they must be observed by Physical layers at certain points within a state. A changing alignment signal, as described below, may be received, but it will not have any effect once the link has entered a transmitting state. Also, Physical layer entities can only observe command register values at certain points in time. Physical layer logic, for example, can take a snapshot and use it in future operations. To avoid any anomalous behavior, some implementations allow updates to command registers to be associated with specific periods (e.g. in a transmitting state or holding in Reset calibration in slow mode transmitting state).

“Status values track hardware changes so the value read can depend on the time they were read. However, some status values such as speed, link map, latency and speed may not change after initialization. A re-initialization, or low power link state (LPLS), is what may cause them to change. (e.g., hard lane failures in TLS may not cause reconfiguration of the link until reinitialization is triggered.

“Interface signals could include signals that are not external but have an effect on Physical layer behavior. These interface signals could include timing and encoding signals. Design specific interface signals are possible. These signals may be inputs or outputs. Interface signals such as semaphores or prefixed EO, for example, may only be active once per assertion edge. They can then be deasserted and reasserted again to take effect, among other things. Table 1 shows an example list of functions.

“TABLE 1\nFunction\ninput pin reset (aka warm reset)\ninput pin reset (aka cold reset)\ninput in-band reset pulse; causes semaphore to be set; semaphore is cleared\nwhen in-band reset occurs\ninput enables low power states\ninput loopback parameters; applied for loopback pattern\ninput to enter PWLTS\ninput to exit PWLTS\ninput to enter LPLS\ninput to exit LPLS\ninput from idle exit detect (aka squelch break)\ninput enables use of CPhyInitBegin\ninput from local or planetary alignment for transmitter to exit initialization\noutput when remote agent NAKs LPLS request\noutput when agent enters LPLS\noutput to link layer to force non-retryable flits\noutput to link layer to force NULL flits\noutput when transmitter is in partial width link transmitting state (PWLTS)\noutput when receiver is in PWLTS”

CSR timer default values may be given in pairs: one for slow mode and another for operational speed. The value 0 will disable the timer in some cases (i.e. timeout never happens). These timers can be combined with those in Table 2 below. To time the expected actions in a state, primary timers are useful. Secondary timers can be used to abort initializations that are not progressing, or to make forward state transitions at specific times in an automated testing equipment (or ATE mode). Secondary timers may sometimes be larger than primary timers for a particular state. Exponential timer sets can also be suffixed by exp, and the timer value will be 2 higher than the field value. The field value is used for linear timers. Both timers could use different granularities. A timing profile is a combination of timers from the power management section. These can be linked to a common timing diagram.

“TABLE 2\nTimers\nTable Tpriexp Set\nReset residency for driving EIEOS\nReceiver calibration minimum time; for stagger transmitter off\nTransmitter calibration minimum time; for stagger on\nTsecexp Set\nTimed receiver calibration\nTimed transmitter calibration\nSquelch exit detect/debounce\nDetectAtRx overhang for handshake\nAdapt + bitlock/bytelock/deskew\nConfigure link widths\nWait for planetary aligned clean flit boundary\nRe-bytelock/deskew\nTdebugexp Set\nFor hot plug; non-0 value to debug hangs\nTBLSentry Set\nBLS entry delay – fine\nBLS entry delay – coarse\nTBLS Set\nBLS duration for transmitter\nBLS duration for receiver\nBLS clean flit interval for transmitter\nTBLS clean flit interval for receiver”

“Command and Control Registers” can be provided. Software may read or write control registers in certain cases. In Reset, late-action values may take effect (e.g. pass through from the software-facing stage to the hardware-facing stage). Hardware can clear control semaphores (prefixed with CP) which are RWIS. Any of the items listed herein may be performed using control registers. They can be modified and made accessible via hardware, software, firmware or any combination thereof.

“Status Registers can be used to track hardware changes. They can be written and used by hardware and can be read-only. However, debug software may be able write to them. These registers can be complemented by many private status registers and may not impact interoperability. Status semaphores (prefixed with SP) can be mandating since they can be cleared by software to undo the actions that set the status. As a subset, default can be used to indicate initial (on reset). This register can be copied to a storage structure upon an initialization abort.

You can also provide “Tool Box registers.” Testability tool-box registers can be used in the Physical layer to provide loop back control mechanisms, pattern generation, pattern checking, and pattern checking. These registers can be used in conjunction with electrical parameters to determine margins. This tool-box can be used by Interconnect built in testing to determine margins. These registers can be combined with the registers discussed in the previous sections to adapt transmitters, among other examples.

HPI can support Reliability. Availability. and Serviceability (RAS), in some implementations. This is done by using the Physical layer. HPI supports hot plug-and-remove with one or more layers. This may include software. Hot remove may include quiescing the connection and clearing an initialization start state/signal for the agent to then be removed. Remote agent (i.e. Remote agents (i.e. In-band resets (e.g. through BLS) can cause the agents to wait in a reset status, such as a Calibrate Reset State(CRS). The agent that is to be removed can then be removed or can be kept in targeted pin reset, power down, among other features. Some of the events mentioned above may not be necessary and other events can be added.

Hot add can also include initialization speed. An initialization signal can set on the agent so that it defaults to slow. Software can change the speed to slow, which may be used to clear the initialization signal. Software can set a speed and allow the link to come up in slow mode. Sometimes, a remote cannot be locked by PLL at this time. Both agents can set their operational speed and enable can be used to allow adaptation (if this has not been done before). Both agents can clear the initialization start indicator. A BLS reset in-band can also be used to cause agents to remain in CRS. Software may assert a warm reset, which can be a targeted or self-reset of an agent (to go added), which could cause a PLL lock to relock. Software can also set the initialization start signal using any logic known and then set it on remote (thus advancing to Receiver Detect State, RDS). Software can also de-assert the warm reset of an adding agent, thus advancing it to RDS. The link can then be initialized at operational speed to a Transmitting Link State, (or to Loopback, if the adaption signals are set), and then it can go to full operational speed. You can add additional events to the list of events, or you could omit some of these events.

Data lane failure recovery support can be provided. One embodiment of HPI allows a link to be protected against hard error by configuring it to less than its full width (e.g. The faulty lane can be excluded by a link in HPI that is less than half its full width. The configuration can be done using the link state machine. Unused lanes can also be disabled in the configuration state. The flit can be sent across at a smaller width.

“In some HPI implementations, lane reverse can be supported by some links. Lane reversal could refer to, for example, lanes 0/1/2 . . A transmitter connected to lanes 1 & 2 . . of a receiver (e.g. n can equal 19, 7, or both. The receiver can detect lane reversal by identifying it in the field of a TSC header. You can reverse lane reversal at the receiver by starting in a Polling condition by using physical lanes n. . . 0 for the logical lane 0. . . n. Referring to a lanes may also refer to a number of logical lanes. Thus, board designers can lay out the physical or electrical design more efficiently and HPI may be able to work with virtual lanes assignments, as described in this article. In one embodiment, the polarity can be reversed (i.e. When a differential transmitter +/- is connected to receiver?/+ is connected to receiver ?/+. A receiver can detect polarity from one or more TS header field and handle it in one embodiment in the Polling State.

Referring to FIG. “Referring to FIG. 10, an embodiment is shown of a block diagram that shows a computing system with a multicore CPU. A processor or processing device that includes a microprocessor or embedded processor (DSP), digital signal processor (DSP), network processor (DSP), a digital signal processor(DSP), a network processor (DSP), a network and handheld processors, an application processor, co-processors, system on a chip, SOC, or any other device that executes code is called Processor 1000. In one embodiment, processor 1000 includes at least two cores:core 1001 or 1002, which can include asymmetric cores and symmetric cores (the illustrated embodiment). But processor 1000 can include any number processing elements, symmetrical or not.

Click here to view the patent on Google Patents.

How to Search for Patents

A patent search is the first step to getting your patent. You can do a google patent search or do a USPTO search. Patent-pending is the term for the product that has been covered by the patent application. You can search the public pair to find the patent application. After the patent office approves your application, you will be able to do a patent number look to locate the patent issued. Your product is now patentable. You can also use the USPTO search engine. See below for details. You can get help from a patent lawyer. Patents in the United States are granted by the US trademark and patent office or the United States Patent and Trademark office. This office also reviews trademark applications.

Are you interested in similar patents? These are the steps to follow:

1. Brainstorm terms to describe your invention, based on its purpose, composition, or use.

Write down a brief, but precise description of the invention. Don’t use generic terms such as “device”, “process,” or “system”. Consider synonyms for the terms you chose initially. Next, take note of important technical terms as well as keywords.

Use the questions below to help you identify keywords or concepts.

  • What is the purpose of the invention Is it a utilitarian device or an ornamental design?
  • Is invention a way to create something or perform a function? Is it a product?
  • What is the composition and function of the invention? What is the physical composition of the invention?
  • What’s the purpose of the invention
  • What are the technical terms and keywords used to describe an invention’s nature? A technical dictionary can help you locate the right terms.

2. These terms will allow you to search for relevant Cooperative Patent Classifications at Classification Search Tool. If you are unable to find the right classification for your invention, scan through the classification’s class Schemas (class schedules) and try again. If you don’t get any results from the Classification Text Search, you might consider substituting your words to describe your invention with synonyms.

3. Check the CPC Classification Definition for confirmation of the CPC classification you found. If the selected classification title has a blue box with a “D” at its left, the hyperlink will take you to a CPC classification description. CPC classification definitions will help you determine the applicable classification’s scope so that you can choose the most relevant. These definitions may also include search tips or other suggestions that could be helpful for further research.

4. The Patents Full-Text Database and the Image Database allow you to retrieve patent documents that include the CPC classification. By focusing on the abstracts and representative drawings, you can narrow down your search for the most relevant patent publications.

5. This selection of patent publications is the best to look at for any similarities to your invention. Pay attention to the claims and specification. Refer to the applicant and patent examiner for additional patents.

6. You can retrieve published patent applications that match the CPC classification you chose in Step 3. You can also use the same search strategy that you used in Step 4 to narrow your search results to only the most relevant patent applications by reviewing the abstracts and representative drawings for each page. Next, examine all published patent applications carefully, paying special attention to the claims, and other drawings.

7. You can search for additional US patent publications by keyword searching in AppFT or PatFT databases, as well as classification searching of patents not from the United States per below. Also, you can use web search engines to search non-patent literature disclosures about inventions. Here are some examples:

  • Add keywords to your search. Keyword searches may turn up documents that are not well-categorized or have missed classifications during Step 2. For example, US patent examiners often supplement their classification searches with keyword searches. Think about the use of technical engineering terminology rather than everyday words.
  • Search for foreign patents using the CPC classification. Then, re-run the search using international patent office search engines such as Espacenet, the European Patent Office’s worldwide patent publication database of over 130 million patent publications. Other national databases include:
  • Search non-patent literature. Inventions can be made public in many non-patent publications. It is recommended that you search journals, books, websites, technical catalogs, conference proceedings, and other print and electronic publications.

To review your search, you can hire a registered patent attorney to assist. A preliminary search will help one better prepare to talk about their invention and other related inventions with a professional patent attorney. In addition, the attorney will not spend too much time or money on patenting basics.

Download patent guide file – Click here