Samuel Hammond Duncan - Arlington MA Glenn Arthur Herdeg - Leominster MA Ricky Charles Hetherington - Westboro MA Craig Durand Keefer - Nashua NH Maurice Bennet Steinman - Marlboro MA Paul Michael Guglielmi - Westboro MA
Assignee:
Compaq Computer Corporation - Houston TX
International Classification:
G06F 1212
US Classification:
711155, 711144, 711145, 711133, 710129, 710126
Abstract:
A multiprocessor having improved bus efficiency is shown to include a number of processing units and a memory coupled to a system bus. Also coupled to the system bus are at least one I/O bridge systems. A method for improving partial cache line writes from I/O devices to the central processing units incorporates cache coherency protocol and an enhanced invalidation scheme to ensure atomicity, which minimizes the bus utilization. In addition, a method for allowing peer-to-peer communication between I/O devices coupled to the system bus via different I/O bridges includes a command and address space configuration that allows for communication without the involvement of any central processing device. Interrupt performance is improved through the storage of an interrupt data structure in main memory. The I/O bridges maintain the data structure, and when the CPU is available the interrupts can be accessed by a fast memory read; thereby reducing the requirement of I/O reads for interrupt handling.
Apparatus For Determining Memory Bank Availability In A Computer System
David M. Fenwick - Nashua NH Denis Foley - Shrewsbury MA David Hartwell - Bolton MA Ricky C. Hetherington - Westboro MA Dale R. Keck - Shrewsbury MA Elbert Bloom - Marlboro MA
Assignee:
Compaq Computer Corporation - Houston TX
International Classification:
G06F 1202
US Classification:
710 17, 711 5
Abstract:
In accordance with the present invention, an apparatus includes a system bus having memory bank available signals. Coupled to the system bus are at least two memory modules, each having at least one memory bank. Each memory module includes a mechanism for associating each memory bank with one of the memory bank available signals. Further, each memory module includes logic for determining an availability status of each memory bank and for providing the associated memory bank busy signal with values reflecting the availability status of the memory bank. Additionally, at least two commander modules are coupled to the system bus and include logic, responsive to the memory bank available signals for preventing the commander module from gaining control of the system bus when the commander is attempting to access a memory bank determined to be unavailable. With such an arrangement, only commander modules seeking to access memory banks which are available will be allowed to gain control of the system bus. This avoids stalling the system bus and improves system performance by allowing all initiated transactions to complete as quickly as possible.
Apparatus And Method For Distributed Non-Blocking Multi-Level Cache
Sharad Mehrotra - Cupertino CA Ricky C. Hetherington - Pleasanton CA
Assignee:
Sun Microsystems, Inc. - Palo Alto CA
International Classification:
G06F 1208
US Classification:
711118, 711119, 711122
Abstract:
A multi-level cache and method for operation therefore includes a first non-blocking cache receiving access requests from a device in a processor, and a first miss queue storing entries corresponding to access requests not serviced by the first non-blocking cache. A second non-blocking cache is provided for receiving access requests from the first miss queue, and a second miss queue is provided for storing entries corresponding to access requests not serviced by the second non-blocking cache. Other queueing structures such as a victim queue and a write queue are provided depending on the particular structure of the cache level within the multilevel cache hierarchy.
Mechanism For Reordering Transactions In Computer Systems With Snoop-Based Cache Consistency Protocols
Robert Cypher - Saratoga CA Ricky C. Hetherington - Pleasanton CA Belliappa Kuttanna - Austin TX
Assignee:
Sun Microsystems, Inc. - Santa Clara CA
International Classification:
G06F 1200
US Classification:
711141, 711146, 712216
Abstract:
An apparatus and method for expediting the processing of requests in a multiprocessor shared memory system. In a multiprocessor shared memory system, requests can be processed in any order provided two rules are followed. First, no request that grants access rights to a processor can be processed before an older request that revokes access rights from the processor. Second, all requests that reference the same cache line are processed in the order in which they arrive. In this manner, requests can be processed out-of-order to allow cache-to-cache transfers to be accelerated. In particular, foreign requests that require a processor to provide data can be processed by that processor before older local requests that are awaiting data. In addition, newer local requests can be processed before older local requests. As a result, the apparatus and method described herein may advantageously increase performance in multiprocessor shared memory systems by reducing latencies associated with a cache consistency protocol.
Method For Operating A Non-Blocking Hierarchical Cache Throttle
Ricky C. Hetherington - Pleasanton CA Thomas M. Wicki - Palo Alto CA
Assignee:
Sun Microsystems, Inc. - Santa Clara CA
International Classification:
G06F 1208
US Classification:
711140, 711122, 711118
Abstract:
A multi-level cache and method for operation of a multi-level cache generating multiple cache system accesses simultaneously. Each access request includes an address identifying a memory location having data that is a target of the access. A insertion pointer inserts each access request into an entry in a memory scheduling window. Each entry is marked as valid when that entry is ready to be applied to a first cache level. A picker picks valid entries from the memory scheduling window by pointing to the picked entry and applying the address therein to the first cache level. The picking occurs in a free-running mode regardless of whether the accesses hit in the first cache level. A second cache level, receives accesses that have missed in the first cache level. A resource monitor in the second cache level determines when a predetermined number of resources are committed to servicing the accesses that have missed in the first cache level. In response to the monitoring step the second cache level generates a stall signal thereby stalling the picking process.
Multiple-Core Processor With Flexible Cache Directory Scheme
Ricky C. Hetherington - Pleasanton CA, US Bikram Saha - Cupertino CA, US
Assignee:
Sun Microsystems, Inc. - Santa Clara CA
International Classification:
G06F 12/08
US Classification:
711122, 711119
Abstract:
A multiple-core processor providing a flexible cache directory scheme. In one embodiment, a processor may include a second-level cache including a number of cache banks and a respective number of cache directories corresponding to the cache banks. The processor may further include a number of processor cores configured to access the cache banks, as well as core/bank mapping logic coupled to the second-level cache and the processor cores. Each of the processor cores may include a respective first-level cache. Each of the respective cache directories may be configured to store directory state information associated with portions of respective first-level caches of at least two of the processor cores. If fewer than all of the cache banks are enabled, the core/bank mapping logic may be configured to completely map directory state information associated with each respective first-level cache of enabled processor cores to respective cache directories associated with enabled cache banks.
Method And Apparatus For Power Throttling In A Multi-Thread Processor
Robert T. Golla - Round Rock TX, US Ricky C. Hetherington - Pleasanton CA, US
Assignee:
Sun Microsystems, Inc. - Santa Clara CA
International Classification:
G06F 1/00 G06F 9/30
US Classification:
713322, 713340, 712216
Abstract:
A method and apparatus for controlling power consumption in a processor. In one embodiment, a processor includes a pipeline. The pipeline includes logic for fetching instructions, issuing instructions, and executing instructions. The processor also includes a power management unit. The power management unit is configured to input M stalls into the pipeline every N instruction cycles (where M and N are integer value and wherein M is less than N).
Multiple Independent Coherence Planes For Maintaining Coherency
Ricky C. Hetherington - Pleasanton CA, US Stephen E. Phillips - Los Gatos CA, US
Assignee:
Sun Microsystems, Inc. - Santa Clara CA
International Classification:
G06F 12/08 G06F 13/00
US Classification:
711141, 711147, 711202, 711203, 709216, 709245
Abstract:
In one embodiment, a node comprises at least one processor core and a plurality of coherence units. The processor core is configured to generate an address to access a memory location. The address maps to a first coherence plane of a plurality of coherence planes. Coherence activity is performed within each coherence plane independent of other coherence planes, and a mapping of the address space to the coherence planes is independent of a physical location of the addressed memory in a distributed system memory. Each coherence unit corresponds to a respective coherence plane and is configured to manage coherency for the node and for the respective coherence plane. The coherence units operate independent of each other, and a first coherence unit corresponding to the first coherence plane is coupled to receive the address if external coherency activity is needed to complete the access to the memory location.