The Command Processor (CP) is a part of the chip that parses incoming packets from the driver and programs the GPU appropriately based on those packets. This is a more efficient programming method than MMIO as command streams can be stored in buffers and queued up for eventual processing and, since a single command packet can encompass a fairly substantial set of register writes bandwidth requirements are reduced.
To use the CP, a block of GART memory is allocated for the ring buffer. This buffer is shared by the driver and the CP. The driver writes new packets into the ring and the CP reads packets out of the ring and processes them, programming the card appropriately. In order to work properly, both the driver and the CP must have a consistent view of the buffer. To do this, both sides keep track of both a read pointer and a write pointer. The write pointer tracks where in the ring buffer the driver is writing new packets and the read pointer tracks where the CP is reading packets for processing. If the CP’s pointers are equal, the queue is empty and the CP will go idle. Periodically, the driver updates the CP’s copy of the write pointer and the CP updates the driver’s copy of the read pointer so both sides have a consistent view.
There are 3 types of packets that are primarily used with the CP: type 0, type 2, and type 3. Type 0 packets are used to write data to a number of consecutive registers starting at a particular offset. Type 2 packets are filler packets; NOPs. And type 3 packets are opcode packets that are used to program specific 2D/3D/video tasks.
In addition to the ring buffer, the CP is able to read from buffers in GART memory called Indirect Buffers. The driver can use indirect buffers to store 2D/3D/video command streams. When the driver wants to execute these buffers, it queues up writes to the indirect buffer control registers (base and size) via type 0 packets in the ring buffer. When the CP encounters this it starts fetching the command stream from the indirect buffer until the end of that buffer at which time it goes back to processing the ring buffer.