Get to know a radeon part 3
Command Processor
The Command Processor (CP) is a part of the chip that parses incoming packets from the driver and programs the GPU appropriately based on those packets. This is a more efficient programming method than MMIO as command streams can be stored in buffers and queued up for eventual processing and, since a single command packet can encompass a fairly substantial set of register writes bandwidth requirements are reduced.
Ring Buffer
To use the CP, a block of GART memory is allocated for the ring buffer. This buffer is shared by the driver and the CP. The driver writes new packets into the ring and the CP reads packets out of the ring and processes them, programming the card appropriately. In order to work properly, both the driver and the CP must have a consistent view of the buffer. To do this, both sides keep track of both a read pointer and a write pointer. The write pointer tracks where in the ring buffer the driver is writing new packets and the read pointer tracks where the CP is reading packets for processing. If the CP’s pointers are equal, the queue is empty and the CP will go idle. Periodically, the driver updates the CP’s copy of the write pointer and the CP updates the driver’s copy of the read pointer so both sides have a consistent view.
Packets
There are 3 types of packets that are primarily used with the CP: type 0, type 2, and type 3. Type 0 packets are used to write data to a number of consecutive registers starting at a particular offset. Type 2 packets are filler packets; NOPs. And type 3 packets are opcode packets that are used to program specific 2D/3D/video tasks.
Indirect Buffers
In addition to the ring buffer, the CP is able to read from buffers in GART memory called Indirect Buffers. The driver can use indirect buffers to store 2D/3D/video command streams. When the driver wants to execute these buffers, it queues up writes to the indirect buffer control registers (base and size) via type 0 packets in the ring buffer. When the CP encounters this it starts fetching the command stream from the indirect buffer until the end of that buffer at which time it goes back to processing the ring buffer.
January 30th, 2008 at 1:11 pm
Very interesting. Thanks!
spell check: both sizes keep -> both sides keep
January 30th, 2008 at 1:22 pm
Very nice. What’s the mechanism ensuring that the read and write pointers do not step on each other’s shoes ?
January 30th, 2008 at 1:40 pm
Both the driver and the CP keep their own copies of the read and write pointers.
January 30th, 2008 at 3:20 pm
I’ve always wondered why the ring buffer design is used instead of a normal buffer. Also, does it ever happen that the buffer fills up and you overwrite commands that haven’t been processed?
January 30th, 2008 at 3:23 pm
You’ve got to keep track in the driver an make sure you don’t fill the ring buffer completely. If you hit the end of the buffer, the read and write pointers would be equal which is assumed by the CP to be an empty queue so it will idle.
January 30th, 2008 at 3:24 pm
That was my question
But then I thought about it a bit, and you can’t do that. Because even if the CP is slow to update your copy of the read pointer, you have a pessimistic guess of the remaining available space so you’ll just refrain from filling the buffer, even if you could.
January 31st, 2008 at 4:15 pm
Could you please, when talking about some radeon part (CP, MMIO etc.) , point to specific functions/parts of driver code?
January 31st, 2008 at 5:00 pm
There are 3 main parts to a DRI enabled X driver: the ddx (handles modesetting, 2D accel, Xv), the mesa 3D driver (converts GL into card specific commands), and the drm (kernel module that arbitrates access to the hw). The ddx and 3D driver send command buffers to the drm which validates the buffers and submits them to the hw. In this case the driver side of the CP handling (as referenced by the above post) is done in the radeon drm:
http://cgit.freedesktop.org/mesa/drm/