How the Real3D configuration registers (0x9C) work

Technical discussion for those interested in Supermodel development and Model 3 reverse engineering. Prospective contributors welcome.
Forum rules
Keep it classy!

  • No ROM requests or links.
  • Do not ask to be a play tester.
  • Do not ask about release dates.
  • No drama!

How the Real3D configuration registers (0x9C) work

Postby gm_matthew » Fri Aug 11, 2023 4:33 am

By studying the Real3D library files in Ghidra, I've figured out what the Real3D configuration registers actually do.

Code: Select all
  local_118 = 0;
  local_14c = *(int *)(g_API_Data + 0x518) + *(int *)(g_API_Data + 0x51c) >> 2;  // _pingpong_size + _update_size
  local_11c = *(int *)(g_API_Data + 0x518) >> 2;                                 // _pingpong_size
  if ((*(uint *)(g_API_Data + 0x7e0) >> 1 & 1) != 0) {
    local_118 = 4;
    local_14c = local_14c + 4;
    local_11c = local_11c + 4;
  }
  if ((*(uint *)(g_API_Data + 0x7e0) >> 1 & 1) == 0) {
    local_1c = local_11c;    // 0x9c000000
    local_18 = local_118;    // 0x9c000004
    local_14 = local_14c;    // 0x9c000008
    _PRO_IO(&local_e0,0x31,local_118 << 2 | 0x8c000000,0xffffffff,0);
    _PRO_IO(local_110,0xc,local_118 * 4 + 0xc4U | 0x8c000000,0xffffffff,0);
    _PRO_IO(local_13c,8,0x8c0000f4,0xffffffff,0);
    _PRO_IO(&local_e0,0x31,local_14c << 2 | 0x8c000000,0xffffffff,0);
    _PRO_IO(local_110,0xc,local_14c * 4 + 0xc4U | 0x8c000000,0xffffffff,0);
    _PRO_IO(local_13c,8,(local_14c + 0x3d) * 4 | 0x8c000000,0xffffffff,0);
    _PRO_Flush();
    _PRO_IO(&local_1c,3,0x9c000000,0xffffffff,0);
    _PRO_Flush();
  }

(g_API_Data + 0x518) points to _pingpong_size, (g_API_Data + 0x51c) points to _update_size, and (g_API_Data + 0x7e0) points to _offline_mode.

0x9C000000 is the size of the ping pong buffer (high culling RAM) in 32-bit words. 0x9C000008 is the combined size of the ping pong buffer and the update buffer in 32-bit words. Notably, all games leave a gap at the start of low culling RAM twice as large as the combined size of the ping pong and update buffers; it seems that both of these are double-buffered. For example, Scud Race sets 0x9C000008 to 0x18000 (0x60000 bytes), and never writes to low culling RAM lower than 0x8C0C0000. The update buffer is probably used to store updated polygon and texture data while the Real3D Pro-1000 is busy rendering, and then updates the polygon RAM and texture RAM (perhaps even low culling RAM?) when it has finished rendering the frame.

0x9C000004 specifies a bunch of flags, but the only one that varies at all across games is _offline_mode; the PC does not send data to the Pro-1000 in offline mode. All games except for Sega Rally 2 and Star Wars Trilogy Arcade have this flag enabled, so I'm guessing it only affects receiving data from a PC via SCSI and does not affect data sent internally.

Well that's another little mystery solved :)
gm_matthew
 
Posts: 224
Joined: Fri Oct 07, 2011 7:29 am
Location: Bristol, UK

Re: How the Real3D configuration registers (0x9C) work

Postby Bart » Sat Aug 12, 2023 9:09 pm

Awesome work. What do you think the update buffer is wired up to?

If low culling RAM contains *both* parts of the ping pong RAM (high culling RAM), I take it then that the actual memory is mapped to what we call low culling RAM, and the high culling RAM region (ping pong buffer) is simply dynamically remapped to point to one or the other part of this reserved region, as defined by the configuration registers.

But update RAM seems a bit ambiguous. Texture memory is written via a FIFO. Polygon RAM looks more like an actual RAM region. Do you think data is copied out of polygon RAM into the update buffer? Is it possible the update buffer is only to implement the texture FIFO?
User avatar
Bart
Site Admin
 
Posts: 3086
Joined: Thu Sep 01, 2011 2:13 pm
Location: Reno, Nevada

Re: How the Real3D configuration registers (0x9C) work

Postby gm_matthew » Sun Aug 13, 2023 4:41 am

Bart wrote:Awesome work. What do you think the update buffer is wired up to?

If low culling RAM contains *both* parts of the ping pong RAM (high culling RAM), I take it then that the actual memory is mapped to what we call low culling RAM, and the high culling RAM region (ping pong buffer) is simply dynamically remapped to point to one or the other part of this reserved region, as defined by the configuration registers.

But update RAM seems a bit ambiguous. Texture memory is written via a FIFO. Polygon RAM looks more like an actual RAM region. Do you think data is copied out of polygon RAM into the update buffer? Is it possible the update buffer is only to implement the texture FIFO?

Yes, the 4MB culling RAM (2MB on step 1.x) region includes both the "high culling" ping pong RAM which is double buffered and the low culling RAM which is not.

The issue is what happens if a memory region is updated while the video board is busy rendering? Games will often try to update a model in polygon RAM that is actively being used, which would lead to a potential race condition on real hardware if polygon RAM is updated immediately when the DMA transfer occurs. This is why the update buffer makes sense; it provides a place to store polygon RAM and texture RAM updates until the video board has finished rendering. In Supermodel we don't have to worry about this as we effectively double-buffer all of the video memory regions if GPU multithreading is enabled (e.g. we have both polyRAM and polyRAMRO).

Per the other topic, we know that geometry T&L starts before the frame buffers are swapped, so we need a place to store transformed polygons until they are ready to be rendered; this must be what the "polygon FIFO" is for, rather than storing pending texture updates. 3dfx Voodoo cards have an optional memory FIFO that works in the same way, allowing many triangles to be queued up in advance.
gm_matthew
 
Posts: 224
Joined: Fri Oct 07, 2011 7:29 am
Location: Bristol, UK

Re: How the Real3D configuration registers (0x9C) work

Postby Bart » Sun Aug 13, 2023 8:30 pm

Makes sense. So the working theory is that during GP, the Real3D is reading from polygon RAM, culling RAM, ping pong RAM and storing transformed results in update RAM? Then during DP, update RAM is read and the primitives therein are rasterized? I wonder if it's possible to read update RAM during DP to see what it contains. Would be interesting but obviously many more important things to figure out first :)
User avatar
Bart
Site Admin
 
Posts: 3086
Joined: Thu Sep 01, 2011 2:13 pm
Location: Reno, Nevada

Re: How the Real3D configuration registers (0x9C) work

Postby Ian » Mon Aug 14, 2023 2:59 am

If you think about the 3dfx card, it didn't have hardware T&L so polygons are projected onto the screen and clipped in software by the driver and sent to the GPU. This is where the FIFO buffer comes in, they want to keep the rasterizer busy and don't want to stall waiting for polygons.

With the real3d board it has hardware T&L. The input coordinates of the polys, are very different to the output coordinates because it's they've been multiplied by the model view projection matrix, done the perspective divide, then multiplied by the viewport size. This means you need somewhere to store them. I'd be very surprised if the T&L happened entirely before the rasterizer ran, because they would mean you'd have to store a copy of every polygon which would be very wasteful. But it would make sense to batch say 128 polys at a time or some other arbitrary number, store them in a temp buffer and send to the rasterizer. This means the T&L and rasterizer stages can happen in parallel, and you only need a small buffer.
Ian
 
Posts: 2044
Joined: Tue Feb 23, 2016 9:23 am

Re: How the Real3D configuration registers (0x9C) work

Postby Ian » Mon Aug 14, 2023 6:55 am

Thinking about it. What Mathew says makes a lot of sense.

Currently our frames look like this
[||||||irq2||||||||||||||||swapbuffers]

but really they should work like this

[irq2||||||||||||||||pingpong|||||swapbuffers]

Any writes after pingpong happen in next frame so it must be double buffered. And pretty much most games start writing the next frame as soon as the pingpong bit has flipped. It might be easier if we emulated it how it actually worked in the h/w.
Ian
 
Posts: 2044
Joined: Tue Feb 23, 2016 9:23 am

Re: How the Real3D configuration registers (0x9C) work

Postby Bart » Mon Aug 14, 2023 5:02 pm

By pingpong, you mean flipping the ping pong buffers and kicking off GP (then DP), and swap buffers is displaying the result (for us, we render and then swap buffers), right? I think this makes sense. We can also correctly wire up IRQ4 and IRQ8 now, which is nice for completeness' sake.
User avatar
Bart
Site Admin
 
Posts: 3086
Joined: Thu Sep 01, 2011 2:13 pm
Location: Reno, Nevada


Return to The Dark Room

Who is online

Users browsing this forum: No registered users and 1 guest