Frame Timing

Technical discussion for those interested in Supermodel development and Model 3 reverse engineering. Prospective contributors welcome.
Forum rules
Keep it classy!

  • No ROM requests or links.
  • Do not ask to be a play tester.
  • Do not ask about release dates.
  • No drama!

Re: Frame Timing

Postby Bart » Mon Jun 13, 2016 9:18 pm

Spent some time looking into it. No breakthrough yet but I did notice that IRQ 0x02 (which is linked either to VBL, completion of processing on the Real3D, or both) is probably the only IRQ that needs to be fired during the Scud Race intro. IRQs 0x0D could be disabled without any visible impact. Interestingly, when 0x02 is disabled, the game gets stuck waiting for decrementer exceptions. Next, I'll have to disassemble the IRQ handler. I've done this several times over the years but always forget to save my work. This time, maybe I'll just commit my commented disassembly to the repo. Once I understand the IRQ, I'll have to dig in and figure out where the game is triggering the command port writes (0x88000000). I generated a trace of all subroutine calls during the FMV sequence so I should be able to home in on it. The game goes through dozens of subroutine calls per frame (*excluding* those triggered inside of IRQs), which is more than I expected in the FMV sequence!

My hunch is simply that 0x88000000 triggers an "action", either processing the texture FIFO or, if empty, rendering a frame, and then generating an IRQ. I should also check to see if games ever disable viewports when uploading textures mid-game, so that when 0x88000000 is triggered, no frame is rendered. That would be an even simpler protocol.
User avatar
Bart
Site Admin
 
Posts: 3086
Joined: Thu Sep 01, 2011 2:13 pm
Location: Reno, Nevada

Re: Frame Timing

Postby Ian » Tue Jun 14, 2016 2:12 am

Nice! It'll be quite exciting to fix these bugs :)
I've pretty much figured out how to make culling work, maybe later in the week will have some time to work on it.
I figured too we might be able to get a near/far by using the bounding boxes from only the leaf nodes (as apposed to the higher level nodes). By doing a min/max on those we'll get a pretty accurate representation of the Z values for each scene.
Ian
 
Posts: 2044
Joined: Tue Feb 23, 2016 9:23 am

Re: Frame Timing

Postby Bart » Wed Jun 15, 2016 9:54 pm

Progress is painfully slow due to time constraints. Busy with work and parents are coming to visit this weekend. Working through the IRQ handler with particular emphasis on IRQ 0x02. Here's what I've got so far:

Code: Select all
IRQHandler:
0x00001AC8: 0x3821FFC0  addi  r1,r1,-0x40
0x00001ACC: 0xBC01FF80  stmw  r0,-0x80(r1)
0x00001AD0: 0x3821FF80  addi  r1,r1,-0x80
0x00001AD4: 0x7FFA02A6  mfspr r31,srr0
0x00001AD8: 0x7FDB02A6  mfspr r30,srr1
0x00001ADC: 0x7FA00026  mfcr  r29
0x00001AE0: 0x7F8802A6  mfspr r28,lr
0x00001AE4: 0x7F6902A6  mfspr r27,ctr
0x00001AE8: 0x7F4102A6  mfspr r26,xer
0x00001AEC: 0xBF41FFE8  stmw  r26,-0x18(r1)
0x00001AF0: 0x3821FFC0  addi  r1,r1,-0x40
0x00001AF4: 0x7CA000A6  mfmsr r5
0x00001AF8: 0x54A504E2  and r5,r5,0xFFFFDFFF
0x00001AFC: 0x60A53032  ori r5,r5,0x3032
0x00001B00: 0x7CA00124  mtmsr r5
0x00001B04: 0x4C00012C  isync
0x00001B08: 0x480013F1  bl  0x00002EF8              ; ?? Step 2.x only? ??
0x00001B0C: 0x48000EE9  bl  0x000029F4              ; SCSI
0x00001B10: 0x3C60FE10  li  r3,0xFE100000
0x00001B14: 0x88630018  lbz r3,0x18(r3)
0x00001B18: 0x888D8008  lbz r4,-0x7FF8(r13)
0x00001B1C: 0x7C632038  and r3,r3,r4
0x00001B20: 0x5465063E  and r5,r3,0x000000FF
0x00001B24: 0x7CA03120  mtcrf 0x03,r5
0x00001B28: 0x40960014  bf  cr5[eq],0x00001B3C
0x00001B2C: 0x886D85B1  lbz r3,-0x7A4F(r13)
0x00001B30: 0x5460C801  rlwinm. r0,r3,25,0x80000000
0x00001B34: 0x41820008  bt  cr0[eq],0x00001B3C
0x00001B38: 0x48000F45  bl  0x00002A7C
0x00001B3C: 0x419F0109  btl cr7[so],0x00001C44    ; IRQ 0x01
0x00001B40: 0x419E0129  btl cr7[eq],IRQ02Handler  ; IRQ 0x02
0x00001B44: 0x419D01F1  btl cr7[gt],0x00001D34    ; IRQ 0x04
0x00001B48: 0x419C0191  btl cr7[lt],0x00001CD8    ; IRQ 0x08
0x00001B4C: 0x419A0279  btl cr6[eq],0x00001DC4    ; IRQ 0x20
0x00001B50: 0x419B00D1  btl cr6[so],0x00001C20    ; IRQ 0x10
0x00001B54: 0x4198025D  btl cr6[lt],0x00001DB0    ; IRQ 0x80
0x00001B58: 0x41990239  btl cr6[gt],0x00001D90    ; IRQ 0x40
0x00001B5C: 0x38600005  li  r3,0x00000005
0x00001B60: 0x38210040  addi  r1,r1,0x40
0x00001B64: 0xBB41FFE8  lmw r26,-0x18(r1)
0x00001B68: 0x7F4103A6  mtspr xer,r26
0x00001B6C: 0x7F6903A6  mtspr ctr,r27
0x00001B70: 0x7F8803A6  mtspr lr,r28
0x00001B74: 0x7FAFF120  mtcrf 0xFF,r29
0x00001B78: 0x7FDB03A6  mtspr srr1,r30
0x00001B7C: 0x7FFA03A6  mtspr srr0,r31
0x00001B80: 0x7C210B78  mr  r1,r1
0x00001B84: 0xB8410008  lmw r2,0x08(r1)
0x00001B88: 0x80010000  lwz r0,0x00(r1)
0x00001B8C: 0x80210004  lwz r1,0x04(r1)
0x00001B90: 0x38210040  addi  r1,r1,0x40
0x00001B94: 0x4C000064  rfi

IRQ02Handler:
0x00001C68: 0x806D8030  lwz r3,-0x7FD0(r13)
0x00001C6C: 0x7C7603A6  mtspr dec,r3    ; decrementer = -7fd0(r13)
0x00001C70: 0x3821FFF0  addi  r1,r1,-0x10
0x00001C74: 0x7C0802A6  mfspr r0,lr
0x00001C78: 0x90010000  stw r0,0x00(r1)
0x00001C7C: 0x480002A9  bl  0x00001F24    ; some TBL calculation
0x00001C80: 0x48001321  bl  0x00002FA0    ; LED
0x00001C84: 0x480013AD  bl  ReadControls  ; read inputs
0x00001C88: 0x48000151  bl  ReadReal3DStatusRegs
0x00001C8C: 0x804D8014  lwz r2,-0x7FEC(r13)
0x00001C90: 0x2C020000  cmpi  cr0,0,r2,0x00
0x00001C94: 0x7C4803A6  mtspr lr,r2
0x00001C98: 0x4C820021  bclrl 0x04,cr0[eq]
0x00001C9C: 0x80010000  lwz r0,0x00(r1)
0x00001CA0: 0x7C0803A6  mtspr lr,r0
0x00001CA4: 0x38210010  addi  r1,r1,0x10
0x00001CA8: 0x886D8034  lbz r3,-0x7FCC(r13)
0x00001CAC: 0x38630001  addi  r3,r3,0x01
0x00001CB0: 0x986D8034  stb r3,-0x7FCC(r13)
0x00001CB4: 0x38800002  li  r4,0x00000002
0x00001CB8: 0x3CA0F118  li  r5,0xF1180000
0x00001CBC: 0x38A50010  addi  r5,r5,0x10
0x00001CC0: 0x7C802D2C  stwbrx  r4,0,r5
0x00001CC4: 0x3C60FE10  li  r3,0xFE100000
0x00001CC8: 0x88630018  lbz r3,0x18(r3)
0x00001CCC: 0x70630002  andi. r3,r3,0x0002
0x00001CD0: 0x40A2FFF0  bf  cr0[eq],0x00001CC0
0x00001CD4: 0x4E800020  bclr  0x14,0


uint32_t  g_real3d_status_regs[9];  // @ -0x7a00(r13)
uint8_t   g_real3d_status_bits[5];  // @ -0x79dc(r13)

void ReadReal3DStatusRegs()
{
  volatile uint32_t *real3d_status_regs = 0x84000000;
  for (int i = 0; i < 9; i++)
    g_real3d_status_regs[i] = bswap32(real3d_status_regs[i]);
  uint32_t status_reg = g_real3d_status_regs[0];
  g_real3d_status_bits[0] = (status_reg & 0x08000000) >> 27;  // -79dc(r13)
  g_real3d_status_bits[1] = (status_reg & 0x04000000) >> 26;  // -79db(r13)
  g_real3d_status_bits[2] = (status_reg & 0x02000000) >> 25;  // -79da(r13) <-- read by game code
  g_real3d_status_bits[3] = (status_reg & 0x01000000) >> 24;  // -79d9(r13) <-- read by game code
  g_real3d_status_bits[4] = (status_reg & 0x00800000) >> 23;  // -79d8(r13) <-- read by game code
}


ReadReal3DStatusRegs:
0x00001DD8: 0x3821FFF0  addi  r1,r1,-0x10
0x00001DDC: 0x93E10000  stw r31,0x00(r1)
0x00001DE0: 0x93C10004  stw r30,0x04(r1)
0x00001DE4: 0x93A10008  stw r29,0x08(r1)
0x00001DE8: 0x3821FFF0  addi  r1,r1,-0x10
0x00001DEC: 0x7C0802A6  mfspr r0,lr
0x00001DF0: 0x90010000  stw r0,0x00(r1)
0x00001DF4: 0x3BC00000  li  r30,0x00000000            ; r30 = 0
0x00001DF8: 0x3BAD85FC  addi  r29,r13,-0x7A04         ; r29 = r13-7a04
0x00001DFC: 0x3BE00009  li  r31,0x00000009            ; r31 = 9
.L1:
0x00001E00: 0x387E0000  addi  r3,r30,0x00             ; r3 = r30
0x00001E04: 0x48000F29  bl  ReadReal3DStatusReg
0x00001E08: 0x947D0004  stwu  r3,0x04(r29)            ; r29 += 4; *r29 = r3
0x00001E0C: 0x3BDE0004  addi  r30,r30,0x04            ; r30 += 4
0x00001E10: 0x37FFFFFF  addic.  r31,r31,-0x01         ; r31--
0x00001E14: 0x40A2FFEC  bf  cr0[eq],0x00001E00        ; if (r31) goto .L1
0x00001E18: 0x806D8600  lwz r3,-0x7A00(r13)           ; r3 = -7a00(r13)
0x00001E1C: 0x54642FFE  rlwinm  r4,r3,5,0x00000001    ; r4 = (r3 & 0x08000000) >> 27
0x00001E20: 0x988D8624  stb r4,-0x79DC(r13)           ; -79dc(r13) = r4 byte
0x00001E24: 0x546437FE  rlwinm  r4,r3,6,0x00000001    ; r4 = (r3 & 0x04000000) >> 26
0x00001E28: 0x988D8625  stb r4,-0x79DB(r13)           ; -79db(r13) = r4 byte
0x00001E2C: 0x54643FFE  rlwinm  r4,r3,7,0x00000001    ; r4 = (r3 & 0x02000000) >> 25
0x00001E30: 0x988D8626  stb r4,-0x79DA(r13)           ; -79da(r13) = r4 byte
0x00001E34: 0x546447FE  rlwinm  r4,r3,8,0x00000001    ; r4 = (r3 & 0x01000000) >> 24
0x00001E38: 0x988D8627  stb r4,-0x79D9(r13)           ; -79d9(r13) = r4 byte
0x00001E3C: 0x54644FFE  rlwinm  r4,r3,9,0x00000001    ; r4 = (r3 & 0x00800000) >> 23
0x00001E40: 0x988D8628  stb r4,-0x79D8(r13)           ; -79d8(r13) = r4 byte
0x00001E44: 0x80010000  lwz r0,0x00(r1)
0x00001E48: 0x7C0803A6  mtspr lr,r0
0x00001E4C: 0x38210010  addi  r1,r1,0x10
0x00001E50: 0x83A10008  lwz r29,0x08(r1)
0x00001E54: 0x83C10004  lwz r30,0x04(r1)
0x00001E58: 0x83E10000  lwz r31,0x00(r1)
0x00001E5C: 0x38210010  addi  r1,r1,0x10
0x00001E60: 0x4E800020  bclr  0x14,0

ReadReal3DStatusReg:
0x00002D2C: 0x894D805C  lbz r10,-0x7FA4(r13)            ; r10 = -7fa4(r13) byte
0x00002D30: 0x5540C001  rlwinm. r0,r10,24,0x80000000    ; r0 = (r10 & 0x80) << 24
0x00002D34: 0x40820014  bf  cr0[eq],0x00002D48          ; if (r0 == 0) goto .L1
0x00002D38: 0x3D408400  li  r10,0x84000000              ; r10 = 0x84000000
0x00002D3C: 0x506A06BA  rlwimi  r10,r3,0,0x0000003C     ; r10 |= (r3 & 0x3c)
0x00002D40: 0x7C60542C  lwbrx r3,0,r10                  ; r3 = bswap32(*r10)
0x00002D44: 0x4E800020  bclr  0x14,0                    ; return
.L1:  ; this code path does not appear to be executed on Step 1.5
0x00002D48: 0x3D40C200  li  r10,0xC2000000
0x00002D4C: 0x892A000C  lbz r9,0x0C(r10)
0x00002D50: 0x5520C001  rlwinm. r0,r9,24,0x80000000
0x00002D54: 0x4082FFF8  bf  cr0[eq],0x00002D4C
0x00002D58: 0x3D208000  li  r9,0x80000000
0x00002D5C: 0x5069063A  rlwimi  r9,r3,0,0x000000FC
0x00002D60: 0x390A0010  addi  r8,r10,0x10
0x00002D64: 0x7D20452C  stwbrx  r9,0,r8
0x00002D68: 0x390A0014  addi  r8,r10,0x14
0x00002D6C: 0x7C60442C  lwbrx r3,0,r8
0x00002D70: 0x4E800020  bclr  0x14,0


Of the status bits, we already know that one of them has something to do with frame rendering. But what's interesting here is that I can see that a couple of others are read somewhere in the game code (not sure where, I just did a search through the entire disassembly and found them being used at one point -- could be a red herring, though). The decrementer register is also reloaded in IRQ 0x02. So this is definitely responsible for frame timing. This would also seem to indicate that the IRQ is probably not triggered more than once per frame... The plot thickens.

EDIT: just checked. The other bits seemingly have no effect (if I force them to 0 or flip them randomly, there is no effect). This is going to be tricky, I predict. I wonder if they're actually doing some sort of relatively precise cycle counting to time things.
User avatar
Bart
Site Admin
 
Posts: 3086
Joined: Thu Sep 01, 2011 2:13 pm
Location: Reno, Nevada

Re: Frame Timing

Postby Ian » Wed Jun 22, 2016 3:48 pm

there are these 2 functions in the api

PRO_GetRealtimeClockCount Get the processing time of a given logical unit.
PRO_GetProcessingTime Get the processing time of a given logical unit.
Ian
 
Posts: 2044
Joined: Tue Feb 23, 2016 9:23 am

Re: Frame Timing

Postby Bart » Wed Jun 22, 2016 7:08 pm

IRQ 0x04 and IRQ 0x02 seem to be linked. Currently, we assume that IRQ 0x02 gets triggered on VBL. It seems to measure how many cycles have elapsed since IRQ 0x04 and saves that number. Need to see how this number is being used... Almost certainly related to frame timing or the timing of Real3D processing (perhaps related to the API functions you found).
User avatar
Bart
Site Admin
 
Posts: 3086
Joined: Thu Sep 01, 2011 2:13 pm
Location: Reno, Nevada

Re: Frame Timing

Postby Bart » Wed Jun 22, 2016 9:03 pm

To-do: IRQs are a little wacky in Supermodel. Looks like I'm allowing nested IRQs but I'm not so sure that's such a great idea. Probably better to trigger IRQs only when we go from a non-pending->pending state.
User avatar
Bart
Site Admin
 
Posts: 3086
Joined: Thu Sep 01, 2011 2:13 pm
Location: Reno, Nevada

Re: Frame Timing

Postby Ian » Thu Jun 23, 2016 2:46 am

something I observed from ocean hunter and the ski game
Both visibly updates the textures, before swap buffers are called, which seems wrong
Ian
 
Posts: 2044
Joined: Tue Feb 23, 2016 9:23 am

Re: Frame Timing

Postby Bart » Sun Jun 26, 2016 6:14 pm

Ian wrote:something I observed from ocean hunter and the ski game
Both visibly updates the textures, before swap buffers are called, which seems wrong


Hopefully this is related to the Scud Race Plus FMV issue. I haven't had much time in the last couple of weeks (my parents recently visited and I've got some other personal projects that have been taking more time) but I did recently discover Scud's main program loop:

Code: Select all
0x0001495C: 0x886DC007  lbz r3,-0x3FF9(r13)
0x00014960: 0x2C030000  cmpi  cr0,0,r3,0x00
0x00014964: 0x4082002C  bf  cr0[eq],0x00014990  ; enter test menu
0x00014968: 0x38600000  li  r3,0x00000000
0x0001496C: 0x906D800C  stw r3,-0x7FF4(r13)
0x00014970: 0x4801E6D9  bl  0x00033048
0x00014974: 0x4BFF2B6D  bl  0x000074E0
0x00014978: 0x48000105  bl  0x00014A7C
0x0001497C: 0x4800014D  bl  0x00014AC8
0x00014980: 0x4802AA11  bl  0x0003F390
0x00014984: 0x4BFF2C35  bl  0x000075B8
0x00014988: 0x4801D599  bl  0x00031F20          ; timing
0x0001498C: 0x4BFFFFD0  b 0x0001495C


The subroutine at 0x31f20 appears to control the refresh rate. NOPing out the call to 0x75b8 prevents 3D graphics from being drawn and causes a texture FIFO overflow in the FMV sequence. Hopefully these two functions narrow down the problem and I don't have to spend any time in the actual FMV decompression/texture upload code, which is extremely complex.
User avatar
Bart
Site Admin
 
Posts: 3086
Joined: Thu Sep 01, 2011 2:13 pm
Location: Reno, Nevada

Re: Frame Timing

Postby Bart » Sun Jun 26, 2016 8:15 pm

What I've got so far in terms of translating the main loop and relevant exception handlers:

Code: Select all
uint8_t   g_dec_exception_count;    // @ -0x7fc9(r13)                     DeCExceptionHandler()
uint32_t  g_unknown_7fe0;           // @ -0x7fe0(r13)                     DecExceptionHandler()
uint32_t  g_unknown_7fd8;           // @ -0x7fd8(r13) (related to IRQ08?) DecExceptionHandler()

void      (*g_IRQ04Callback)();     // @ -0x7fe8(r13) (0x100018)          IRQ04Handler()
uint8_t   g_irq04_count;            // @ -0x7fca(r13)                     IRQ04Handler()
uint32_t  g_irq04_timestamp;        // @ -0x7fdc(r13) TBL cycle count at last IRQ 0x04, or 0 if not triggered(?)
uint32_t  g_time_since_irq04;       // @ -0x7fd4(r13) TBL cycles elapsed since IRQ 0x04, or 0 if g_irq04_time was 0

uint8_t   g_unknown_7ff7;           // @ -0x7ff7(r13)                     Sub31F20()
uint8_t   g_unknown_7ff6;           // @ -0x7ff6(r13)                     Sub31F20()
uint32_t  g_unknown_7fc8;           // @ -0x7fc8(r13)                     Sub31F20()

uint32_t  g_real3d_status_regs[9];  // @ -0x7a00(r13)
uint8_t   g_real3d_status_bits[5];  // @ -0x79dc(r13)

void Sub31F20()
{
  /*
   * Wait until dec has been re-loaded (as far as I can tell, this only happens
   * in the IRQ02 handler). Note the dec exception occurs when dec goes 0 -> -1.
   */
  while (DEC() < 0)
    ;
  write_reg_32_le(0xf118000c, 3); // unknown functionality -- some sort of ack?
  unsigned num_dec = g_dec_exception_count;
  while (num_dec == g_dec_exception_count)
    ;
  if (g_dec_exception_count >= 2)
    g_unknown_7ff7 += 1;
  g_dec_exception_count = 0;
  g_unknown_7ff6 = 0;
  g_unknown_7fc8 = 0;
}

void IRQ04Handler()
{
  g_irq04_timestamp = TBL();
  if (g_IRQ04Callback)
    g_IRQ04Callback();
  g_irq04_count++;
  volatile uint32_t *tilegen_irq_ack = 0xf1180010;  // tilegen
  volatile uint8_t *irq_pending = 0xfe100018;       // irq controller
  do
  {
    *tilegen_irq_ack = bswap32(0x04);
  } while (*irq_pending & 0x04);
}

// Called in IRQ02 handler!
void MeasureTimeSinceIRQ04()
{
  g_time_since_irq04 = 0;
  if (g_irq04_timestamp != 0)
    g_time_since_irq04 = TBL() - g_irq04_timestamp;
  else
    g_time_since_irq04 = 0;
  g_irq04_timestamp = 0;
}

void ReadReal3DStatusRegs()
{
  volatile uint32_t *real3d_status_regs = 0x84000000;
  for (int i = 0; i < 9; i++)
    g_real3d_status_regs[i] = bswap32(real3d_status_regs[i]);
  uint32_t status_reg = g_real3d_status_regs[0];
  g_real3d_status_bits[0] = (status_reg & 0x08000000) >> 27;  // -79dc(r13)
  g_real3d_status_bits[1] = (status_reg & 0x04000000) >> 26;  // -79db(r13)
  g_real3d_status_bits[2] = (status_reg & 0x02000000) >> 25;  // -79da(r13) <-- read by game code
  g_real3d_status_bits[3] = (status_reg & 0x01000000) >> 24;  // -79d9(r13) <-- read by game code
  g_real3d_status_bits[4] = (status_reg & 0x00800000) >> 23;  // -79d8(r13) <-- read by game code
}

void DecExceptionHandler()
{
  // ... Save registers and some MSR stuff ...

  g_dec_exception_count += 1;
 
  // Following code is Sub1EFC()
  if (g_unknown_7fe0 != 0)
    g_unknown_7fd8 = TBL() - g_unknown_7fe0;
  else
    g_unknown_7fd8 = 0;
  g_unknown_7fe0 = 0;
 
  // ... Restore registers and RFI ...
}
User avatar
Bart
Site Admin
 
Posts: 3086
Joined: Thu Sep 01, 2011 2:13 pm
Location: Reno, Nevada

Re: Frame Timing

Postby Ian » Mon Jun 27, 2016 10:39 am

Only briefly looked at this
however ..

Code: Select all
 
  g_real3d_status_bits[0] = (status_reg & 0x08000000) >> 27;  // -79dc(r13)
  g_real3d_status_bits[1] = (status_reg & 0x04000000) >> 26;  // -79db(r13)
  g_real3d_status_bits[2] = (status_reg & 0x02000000) >> 25;  // -79da(r13) <-- read by game code
  g_real3d_status_bits[3] = (status_reg & 0x01000000) >> 24;  // -79d9(r13) <-- read by game code
  g_real3d_status_bits[4] = (status_reg & 0x00800000) >> 23;  // -79d8(r13) <-- read by game code


That isn't
Code: Select all
   unsigned long  spare1       : 4;
   unsigned long  gp_done      : 1;
   unsigned long  dp_done      : 1;
   unsigned long  ping_pong    : 1;
   unsigned long  update_done  : 1;
   unsigned long  rend_done    : 1;
   unsigned long  tot_clks     :23;


Is it ? :)
Ian
 
Posts: 2044
Joined: Tue Feb 23, 2016 9:23 am

PreviousNext

Return to The Dark Room

Who is online

Users browsing this forum: No registered users and 1 guest

cron