S-buffer: how the Model 3 handles anti-aliasing and translucency

Technical discussion for those interested in Supermodel development and Model 3 reverse engineering. Prospective contributors welcome. Not for end-user support.
Forum rules
Keep it classy!
  • No ROM requests or links.
  • Do not ask to be a play tester.
  • Do not ask about release dates.
  • No drama!
gm_matthew
Posts: 15
Joined: Wed Nov 08, 2023 2:10 am

S-buffer: how the Model 3 handles anti-aliasing and translucency

Post by gm_matthew »

I had been pondering for a while about exactly how translucency worked on the Model 3, but recently after studying this patent and looking into the Real3D firmware, I believe I may have discovered the key to how not just translucency, but also anti-aliasing is implemented on Model 3: the S-buffer.

First of all, what does the S in S-buffer stand for? Actually, I have no idea. But I do know that it contains metadata used to help Jupiter convert the video output into an anti-aliased image.

In the Real3D firmware it is possible to perform an "Earth snap test" that captures and displays some of the data being processed by the pixel processor (Earth), including what appears to be the output to the frame buffer:

Code: Select all

xxxxxxxx xxxxxxxx -------- -------- -------- -------- -------- -------- IRNG
-------- -------- --xxxxx- -------- -------- -------- -------- -------- Top Xing
-------- -------- -------x xxxx---- -------- -------- -------- -------- Bottom Xing
-------- -------- -------- ----xxxx x------- -------- -------- -------- Left Xing
-------- -------- -------- -------- -xxxxx-- -------- -------- -------- Right Xing
-------- -------- -------- -------- ------x- -------- -------- -------- Xluc Flag
-------- -------- -------- -------- -------x -------- -------- -------- Poly Xluc Flg
-------- -------- -------- -------- -------- xxxxxxxx xxxxxxxx xxxxxxxx Pix Color (RGB)
Each Earth ASIC outputs 16-bit data to three 3D-RAMs, for 48 bits in total. I'm presuming that IRNG stands for "ignoring" so that the remaining data fits into a 48-bit word, spread across the three 3D-RAMs. Xing stands for "edge crossing"; the patent describes these in (somewhat convoluted) detail, but what these do is describe how close a rendered polygon edge is to the pixel sample point. Xluc stands for translucency; there are two flags used to determine if anti-aliasing should be performed for this particular pixel.

Anti-aliasing is performed by taking the video output and blending adjacent pixels together using weighted averages calculated from the edge crossing values. Let's take a look at an example:

Image

A, B, C and D represent the sample points; each of these contains a 24-bit RGB color value, and top, bottom, left and right edge crossing values. In this particular case, A and C have right edge crossings and B and D have left edge crossings. (If an edge has no crossing, the value is set to zero.) The overall color of the pixel uses a weighted average of A, B, C and D depending on the values of the edge crossings; this pixel will be more weighted towards the color of sample points A and C.

Let's look at another example:

Image

This pixel will be weighted more towards the colors of sample points B, C and D.

I should point out now that this will not be a practical way to implement anti-aliasing in Supermodel; calculating the edge crossing values for each pixel when rendering on a GPU would be extremely difficult, if not outright impossible. It would be a lot more practical to implement a more conventional anti-aliasing method such as SSAA or MSAA; I believe that Ian is having thoughts about implementing SSAA.

So where does translucency come into this? Well, using the S-buffer it is possible to implement translucency by rendering only to every other pixel and then manipulating the edge crossing values to bias the color of the pixels. For example, suppose that we are rendering a red triangle onto a black background. The triangle is rendered like this, regardless of the translucency level:

Image

The edge crossing values are manipulated to bias the displayed pixels towards the red or black sample points depending on the desired translucency level. For example, at 50% translucency the triangle will end up being displayed like this:

Image

All of this means that translucency can be achieved without alpha blending at all, which means there is no need to sort polygons. However, there are a few caveats:
  • Rendering a translucent polygon on top of another with the same translucency pattern erases the previous polygon.
  • If two polygons of opposing translucency patterns overlap, the result is opaque as the background is completely overwritten.
  • Resolution is effectively halved for translucent polygons.
So what does this mean for Supermodel? Well, it allowed me to discover that the result of two overlapping translucent polygons is always opaque, which has helped me out with my upcoming LOD blending implementation. But I think at least for now, we should stick with our current approach of simulating the results by rendering to separate layers and blending them together at the end.
Bart
Site Admin
Posts: 87
Joined: Tue Nov 07, 2023 5:50 am

Re: S-buffer: how the Model 3 handles anti-aliasing and translucency

Post by Bart »

I believe the "S" stands for "span". As in a horizontal span of pixels (from left to right edge on the current line for the polygon). This is amazing insight, Matthew! The term was thrown around in the 90's possible inconsistently but I remember reading this document ages ago and seeing it mentioned there for the first time.
gm_matthew
Posts: 15
Joined: Wed Nov 08, 2023 2:10 am

Re: S-buffer: how the Model 3 handles anti-aliasing and translucency

Post by gm_matthew »

Bart wrote: Sun Nov 12, 2023 11:56 pm I believe the "S" stands for "span". As in a horizontal span of pixels (from left to right edge on the current line for the polygon). This is amazing insight, Matthew!
I'm actually not sure, I saw this article but the "s-buffer" described in it seems to be different to the "s-buffer" used by Real3D.
Ian
Posts: 26
Joined: Wed Nov 08, 2023 10:26 am

Re: S-buffer: how the Model 3 handles anti-aliasing and translucency

Post by Ian »

Resolution is effectively halved for translucent polygons.

Is that really the case? You'd think we would be able to see that in video captures. I thought most early AA implementations basically just smoothed the edges of the polygons.
Bart
Site Admin
Posts: 87
Joined: Tue Nov 07, 2023 5:50 am

Re: S-buffer: how the Model 3 handles anti-aliasing and translucency

Post by Bart »

Am I understanding this?

- The sample points are on pixel boundaries (not center points).
- Doesn't matter which side of the diagonal lines (polygon edges) the polygon is filled on because the colors at the sample points have already been rasterized (or more generally, let's say they are known or can readily be fetched) somewhere. The closer a sample point is to the edge, the less that point is weighted.

Presumably this happens *during* polygon rasterization to the frame buffer. S-buffer would seem to imply a list of horizontal polygon spans but it seems impractical that *all* spans for the entire frame would be stored. Do you think one of the ASICs (which one? Maybe Earth does this itself?) spits out an entire list of spans top-to-bottom for a polygon (each span would just be two x coordinates and one y coordinate), passes it to the Earth ASIC to perform this test, which would involve reading from the framebuffer?

If I understand correctly, the span x1 and x2 points would be stored in sub-pixel resolution. When rendering an opaque polygon, the system would need to sample from the frame buffer for each adjacent set of spans, on both sides. So if a polygon is 100 pixels tall, that would be: 4 pixels * (100 + 1) * 2 = 808 pixels to read and average together. Nearly half of those would not need to be sampled from the framebuffer because they would be sampled from texture RAM or the polygon color directly (the "interior" of the spans; i.e., the interior of the polygon).

Or am I misunderstanding?

EDIT: Looking at the patent now, it seems the edges are the polygon edges directly (not just horizontal spans, but the 3 or 4 edges between verts).
Bart
Site Admin
Posts: 87
Joined: Tue Nov 07, 2023 5:50 am

Re: S-buffer: how the Model 3 handles anti-aliasing and translucency

Post by Bart »

Ian wrote: Mon Nov 13, 2023 12:33 am Resolution is effectively halved for translucent polygons.

Is that really the case? You'd think we would be able to see that in video captures. I thought most early AA implementations basically just smoothed the edges of the polygons. To do that you would just need to store the angle of the slope.
This would be really easy to test if I could display polygons on real hardware using an alternating line pattern. Really need to build a transfer cable.
Ian
Posts: 26
Joined: Wed Nov 08, 2023 10:26 am

Re: S-buffer: how the Model 3 handles anti-aliasing and translucency

Post by Ian »

This is how it worked on the n64.

https://ultra64.ca/files/documentation/ ... ok%20jaggy.

My guess is it would be pretty similar to that. I'd could be wrong but I'd be super surprised if it really rendered to every other line for alpha polys. Most games unless doing lod blending only use layer 0. The hardware could do alpha blending, it was basically built into the ram.
Bart
Site Admin
Posts: 87
Joined: Tue Nov 07, 2023 5:50 am

Re: S-buffer: how the Model 3 handles anti-aliasing and translucency

Post by Bart »

Some interesting notes on the patent:

- Controller (CPU), Geometry Processor, and Display Processor (pixel processor) all handle 3 different scenes in a pipeline. CPU is preparing frame 2 while geometry processor is transforming and setting up polygons from frame 1, and display processor is outputting frame 0.

- There is a weird description of depth sorting involving a priority list *and* depth buffer, which doesn't make much sense to me, and would seem to readily describe how Model 2 worked. It sorted polygons into a list using programmable criteria (i.e., which vertex to consider for depth or whether to use the centroid, etc.) Is it possible Model 3 lacks a depth buffer and just sorts polygons? That would certainly put a strict bound on how many could be rendered per frame: the size of the polygon list. [EDIT: this language dates back to some patent from the 1980's and has just been re-used, so it's hard to tell what is actually being done by Pro-1000]

Code: Select all

The Geometry Processor reads, from a database, descriptions of objects that are potentially visible in the stored
three-dimensional digital representation of the Scene, and
the objects that are read are rotated into display coordinates
using the rotation matrices calculated in the Controller. The
Geometry Processor mathematically projects the three
dimensional data onto the two-dimensional display window.
In addition, the Geometry Processor calculates (as by use of
a depth-buffer or the like) which objects are in front or
behind other objects and Stores this information in a priority
list. 
- S-buffer almost certainly refers to "span buffer". The patent references "span/sub-span" methods and, crucially, this patent that uses the term "spans". This is from 1989 and assigned to General Electric (GE made flight simulators).

- This related patent from Martin Marietta includes some more information including a flow-chart. They describe the process as being performed pixel-by-pixel, as if each polygon is rasterized by scanning the entire screen. I think more than likely there is some sort of edge/span buffer.

- "Spans" are not horizontal segments as I thought. They're almost like screen tiles? From the 1989 GE patent:
span.jpg
span.jpg (41.79 KiB) Viewed 11289 times

Code: Select all

As is apparent, some spans at the boundary of the face
will contain only a small portion of the face and some
spans will be completely covered by the face. FIG. 4
shows a face 30 and a face 32 and a set of spans that need
to be identified in order to process the face in detail. In
particular, span 34 is outside face 30 and face 32. Span 38 is wholly contained within face 30 and not within
face 32 and span 36 is on the edge of face 30 and face 32.
The part of the face that lies within each span is pro cessed in detail by the span processing of the display
processor. For a more detailed description of spans, faces, and span processing reference is made to U.S.
patent application Ser. No. 810,738 filed Dec. 19, 1985
(still pending) and titled Method of Edge Smoothing for
a Computer Image Generation System and to U.S. pa
tent application Ser. No. 810,737 filed Dec. 19, 1985
(still pending) and titled Method of Comprehensive Distortion Correction for a Computer. Image Genera
tion System, the disclosure of both of which are hereby incorporated by reference. 
Also, this patent's use of "depth buffer" doesn't appear to mean a memory region as we would understand it today. It seems like the whole processing algorithm involving the s-buffer method is referred to as the "depth buffer".

- A patent was filed in 1985 but still pending as of the filing above that discusses in greater detail span processing (Ser. No. 810,738 filed Dec. 19, 1985; but I couldn't quickly find it).
gm_matthew
Posts: 15
Joined: Wed Nov 08, 2023 2:10 am

Re: S-buffer: how the Model 3 handles anti-aliasing and translucency

Post by gm_matthew »

Literally the only actual mention of "S-buffer" in the patent itself (other than the title) is this paragraph:
With reference to FIG. 7, if 2×2 S buffered subpixels per pixel 60 are desired for better sampling, the pixel resolution can be increased by a factor of 4, and then a post filter 62 applied to filter the pixels down to the display resolution.
One of the Jupiter modeword register flags is "Disable S-Buffer", which I can only presume must refer to the edge crossing values and translucency flags; if this is not the S-buffer then what is?

Model 3 definitely has a depth buffer, the video RAM tests make explicit reference to "depth buffer memory" and the Real3D developer's guide makes a reference to a z-buffer in the function PolygonIsLayered().
Bart
Site Admin
Posts: 87
Joined: Tue Nov 07, 2023 5:50 am

Re: S-buffer: how the Model 3 handles anti-aliasing and translucency

Post by Bart »

gm_matthew wrote: Mon Nov 13, 2023 2:35 am Literally the only actual mention of "S-buffer" in the patent itself (other than the title) is this paragraph:
With reference to FIG. 7, if 2×2 S buffered subpixels per pixel 60 are desired for better sampling, the pixel resolution can be increased by a factor of 4, and then a post filter 62 applied to filter the pixels down to the display resolution.
One of the Jupiter modeword register flags is "Disable S-Buffer", which I can only presume must refer to the edge crossing values and translucency flags; if this is not the S-buffer then what is?
I guess with S-buffering turned off you'd have a lack of blending at the edges and presumably a lack of transparency if this is how it's implemented.

By the way, this patent was filed in 1994 and granted in 1997 and is therefore probably exactly what found its way into the Pro-1000. They describe the polygon rasterization algorithm, including translucency in considerable detail. It's not exactly as you describe it but close: translucency is achieved by disabling sampling points. But they don't use a horizontal stipple pattern necessarily, which would halve the horizontal resolution of the texture map as Ian pointed out. Check out Fig. 5a and page 18:

Code: Select all

played or further processed, as required).
The edge intercept calculator performs the process steps
shown in the flow chart of FIG.5a: the pixelrowline Lp and
column Pp data signals are received in step 50 and the
polygon 36 vertex data signals are received in step 51. In
step 52, the beginning of a display line Lloop, the calculator
determines which display lines L., from the top line (where
I=Io) to the bottom line (I-Imax), have the polygon present in at least part of one pixel; these line numbers L are
temporarily recorded and the first of the line numbers LP is
set. This line number is entered into step 53, where the left
and right polygon edge-rowline intersections J and J are
found for that line; these limits are also temporarily stored.
Step 54 begins a nested pixel loop, considering each pixel.JP
along the line LP established in step 52, from the limits (left
to right) found in step 53. Inside the nested loop is found
steps 55, 56, 57 and 58, respectively acting for: finding, for
the top-left corner Co of the pixel, the four crossings C and
the associated normalized four distances D of that corner
constellation; clamping the distance D to a maximum 1
pixel value, and setting a "covered” flag, if the particular distance D is greater than 1 pixel distance (step 55); operating on some aspect of the pixel data if the polygon is
translucent (translucency considerations will be discussed in
detail hereinbelow); computing the corner point Co color as
a function of all related factors, such as polygon color,
texture effects (color, modulation, and the like) and so forth
(step 57); and then temporarily storing the relevant data
signals in video memory (the frame buffer 44). Thereafter.
step 59 is entered and the J loop determines: will the J value
for a next-sequential pixel will exceed the righthand limit
J? If not, further pixel processing on that Lp line can
proceed; output 59a is used and step 60 is entered, to return
the loop to step 54, with J=J+1. If the present pixel was at
Jr., the next pixel contains no part of the present polygon.
and output 59bis exited so that the process moves to the next
line L=(Lp+1). Step 61 is entered and the line loop deter
mines if the next-sequential line will exceed the lower line
limit L for that polygon? If not, further pixel processing
on that line can proceed; output 61a is used and step 62 is
entered so that the line loop actually now returns to step 54,
with L-L-1; the pixel loop for that new line will be
traversed. If the just-completed line was at the bottom of the
polygon, the next line contains no part of the present polygon, and output 61b is exited so that the process moves
to the next polygon 36". When the present polygon process ing is finished, step 63 is entered and the polygon loop determines if all polygons in the present view window 34
have been considered? If other polygons remain to be
processed, exit 63a is taken, the polygon designator is
advanced at step 64 and the set of next-polygon vertices
fetched (step 51). If no other polygons remain for
consideration, further pixel processing in this system portion is not necessary for that display frame; output 61b is exited
so that the process moves to the next image window frame
of data signals and begins the edge intercept calculation
process for the pixel data of that display frame.
TRANSLUCENCY
Each face polygon 36 has a certain, known degree of
translucency. We normalize the translucency level so that the
level can be denoted by a number between 0.0 (perfectly transparent) and 1.0 (completely opaque). Translucency is
accomplished by disregarding or disabling particular pixel
corner sample points, so that even though the polygon may
lie on (i.e., cover) the sample corner point, the polygon characteristics data is not written into the video memory for
that pixel corner; the edge crossings associated with the
omitted corner are also not written into memory. By dis
abling sampling points depending on the amount of
translucency, polygons visually behind the translucent poly
gon can be seen. The more translucent the polygon, the more
sample points are disabled. This is analogous to poking
holes in the translucent polygon to see what is behind. A
pattern select code and translucency value are assigned to a
polygon. Translucency value indicates how many sample points to disable, and the pattern select indicates which
sample points to disable.
The disablement process may be understood by reference
to FIG. 6a, [b]where five different transparency levels T are[/b]
shown, for each of the two levels of a pattern select (PS) bit.
It will be seen that for different ranges of T levels, the
translucency disablement pattern for the pattern select bit at
a low binary level (PS=0) is complementary to the selected
pattern with the pattern select bit at a high binary level
(PS=1). While only five levels of translucency T are
illustrated, more translucency levels are computable. The number of Tlevels is achieved by modifying the edge crossings, as shown in FIGS. 6b-6e, as a function of
translucency. The more translucent the face polygon, the less
the area assigned to a sample point. This procedure increases
or decreases the area of the polygon on each sample corner
point: as seen in FIG. 6b, the translucency level T is
sufficiently low (0-T<4) that only one pixel corner C is
part of the selected pattern, and the other corners C, C and
C corners of that same pixel are disabled and not consid
ered; the left and right crossings are moved toward the
sample point, so that distances D, and D respectively become 1.0 and 0.0, while the modified bottom crossing distance D" becomes (4TD), where D is the
unmodified bottom crossing 40" distance to corner C.
Similarly, the top crossing 40' distance D is modified to
D'=1-(4*T*(D-D)), where D is the unit pixel edge segment distance. As the translucency level T increases, the
edge crossings are moved away from the corner sample point, and increases the effective area of the face polygon in
the pixel being considered.
5 different translucency levels, huh? That sounds familiar :)

Looks to me like the rasterizer takes the polygon vertices, computes ymin and ymax of the polygon, then loops over those scanlines. For each scanline, it loops from xmin to xmax. For each pixel, it computes the distances to the polygon edge, as in your diagram and clamps the distance to 1 pixel. The crossing distances, translucency, color (polygon color, texture color, with lighting and modulation applied) are stored to the frame buffer along with a "covered flag".

EDIT:

Yes, stipple is used. I *think* this is what's happening per pixel:

- 8 neighboring pixels are considered.
- Our pixel is in the center.
- This means there are 4x4 corners in play. These are the "sample points."
- The color value in the frame buffer for each pixel is its top-left sample point.
- The rasterizer draws to the frame buffer and stores the translucency flag, some "covered" flag, and edge mumbo jumbo per pixel in addition to the color, which again, will be used for the top-left sample point.
- During a post-processing step (after all polygons have been drawn?), we have all of our sample values set up in each pixel of the frame buffer but now we need to compute the actual color of the pixel at its center point. This will be some mix of all 4 of its sample point corners but those sample points will be modulated by the edge calculations. And that is why 16 (4x4) sample points are involved because each of those 4 sample points can be influenced by its neighbors to result in the final color.

So... if we draw only opaque polygons, each polygon pixel overwrites the previous one if it's nearer to the camera. Translucent polygons are no different except that they also set the T flag and some pixels are not written (the disabled sample points). But disabling sample points alone is not all that happens. Fig. 6a shows that there are in fact 8 different stipple patterns. Yet we have 32 levels of translucency. The patent explains the additional levels are achieved by modulating the edge values for the pixels that are written.

Because of the T flag, if another translucent polygon overwrites one, the old color value is lost completely. This also means that depending on the stipple pattern used you might be able to overlay two translucent polygons and have them create an opaque image by writing to alternate pixels.

Once everything is done rendering, a post-processing step mixes all the colors together. This I think means 8 pixels have to be read to establish the color of each individual pixel in the frame buffer.
Post Reply