The Active Network
ActiveWin: Reviews Active Network | New Reviews | Old Reviews | Interviews |Mailing List | Forums


Product: GeForce 4 Ti 4600
Company: NVIDIA
Estimated Street Price:
Review By: Julien Jay

GeForce 4 Technology Explanation
LossLess Z Compression

Table Of Contents
1: Introduction
2: GeForce4 Ti 4600 Technology Explanation
3: GeForce 4 Ti 4600 Technology Explanation 2
4: GeForce 4 Ti 4600 Technology Explanation 3
5: nView
6: Direct 3D Benchmarks
7: OpenGL Benchmarks

Already included in the GeForce 3, the LossLess Z Compression process concerns the Z parameter of a pixel (where Z stands for depth of the pixel in a 3D scene). Usually when a scene is displayed, the Z value (coded in 16, 24 or 32 bit) determines if a pixel should be visible or not. The more the games are beautiful and realistic the more the depth values have, obstructing the memory. Just like in ATI Radeon chips, the GeForce 3 Lossless Z Compression reduces the amount of required z-buffer bandwidth by compressing the information flux, with a factor of 4:1. Even though NVIDIA doesn’t detail the algorithm used by the Lossless Z compression, it can in theory reduce z-buffer memory accesses by 75%. Obviously the compression is not destructive and doesn’t alter the way scenes are displayed. According to NVIDIA the GeForce 4 is able to successfully achieve the 4:1 compression more often than with the GeForce 3, thanks to a new compression algorithm.

Z Occlusion Culling

   Just like old PowerVR chips from NEC or the Kyro 2, the Z-Occlusion Culling technology featured by the Light Speed Memory Architecture II of the GeForce 4 is in fact an HSR (hardware surface removal). Everyone knows that when a 3D scene is rendered by the GPU, all the pixels are calculated even those who’d be hidden behind an earlier rendered pixel (for a reason or another) before the scene is finally displayed. The purpose of Z-Occlusion Culling is to not calculate the pixels that would be hidden so they won’t be processed by the pixel shader, thus saving 50% of the bandwidth with actual games. Anyway to get the best result with Z occlusion culling the 3D application should ideally sort its scene’s objects before they are sent to the 3D chip. The Z-Occlusion culling algorithms have been tremendously improved with the GeForce 4 Ti 4600 making the GeForce 4 approximately 25% more efficient when discarding non viewable pixels than the GeForce 3.

nFinite FX II

   Like its predecessor, the GeForce 4 has been designed to fully exploit the features offered by DirectX 8.0. Matter of fact it supports the Pixel and Vertex shaders. First introduced with the GeForce 3, nFiniteFX is the sweet name NVIDIA gave to the programmable engine that regroups the Pixel Shaders and Vertex Shaders 3D features.  

If vertex and pixel shaders mean nothing to you here is a description of what they are. A vertex is the corner of the triangle where two edges meet, and thus every triangle is composed of three vertices. A vertex usually carries information like its coordinates, weight, normal, color, texture coordinates, fog and point size data. A Shader is a small program that executes mathematical operations to alter data so a new vertex emerges with a different color, different textures, or a different position in space. Vertex Shaders are run by the GPU (so it doesn't consume CPU horsepower) to act on triangles’ top (vertices: it concerns every polygon shape) associated data for the Vertex Shaders or on the pixels for the Pixels Shaders. Now let's see the Pixel Shaders. If every 3D scene is composed of pixels generated by Pixels shaders, the GeForce 4 comes with 4 Pixel Shaders aimed to convert a set of texture coordinates (s, t, r, q) into a color (ARGB), using a shader program. Pixel shaders use floating point math, texture lookups and results of previous pixel shaders. A pixel shader can execute programmed texture address operations on up to four textures and then run eight freely programmed instructions of texture blending operations that combine the pixel's color information with the data derived from the up to four different textures.  

Then a combiner adds specular lighting & fog effects to make the pixel alpha-blended, defining its opacity. With 27 instructions for the Vertex shaders and 23 instructions for the Pixels shader, games’ developers are freer than ever to express their creativity, realizing the craziest things their imagination suggests. In other words, the vertex shaders inject personality into characters and environments while the pixel shaders create ambiance with materials and surfaces that mimic reality. With such a technology, developers not only use pre-cabled instructions from NVIDIA but they also create and upload their own algorithms into the GPU bringing to life brand new graphic effects engine! Due to its flexibility, listing the effects that can be managed by the GeForce 4 Ti 4600 is simply impossible, but here are some of the most famous things that are now supported: enhanced Matrix Palette Skinning, Keyframe Animation (used by 3D morphing), 3D objects can be distorted to simulate waves, wind, etc. The only limit the developer will face is that a vertex shader can’t exceed 128 instructions.  

NVIDIA Demos using the dual vertex shader (click to enlarge)

With nFinite FX II, NVIDIA has enhanced and tweaked existing pixel and vertex shaders. The GeForce 4 Ti 4600 now supports versions 1.2 and 1.3 pixel shaders in addition to the support of versions 1.0 and 1.1. It doesn’t support, however, the latest version 1.4 of the Pixel Shaders that is already supported by the ATI Radeon 8500. The GeForce 4 Ti 4600 brings many new Pixel Shaders modes: between them is the z-correct bump mapping that prevents artefacts when a bump-mapped surface intersects with other geometry for an improved level of reality.  

NVIDIA’s latest GPU includes a brand new supplementary pipeline dedicated to the Vertex Shaders. Including two multi threaded Vertex Shaders pipelines in a chip isn’t new, since the NV2A (the GPU that equips the Xbox) has also two Vertex Shaders pipelines. It’s quite obvious that two parallel Vertex Shaders can process many more vertices while at the same time providing a clear performance benefit for the application or API. This secondary pipeline teamed up with a higher frequency makes the GeForce 4 Ti 4600, theoretically, three times more powerful than the GeForce 3 Ti 500 for the treatment of these instructions.  


   Those new Vertex Shaders can be used to render incredible fussy effects as shown by the Wolfman demo from NVIDIA. The NVIDIA GeForce4 GPU family and nfiniteFX II engine represents the first time that realistic fur with per-pixel lighting can be applied to a highly complex animated character and run at high frame rates. The nfiniteFX II engine.s dual vertex shaders are able drive more than 100 million vertices per second. This power is needed as the Wolfman contains over 100,000 polygons. The Wolfman uses eight concentric fur shells. The color and density of the fur is controlled using a separate texture map that covers the entire body, which gives the fur its distinct look, rather than a uniform pattern. The nfiniteFX II engine's advanced pixel shader support for 3 and 4 textures accelerates this type of rendering. The Wolfman is not a mere static model. Rather, it is a completely skinned animation. This Wolfman contains a 61-bone skeleton. The complexity of this model is on par with that used in television and film special effects production. Each and every vertex of the skin, fur layers, and fin geometry are deformed in real-time to match the movement of the underlying skeleton. The complexity of this task is amazing, as the nfiniteFX II engine needs to handle these vertex deformations for each of the eight layers. One of the unique properties of stranded material such as hair and fur is that it reflects light more in some directions than others. This is known as .anisotropic. lighting, and is computationally expensive to reproduce. The nfiniteFX II engine has advanced pixel shaders that help the GeForce4 GPUs deliver 50% more performance than the GeForce3. These improvements allow the GPUs to deliver anisotropic lighting to the Wolfman while maintaining fast frame rates. Individual strands of hair and patches of fur react based on the position and the intensity of the light and the angle that the light strikes the fur.

Wolfman Demo (click to enlarge if you're not scared to death ;))

« GeForce 4 Ti 4600 Technology Explanation GeForce 4 Ti 4600 Technology Explanation 3 »


  *   *