ActiveWin.Com: NVIDIA GeForce 4 Ti 4600

ActiveWin: Reviews

DirectX

ActiveMac

Downloads

Forums

Interviews

News

MS Games & Hardware

Reviews

Support Center

Windows 2000

Windows Me

Windows Server 2003

Windows Vista

Windows XP

News Centers

Windows/Microsoft

DVD

Apple/Mac

Xbox

News Search

ActiveXBox

Xbox News

Box Shots

Inside The Xbox

Released Titles

Announced Titles

Screenshots/Videos

History Of The Xbox

Links

Forum

FAQ

Windows XP

Introduction

System Requirements

Home Features

Pro Features

Upgrade Checklists

History

FAQ

Links

TopTechTips

FAQ's

Windows Vista

Windows 98/98 SE

Windows 2000

Windows Me

Windows Server 2002

Windows "Whistler" XP

Windows CE

Internet Explorer 6

Internet Explorer 5

Xbox

Xbox 360

DirectX

DVD's

TopTechTips

Registry Tips

Windows 95/98

Windows 2000

Internet Explorer 5

Program Tips

Easter Eggs

Hardware

DVD

ActiveDVD

DVD News

DVD Forum

Glossary

Tips

Articles

Reviews

News Archive

Links

Drivers

Latest Reviews

Xbox/Games

Fallout 3

Applications

Windows Server 2008 R2

Windows 7

Hardware

iPod Touch 32GB

Latest Interviews

Steve Ballmer

Jim Allchin

Site News/Info

About This Site

Affiliates

Default Home Page

Link To Us

Links

News Archive

Site Search

Awards

Credits
�1997-2012, Active Network, Inc. All Rights Reserved.
Please click here for full terms of use and restrictions or read our Light Tower Privacy Statement.

Product: GeForce 4 Ti 4600
Company: NVIDIA
Website: http://www.nvidia.com
Estimated Street Price: $399.99
Review By: Julien Jay

GeForce Ti 4600 GPU

Table Of Contents

1: Introduction
2: GeForce4 Ti 4600 Technology Explanation
3: GeForce 4 Ti 4600 Technology Explanation 2
4: GeForce 4 Ti 4600 Technology Explanation 3
5: nView
6: Direct 3D Benchmarks
7: OpenGL Benchmarks
8: Conclusion

Engraved in 0.15� and still manufactured by TSMC, this brand new GeForce 4 GPU includes 63 million transistors against 57 million for the GeForce 3 Ti 500. This compares to the Pentium 4 Northwood�s 55 million transistors. In terms of speed, the GeForce 4 Ti 4600 is clocked at 300 MHz for the GPU and 325 MHz for the DDR-SDRAM memory. The GPU is seconded by 128 MB of 2.8ns 128-bit DDR-SDRAM. The features of this new GPU are listed in the table below:

63 million transistors

Manufactured in TSMC's .15 � process

GPU clocked at 300 MHz

Memory clocked at 650 MHz

128 MB frame buffer by default

AGP 2x/4x

nfiniteFX II engine

Accuview Anti Aliasing

Light Speed Memory Architecture II

nView

The theoretical performances of the chip are as follow:

Vertices per Second: 136 Million

Fill Rate: 4.8 Billion AA Samples/Sec

Fill Rate: 1200 Mpixels/Sec

Operations per Second: 1.23 Trillion

Memory Bandwidth: 10.6GB/Sec.

Maximum Memory: 128MB

The GeForce 4 Ti 4600 GPU features only one pixel shader like the GeForce 3 but also includes a supplementary vertex shader for a total of two vertex shaders as well as many new features that we�ll see in detail below. In order to further improve performance, NVIDIA added many new optimizations, fine tuning and tweaking to the GeForce 4 in order to boost treatment times for an overall better performance thus not requiring ground breaking new technology. This new approach was already present in the GeForce 3.

GeForce 4 Ti 4600 GPU Die

GeForce 4 Technology Explanation
LightSpeed Memory Architecture II

The new enhanced Light Speed Memory Architecture II is aimed to optimize the memory�s bandwidth for a better and more realistic gaming experience. This new architecture includes six unique technologies responding to the sweet names of �Crossbar Memory Controller�, �Quad Cache�, �Z-Occlusion�, �Lossless Z Compression�, �Auto Pre-Charge� and �Fast Z-Clear�.

CrossBar Memory Controller

Just like the GeForce 3, the GeForce 4 uses a new memory controller called CrossBar whose main task is to widely optimize the fillrate of the chip by avoiding bit wasting, thus reducing latency times. Traditionally a GPU uses a 256-bit memory controller that can transfer data only in 256-bit. So if a triangle is only one pixel in size it requires a memory access of 32 bytes when only 8 bytes are in fact required: more than 75% of the memory bandwidth is wasted with this process! That�s why NVIDIA intelligently solved the problem by implementing the new CrossBar controller. Unlike yesterday�s GPU, the CrossBar controller has four independent wide memory sub-controllers that can treat 64 bit blocks per clock for a global total of 256 bit (it can also group data to treat them entirely in 256 bit). This new memory controller is the key for better memory management in order to answer today�s game developers� needs: complexity of 3D scenes (the number of triangles per frame has widely increased in recent games). Comparing to a traditional memory controller, the CrossBar cuts the average latency down to 25%. That way any 3D applications can take benefit of this marvel of technology. According to NVIDIA, the CrossBar controller can speed up memory access up to four times. It�s obvious that data that are about to be written or read make only 64 bits, but hopefully this situation is far from being an everyday occurrence. The difference between GeForce 3 and GeForce 4 Crossbar Memory Controllers is the algorithm they used. For the GeForce 4 the load balancing algorithms of the Crossbar Memory Controller have been streamlined for a more efficient use of the memory between the 4 partitions.

GeForce 4 Ti 4600 CrossBar Memory Controller

Quad Cache

The Quad Cache is a brand new cache memory sub-system that regroups four distinct memory caches. Each of the four memory caches is dedicated to achieve a specific task: one is in charge of the primitives, one for the vertex, one for the texture, and one for the pixel. NVIDIA doesn�t disclose the size of each cache memory.

Once data has been processed by the GPU, a small quantity of data or the result of a calculation is stored in each cache with the enormous advantage to be instantly available for the GPU. Even if the GPU needs some of the information calculated before in order to render the next scene, it�ll retrieve this information from the quad cache rather than searching the whole memory to find back this data. We can detail the use of the Quad Cache like this:

Vertex Cache: This cache stores vertices after they are sent over the AGP bus. It makes the AGP more efficient by making sure there are no multiple transmissions of the same vertices.

Primitive Cache: This cache stores information issued from the operation that assembles vertices into fundamental primitives.

Dual Texture Cache: This feature was already present in the GeForce 3 but its algorithm has been slightly improved making it more efficient with multi texturing or high quality filtering.

Pixel Cache: Located at the end of the processing pipeline this cache is a coalescing cache. It waits for a certain quantity of pixels to be drawn before writing them to the memory using burst modes.

Auto Pre-Charge

Typically the information that is stored in a memory, no matter what kind of memory is being used, is always identically organized in banks. The problem of this architecture appears if the GPU needs to access information contained in another bank other than the one that is currently opened. To do so, the memory should close the bank actually used and then pre-charge and enable the new bank to give the GPU the information it needs. It results in a dramatic loss of performance since all the operations described above take ten clock cycles to complete while the GPU does nothing except wait.

GeForce 4�s Auto Pre-Charge feature is able to automatically pre-charge a memory bank that isn�t currently used according to a certain prediction algorithm in order to boost performance. Sure the activation phase is still required but it will take only 2 or 3 clock cycles.

Fast Z-Clear

After each rendering of an image from a 3D scene a conventional GPU should erase the Z-Buffer. Usually this is done by writing a value of 0 for each pixel in the frame buffer. Fast Z-Clear sets a flag corresponding to a specific area of the Z-Buffer and fills the flag with zero rather than filling the whole frame buffer. This technology has nothing new and is already present in ATI Radeon GPU. It is supposed to speed up displaying times by up to 10%.

� Introduction

GeForce 4 Ti 4600 Technology Explanation 2 �