[SDL] sensible optimization [was re: tile based _junk_]

Darrell Johnson johnsond at westman.wave.ca
Thu Aug 19 16:17:10 PDT 1999


rival games wrote:
> 
> > We're talking about a simple scroller!  Any Pentium system, and most
> > 486ers, will do fine while regenerating the whole screen every frame.
> 
> Ha ha ha ha. I'm glad to hear you think 10fps is good.  My pentium II 333 was
> running my engine at 40fps regenerating the screen everytime, now I optimised
> it thanks to the influence of this mailing list (except for you) and it runs
> over 150fps. You're thinking of a 320x240x8bitcolor tile based engine.

Ahem, how were you regenerating the screen every time?  In one
assembly-coded long linear write to the screen buffer (the kind of
sensibly optimized 2d engine I was talking about), or unoptimized C
copying one tile at a time with plenty of unnecessary (or inappropriate;
such as <32 bit or not on 32 bit boundaries) memory accesses.  Copying
from the tiles in memory to the screen is not a lot faster than copying
one area of the screen to another (unless it is using a hardware speedup
in the video card, which you can't count on).

If you haven't made a near-optimal full-regeneration version, how can
you claim another strategy is better?  The truth is that you are
comparing an unoptimized engine to an optimized one, nothing more.  This
strategy of reusing most of the screen is well suited to taking
advantage of one of the few C techniques that is as fast as well coded
assembly: a large memcpy.  So I've been guessing you use the efficient
memcpy, then fill in the luckily small holes (due to relatively slow
scrolling and small numbers of sprites) inefficiently with your
unoptimized (or poorly optimized) C routines.  This is not so much a
fundamental improvement as an easy means of gaining acceptable
performance in a special case.  These kind of hacks are not suited to
high-performance games and are not good for your development as a
programmer.

Also, your Pentium II 330 will not usually be even twice as fast as a
Pentium 160 for this stuff (maybe 150%, if that).  Their memory access
works at closer to the same rate, and you have to code carefully to not
have all the extra cycles eaten up by cache misses.  It may seem a lot
faster in a memory-hogging windowing environment or a 3d game, but
that's because the P2 330 undoubtedly has oodles more memory, a faster
hard drive, and a useable 3d card; none of which are factors here (Quake
without hardware 3d support would also run faster, but computation is
the limiting factor in that case, and Quake was very well written to
take advantage of higher clock rates).

To reiterate, I have never said that optimization is bad.  I do,
however, believe that reusing the unused parts of the screen is a bad
optimization; it limits you to modifying only small parts of the screen
at a time, so you can't have animated tiles or hundreds of sprites on
the screen.

As for the fellow who was talking about using this type of optimization
in a Starcraft-type game, think for a second: how much of the screen do
you think you could typically reuse in Starcraft?  Maybe a lot, in the
beginning phase, but when you get to the exciting major battles where
you want a high frame rate, the screen is filled with sprites and you
can't reuse any of it.  An inconsistent frame rate is worse than a just
plain low frame rate; there's nothing worse than thinking your machine
is fast enough to run a game at a certain resolution, then having it all
fall apart just when it gets interesting.  As for maintaining a reusable
background layer, you are talking about a whole extra full-screen blit
per frame!  This is slower than an efficient full regeneration, if you
are using tiles (if you are using some sort of voxel engine or more
complex 3d system which isn't well suited to hardware acceleration, you
just might find this kind of strategy worthwhile; but I was never
talking about that stuff).

Modern computers are fast.  They aren't so fast that you can forget
about efficiency, they are fast in ways that change the rules, and old
optimization strategies that were brilliant in their day are now useless
or even counter-productive.

I know some of you guys are mad at me; I've been pretty blunt in
essentially calling a lot of your work useless garbage, so I'm not
offended.  But take a minute and really think about what I'm saying
before you dismiss it out of hand.

BTW, you don't actually think that you're really running at 150 fps, do
you?  If nothing else, your monitor can't keep up with that.

Cheers,

Darrell Johnson



More information about the SDL mailing list