[SDL] Best way to implement a simple "Framebuffer"

Patrick Baggett baggett.patrick at gmail.com
Mon Sep 26 07:52:23 PDT 2011


On Mon, Sep 26, 2011 at 9:31 AM, nils.stec <nils-stec at spectra-light.de>wrote:

> **
> For a 6502 emulator i'm writing at the moment i'm searching for a efficient
> way to use SDL with it.
>
> This emu has a 64k address space, starting at 0x200 there are 1024 bytes.
> The graphics screen is 32x32 pixels in it's size. Every byte is a one color.
> It's compatbile to the online-emu located here:
> http://www.6502asm.com/beta/index.html
> If you want to try it, choose an example from the drop down list, click on
> compile and then click on run. you will see the graphics output on the
> window to the right.
>
> The color defines are:
>
>
>
>  Quote:
>
>
>
> Black ($0)
> White ($1)
> Red ($2)
> Cyan ($3)
> Purple ($4)
> Green ($5)
> Blue ($6)
> Yellow ($7)
> Orange ($[image: Cool]
> Brown ($9)
> Light red ($a)
> Dark gray ($b)
> Gray ($c)
> Light green ($d)
> Light blue ($e)
> Light gray ($f)
>
>
>
>
> At the moment i have a function which calculates the pixel position, the
> corresponding sdl color value in RGBA format and draws a filled box of 8x8
> pixels for one pixel in this 32x32 byte memory space.
>
>
>
>
>  Code:
>
>
> int set_px(int x, int y, unsigned char pxcol) {
>    unsigned char r,g,b,a=255;
>    pxcol &= 0x0f;
>    switch(pxcol) {
>       case 0:      // black
>          r = 0; g = 0;   b = 0;
>          break;
>       case 1:      // white
>          r = 255; g = 255; b = 255;
>          break;
>       case 2:      // red
>          r = 135; g = 0;   b = 0;
>          break;
>       case 3:      // cyan
>          r = 40; g = 240; b = 255;
>          break;
>       case 4:      // purple
>          r = 200; g = 70; b = 200;
>          break;
>       case 5:      // green
>          r = 0; g = 255;   b = 0;
>          break;
>       case 6:      // blue
>          r = 0; g = 0; b = 255;
>          break;
>       case 7:      // yellow
>          r = 230; g = 220; b = 10;
>          break;
>       case 8:      // orange
>          r = 230; g = 155; b = 85;
>          break;
>       case 9:      // brown
>          r = 100; g = 70; b = 0;
>          break;
>       case 10:   // light red
>          r = 255; g = 120; b = 120;
>          break;
>       case 11:   // dark gray
>          r = 50; g = 50;   b = 50;
>          break;
>       case 12:   // gray
>          r = 100; g = 100; b = 100;
>          break;
>       case 13:      // light green
>          r = 170; g = 255; b = 100;
>          break;
>       case 14:   // lblue
>          r = 10; g = 160; b = 240;
>          break;
>       case 15:   // light grad
>          r = 180; g = 180; b = 180;
>          break;
>       default:
>          r = 0; g = 0; b = 0;
>          a = 255;
>          break;
>    }
>
>
Use a lookup table here. You've got 16 colors, so create an array with 16
elements. Each element becomes the color. Then you can do:

unsigned int colorTable[16] =
{
    0xFF000000;    //black
    0xFFFFFFFF; // white

    ....

};

In effect, the lowest 8 bits are the red color, the next 8 bits are the
green color, then blue, then alpha. Keeping it as a single unsigned int is
likely faster since most machines can write a 4-byte value about as fast as
they can write a single byte value. To use the table, you simply look up the
color in the table:


unsigned int colorEntry = colorTable[pxcol];

r = colorEntry & 0xFF;
g = (colorEntry >> 8) & 0xFF);
b = (colorEntry >> 16) & 0xFF;
a = (colorEntry >> 24) & 0xFF;



>    boxRGBA(screen, x*PIXEL_SIZE, y*PIXEL_SIZE, (x*PIXEL_SIZE)+PIXEL_SIZE,
> (y*PIXEL_SIZE)+PIXEL_SIZE, r, g, b, a);
>
>    return 0;
> }
>
>
>
> This isn't very efficient. On my Core2Duo with an onboard Intel card
> running Linux with the opensource drivers i get about 30 frames per second,
> on my Phenom II with an Radeon 3000 running Linux with the closed source
> ATI-Drivers (fglrx) i get around 300 frames per second.
>
> On my Pentium 4, 1,4Ghz, Nvidia Geforce 4 MX440 64MB, i get with the closed
> source nvidia drivers about 20 frames per second.
> This is the screen updating routine:
>
>
>
>
>  Code:
>
>
>       for(x = 0; x < 32; x++) {
>          for(y = 0; y < 32; y++) {
>             addr = graphix_address+(x+(y*32));  // calculate address
> starting at 0x200, 1 line is 32 bytes in size, 32 lines on screen
>             set_px(x,y, get6502memory(addr));    // give set_px the byte in
> "graphics" memory
>          }
>       }
>
>
>
First off, reverse your inner/outer loop. This will immediately improve
performance, especially when doing >> 32x32 rendering areas. This is because
the pixels at (x, y) and (x, y+1) in memory are actually quite far away,
while (x,y) and (x+1,y) are literally right next to each other. You can
actually use that to your advantage by not computing the address every time.
For example, if you have a 32x32 image with each pixel being 32 bits (4
bytes), then (0,0) is 0 bytes from the start of the image, and (1,0) is 4
bytes from the start, (2,0), is 8 bytes from the start, etc. So to write a
row of pixels you don't need to do:

for(x=0; x<32; x++) {
    base_address + x + y*32;

}

Look at what changes in the for() loop. 'Y' doesn't. The base_address
doesn't. So you can precompute these like this:

for(y=0; y<32; y++) {

    row_base = base_address + y*32;

    for(x=0; x<32; x++) {

        pixel = row_base + x;

    }

}

If you know the graphics address is absolutely fixed, then you don't need to
use a variable and can just #define a constant like #define GFX_ADDR 0x200.

The address computation fix applies to set_px() -- do you really need to
recompute the address each time? You could just take a point and add 4 to it
each time (assuming 32-bit pixels) to get the x+1 pixel.

Mainly, you just need to focus on things in terms of rows and how to do
absolutely minimal amounts of work to plot the next pixel. For example, a
clear screen sort of operation doesn't need to be a double-nested for() loop
that sets each pixel to black, it can just be memset(framebuffer, 0,
32*32*sizeof(int)).

Finally, on a stylistic note, you probably shouldn't be hard coding 32x32
all over your program. You'll probably want to expand your screen some time
without changing every instance of 32. Use a #define for it -- the compiler
doesn't output any slower code for it, but it does make it much easier to
change stuff.

Patrick


>
>
> I hope anyone of you know a better and faster solution for this.
>
>
> ------------------------------
>
> If you're interested in Embedded Linux and Microcontrollers have a look at
> my (german) page:
> http://krumeltee.wordpress.com/
>
> _______________________________________________
> SDL mailing list
> SDL at lists.libsdl.org
> http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.libsdl.org/pipermail/sdl-libsdl.org/attachments/20110926/eeacf807/attachment.htm>


More information about the SDL mailing list