This text file and most all others in this new GS game folder for Super Mario Bros IIgs can be found at the program authors website urls below: http://www.d.umn.edu/~lscharen/code0998/ (standard item directory) or http://www.d.umn.edu/~lscharen/ (html document pages - WWW) ------------------------------------------------------------------------ This document descibes a new type of compiled sprite I have constructed for my programming use ------------------------------------------------------------------------ Everyone who has used compiled sprites has been impessed by their speed, but gritted teeth over their numerous shortcomings. i.e. no clipping, special effects difficult, etc. To address these concers I have created a type of compiled sprite that has the following features: „Self-clipping: Define a minimum and maximum X range and the sprite will clip itself „Vertical flipping: useful for effects and reducing code size „Independent settings for each scan line „Support of simple overlap management: useful for constructing pseudo-backgrounds out of sprites without slow down ------------------------------------------------------------------------ First let's define a few things. The sprite itself is descriped as an array of pointers to scanline functions. Each scanline fuction is responsible for drawing on horizontal line of the sprite. Sprite dc a4'scanline1' dc a4'scanline2' dc a4'scanline2' dc a4'scanline5' dc a4'scanline1' . . . Notice that the same scanline can be used more than once. This is helpfule when horizontal slices of the sprite are similar. e.g. a solid box. Now what about the scanline functions. Each procedure contains a header, several data tables, and assumes certain parameters are passed via the registers. X-reg: contains the rightmost word of the sprite to be drawn Y-reg: contains the leftmost word of the sprite to be drawn DP: address of the left side of the scanline on the graphics screen SP: address of the rightmost side of the scanline Note: X-reg must be >= Y-reg or else a crash is almost certain. The Header of the scanline function is as follows: Entry anop ;entry point bra begin ;jump into code link ds 4 ;pointer to address of code to return to r_mask ds 2 ;mask for rightmost work l_mask ds 2 ;mask for leftmost word tmp1 ds 2 ;temporary storage tmp2 ds 2 tmp3 ds 2 begin lda mask,x ;first do the rightmost word ora r_mask ;combine the masks bne Need2Msk ;if we can blit the whole word, do it dex ;DEX twice because we do a jmp (Tbl+2,x) dex ;later, and we want to do the first word bra patch ;as well Need2Msk and <$00,x ;AND it with the screen data sta tmp1 ;save the result lda r_mask eor #$FFFF ;invert the mask and data,x ;clear the data ora tmp1 ;combine with the previous result pha ;put it on the screen This code has just put the first word of data on the screen. Now we need to set some dispatch vectors so we can jump into the speedy compiled code. patch lda Tbl,y ;patch the dispatch table here so we do sta tmp1 ;the left edge as a special case lda Tbl+2,y ;patch this is case Xreg == Yreg sta tmp2 ;save these to restore later lda #l_word sta Tbl,y ;patch code to do the last word lda #e_code ;and patch in the exit code sta Tbl+2,y jmp (Tbl+2,x) ;now jump into the compiled sprite code Now our dispatch vectors are set. The routine l_word properly draws the last word of data in a similar way we did the right word above. e_code patches things up and exits cleanly. Now let's look at l_word and e_code. l_word tyx ;now clip the last word (need it in X) lda mask,x ;same procedure as above ora l_mask and <$00,x sta tmp3 ;the other space is used lda l_mask eor #$FFFF and data,x ora tmp3 pha ; jmp e_code ;this can be eliminated e_code lda tmp1 sta Tbl,y ;restore values in the table lda tmp2 sta Tbl+2,y jmp (link) ;now jump to wherever In case you're wondering, the reason for both l_word and e_code is to handle the case where X-reg == Y-reg. In that case the dispatch code jumps directly to e_code. What follows is the data tables and such. This code is takes from test code I've compiled and worked with, so it is correct. Try and figure it out. :) data dc h'1111 2222 3300 0000 0000 0044 4455 5566 0000 0077 8888' mask dc h'0000 0000 00ff ffff ffff ff00 0000 0000 ffff ff00 0000' Tbl dc i2'w_10,w_9,w_8,w_7,w_6,w_5,w_4,w_3,w_2,w_1,w_0' w_0 pea $8888 jmp (Tbl+18) ;this goes in right to left order w_1 lda <$12 and #$ff00 ora #0077 pha jmp (Tbl+16) w_2 pei <$10 ;transparent word jmp (Tbl+14) w_3 pea $5566 jmp (Tbl+12) w_4 pea $4455 jmp (Tbl+10) w_5 lda <$0A and #$ff00 ora #$0044 pha ;masked word jmp (Tbl+8) w_6 pei <$08 jmp (Tbl+6) w_7 pei <$06 jmp (Tbl+4) w_8 lda $04 and #$00ff ora #$3300 pha jmp (Tbl+2) w_9 pea $2222 jmp (Tbl) w_10 pea $1111 jmp e_code ;this is the last word, exit now The algorithmic idea behind this sprite, is to handle the left and right edges as a special case, so that any data in the middle can be blitted one word at a time. As you can see, there is a bit of overhead for each word, but it is still ~2 to 3 times better than a bitmap (11 to 23 cycles vs. 33 cycles). Also, since the scanlines are stored in an array, one could have an independent table of offsets for each line for wave effects on a per sprite basis, or vertical flipping. Also, with the self clipping aspect, a different Xmin and Xmax can be defined for each line of the graphics screen, allowing for very complex borders. If one were to use this system, a convenient structure to use as defined in C might be: #define numLines 20 typedef struct sprite { int height; int width; void (*scanlines)[numLines](void); int offset[numLines] int Xmin[numLines] int Xmax[numLines] } sprite; My main motivation for developing this type of sprite was to have a technique to eliminate the erase/update part of a draw/erase/update loop. In my Super Mario Bros game, I had at one point just a draw/draw loop where the screen would scroll and after the baseline of a sprite was reached, it would be drawn on top of the newly scrolled area. This was good, but introduced noticable flicker, especially for large sprites. If I could draw the sprites 1 scanline at a time, some tearing may occur, but no flicker. So this format was born. The extra features contained within it are just extensions of the scanline-independent nature. I've tried to be careful to make sure that the same scanline can be used multiple times across several sprites. If one were to put some effort into it, sprites could dynamically change by changing entries in their scanline table. This could be used for some cool effects. Also, one point that may need some explanation. The reason the sprite exits via a jml (abs) is two-fold. First, the SP is used to pass a parameter, and doing a js[lr] would corrupt that, not to mention use memory that's not yours. Also, since we are blitting to Bank 01, the DP and Stack are in that bank, so rt[ls] are not usable. By doing a jml to and from the sprite, we can make a dispatch routine to call them on a per-scanline basis. Also, if needed, the sprite format can be modified to draw to any bank and be callable via an rtl. Some extra fields in the header need to be filled out, and the PEA,PEI optimization is no longer possible, so I don't know if it'd be worth it unless the variable clipping was worth it. I'm sure there are flaws and sub-optimal code in my sprites, if anyone knows of another way to implement the self-clipping without testing after each blitted word or special-casing everything please get in touch with me. ------------------------------------------------------------------------ One last thing. I always thought transparency was cool, so here's a way to do it for a compiled sprite. It's not 100% correct, but gives pretty good results: lda screen_data and #$EEEE clc adc #sprite_data ror sta screen_data where sprite_data has the form %XXX0 XXX0 XXX0 XXX0