stx divisor sty dividend ldx #16 lda #0 lp asl dividend rol cmp divisor bcc nextbit sbc divisor inc dividend nextbit dex bne lp ;"dividend" now contains the result of the division. sta remainder It is rather sneaky, and yes, it is correct. What it does is essentially the same computation, but it shifts the result into the dividend location instead of a seperate result location. However, I like the carry flag technique used by Shadow's code, so let's add it in: stx divisor sty dividend ldx #16 lda #0 lp rol dividend rol cmp divisor bcc nextbit sbc divisor nextbit dex bne lp rol dividend ;"dividend" now contains the result of the division. sta remainder Note that now when a 1 bit is entered into the result, we save 7 cycles. We do perform an extra ROL at the end however. The relative performance is determined by what kind of answers we expect: result of 0, new code is 7 cycles slower; result is power of two, new code is the same speed; otherwise the new code is faster (and this is most of the time so we judge this a win). Note that the first ROL to dividend shifts in a garbage bit, which is shifted out by the final one. That is what we are spending in order to make part of the main loop faster, and in this case we expect to save time overall so it's a good thing. What's interesting is that the effect is that of PRESERVING the carry flag of the caller. This is sometimes a big concern for assembly routines because the carry flag is a great place to pass booleans. Lastly, here's a slightly rearranged version for clarity: stx divisor sty dividend ldx #16 lda #0 rol dividend lp rol cmp divisor bcc nextbit sbc divisor nextbit rol dividend dex bne lp ;"dividend" now contains the result of the division. sta remainder We are now communicating the dividend bit instead of the result bit in the carry flag when we branch back to lp. Note that we could also change the initial ROL to an ASL, making the garbage bit a 0 and thereby setting the carry flag to 0 at the end of the routine. Todd Whitesel toddpw @ tybalt.caltech.edu