Differences

This shows you the differences between two versions of the page.

--- base:advanced_optimizing [2021-04-19 08:26] – [LAX] bitbreaker
+++ base:advanced_optimizing [2022-12-09 14:47] – [Shifting] bitbreaker
@@ Line 718: / Line 718: @@
 The advantage is, that you can move bits also across registers and are not restricted to the accumulator only.
+When shifting, we handle 9 bits, as the bit falling out at one edge of the byte will be the new carry, and the old carry will be shifted in. This will introduce a gap of one bit, when we wrap around bits:
+<code>
+        lda #%11111111
+        clc
+        rol
+        rol
+        ;-> A = %11111101
+        ;              ^
+        ;             gap :-(
+</code>
+To avoid this behavior there's several ways around it:
+<code>
+        lda #%11111111
+        asl
+        adc #0
+        ...
+        lda #%11111111
+        anc #$ff
+        rol
+        ...
+        lda #%11111111
+        cmp #$80
+        rol
+</code>
+This way bit 7 is copied to carry first and then shifted in on the right end again.
+If you deal with chars, you often need numbers divided by 8, this also includes numbers bigger than 8 bits, as the screen is 320 pixels wide. If you include clipping you might even span over a bigger range.
+An easy way to shift 11 bits to a final 8 bit results without having to deal with two different bytes being shifted independently, is the following:
+<code>
+        lda xhi        ;00000hhh
+        asr #$0f       ;000000hh h - might also be a lsr in case if no upper bits need to be clamped
+        ora xlo        ;lllll0hh h
+        ror            ;hlllll0h h
+        ror            ;hhlllll0 h
+        ror            ;hhhlllll 0
+</code>
+As the least significant 3 bits are lost during the shift anyway, we place the bits for the highbyte there and rotate them back in on the left side, so all we need to shift then is a single byte. To make the rotation work, the highbyte needs to be preshiftet by one before the lowbyte is merged in. The only prerequisite of this method is, that the lowbyte must have least significant three bits cleared.
 ====== Jumpcode ======
@@ Line 1470: / Line 1518: @@
         tsx ;fetch value from table again
 </code>
+====== Limiting and masking ======
+Sometimes it occurs, that we want to extract the low nibble of a value and limit it to a given range.
+<code>
+        bpl .positive
+        cmp #$f0
+        bcs +
+        lda #$f0
++
+        and #$0f
+</code>
+As you can see, we limit the value to $f0 .. $ff first and then clamp of the highnibble to end up with values that range from $00..$0f
+Observe, how this can be done cheaper, by just shifting the range and making use of the wrap around of 8 bits/carry:
+<code>
+        bpl .positive
+        ;clc
+        adc #$10
+        bcs +
+        lda #$00
++
+</code>
+We add $10 so the limit is then reached, depending on the carry. As we now wrapped the 8 bits by overflowing, the upper bits are already zero and we can forgo on the and #$0f component. The lownibble is not affected, as we focus on the lower 4 bits only.
 ====== Misc stuff ======
@@ Line 1539: / Line 1615: @@
 <code>
-        lda bmp
+        lda bmp       ;could also use lax bmp, sbx #$08, stx bmp to save more cycles
         sec
         sbc #$08
@@ Line 1612: / Line 1688: @@
 **HAPPY OPTIMIZING!**
-Bitbreaker/Oxyron^Nuance
+Bitbreaker/Performers^Nuance