User Tools

Site Tools


base:advanced_optimizing

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
base:advanced_optimizing [2021-04-19 08:26] – [LAX] bitbreakerbase:advanced_optimizing [2024-03-03 11:06] (current) – [ASR] bitbreaker
Line 718: Line 718:
  
 The advantage is, that you can move bits also across registers and are not restricted to the accumulator only. The advantage is, that you can move bits also across registers and are not restricted to the accumulator only.
 +
 +When shifting, we handle 9 bits, as the bit falling out at one edge of the byte will be the new carry, and the old carry will be shifted in. This will introduce a gap of one bit, when we wrap around bits:
 +
 +<code>
 +        lda #%11111111
 +        clc
 +        rol
 +        rol
 +        ;-> A = %11111101
 +        ;              ^
 +        ;             gap :-(
 +</code>
 +
 +To avoid this behavior there's several ways around it:
 +
 +<code>
 +        lda #%11111111
 +        asl
 +        adc #0
 +        
 +        ...
 +        
 +        lda #%11111111
 +        anc #$ff
 +        rol
 +        
 +        ...
 +        
 +        lda #%11111111
 +        cmp #$80
 +        rol
 +</code>
 +
 +This way bit 7 is copied to carry first and then shifted in on the right end again.
 +
 +If you deal with chars, you often need numbers divided by 8, this also includes numbers bigger than 8 bits, as the screen is 320 pixels wide. If you include clipping you might even span over a bigger range.
 +An easy way to shift 11 bits to a final 8 bit results without having to deal with two different bytes being shifted independently, is the following:
 +
 +<code>
 +        lda xhi        ;00000hhh
 +        asr #$0f       ;000000hh h - might also be a lsr in case if no upper bits need to be clamped
 +        ora xlo        ;lllll0hh h
 +        ror            ;hlllll0h h
 +        ror            ;hhlllll0 h
 +        ror            ;hhhlllll 0
 +</code>
 +
 +As the least significant 3 bits are lost during the shift anyway, we place the bits for the highbyte there and rotate them back in on the left side, so all we need to shift then is a single byte. To make the rotation work, the highbyte needs to be preshiftet by one before the lowbyte is merged in. The only prerequisite of this method is, that the lowbyte must have least significant three bits cleared. 
 ====== Jumpcode ====== ====== Jumpcode ======
  
Line 859: Line 907:
                  
 In the same way this method can also be used to set bits (for e.g. with adc #$81) or to toggle bits. In the same way this method can also be used to set bits (for e.g. with adc #$81) or to toggle bits.
 +
 +When masking out bits, SAX or SBX is often a good choice.
 + 
 +<code>
 +       lax value
 +       and #%11110000
 +       sta highnibble
 +</code>
 +
 +After this we need to restore from X to mask the lower bits, better then another lda value, but still. 
 +
 +<code>      
 +       lda value
 +       ldx #%11110000
 +       sax highnibble
 +</code>
 +
 +This looks already better, we have the original value still in A and can do another mask operation.
 +
 +<code>
 +       lax value
 +       eor #%000011111
 +       sax highnibble       
 +</code>
 +
 +This looks even better, we can reuse X here and also A still contains the original bits, but in an inverted manner. So this opens up more options of reusing the original value at more than one register which gives potential for further savings.
 +This was spotted in Krill's loader when doing lookups on the GCR tables, so thanks to Krill here :-)
 ====== Illegal opcodes ====== ====== Illegal opcodes ======
  
Line 983: Line 1058:
  
 <code> <code>
-        and #$ff+        and #$fe
         lsr         lsr
-        clc 
 </code> </code>
 ===== ARR ===== ===== ARR =====
Line 1411: Line 1485:
 </code> </code>
  
-Depending on what you have in register A, you can express it in many differnet ways:+Depending on what you have in register A, you can express it in many different ways:
  
 <code> <code>
Line 1441: Line 1515:
 There are of course also other expressions possible, just ponder a while about the term. Also the carry flag after the negation can be influenced, depending on using sbc or adc for most cases ($00/$ff will cause an overflow). There are of course also other expressions possible, just ponder a while about the term. Also the carry flag after the negation can be influenced, depending on using sbc or adc for most cases ($00/$ff will cause an overflow).
  
-How'about forming terms with logical operations? We notice, that for e.g. (a + b) xor $ff is the same as (a xor $ff) - b:+How about forming terms with logical operations? We notice, that for e.g. (a + b) xor $ff is the same as (a xor $ff) - b:
  
 <code> <code>
Line 1470: Line 1544:
         tsx ;fetch value from table again         tsx ;fetch value from table again
 </code> </code>
 +
 +====== Limiting and masking ======
 +
 +Sometimes it occurs, that we want to extract the low nibble of a value and limit it to a given range.
 +
 +<code>
 +        bpl .positive
 +        cmp #$f0
 +        bcs +
 +        lda #$f0
 ++
 +        and #$0f
 +</code>
 +
 +As you can see, we limit the value to $f0 .. $ff first and then clamp of the highnibble to end up with values that range from $00..$0f
 +
 +Observe, how this can be done cheaper, by just shifting the range and making use of the wrap around of 8 bits/carry:
 +
 +<code>
 +        bpl .positive
 +        ;clc
 +        adc #$10
 +        bcs +
 +        lda #$00
 ++
 +</code>
 +
 +We add $10 so the limit is then reached, depending on the carry. As we now wrapped the 8 bits by overflowing, the upper bits are already zero and we can forgo on the and #$0f component. The lownibble is not affected, as we focus on the lower 4 bits only.
  
 ====== Misc stuff ====== ====== Misc stuff ======
Line 1539: Line 1641:
  
 <code> <code>
-        lda bmp+        lda bmp       ;could also use lax bmp, sbx #$08, stx bmp to save more cycles
         sec         sec
         sbc #$08         sbc #$08
Line 1612: Line 1714:
 **HAPPY OPTIMIZING!** **HAPPY OPTIMIZING!**
  
-Bitbreaker/Oxyron^Nuance+Bitbreaker/Performers^Nuance
base/advanced_optimizing.1618813584.txt.gz · Last modified: 2021-04-19 08:26 by bitbreaker