User Tools

Site Tools


base:8bit_multiplication_16bit_product_fast_no_tables

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
base:8bit_multiplication_16bit_product_fast_no_tables [2020-02-02 22:08] djmipsbase:8bit_multiplication_16bit_product_fast_no_tables [2023-03-15 03:25] (current) djmips
Line 1: Line 1:
 +====== 8bit multiplication with 16bit product ======
 +
 +This code aims to be fast, without using tables.
 +
 <code> <code>
 ; mul 8x8 16 bit result for when you can't afford big tables ; mul 8x8 16 bit result for when you can't afford big tables
 ; by djmips  ; by djmips 
 ; ;
-; inputs are mul1 and mul2 and should be zero page. +; inputs are mul1 and X.  mul1 and mul2 should be zp locations
 ; A should be zero entering but if you want it will factor in as 1/2 A added to the result. ; A should be zero entering but if you want it will factor in as 1/2 A added to the result.
 ; ;
 ; output is 16 bit in A : mul1   (A is high byte) ; output is 16 bit in A : mul1   (A is high byte)
 ; ;
 +; length = 65 bytes 
 ; total cycles worst case = 113 ; total cycles worst case = 113
 ; total cycles best case = 97 ; total cycles best case = 97
 ; avg = 105 ; avg = 105
-; inner loop credits supercat+; inner loop credits Damon Slye CALL APPLE, JUNE 1983, P45-48.
  
 MUL: MUL:
-     dec mul2 ;5  ; decrement mul2 because we will be adding with carry set for speed (an extra one) +     cpx #$00 
-     ror mul1 ;5      \ +     beq zro 
-     bcc b1   ;2/3      Best case 8 Worst case 10 +     dex          ; decrement mul2 because we will be adding with carry set for speed (an extra one) 
-     adc mul2 ;3       / +     stx mul2  
-b1: ror     ;2       \ +     ror mul1 
-     ror mul1 ;5         \ +     bcc b1 
-     bcc b2   ;2/3      /  Best case 10 Worst case 12 +     adc mul2 
-     adc mul2 ;3       /   +b1:  ror 
 +     ror mul1 
 +     bcc b2 
 +     adc mul2
 b2:  ror b2:  ror
      ror mul1      ror mul1
      bcc b3      bcc b3
-     adc mul2  ; 10 or 12+     adc mul2
 b3:  ror b3:  ror
      ror mul1      ror mul1
      bcc b4      bcc b4
-     adc mul2  ; 10 or 12+     adc mul2
 b4:  ror b4:  ror
      ror mul1      ror mul1
      bcc b5      bcc b5
-     adc mul2   ; 10 or 12 +     adc mul2
 b5:  ror b5:  ror
      ror mul1      ror mul1
      bcc b6      bcc b6
-     adc mul2   ; 10 or 12 +     adc mul2
 b6:  ror b6:  ror
      ror mul1      ror mul1
      bcc b7      bcc b7
-     adc mul2   ; 10 or 12 +     adc mul2
 b7:  ror b7:  ror
      ror mul1      ror mul1
      bcc b8      bcc b8
-     adc mul2  ; 10 or 12 +     adc mul2 
-b8:  ror    ; 2 +b8:  ror 
-     ror mul1  ; 5 +     ror mul1 
-     inc mul2  ; 5+     inx          ; Optional - this preserves X across the call - could also do inc mul2 or leave out
      rts      rts
 +     
 +zro: stx mul1
 +     txa
 +     rts     
 </code> </code>
  
  
base/8bit_multiplication_16bit_product_fast_no_tables.1580677701.txt.gz · Last modified: 2020-02-02 22:08 by djmips