Differences

This shows you the differences between two versions of the page.

--- base:8bit_multiplication_16bit_product_fast_no_tables [2020-02-02 22:00] – djmips
+++ base:8bit_multiplication_16bit_product_fast_no_tables [2023-03-15 03:25] (current) – djmips
@@ Line 1: / Line 1: @@
+====== 8bit multiplication with 16bit product ======
+This code aims to be fast, without using tables.
 <code>
 ; mul 8x8 16 bit result for when you can't afford big tables
 ; by djmips
 ;
-; inputs are mul1 and mul2 and A should be zero
+; inputs are mul1 and X.  mul1 and mul2 should be zp locations
-; output is 16 bit in A : mul1
+; A should be zero entering but if you want it will factor in as 1/2 A added to the result.
 ;
+; output is 16 bit in A : mul1   (A is high byte)
+;
+; length = 65 bytes
 ; total cycles worst case = 113
 ; total cycles best case = 97
 ; avg = 105
-; inner loop credits supercat
+; inner loop credits Damon Slye CALL APPLE, JUNE 1983, P45-48.
 MUL:
-     dec mul2	;5  ; decrement mul2 because we will be adding with carry set for speed (an extra one)
+     cpx #$00
-     ror mul1	;5      \
+     beq zro
-     bcc b1	  ;2/3     \  Best case 8 Worst case 10
+     dex          ; decrement mul2 because we will be adding with carry set for speed (an extra one)
-     adc mul2	;3       /
+     stx mul2
-b1: ror		    ;2       \
+     ror mul1
-     ror mul1	;5         \
+     bcc b1
-     bcc b2	  ;2/3      /  Best case 10 Worst case 12
+     adc mul2
-     adc mul2	;3       /
+b1:  ror
+     ror mul1
+     bcc b2
+     adc mul2
 b2:  ror
      ror mul1
      bcc b3
-     adc mul2  ; 10 or 12
+     adc mul2
 b3:  ror
      ror mul1
      bcc b4
-     adc mul2  ; 10 or 12
+     adc mul2
 b4:  ror
      ror mul1
      bcc b5
-     adc mul2   ; 10 or 12
+     adc mul2
 b5:  ror
      ror mul1
      bcc b6
-     adc mul2   ; 10 or 12
+     adc mul2
 b6:  ror
      ror mul1
      bcc b7
-     adc mul2   ; 10 or 12
+     adc mul2
 b7:  ror
      ror mul1
      bcc b8
-     adc mul2  ; 10 or 12
+     adc mul2
-b8:  ror		   ; 2
+b8:  ror
-     ror mul1	 ; 5
+     ror mul1
-     inc mul2  ; 5
+     inx          ; Optional - this preserves X across the call - could also do inc mul2 or leave out
      rts
+zro: stx mul1
+     txa
+     rts
 </code>