User Tools

Site Tools


base:fastest_multiplication

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
base:fastest_multiplication [2023-08-21 04:54] reposebase:fastest_multiplication [2024-02-13 08:24] (current) – accurate timing if Jack Asser code repose
Line 5: Line 5:
 Requires tables or a generator routine such as [[table_generator_routine_for_fast_8_bit_mul_table]] Requires tables or a generator routine such as [[table_generator_routine_for_fast_8_bit_mul_table]]
  
-Jack Asser's: 233 cycles  ref: [[seriously_fast_multiplication]] \\ +Jack Asser's: 238.1 cycles  ref: [[seriously_fast_multiplication]] \\ 
-Chris Jam's: 204.5  ref: http://csdb.dk/forums/?roomid=11&topicid=91766 \\ +Chris Jam's: 204.5 ref: http://csdb.dk/forums/?roomid=11&topicid=91766 \\ 
-Mine: 196 zp variation: 192 \\ +Mine: 198.6 zp variation: 194.6 \\ 
-Times above need to add 12 for jsr/rts \\ +Times above need to add for rts \\ 
-Note: updated 2023; corrected typos and timings \\+Note: updated 2023; corrected code, typos and timings \\ 
 +*Timing approximate for external code \\ 
 +This is the winner out of 120 published algorithms as independently tested here: 
 +https://github.com/TobyLobster/multiply_test \\ 
 +Note: I have a new version which is 188.1 cycles, excepting the RTS. \\
  
 <code> <code>
Line 22: Line 26:
 ;zero page ram required: minimum 8 bytes, ideally 14 ;zero page ram required: minimum 8 bytes, ideally 14
 ;do_add: 30 bytes in zp, if used ;do_add: 30 bytes in zp, if used
-;time: 196 cycles, option for 192 if you use 30 more zp bytes for do_add+;time: 198.6 cycles, option for 194.6 if you use 30 more zp bytes for do_add
 ;measurement method: average timings over all input combinations ;measurement method: average timings over all input combinations
  
 ;How to use: ;How to use:
-;put numbers in x/y and result is Y reg, reg, z1, z0+;put numbers in x/y and result is Y reg (z3)reg (z2), z1, z0
  
 ;tables of squares ;tables of squares
 ;sqr(x)=x^2/4 ;sqr(x)=x^2/4
 ;negsqr(x)=(255-x)^2/4 ;negsqr(x)=(255-x)^2/4
-sqrlo=$c000;511 bytes +sqrlo=$c100;511 bytes 
-sqrhi=$c200;511 bytes +sqrhi=$c300;511 bytes 
-negsqrlo=$c400;511 bytes +negsqrlo=$c500;511 bytes 
-negsqrhi=$c600;511 bytes+negsqrhi=$c700;511 bytes
  
 ;pointers to square tables above ;pointers to square tables above
Line 49: Line 53:
 z0=$80;product, 2 bytes z0=$80;product, 2 bytes
 z1=$81 z1=$81
-z2=$82 ;returned in reg+z2=$82 ;returned in reg
 z3=$83 ;returned in Y reg z3=$83 ;returned in Y reg
  
 ;Example showing use ;Example showing use
 +*=$c000
 lda #$ff lda #$ff
 sta x0 sta x0
Line 60: Line 65:
 jsr makesqrtables jsr makesqrtables
 jsr umult16 jsr umult16
-stx z2+sta z2
 sty z3 sty z3
 +rts
 ;result should be $fffe0001, e.g. as viewed with a typical m 0080 monitor command: ;result should be $fffe0001, e.g. as viewed with a typical m 0080 monitor command:
 ;0080 01 00 fe ff ;0080 01 00 fe ff
Line 167: Line 173:
 lda (p_sqr_hi),y lda (p_sqr_hi),y
 sbc (p_invsqr_hi),y sbc (p_invsqr_hi),y
-tay;x1*y1h;Y=z3, 30.992 cycles+tay;Y=x1*y1h, 30.992 cycles
 ;17+34+33+17+33+31=164.97 cycles for main multiply part (minimum=157, maximum=173) ;17+34+33+17+33+31=164.97 cycles for main multiply part (minimum=157, maximum=173)
  
Line 181: Line 187:
 c2a: lda #0 c2a: lda #0
 c2b: adc #0 c2b: adc #0
- tax;X=z2, 6 cycles + tax 
- bcc c1c;3/6 avg 4.5+ bcc c1c;9
  iny;z3++  iny;z3++
- clc+ clc;(+3) taken 7% of the time, 3*.07=+.21
  
 ;-add last number of column 1 ;-add last number of column 1
Line 192: Line 198:
  
 ;-add last number of column 2 ;-add last number of column 2
- txa;A=z2+ txa
 c2c: adc #0 c2c: adc #0
- tax;X=z2, 6 + ;A=z2 
- bcc fin;3/4 avg 3.5 + bcc fin;7 
- iny;z3+++ iny;(+1) taken 42% of the time, 1*.42=.42
  
-;Y=z3, X=z2 +;Y=z3, A=z2 
-;9+6+4.5+8+6+3.5=37 +;add partials part total cycles=33.63 (minimum=33, maximum=37) 
-fin: rts+;total time=164.97+33.63=198.6 
 +fin: rts;add 6 to include this (204.6)
  
 Diagram of the additions Diagram of the additions
base/fastest_multiplication.1692586473.txt.gz · Last modified: 2023-08-21 04:54 by repose