User Tools

Site Tools


base:fast_sqrt

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
base:fast_sqrt [2019-07-27 21:28] verzbase:fast_sqrt [2019-08-18 10:20] verz
Line 390: Line 390:
         }         }
 </code> </code>
-where //p// is the number of bits of the result; or half the bits of the radicand (+1 if the radicand has an odd amount of bits).\\ +where //p// is the number of bits of the result; (or half the bits of the radicand+1 if the radicand has an odd amount of bits).\\ 
 //p=ceil(#bits-of-radicand/2)//\\  //p=ceil(#bits-of-radicand/2)//\\ 
  
Line 397: Line 397:
  
 This is the code for a 32bit integer Sqrt. Provides the result and the remainder: This is the code for a 32bit integer Sqrt. Provides the result and the remainder:
-<code>+<code 6502acme>
 ;******************************************** ;********************************************
 ;*    sqrt32 ;*    sqrt32
 ;* ;*
 ;*   computes Sqrt of a 32bit number ;*   computes Sqrt of a 32bit number
 +;********************************************
 +;*   by Verz - Jul2019
 +;********************************************
 ;* ;*
-;*      input:  square, the 4-byte source number +;*  input:  square, 32bit source number 
-;*      output: sqrt, 16bit value +;*  output: sqrt,   16bit value 
-;*              remnd, 16bit value +;*          remnd,  17bit value 
-;*+;********************************************
  
 sqrt32  lda #0 sqrt32  lda #0
         sta sqrt        ; R=0         sta sqrt        ; R=0
         sta sqrt+1         sta sqrt+1
 +        sta M+4
         ;sta T+1        ; (T+1) is zero until last iteration; (T+0) is always 0         ;sta T+1        ; (T+1) is zero until last iteration; (T+0) is always 0
  
         ldy #14         ; 15 iterations (14-->0) + last iteration         ldy #14         ; 15 iterations (14-->0) + last iteration
-loopsq  clc +loopsq   
-        lda stablo,   ; (2*R+D) LSR 1; actually: R+(D LSR 1) +        lda sqrt        ; (2*R+D) LSR 1; actually: R+(D LSR 1) 
-        adc sqrt +        ora stablo,   ; using ORA instead of ADC is ok because the bit to be set 
-        sta T+2 +        sta T+2         ;    will have not been affected yet 
-        lda stabhi,y +        lda sqrt+1 
-        adc sqrt+1+        ora stabhi,y
         sta T+3         sta T+3
 +        bcs skip0       ; takes care of large numbers; if set, M>T
                  
         lda M+3         lda M+3
Line 436: Line 441:
         sbc T+3         sbc T+3
         sta M+3         sta M+3
-        clc             ; possibly unnecessary 
         lda sqrt        ; R=R+D         lda sqrt        ; R=R+D
-        adc stablo+1,y+        ora stablo+1,y
         sta sqrt         sta sqrt
         lda sqrt+1         lda sqrt+1
-        adc stabhi+1,y+        ora stabhi+1,y
         sta sqrt+1         sta sqrt+1
 skip1 skip1
Line 451: Line 455:
         bpl loopsq         bpl loopsq
 lastiter                ; code for last iteration lastiter                ; code for last iteration
 +        bcs skp0        ; takes care of large numbers; if set, M>T
         ; during last iteration D=1, so [(2*R+D) LSR 1] makes D the MSB of T+1         ; during last iteration D=1, so [(2*R+D) LSR 1] makes D the MSB of T+1
         lda M+3         lda M+3
Line 474: Line 479:
         sta M+3         sta M+3
         inc sqrt        ; R=R+D with D=1         inc sqrt        ; R=R+D with D=1
-        bne skp1 +skp1    asl M+1         ; M=M*2; location M+0=0
-        inc sqrt+1 +
-skp1    asl M           ; M=M*2 +
-        rol M+1+
         rol M+2         rol M+2
         rol M+3         rol M+3
 +        rol M+4
         rts         rts
  
 ;**** Variables and Shift table ;**** Variables and Shift table
-       byte 0 
 stabhi byte 0,0,0,0,0,0,0,0 stabhi byte 0,0,0,0,0,0,0,0
 stablo BYTE $01,$02,$04,$08,$10,$20,$40,$80 stablo BYTE $01,$02,$04,$08,$10,$20,$40,$80
        byte 0,0,0,0,0,0,0,0        byte 0,0,0,0,0,0,0,0
  
-square  = $5B     bytes: input value +square = $57    bytes: input value; during calculation needs the 5th byte 
-sqrt    = $5F     ; 2 bytes: result +sqrt   = $5F    ; 2 bytes: result 
-remnd   = M+2     ; 2 bytes: is in fact the high bytes of M (M LSR 16) +remnd  = M+2    ; 2 B + 1 b: is in the high bytes of M (M LSR 16); msb is in T+0 (the 5th byte of square
-      = $57     ; 4 bytes: could be 2 bytes: T+0 is always 0; T+1 is 0 until last iteration +     = $5B    ; 4 bytes: could be 2 bytes: T+0 is always 0; T+1 is 0 until last iteration 
-      $5B     ; 4 bytes: over the input square+     square ; 4 bytes: over the input square
 </code> </code>
-The algorithm is pretty fast: it has a top cycles count of around 1700, but seems to average at 1.3ms with variables in page zero.\\  +The algorithm is pretty fast: it has a top cycles count of around 1700, but seems to average at 1.3ms (using variables in page zero).\\ 
-I think it could be convenient to add a control to check whether M=0 and exit earlier.+
  
base/fast_sqrt.txt · Last modified: 2019-08-18 20:28 by verz