User Tools

Site Tools


base:fast_sqrt

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
base:fast_sqrt [2019-07-27 18:20] verzbase:fast_sqrt [2019-08-18 20:28] (current) verz
Line 390: Line 390:
         }         }
 </code> </code>
-where //p// is the number of bits of the result; or half the bits of the radicand (+1 if the radicand has an odd amount of bits).\\ +where //p// is the number of bits of the result; (or half the bits of the radicand+1 if the radicand has an odd amount of bits).\\ 
 //p=ceil(#bits-of-radicand/2)//\\  //p=ceil(#bits-of-radicand/2)//\\ 
  
 Incidentally the algorithm provides also the remainder via (M LSR p)\\  Incidentally the algorithm provides also the remainder via (M LSR p)\\ 
-//p// can be greater than necessary, just producing more iterations. A value of p=16 can be choosed to perform the calculations for 32bit numbers and smaller.+//p// can be greater than necessary, just producing more iterations. A value of p=16 permits to perform the calculations for 32bit numbers and smaller.
  
 This is the code for a 32bit integer Sqrt. Provides the result and the remainder: This is the code for a 32bit integer Sqrt. Provides the result and the remainder:
-<code>+<code 6502acme>
 ;******************************************** ;********************************************
 ;*    sqrt32 ;*    sqrt32
 ;* ;*
 ;*   computes Sqrt of a 32bit number ;*   computes Sqrt of a 32bit number
 +;********************************************
 +;*   by Verz - Jul2019
 +;********************************************
 ;* ;*
-;*      input:  square, the 4-byte source number +;*  input:  square, 32bit source number 
-;*      output: sqrt, 16bit value +;*  output: sqrt,   16bit value 
-;*              remnd, 16bit value +;*          remnd,  17bit value 
-;*+;********************************************
  
 sqrt32  lda #0 sqrt32  lda #0
         sta sqrt        ; R=0         sta sqrt        ; R=0
         sta sqrt+1         sta sqrt+1
 +        sta M+4
         ;sta T+1        ; (T+1) is zero until last iteration; (T+0) is always 0         ;sta T+1        ; (T+1) is zero until last iteration; (T+0) is always 0
  
 +        clc
         ldy #14         ; 15 iterations (14-->0) + last iteration         ldy #14         ; 15 iterations (14-->0) + last iteration
-loopsq  clc +loopsq   
-        lda stablo,   ; (2*R+D) LSR 1; actually: R+(D LSR 1) +        lda sqrt        ; (2*R+D) LSR 1; actually: R+(D LSR 1) 
-        adc sqrt +        ora stablo,   ; using ORA instead of ADC is ok because the bit to be set 
-        sta T+2 +        sta T+2         ;    will have not been affected yet 
-        lda stabhi,y +        lda sqrt+1 
-        adc sqrt+1+        ora stabhi,y
         sta T+3         sta T+3
 +        bcs skip0       ; takes care of large numbers; if set, M>T
                  
         lda M+3         lda M+3
Line 436: Line 442:
         sbc T+3         sbc T+3
         sta M+3         sta M+3
-        clc             ; possibly unnecessary 
         lda sqrt        ; R=R+D         lda sqrt        ; R=R+D
-        adc stablo+1,y+        ora stablo+1,y
         sta sqrt         sta sqrt
         lda sqrt+1         lda sqrt+1
-        adc stabhi+1,y+        ora stabhi+1,y
         sta sqrt+1         sta sqrt+1
 skip1 skip1
Line 451: Line 456:
         bpl loopsq         bpl loopsq
 lastiter                ; code for last iteration lastiter                ; code for last iteration
 +        bcs skp0        ; takes care of large numbers; if set, M>T
         ; during last iteration D=1, so [(2*R+D) LSR 1] makes D the MSB of T+1         ; during last iteration D=1, so [(2*R+D) LSR 1] makes D the MSB of T+1
         lda M+3         lda M+3
Line 474: Line 480:
         sta M+3         sta M+3
         inc sqrt        ; R=R+D with D=1         inc sqrt        ; R=R+D with D=1
-        bne skp1 +skp1    asl M+1         ; M=M*2; location M+0=0
-        inc sqrt+1 +
-skp1    asl M           ; M=M*2 +
-        rol M+1+
         rol M+2         rol M+2
         rol M+3         rol M+3
 +        rol M+4
         rts         rts
  
 ;**** Variables and Shift table ;**** Variables and Shift table
-       byte 0 
 stabhi byte 0,0,0,0,0,0,0,0 stabhi byte 0,0,0,0,0,0,0,0
 stablo BYTE $01,$02,$04,$08,$10,$20,$40,$80 stablo BYTE $01,$02,$04,$08,$10,$20,$40,$80
        byte 0,0,0,0,0,0,0,0        byte 0,0,0,0,0,0,0,0
  
-square  = $5B     bytes: input value +square = $57    bytes: input value; during calculation needs the 5th byte 
-sqrt    = $5F     ; 2 bytes: result +sqrt   = $5F    ; 2 bytes: result 
-remnd   = M+2     ; 2 bytes: is in fact the high bytes of M (M LSR 16) +remnd  = M+2    ; 2 B + 1 b: is in the high bytes of M (M LSR 16); msb is in T+0 (the 5th byte of square
-      = $57     ; 4 bytes: could be 2 bytes: T+0 is always 0; T+1 is 0 until last iteration +     = $5B    ; 4 bytes: could be 2 bytes: T+0 is always 0; T+1 is 0 until last iteration 
-      $5B     ; 4 bytes: over the input square+     square ; 4 bytes: over the input square
 </code> </code>
-The algorithm is pretty fast: it has a top cycles count of around 1700, but seems to average at 1.3ms with variables in page zero.\\  +The algorithm is pretty fast: it has a top cycles count of around 1700, but seems to average at 1.3ms (using variables in page zero).\\ 
-I think it could be convenient to add a control to check whether M=0 and exit earlier.+
  
base/fast_sqrt.1564244423.txt.gz · Last modified: 2019-07-27 18:20 by verz