User Tools

Site Tools


base:quicksort_16-bit_elements

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
base:quicksort_16-bit_elements [2016-08-14 21:08] litwr2base:quicksort_16-bit_elements [2020-10-20 18:53] (current) – a link added litwr2
Line 1: Line 1:
 ====== Quicksort (for 16-bit Elements) ====== ====== Quicksort (for 16-bit Elements) ======
-by Vladimir Lidovski aka litwr, 13 Aug 2016+by Vladimir Lidovski aka litwr, 13 Aug 2016 (with help of BigEd)
  
-It is well known that the best, the fastest sort routine is Quicksort.  It is very odd that it is not implemented for 6502 for all of 42 yearsfrom 1975 to 2016.  The main problem is in the stack depending nature of Quicksort and the stack limit of 256 bytes of 6502 architecture.  It is solvable.+It is well known that the best, the fastest sort routine is Quicksort.  It is very odd that its implementations for 6502 for all of 42 years (from 1975 to 2016) have a bit blurred and unofficial status.  The main problem is in the stack depending nature of Quicksort and the stack limit of 256 bytes of 6502 architecture.  It is solvable.
  
 The next Pascal code was translated to 6502 assembler. The next Pascal code was translated to 6502 assembler.
Line 61: Line 61:
           tya           tya
           adc i2hi           adc i2hi
-          lsr+          ror
           sta zp1hi           sta zp1hi
           ror zp1lo           ror zp1lo
Line 214: Line 214:
 | |Insertion |21.4 |0.14 |39.75 |0.16 | | |Insertion |21.4 |0.14 |39.75 |0.16 |
 | |Shell     |1.75 |0.8  |1.12  |0.82 | | |Shell     |1.75 |0.8  |1.12  |0.82 |
-| |Quick     |0.75 |0.4  |0.43  |0. |+| |Quick     |0.75 |0.4  |0.47  |0.94  |
  
 ^4096 Integers^Sort Type ^Random ^Ordered ^Reversed ^Zeros ^ ^4096 Integers^Sort Type ^Random ^Ordered ^Reversed ^Zeros ^
 | |Insertion |317.98 |0.6  |635.13 |0.58 | | |Insertion |317.98 |0.6  |635.13 |0.58 |
 | |Shell     |8.12   |4.02 |5.25   |4    | | |Shell     |8.12   |4.02 |5.25   |4    |
-| |Quick     |3.63   |1.85 |2.02   |4.13 |+| |Quick     |3.67   |1.88 |2.04   |4.21 |
  
 ^12288 Integers^Sort Type ^Random ^Ordered ^Reversed ^Zeros ^ ^12288 Integers^Sort Type ^Random ^Ordered ^Reversed ^Zeros ^
 | |Insertion |2877.48|1.78 |5714.08|1.75 | | |Insertion |2877.48|1.78 |5714.08|1.75 |
 | |Shell     |30.18  |13.82|18.05  |13.81| | |Shell     |30.18  |13.82|18.05  |13.81|
-| |Quick     |11.67  |6.78 |7.28   |12.93|+| |Quick     |11.74  |6.83 |7.33   |13.07|
 //Random//, //Ordered//, //Reversed//, //Zeros// mean the type of array filling.  Random filling just copies ROM content into array.  Ordered filling uses numbers from 0 with step 1.  Reversed filling uses numbers from $ffff with step -1.  Zeros filling is just an array filled with the zeros only. //Random//, //Ordered//, //Reversed//, //Zeros// mean the type of array filling.  Random filling just copies ROM content into array.  Ordered filling uses numbers from 0 with step 1.  Reversed filling uses numbers from $ffff with step -1.  Zeros filling is just an array filled with the zeros only.
  
 So Quicksort is more than two times faster than Shell Sort.  Its code occupies together with its call wrapping 259 bytes.  It also uses 6 bytes of zero page and 2 bytes of any RAM.  The Shell sort requires about 250 bytes for the code and data and it doesn't use stack.  It also uses up to 14 bytes at zero page.  All these 14 bytes were used in the mentioned above measurements. So Quicksort is more than two times faster than Shell Sort.  Its code occupies together with its call wrapping 259 bytes.  It also uses 6 bytes of zero page and 2 bytes of any RAM.  The Shell sort requires about 250 bytes for the code and data and it doesn't use stack.  It also uses up to 14 bytes at zero page.  All these 14 bytes were used in the mentioned above measurements.
 +
 +This version of Quicksort requires almost all stack (about 240 bytes) to sort fast more than 32 KB of data.  The required minimum is about 176 bytes but the stack with the only such minimum may slow down sorting dramatically.  If the stack has less free bytes than this minimum then this may give the meditation, the endless loop.
 +
 +It is possible to reduce the stack load by tail call elimination.  It makes Quicksort slightly faster (only by 3-4%) but reduces the stack load more than 50%.  So 128 free bytes in the stack will make the sort of 60 KB data without delays.
 +
 +<code>
 +i2lo = zp0lo
 +i2hi = zp0hi
 +j2lo = zp2lo
 +j2hi = zp2hi
 +x_lo = m2lo
 +x_hi = m2hi
 +ublo = m3lo
 +ubhi = m3hi
 +lblo = m4lo
 +lbhi = m4hi
 +
 +quicksort0
 +          tsx
 +          cpx #16  ;stack limit
 +          bcs qsok
 +
 +qs_csp    ldx #0
 +          txs
 +
 +quicksort lda #>array+size2-2
 +          sta ubhi
 +          lda #<array+size2-2
 +          sta ublo
 +          lda #>array
 +          sta lbhi
 +          lda #<array
 +          sta lblo
 +          tsx
 +          stx qs_csp+1
 +qsok      lda lblo
 +          sta i2lo
 +          lda lbhi
 +          sta i2hi
 +          ldy ubhi
 +          sty j2hi
 +          lda ublo
 +          sta j2lo
 +          clc        ;this code works only for the even align
 +          adc i2lo
 +          and #$fc
 +          sta zp1lo
 +          tya
 +          adc i2hi
 +          ror
 +          sta zp1hi
 +          ror zp1lo
 +
 +          ldy #0
 +          lda (zp1lo),y
 +          sta x_lo
 +          iny
 +          lda (zp1lo),y
 +          sta x_hi
 +qsloop1   ldy #0     ;compare array[i] and x
 +          lda (i2lo),y
 +          cmp x_lo
 +          iny
 +          lda (i2lo),y
 +          sbc x_hi
 +          bcs qs_l1
 +
 +          lda #2
 +          adc i2lo
 +          sta i2lo
 +          bcc qsloop1
 +
 +          inc i2hi
 +          bne qsloop1 ;=jmp
 +
 +qs_l1     ldy #0    ;compare array[j] and x
 +          lda x_lo
 +          cmp (j2lo),y
 +          iny
 +          lda x_hi
 +          sbc (j2lo),y
 +          bcs qs_l3
 +
 +          lda j2lo
 +          sbc #1
 +          sta j2lo
 +          bcs qs_l1
 +
 +          dec j2hi
 +          bne qs_l1 ;=jmp
 +
 +qs_l3     lda j2lo    ;compare i and j
 +          cmp i2lo
 +          lda j2hi
 +          sbc i2hi
 +          bcc qs_l8
 +
 +qs_l6     lda (j2lo),   ;exchange elements with i and j indices
 +          tax
 +          lda (i2lo),y
 +          sta (j2lo),y
 +          txa
 +          sta (i2lo),y
 +          dey
 +          bpl qs_l6
 +
 +          ;sec
 +          lda #1        ;CY=1
 +          adc i2lo
 +          sta i2lo
 +          bcc *+4
 +          inc i2hi
 +          sec
 +          lda j2lo
 +          sbc #2
 +          sta j2lo
 +          bcs *+4
 +          dec j2hi
 +          ;lda j2lo
 +          cmp i2lo
 +          lda j2hi
 +          sbc i2hi
 +          ;bcc *+5
 +          ;jmp qsloop1
 +          bcs qsloop1
 +
 +qs_l8     lda lblo
 +          cmp j2lo
 +          lda lbhi
 +          sbc j2hi
 +          bcs qs_l5
 +
 +          lda i2hi
 +          pha
 +          lda i2lo
 +          pha
 +          lda ubhi
 +          pha
 +          lda ublo
 +          pha
 +          lda j2hi
 +          sta ubhi
 +          lda j2lo
 +          sta ublo
 +          jsr quicksort0
 +          pla
 +          sta ublo
 +          pla
 +          sta ubhi
 +          pla
 +          sta i2lo
 +          pla
 +          sta i2hi
 +qs_l5     lda i2lo
 +          cmp ublo
 +          lda i2hi
 +          sbc ubhi
 +          bcs qs_l7
 +
 +          lda i2hi
 +          sta lbhi
 +          lda i2lo
 +          sta lblo
 +          jmp qsok
 +qs_l7     rts
 +</code>
 +
 +The locations //m3hi//, //m3lo//, //m4hi//, //m4lo// maybe situated anywhere in RAM.  The invocation just a call to quicksort routine.  However it is required to put the proper constants after //quicksort// label.  This makes the invocation code more complex in the general case.
 +
 +The other published 6502 Quicksort is at [[http://www.vcfed.org/forum/showthread.php?4687-QuickSort-in-6502-assembler|Vintage Computer Federation]]. More information can be found on page [[https://github.com/litwr2/6502-sorting|6502 sorting]]
base/quicksort_16-bit_elements.1471201714.txt.gz · Last modified: 2016-08-14 21:08 by litwr2