base:optimal_sort_16-bit_elements
no way to compare when less than two revisions
Differences
This shows you the differences between two versions of the page.
— | base:optimal_sort_16-bit_elements [2015-04-17 04:33] (current) – created - external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== Optimal Sort for any number of 16-bit elements ====== | ||
+ | by Mats Rosengren | ||
+ | |||
+ | ====== General Discussion ====== | ||
+ | The extension of the " | ||
+ | |||
+ | Consider first the loop using the Y register to sequentially access the elements starting with the second last and ending with the first in a sequence of up to 255 elements (note that the last access here is with 1 in the Y register, i.e. the " | ||
+ | < | ||
+ | L1 DEY | ||
+ | BEQ L3 | ||
+ | LDA (ZPADD),Y | ||
+ | " | ||
+ | BRA L1 | ||
+ | L3 | ||
+ | </ | ||
+ | To access a longer sequence one could use nested loops with the X register for the outer and the Y register for the inner loop as follows: | ||
+ | < | ||
+ | L1 TYA | ||
+ | BNE L2 | ||
+ | TXA | ||
+ | BEQ L3 | ||
+ | DEX | ||
+ | DEC ZPADD+1 | ||
+ | L2 DEY | ||
+ | LDA (ZPADD),Y | ||
+ | " | ||
+ | BRA L1 | ||
+ | L3 | ||
+ | </ | ||
+ | If initially the registers X and Y are given the values x and y and the value y is added to high part of the pointer ZPADD pointing to the first element of the sequence to be sorted the loop will be run through y + 256 * x times starting with the second last and ending with the first in a sequence of 1 + y + 256 * x elements. And when the loop is finished the pointer ZPADD will again point to the first element of the sequence while the X and the Y registers will take the value zero. Note that as here the last access is with 0 in the Y register the " | ||
+ | |||
+ | To work with 16 bit values one has to handle the 8 less significant bits and the 8 more significant bits separately. The comparison is then made such that first the 8 most significant bits are compared. The 8 less significant bits only need to be compared if the 8 more significant bits were equal. The complete routine is then as follows: | ||
+ | |||
+ | < | ||
+ | SORT CLC | ||
+ | LDA ZPADL+1 | ||
+ | ADC NVALH | ||
+ | STA WORK4L | ||
+ | BCS LER ;BAD INPUT VALUES (TOO FAR TO BRANCH DIRECTLY) | ||
+ | LDA ZPADH+1 | ||
+ | ADC NVALH | ||
+ | STA WORK4H | ||
+ | LER BCS L3B ;BAD INPUT VALUES | ||
+ | L0 LDY NVALL ;NUMBER OF DECREMENTS, LOW | ||
+ | LDX NVALH ;NUMBER OF DECREMENTS, HIGH | ||
+ | LDA WORK4L | ||
+ | STA ZPADL+1 | ||
+ | LDA (ZPADL), | ||
+ | STA WORK3L | ||
+ | LDA WORK4H | ||
+ | STA ZPADH+1 | ||
+ | LDA (ZPADH), | ||
+ | STA WORK3H | ||
+ | BRA L2 | ||
+ | L1 | ||
+ | BNE L1A | ||
+ | TXA | ||
+ | BEQ L3 ;EXIT INNER LOOP | ||
+ | DEX | ||
+ | DEC ZPADL+1 | ||
+ | DEC ZPADH+1 | ||
+ | L1A DEY | ||
+ | LDA (ZPADH),Y | ||
+ | CMP WORK2H | ||
+ | BNE L1B | ||
+ | LDA (ZPADL),Y | ||
+ | CMP WORK2L | ||
+ | L1B BCC L1 | ||
+ | L2 LDA (ZPADL),Y | ||
+ | STA WORK2L | ||
+ | LDA (ZPADH),Y | ||
+ | STA WORK2H | ||
+ | STY WORK1L | ||
+ | LDA ZPADL+1 | ||
+ | STA WORK1LH | ||
+ | LDA ZPADH+1 | ||
+ | STA WORK1HH | ||
+ | BRA L1 ;END INNER LOOP | ||
+ | L3 LDY NVALL ;WHERE THE LARGEST VALUE SHALL BE PUT | ||
+ | LDA WORK4L | ||
+ | STA ZPADL+1 | ||
+ | LDA WORK2L | ||
+ | STA (ZPADL),Y | ||
+ | LDA WORK4H | ||
+ | STA ZPADH+1 | ||
+ | LDA WORK2H | ||
+ | STA (ZPADH),Y | ||
+ | LDY WORK1L | ||
+ | LDA WORK1LH | ||
+ | STA ZPADL+1 | ||
+ | LDA WORK3L | ||
+ | STA (ZPADL), | ||
+ | LDA WORK1HH | ||
+ | STA ZPADH+1 | ||
+ | LDA WORK3H | ||
+ | STA (ZPADH), | ||
+ | LDA NVALL | ||
+ | BNE L3A | ||
+ | LDA NVALH | ||
+ | BEQ L3B ;EXIT OUTER LOOP | ||
+ | DEC NVALH | ||
+ | DEC WORK4L | ||
+ | DEC WORK4H | ||
+ | L3A DEC NVALL ;END OF THE SHORTER SEQUENCE STILL LEFT | ||
+ | BRA L0 ; | ||
+ | L3B RTS | ||
+ | </ | ||
+ | The calling program has to set the pointers ZPADL and ZPADH pointing to the start of the (separately stored) sequences of the 8 less and the 8 more significant bits of the values. The total number of values are 1 + y + 256 * x where the value x has to be given to variable NVALL and the value y has to be given to variable NVALH. At the end these variables will both be zero. | ||
+ | |||
+ | In addition the following 9 bytes are needed internally | ||
+ | < | ||
+ | WORK1L | ||
+ | WORK1LH | ||
+ | WORK1HH | ||
+ | WORK2L | ||
+ | WORK2H | ||
+ | WORK3L | ||
+ | WORK3H | ||
+ | WORK4L | ||
+ | WORK4H | ||
+ | </ | ||
+ | The following is a template for a program calling the routine: | ||
+ | < | ||
+ | ; | ||
+ | ZPADL = $30 ;2 BYTE POINTER IN PAGE ZERO TO THE "LESS SIGNIFICANT" | ||
+ | ZPADH = $32 ;2 BYTE POINTER IN PAGE ZERO TO THE "MORE SIGNIFICANT" | ||
+ | NVALL = $34 ;< "START ADDRESS" | ||
+ | NVALH = $35 ;> "START ADDRESS" | ||
+ | ; | ||
+ | ;9 BYTES USED AS WORKING AREA | ||
+ | ; | ||
+ | WORK1L | ||
+ | WORK1LH = $37 ;HIGH INDEX (ADDL+1) FOR POTENTIALLY LARGEST VALUE (LEAST SIGNIFICANT PART) | ||
+ | WORK1HH = $38 ;HIGH INDEX (ADDL+1) FOR POTENTIALLY LARGEST VALUE (MOST SIGNIFICANT PART) | ||
+ | WORK2L | ||
+ | WORK2H | ||
+ | WORK3L | ||
+ | WORK3H | ||
+ | WORK4L | ||
+ | WORK4H | ||
+ | | ||
+ | LDA #< SEQL | ||
+ | STA ZPADL | ||
+ | LDA #> SEQL | ||
+ | STA ZPADL+1 | ||
+ | LDA #< SEQH | ||
+ | STA ZPADH | ||
+ | LDA #> SEQH | ||
+ | STA ZPADH+1 | ||
+ | LDA #$09 | ||
+ | STA NVALL | ||
+ | LDA #$00 | ||
+ | STA NVALH | ||
+ | JSR SORT | ||
+ | BRK | ||
+ | SEQL .BYTE $30 | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | SEQH .BYTE $05 | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | " | ||
+ | </ | ||
+ | |||
+ | To test the efficiency (and correctness) of this algorithm I generated a totally random permutation of the numbers 0 - 9999 using a random number generator. This sequence was successfully sorted using subroutine SORT above. I measured the number of T1H counts of the W65C22S timer 1 during this sorting, it was 4324613. As T1H decrements every 256 PHI2 pulses and my W6502 runs at 8 megahertz this corresponds to 138.4 seconds. | ||
base/optimal_sort_16-bit_elements.txt · Last modified: 2015-04-17 04:33 by 127.0.0.1