Fast division for PICs
updated August 10th, 2009
If you went through our fast multiplying, now try the fast division if you dare.
The algorithm that has been applied here belongs to the CORDIC family. Also have a look at our CORDIC square-root function.
Normally division-algorithms follow the way, children are tought to operate. Let's take an example:
|
With RISC-technology, at assembler level, the tests are operated with substractions, checking whether the results are negative, zero or positive. The integer-division is done by successive substractions until the result is negative. A counter then indicates how often substractions were made.
As already pointed out, CORDIC has a very different approach to mathematical operations. The incredible speed of the algorithms are the result from a divide and conquer approach. Practically let's have see how our CORDIC division works:
Suppose you want to integer-divide 8710 = 10101112 through 610 = 1102.
numerator 01010111 base_index := 00000001 = 1 divisor 00000110 result:=0
00001100 00000010 = 2 00011000 00000100 = 4 00110000 00001000 = 8 01100000 00010000 = 16
01010111- 01100000 ---------- < 0
00110000 00001000 = 8
01010111- 00110000 ---------- 00100111, positive remainder
new numerator:= 00100111
result:= result(0) + 8 = 8
00011000 00000100 = 4
00100111- 00011000 ---------- 00001111, remainder positive, so
00001100 00000010 = 2
00001111- 00001100 ---------- 00000011, remainder positive, so
00000110 00000001 = 1
00000011- 00000110 ---------- < 0, so do nothing
|
Here PIC 16F84 and 628 code:
DIVV8 MOVF TEMPY8,F BTFSC STATUS,Z ;SKIP IF NON-ZERO RETURN CLRF RESULT8 MOVLW 1 MOVWF IDX16 SHIFT_IT8 BTFSC TEMPY8,7 GOTO DIVU8LOOP BCF STATUS,C RLF IDX16,F BCF STATUS,C RLF TEMPY8,F GOTO SHIFT_IT8 DIVU8LOOP MOVF TEMPY8,W SUBWF TEMPX8 BTFSC STATUS,C GOTO COUNT8 ADDWF TEMPX8 GOTO FINAL8 COUNT8 MOVF IDX16,W ADDWF RESULT8 FINAL8 BCF STATUS,C RRF TEMPY8,F BCF STATUS,C RRF IDX16,F BTFSS STATUS,C GOTO DIVU8LOOP RETURN |
SUB16 MOVF TEMPY16_H,W MOVWF TEMPYY MOVF TEMPY16,W SUBWF TEMPX16 BTFSS STATUS,C INCF TEMPYY,F MOVF TEMPYY,W SUBWF TEMPX16_H RETURN ADD16BIS MOVF TEMPY16,W ADDWF TEMPX16 BTFSC STATUS,C INCF TEMPX16_H,F MOVF TEMPY16_H,W ADDWF TEMPX16_H RETURN DIVV16 MOVF TEMPY16,F BTFSS STATUS,Z GOTO ZERO_TEST_SKIPPED MOVF TEMPY16_H,F BTFSC STATUS,Z RETURN ZERO_TEST_SKIPPED MOVLW 1 MOVWF IDX16 CLRF IDX16_H CLRF RESULT16 CLRF RESULT16_H SHIFT_IT16 BTFSC TEMPY16_H,7 GOTO DIVU16LOOP BCF STATUS,C RLF IDX16,F RLF IDX16_H,F BCF STATUS,C RLF TEMPY16,F RLF TEMPY16_H,F GOTO SHIFT_IT16 DIVU16LOOP CALL SUB16 BTFSC STATUS,C GOTO COUNTX CALL ADD16BIS GOTO FINALX COUNTX MOVF IDX16,W ADDWF RESULT16 BTFSC STATUS,C INCF RESULT16_H,F MOVF IDX16_H,W ADDWF RESULT16_H FINALX BCF STATUS,C RRF TEMPY16_H,F RRF TEMPY16,F BCF STATUS,C RRF IDX16_H,F RRF IDX16,F BTFSS STATUS,C GOTO DIVU16LOOP RETURN... somewhere in the code CALL DIVV16 |
Note that these programs work only for unsigned variables. Worst case for DIVV8 is about 144 cycles, which at 20 MHz is about 30 microseconds. The interest of this algorithm appears more clearly, if larger variables should be used.
Note: the blue marked lines have been corrected, due to a resident bug in this loop. (It should have been a while-do loop instead of a do-while loop. Thanks to Marc Hinault for reporting this bug.