Fast division for PICs

updated August 10th, 2009

If you went through our fast multiplying, now try the fast division if you dare.

The algorithm that has been applied here belongs to the CORDIC family. Also have a look at our CORDIC square-root function.

Normally division-algorithms follow the way, children are tought to operate. Let's take an example:

16546 is the numerator, 27 the divisor
z: start with the left-most digit:
if 1 < 27 then add the second digit
if 16 < 27 then add the third digit
165 > 27, so integer-divide 165 div 27 = 6
get the remainder, which is 165 - 6 * 27 = 3
now restart at z with the remainder

With RISC-technology, at assembler level, the tests are operated with substractions, checking whether the results are negative, zero or positive. The integer-division is done by successive substractions until the result is negative. A counter then indicates how often substractions were made.

As already pointed out, CORDIC has a very different approach to mathematical operations. The incredible speed of the algorithms are the result from a divide and conquer approach. Practically let's have see how our CORDIC division works:

Suppose you want to integer-divide 87₁₀= 1010111₂ through 6₁₀ = 110₂.

numerator 01010111 base_index := 00000001 = 1

divisor 00000110 result:=0

rotate divisor and base_index until the most significant bits of numerator and divisor are equal:

00001100 00000010 = 2

00011000 00000100 = 4

00110000 00001000 = 8

01100000 00010000 = 16

now substract both numerator and altered divisor:

01010111-

01100000

----------

< 0

if negative -which is the case here- rotate back divisor and base_index one digit to the right:

00110000 00001000 = 8

substract again rotated divisor from numerator:

01010111-

00110000

----------

00100111, positive remainder

now replace the divisor by the remainder:

new numerator:= 00100111

this time add the base_index to result:

result:= result(0) + 8 = 8

now rotate to the right divisor and base_index one digit:

00011000 00000100 = 4

substract again:

00100111-

00011000

----------

00001111, remainder positive, so

new numerator:=00001111
result:=result + base_index = 8+4 = 12
rotate:

00001100 00000010 = 2

substract :

00001111-

00001100

----------

00000011, remainder positive, so

new numerator:=00000011
result:=result + base_index = 12+2 = 14
rotate:

00000110 00000001 = 1

substract :

00000011-

00000110

----------

< 0, so do nothing

stop

Here PIC 16F84 and 628 code:

DIVV8
	MOVF  TEMPY8,F
	BTFSC STATUS,Z ;SKIP IF NON-ZERO
	RETURN
	CLRF  RESULT8
	MOVLW  1
	MOVWF  IDX16
SHIFT_IT8
	BTFSC  TEMPY8,7
	GOTO  DIVU8LOOP
	BCF   STATUS,C
	RLF   IDX16,F
	BCF   STATUS,C
	RLF   TEMPY8,F
	GOTO  SHIFT_IT8

DIVU8LOOP
	MOVF  TEMPY8,W
	SUBWF  TEMPX8
	BTFSC   STATUS,C
	GOTO   COUNT8
	ADDWF   TEMPX8
	GOTO  FINAL8
COUNT8
	MOVF  IDX16,W
	ADDWF  RESULT8
FINAL8
	BCF   STATUS,C
	RRF   TEMPY8,F
	BCF   STATUS,C
	RRF   IDX16,F
	BTFSS   STATUS,C
	GOTO   DIVU8LOOP
	RETURN

SUB16
	MOVF    TEMPY16_H,W
	MOVWF   TEMPYY
	MOVF    TEMPY16,W
	SUBWF   TEMPX16
	BTFSS   STATUS,C
	INCF    TEMPYY,F
	MOVF    TEMPYY,W
	SUBWF   TEMPX16_H
RETURN

ADD16BIS
	MOVF    TEMPY16,W
	ADDWF   TEMPX16
	BTFSC   STATUS,C
	INCF    TEMPX16_H,F
	MOVF    TEMPY16_H,W
	ADDWF   TEMPX16_H
	RETURN

DIVV16
	MOVF TEMPY16,F
	BTFSS  STATUS,Z
	GOTO  ZERO_TEST_SKIPPED
	MOVF TEMPY16_H,F
	BTFSC STATUS,Z
	RETURN

ZERO_TEST_SKIPPED
	MOVLW   1
	MOVWF   IDX16
	CLRF    IDX16_H
	CLRF    RESULT16
	CLRF    RESULT16_H

SHIFT_IT16
	BTFSC   TEMPY16_H,7
	GOTO 	DIVU16LOOP
	BCF     STATUS,C
  	RLF     IDX16,F
  	RLF     IDX16_H,F
  	BCF     STATUS,C
  	RLF     TEMPY16,F
  	RLF     TEMPY16_H,F
  	GOTO    SHIFT_IT16

DIVU16LOOP
	CALL    SUB16
	BTFSC   STATUS,C
	GOTO    COUNTX
	CALL    ADD16BIS
	GOTO    FINALX
COUNTX
	MOVF    IDX16,W
	ADDWF   RESULT16
	BTFSC   STATUS,C
	INCF    RESULT16_H,F
	MOVF    IDX16_H,W
	ADDWF   RESULT16_H
FINALX
	BCF     STATUS,C
	RRF     TEMPY16_H,F
	RRF     TEMPY16,F
	BCF     STATUS,C
	RRF     IDX16_H,F
	RRF     IDX16,F
	BTFSS   STATUS,C
	GOTO    DIVU16LOOP
	RETURN

... somewhere in the code CALL DIVV16

Note that these programs work only for unsigned variables. Worst case for DIVV8 is about 144 cycles, which at 20 MHz is about 30 microseconds. The interest of this algorithm appears more clearly, if larger variables should be used.

Note: the blue marked lines have been corrected, due to a resident bug in this loop. (It should have been a while-do loop instead of a do-while loop. Thanks to Marc Hinault for reporting this bug.

Main Page