Random numbers are a complex and highly debated subject. The routines presented here use a method known as the linear congruential method. While no method of generating random numbers is perfect, the linear congruential method is widely considered to be a reasonable method. Only the basics of this method will be covered. For further information, the book "The Art Of Computer Programming, Volume 2" by Donald Knuth is highly recommended. Half of the book (chapter 3) covers the topic of random numbers. The strengths and weakness of the common random number generators are discussed, and numerous empirical and theoretical tests for randomness are described. In addition, it suggests several guidelines for random number generators. The book is quite mathematically intense, but it covers random numbers thoroughly.
A pseudo-random number generator does not return an unpredictable number, like rolling a die would. The idea is to take a long sequence of numbers and return them, one by one, in an order that seems random. The key is to start at a random point in this sequence. This starting point is called the seed. The idea is to make sequence long enough so that only a fraction of the sequence will be used when generating random numbers. Then, by starting a random place is the sequence, the fraction of the sequence used will be unlikely to overlap from one run to the next.
Note that by starting with the same seed, the same sequence of numbers can be generated repeatedly. This can be a useful feature.
On the 6502, 8-bit or 16-bit random numbers will be the most common, so the generator will return a 32-bit number. Each of the 2^32 possible values will be returned by the generator, in a seemingly random order, of course.
The term "linear congruential" sounds scary, but there are only two steps involved: multplication and addition. To get the new seed (i.e. the next number in the sequence), the old seed is multiplied by a constant, a, called the multiplier, then a second constant, c, is added to the product. The new seed is the lower 32 bits (remember, this generator returns 32-bit numbers) of the result. (Actually, keeping only the lower 32 bits is really a third step, and in fact, it is an extremely important step.) The computation performed is:
seed = a * seed + c
It turns out that the real key to making a linear congruential generator work is selecting a good value for a, the multiplier. The constant, c, is not as critical as the multiplier, but it must be an odd number. In fact, c can be chosen to be one, and this is what the routines here use, since it will simplify the calculation.
Note that an odd number (the old seed) times an odd number (the multiplier, a) is an odd number, and an odd number (the product) plus an odd number (the constant, c) will be an even number. Likewise, an even number times an odd number, plus an odd number will be an odd number. So the seed will alternate between odd and even numbers. This means that the upper bits of the seed must be used for generating 8-bit and 16-bit random numbers.
The first subroutine is called RAND and it performs the seed = a * seed + 1
(remember, c = 1 was chosen) calculation. There are three versions, a fast
version, a short (in terms of number of bytes) version, and a middle version,
which is between the other two versions in terms of space and speed, yet is
reasonably short and reasonably fast.
The multiplier 1664525 (in hex, $19660D) is used in all three versions. This multiplier was chosen from line 16 of the table on p. 106 of "The Art Of Computer Programming, Volume 2". (That table lists various multipliers and their result in the spectral test, a critical theoretical test for measuring randomness.)
The seed is stored in SEED3 (high byte) through SEED0 (low byte). Putting
SEED3 through SEED0 on the zero page will be faster and shorter (i.e. fewer
bytes), but it is not necessary.
The fast version works by using four pages of pre-computed tables. There are four 256-byte tables, which are named T3, T2, T1, T0.
T3,X = byte X of T3 = bits 31-24 of 1664525 * X, (X = 0 to 255)
T2,X = byte X of T2 = bits 23-16 of 1664525 * X, (X = 0 to 255)
T1,X = byte X of T1 = bits 15-8 of 1664525 * X, (X = 0 to 255)
T0,X = byte X of T0 = bits 7-0 of 1664525 * X, (X = 0 to 255)
The product of 1664525 * SEED is:

Since only the lowest 4 bytes are kept, the multiplication is: (for brevity, SEED3 to SEED0 are called S3 to S0)

; Linear congruential pseudo-random number generator
;
; Calculate SEED = 1664525 * SEED + 1
;
; Enter with:
;
; SEED0 = byte 0 of seed
; SEED1 = byte 1 of seed
; SEED2 = byte 2 of seed
; SEED3 = byte 3 of seed
;
; Returns:
;
; SEED0 = byte 0 of seed
; SEED1 = byte 1 of seed
; SEED2 = byte 2 of seed
; SEED3 = byte 3 of seed
;
; TMP is overwritten
;
; For maximum speed, locate each table on a page boundary
;
; Assuming that (a) SEED0 to SEED3 and TMP are located on page zero, and (b)
; all four tables start on a page boundary:
;
; Space: 58 bytes for the routine
; 1024 bytes for the tables
; Speed: JSR RAND takes 94 cycles
;
RAND CLC ; compute lower 32 bits of:
LDX SEED0 ; 1664525 * ($100 * SEED1 + SEED0) + 1
LDY SEED1
LDA T0,X
ADC #1
STA SEED0
LDA T1,X
ADC T0,Y
STA SEED1
LDA T2,X
ADC T1,Y
STA TMP
LDA T3,X
ADC T2,Y
TAY ; keep byte 3 in Y for now (for speed)
CLC ; add lower 32 bits of:
LDX SEED2 ; 1664525 * ($10000 * SEED2)
LDA TMP
ADC T0,X
STA SEED2
TYA
ADC T1,X
CLC
LDX SEED3 ; add lower 32 bits of:
ADC T0,X ; 1664525 * ($1000000 * SEED3)
STA SEED3
RTS
;
; Generate T0, T1, T2 and T3 tables
;
; A different multiplier can be used by simply replacing the four bytes
; that are commented below
;
; To speed this routine up (which will make the routine one byte longer):
; 1. Delete the first INX instruction
; 2. Replace LDA Tn-1,X with LDA Tn,X (n = 0 to 3)
; 3. Replace STA Tn,X with STA Tn+1,X (n = 0 to 3)
; 4. Insert CPX #$FF between the INX and BNE GT1
;
GENTBLS LDX #0 ; 1664525 * 0 = 0
STX T0
STX T1
STX T2
STX T3
INX
CLC
GT1 LDA T0-1,X ; add 1664525 to previous entry to get next entry
ADC #$0D ; byte 0 of multiplier
STA T0,X
LDA T1-1,X
ADC #$66 ; byte 1 of multiplier
STA T1,X
LDA T2-1,X
ADC #$19 ; byte 2 of multiplier
STA T2,X
LDA T3-1,X
ADC #$00 ; byte 3 of multiplier
STA T3,X
INX ; note: carry will be clear here
BNE GT1
RTS
In the RAND routine, the ADC #1 instruction can be eliminated by calculating
1664525 * X + 1 and storing the result in tables T3I through T0I. SEED0 will
then use TnI (n = 0 to 3) and SEED1 through SEED3 will use Tn. Since table
T3 was only used by SEED0, it (T3) will no longer be necessary. Table T2
will be identical to T2I, so table T2 can also be eliminated. This means
only 6 pages of tables (T0, T1, T0I, T1I, T2I, and T3I) are needed and that
JSR RAND will take 92 cycles.
The short version is just an ordinary 32-bit * 32-bit multiplication routine.
The DB pseudo-op is called .BYTE on some assemblers. Consult the assembler
documentation for the pseudo-op name it expects.
; Linear congruential pseudo-random number generator
;
; Calculate SEED = 1664525 * SEED + 1
;
; Enter with:
;
; SEED0 = byte 0 of seed
; SEED1 = byte 1 of seed
; SEED2 = byte 2 of seed
; SEED3 = byte 3 of seed
;
; Returns:
;
; SEED0 = byte 0 of seed
; SEED1 = byte 1 of seed
; SEED2 = byte 2 of seed
; SEED3 = byte 3 of seed
;
; TMP, TMP+1, TMP+2 and TMP+3 are overwritten
;
; Note that TMP to TMP+3 and RAND6 are high byte first, low byte last
;
; Assuming that (a) SEED0 to SEED3 and TMP+0 to TMP+3 are all located on page
; zero, and (b) none of the branches cross a page boundary:
;
; Space: 53 bytes
; Speed: JSR RAND takes 2744 cycles, on average (1624 to 3864 cycles)
; specifically, JSR RAND takes 1624 + 70 * N cycles, where
; N = number of bits of SEED that are 1
;
RAND LDA #1 ; store 1 in TMP
LDX #3
RAND1 STA TMP,X
LSR
DEX
BPL RAND1
LDY #$20 ; calculate SEED = SEED * RAND4 + TMP
BNE RAND5 ; branch always
RAND2 BCC RAND4 ; branch if a zero was shifted out
CLC ; add multiplier to product
LDX #3
RAND3 LDA TMP,X
ADC RAND4,X
STA TMP,X
DEX
BPL RAND3
RAND4 ROR TMP ; shift result right
ROR TMP+1
ROR TMP+2
ROR TMP+3
RAND5 ROR SEED3 ; shift out old seed, and shift in new seed
ROR SEED2
ROR SEED1
ROR SEED0
DEY
BPL RAND2
RTS
RAND6 DB $00,$19,$66,$0D ; multiplier (high byte first!)
Since the carry is known to be set after the BCC RAND4, the CLC can be
eliminated by adding $0019660C instead of $0019660D, saving one byte.
The routine will take one fewer byte if RAND6 (the multiplier) is located on
the zero page. However, since the zero page is usually RAM, RAND6 would have
to be initialized somehow, which would likely nullify the one byte savings.
The middle version is similar to the typical 6502 non-looping multiply-by-10 routines, which are tailored to multiply by a specific number. Unfortunately, this routine must be rewritten if a different multiplier is to be used. The idea to add seed, $100 * seed, $10000 * seed, and/or $1000000 * seed when necessary (as determined by the multiplier), then shift seed (left), and repeat. Specifically,
Bit 0 of $0D (multiplier byte 0) is 1
Bit 0 of $66 (multiplier byte 1) is 0
Bit 0 of $19 (multiplier byte 2) is 1
Bit 0 of $00 (multiplier byte 3) is 0
so, seed is added to $10000 * seed, and seed is shifted left (multiplied by 2)
Bit 1 of $0D is 0
Bit 1 of $66 is 1
Bit 1 of $19 is 0
Bit 1 of $00 is 0
so, $100 * seed is added to the previous result, and seed is shifted left
Bit 2 of $0D is 1
Bit 2 of $66 is 1
Bit 2 of $19 is 0
Bit 2 of $00 is 0
so $100 * seed + seed is added to the previous result, and so on.
; Linear congruential pseudo-random number generator
;
; Calculate SEED = 1664525 * SEED + 1
;
; Enter with:
;
; SEED0 = byte 0 of seed
; SEED1 = byte 1 of seed
; SEED2 = byte 2 of seed
; SEED3 = byte 3 of seed
;
; Returns:
;
; SEED0 = byte 0 of seed
; SEED1 = byte 1 of seed
; SEED2 = byte 2 of seed
; SEED3 = byte 3 of seed
;
; TMP, TMP+1, TMP+2 and TMP+3 are overwritten
;
; Assuming that (a) SEED0 to SEED3 and TMP+0 to TMP+3 are all located on page
; zero, and (b) none of the branches cross a page boundary:
;
; Space: 106 bytes
; Speed: JSR RAND takes 517 cycles
;
RAND CLC ; copy SEED into TMP
LDA SEED0 ; and compute SEED = SEED * $10000 + SEED + 1
STA TMP
ADC #1
STA SEED0
LDA SEED1
STA TMP+1
ADC #0
STA SEED1
LDA SEED2
STA TMP+2
ADC TMP
STA SEED2
LDA SEED3
STA TMP+3
ADC TMP+1
STA SEED3
;
; Bit 7 of $00, $19, $66, and $0D is 0, so only 6 shifts are necessary
;
LDY #5
RAND1 ASL TMP ; shift TMP (old seed) left
ROL TMP+1
ROL TMP+2
ROL TMP+3
;
; Get X from the RAND4 table. When:
;
; X = $00, SEED = SEED + $10000 * TMP
; X = $01, SEED = SEED + $100 * TMP
; X = $FE, SEED = SEED + $10000 * TMP + TMP
; X = $FF, SEED = SEED + $100 * TMP + TMP
;
LDX RAND4,Y
BPL RAND2 ; branch if X = $00 or X = $01
CLC ; SEED = SEED + TMP
LDA SEED0
ADC TMP
STA SEED0
LDA SEED1
ADC TMP+1
STA SEED1
LDA SEED2
ADC TMP+2
STA SEED2
LDA SEED3
ADC TMP+3
STA SEED3
INX ; $FE -> $00, $FF -> $01
INX
RAND2 CLC
BEQ RAND3 ; if X = $00, SEED = SEED + TMP * $10000
LDA SEED1 ; SEED = SEED + TMP * $100
ADC TMP
STA SEED1
RAND3 LDA SEED2
ADC TMP,X
STA SEED2
LDA SEED3
ADC TMP+1,X
STA SEED3
DEY
BPL RAND1
RTS
RAND4 DB $01,$01,$00,$FE,$FF,$01
6 cycles (and 1 byte) can be saved by locating the RAND4 table on the zero
page. Since the zero page is usually RAM, RAND4 would have to be
initialized.
Here is an example of the middle routine rewritten for the multiplier 69069 ($10DCD). This multiplier was taken from line 15 of the table on p. 106 of "The Art Of Computer Programming, Volume 2". (According to that table, this multiplier does not do as well on the spectral test as the multiplier 1664525.)
; Linear congruential pseudo-random number generator
;
; Calculate SEED = SEED * 69069 + 1
;
; Enter with:
;
; SEED0 = byte 0 of seed
; SEED1 = byte 1 of seed
; SEED2 = byte 2 of seed
; SEED3 = byte 3 of seed
;
; Returns:
;
; SEED0 = byte 0 of seed
; SEED1 = byte 1 of seed
; SEED2 = byte 2 of seed
; SEED3 = byte 3 of seed
;
; TMP, TMP+1, TMP+2 and TMP+3 are overwritten
;
; Assuming that SEED0 to SEED3 and TMP+0 to TMP+3 are all located on page
; zero:
;
; Space: 173 bytes
; Speed: JSR RAND takes 326 cycles
;
RAND LDA SEED0 ; TMP = SEED * 2
ASL
STA TMP
LDA SEED1
ROL
STA TMP+1
LDA SEED2
ROL
STA TMP+2
LDA SEED3
ROL
STA TMP+3
CLC ; TMP = TMP + SEED (= SEED * 3)
LDA SEED0
ADC TMP
STA TMP
LDA SEED1
ADC TMP+1
STA TMP+1
LDA SEED2
ADC TMP+2
STA TMP+2
LDA SEED3
ADC TMP+3
STA TMP+3
CLC ; SEED = SEED + $10000 * SEED
LDA SEED2
ADC SEED0
TAX ; keep byte 2 in X for now (for speed)
LDA SEED3
ADC SEED1
TAY ; keep byte 3 in Y for now
CLC ; SEED = SEED + $100 * SEED
LDA SEED1
ADC SEED0
PHA ; push byte 1 onto stack
TXA
ADC SEED1
TAX
TYA
ADC SEED2
TAY
LDA TMP ; TMP = TMP * 4 (= old seed * $0C)
ASL
ROL TMP+1
ROL TMP+2
ROL TMP+3
ASL
ROL TMP+1
ROL TMP+2
ROL TMP+3
STA TMP
CLC ; SEED = SEED + TMP
ADC SEED0
STA SEED0
PLA ; pull byte 1 from stack
ADC TMP+1
STA SEED1
TXA
ADC TMP+2
TAX
TYA
ADC TMP+3
TAY
CLC
LDA TMP ; SEED = SEED + TMP * $100
ADC SEED1
STA SEED1
TXA
ADC TMP+1
TAX
TYA
ADC TMP+2
TAY
LDA TMP ; TMP = TMP * $10 (= old seed * $C0)
ASL ; put byte 0 of TMP in the accumulator
ROL TMP+1
ROL TMP+2
ROL TMP+3
ASL
ROL TMP+1
ROL TMP+2
ROL TMP+3
ASL
ROL TMP+1
ROL TMP+2
ROL TMP+3
ASL
ROL TMP+1
ROL TMP+2
ROL TMP+3
SEC ; SEED = SEED + TMP + 1
ADC SEED0
STA SEED0
LDA TMP+1
ADC SEED1
STA SEED1
TXA
ADC TMP+2
STA SEED2
TYA
ADC TMP+3
STA SEED3
RTS
Having selected a RAND routine, the next step is to turn the new seed into a
(pseudo-)random number. For a random number between 0 and 255 ($FF)
inclusive, simply use SEED3. For a random number between 0 and 65535
($FFFF), use SEED2 for the low byte and SEED3 for the high byte. For a
random number between 0 and 1, use bit 7 of SEED3. For a random number
between 0 and 3, use bits 7 through 6 of SEED3. For a random number between
0 and 7, use bits 7 through 5 of SEED3, and so on. Any random number between
0 and one less than a power of 2 is easily obtained, but what about a random
number between 0 and 5, say? The solution is to multiply SEED by 6 (5 plus
one) and keep the upper 8 bits of the 40-bit product. In this case, 6 is
known as the modulus.
The result is the RANDOM subroutine, which returns a random number, RND,
where 0 <= RND < MOD, and MOD is the modulus. There are two versions of
RANDOM, one for 8-bit RND and MOD, and one for 16-bit RND and MOD. The 8-bit
version, called RANDOM8, is up first, and it uses the number in the
accumulator as the modulus, and returns the random number in the accumulator.
; Linear congruential pseudo-random number generator
;
; Get the next SEED and obtain an 8-bit random number from it
;
; Requires the RAND subroutine
;
; Enter with:
;
; accumulator = modulus
;
; Exit with:
;
; accumulator = random number, 0 <= accumulator < modulus
;
; MOD, TMP, TMP+1, and TMP+2 are overwritten
;
; Note that TMP to TMP+2 are only used after RAND is called.
;
RANDOM8 STA MOD ; store modulus in MOD
JSR RAND ; get next seed
LDA #0 ; multiply SEED by MOD
STA TMP+2
STA TMP+1
STA TMP
SEC
ROR MOD ; shift out modulus, shifting in a 1 (will loop 8 times)
R8A BCC R8B ; branch if a zero was shifted out
CLC ; add SEED, keep upper 8 bits of product in accumulator
TAX
LDA TMP
ADC SEED0
STA TMP
LDA TMP+1
ADC SEED1
STA TMP+1
LDA TMP+2
ADC SEED2
STA TMP+2
TXA
ADC SEED3
R8B ROR ; shift product right
ROR TMP+2
ROR TMP+1
ROR TMP
LSR MOD ; loop until all 8 bits of MOD have been shifted out
BNE R8A
RTS
Here is the 16-bit version of RANDOM, called RANDOM16.
; Linear congruential pseudo-random number generator
;
; Get the next SEED and obtain an 16-bit random number from it
;
; Requires the RAND subroutine
;
; Enter with:
;
; MOD = modulus
;
; Exit with:
;
; RND = random number, 0 <= RND < MOD
;
; TMP is overwritten, but only after RND is called.
;
RANDOM16 JSR RAND ; get next seed
LDA #0 ; multiply SEED by MOD
STA RND+1
STA RND
STA TMP
LDY #16
R16A LSR MOD+1 ; shift out modulus
ROR MOD
BCC R16B ; branch if a zero was shifted out
CLC ; add SEED, keep upper 16 bits of product in RND
ADC SEED0
TAX
LDA TMP
ADC SEED1
STA TMP
LDA RND
ADC SEED2
STA RND
LDA RND+1
ADC SEED3
STA RND+1
TXA
R16B ROR RND+1 ; shift product right
ROR RND
ROR TMP
ROR
DEY
BNE R16A
RTS
There are 2^32 possible values for SEED. When using a modulus that is a
power of 2, each possible value (of RND) returned by RANDOM is equally
likely. However, when modulus of 6 is used, for example, some of the
possible values are a teensy (a very, very teensy) bit more likely than
others, since 2^32 is not a multiple of 6. But 2^32 - 4 is a multiple of 6,
so one solution is to reject four possible values of seed by simply calling
RAND again to get the next seed when one of those four possibilities is
encountered. Since only four values out of 2^32 get rejected, it is highly
unlikely that two consecutive seed values will be rejected, so at most, RAND
will be called twice. The right sequence can ensure that this is so. The
question is, which four values should be rejected?
To answer this question, a simpler example is in order, specifically a 4-bit seed, and a modulus of 7. Since 16-2 is multiple of 7, two values must be rejected. Since there are only 16 possible seed values, all 16 possibilites will be listed, along with the corresponding seed * modulus product.
seed seed * 7 ---- -------- 0000 000 0000 0001 000 0111 0010 000 1110 0011 001 0101 0100 001 1100 0101 010 0011 0110 010 1010 0111 011 0001 1000 011 1000 1001 011 1111 1010 100 0110 1011 100 1101 1100 101 0100 1101 101 1011 1110 110 0010 1111 110 1001
The random numbers (the upper 3 bits of seed * modulus) 0 and 3 occur three times, and the rest occur twice. Notice the lower 4 bits of seed * modulus. When the lower 4 bits are less than 2 (the number of seed values to reject), the upper 3 bits are 000 or 011. Also, when the lower 4 bits are greater than or equal to 16-2, the upper three bits are 000 or 011. This means than there are two ways to determine whether to get the next seed (a) if the lower bits of seed * modulus are less than the number of seed values to reject, or (b) if the lower bits are greater than or equal to 16 minus the number of seed values to reject.
The latter method will be used because it can be implemented little faster. For the 32-bit case, this means checking when the lower 32 bits of the seed * modulus product is less than 2^32 minus the number of seed values to reject. This will be checked by adding the number of seed values to reject to the lower 32 bits of the product and looking for a carry from the 32-bit addition.
The only remaining question is, how many seed values should be rejected? This can be calculated by dividing 2^32 by the modulus. The remainder from this division is the number of seed values to reject.
The subroutine that does all of this is called URANDOM, because it returns a
random number with a uniform distribution. Like RANDOM, it returns a random
number, RND, where 0 <= RND < MOD. There are also two versions, one for
8-bit RND and MOD, and one for 16-bit RND and MOD. The 8-bit version, called
URANDOM8, is once again up first, and it uses the number in the accumulator
as the modulus, and returns the random number in the accumulator.
; Linear congruential pseudo-random number generator
;
; Get the next SEED and obtain an 8-bit uniform random number from it
;
; Requires the RAND subroutine
;
; Enter with:
;
; accumulator = modulus
;
; Exit with:
;
; accumulator = random number, 0 <= accumulator < modulus
;
; MOD, REM, TMP, TMP+1, TMP+2, and TMP+3 are overwritten
;
; Note that TMP to TMP+3 are only used after RAND is called.
;
URANDOM8 STA MOD ; store modulus in MOD
LDX #32 ; calculate remainder of 2^32 / MOD
LDA #1
BNE UR8B
UR8A ASL ; shift dividend left
BCS UR8C ; branch if a one was shifted out
UR8B CMP MOD
BCC UR8D ; branch if partial dividend < MOD
UR8C SBC MOD ; subtract MOD from partial dividend
UR8D DEX
BPL UR8A
STA REM ; store remainder in REM
UR8E JSR RAND
LDA #0 ; multiply SEED by MOD
STA TMP+3
STA TMP+2
STA TMP+1
STA TMP
LDY MOD ; save MOD in Y
SEC
ROR MOD ; shift out modulus, shifting in a 1 (will loop 8 times)
UR8F BCC UR8G ; branch if a zero was shifted out
CLC ; add SEED to TMP
TAX
LDA TMP
ADC SEED0
STA TMP
LDA TMP+1
ADC SEED1
STA TMP+1
LDA TMP+2
ADC SEED2
STA TMP+2
LDA TMP+3
ADC SEED3
STA TMP+3
TXA
UR8G ROR TMP+3 ; shift product right
ROR TMP+2
ROR TMP+1
ROR TMP
ROR
LSR MOD ; loop until all 8 bits of MOD have been shifted out
BNE UR8F
CLC ; add remainder to product
ADC REM
BCC UR8H ; branch if no 8-bit carry
INC TMP ; carry a one to byte 1 of product
BNE UR8H ; branch if no 16-bit carry
INC TMP+1 ; carry a one to byte 2 of product
BNE UR8H ; branch if no 24-bit carry
INC TMP+2 ; carry a one to byte 3 of product
STY MOD ; restore MOD (does not affect Z flag!)
BEQ UR8E ; branch if 32-bit carry
UR8H LDA TMP+3 ; return upper 8 bits of product in accumulator
RTS
Here is the 16-bit version of URANDOM, called URANDOM16.
; Linear congruential pseudo-random number generator
;
; Get the next SEED and obtain an 16-bit random number from it
;
; Requires the RAND subroutine
;
; Enter with:
;
; MOD = modulus
;
; Exit with:
;
; RND = random number, 0 <= RND < MOD
;
; REMH, REML, TEMP, TMP, TMP+1, TMP+2, and TMP+3 are overwritten
;
; Note that TMP to TMP+3 are only used after RAND is called.
;
URANDOM16
LDX #32 ; calculate 2^32 / MOD
LDA #0
STA REMH
SEC ; prepare to shift in a 1
UR16A ROL ; shift dividend left
ROL REMH
BCS UR16B ; branch if a one was shifted out
TAY ; compare partial dividend to MOD
CMP MOD
LDA REMH
SBC MOD+1
TYA
BCC UR16C ; branch if partial dividend < MOD
UR16B SBC MOD ; subtract MOD from paritial dividend (computes remainder)
TAY
LDA REMH
SBC MOD+1
STA REMH
TYA
CLC
UR16C DEX
BPL UR16A
STA REML ; store low byte of remainder in REM
LDA MOD+1 ; save high byte of MOD in TEMP
STA TEMP
UR16D JSR RAND
LDA MOD ; save low byte of MOD in TMP+3
STA TMP+3
LDA #0 ; multiply SEED by MOD
STA RND+1
STA RND
STA TMP+2
STA TMP+1
LDY #16
UR16E LSR MOD+1 ; shift out modulus
ROR MOD
BCC UR16F ; branch if a zero was shifted out
CLC ; add SEED, keep upper 16 bits of product in RND
TAX
LDA TMP+1
ADC SEED0
STA TMP+1
LDA TMP+2
ADC SEED1
STA TMP+2
LDA RND
ADC SEED2
STA RND
LDA RND+1
ADC SEED3
STA RND+1
TXA
UR16F ROR RND+1 ; shift product right
ROR RND
ROR TMP+2
ROR TMP+1
ROR TMP
ROR
DEY
BNE UR16E
CLC ; add remainder to product
ADC REML
LDA TMP
ADC REMH
BCC UR16G ; branch if no 16-bit carry
INC TMP+1 ; carry a one to byte 2 of product
BNE UR16G ; branch if no 24-bit carry
LDA TMP+3 ; restore MOD
STA MOD
LDA TEMP
STA MOD+1
INC TMP+2 ; carry a one to byte 3 of product
BEQ UR16D ; branch if 32-bit carry
UR16G RTS
URANDOM16 could be sped up slightly by replacing the SEC instruction with a LDA #1 instruction followed by a branch (BNE) to the first TAY instruction. Then, the ROL at UR16A can be changed to an ASL, and the CLC just before UR16C can be deleted.
Finally, here is a routine that allows the RAND routine to be accessed via the USR function in EhBASIC. Rbyte4 through RByte1 (the seed for the RND function) could be used for SEED3 through SEED0 as follows:
SEED0 = Rbyte4
SEED3 = SEED0+1
SEED2 = SEED0+2
SEED1 = SEED0+3
Note that Rbyte4 through Rbyte1 are not in order in memory! The downside to
using Rbyte is if RND and USR are both used. The RND seed (Rbyte) cannot be
all zeros, or RND will always return 0, but an all zero seed is perfectly
fine for a linear congruential generator. (EhBASIC cleverly avoids an all
zeros seed by setting the seed only on a non-zero argument to RND.) SEED3
through SEED0 could use unused zero page locations (such as $DE to $E1) or
other unused RAM elsewhere instead.
FAC1_1 is used for TMP, since it and the next 3 locations (FAC1_2, FAC1_3,
and FAC1_s) are overwritten after the JSR RAND anyway. FAC1_e could also
be used since the next 3 locations are FAC1_1, FAC1_2, and FAC1_3.
Since the usage of USR is identical to that of RND, RND can simply be
replaced by USR. Remember, before using USR, the statement DOKE 11,address
(where address is the address of the routine below) must be used to set
the USR vector.
; USR-based linear congruential pseudo-random number generator
;
; Usage is identical to RND
;
; Requires the RAND subroutine
;
TMP = FAC1_1
LDA FAC1_e ; get FAC1 exponent
BEQ USRNG_1 ; do next random # if zero
LDX #<SEED0 ; set PRNG pointer low byte
LDY #>SEED0 ; set PRNG pointer high byte
JSR LAB_2778 ; pack FAC1 into (XY)
USRNG_1
JSR RAND ; get the next random number
LDX #$02 ; three bytes to copy
USRNG_2
LDA SEED3,X ; get PRNG byte
STA FAC1_1,X ; save FAC1 byte
DEX
BPL USRNG_2 ; loop if not complete
LDA SEED0 ; set the rounding byte
STA FAC1_r
LDA #$80 ; set the exponent
STA FAC1_e ; save FAC1 exponent
ASL ; clear A
STA FAC1_s ; save FAC1 sign
JMP LAB_24D5 ; normalize FAC1 & return
Last page update: January 30th, 2005.