Skip to content

Fix RAND() function behavior#363

Open
JanJakes wants to merge 1 commit intotrunkfrom
fix-rand-function
Open

Fix RAND() function behavior#363
JanJakes wants to merge 1 commit intotrunkfrom
fix-rand-function

Conversation

@JanJakes
Copy link
Copy Markdown
Member

Summary

Replaces the old mt_rand(0, 1) stub — which returned an integer 0 or 1, not a float in [0, 1) as MySQL requires — with MySQL-compatible RAND() behavior.

Replaces #341.

Unseeded RAND()

Compiles to a native SQLite expression:

((RANDOM() & 0x001FFFFFFFFFFFFF) / 9007199254740992.0)

The 53-bit mask matches the IEEE 754 double mantissa, so division by 2^53 is exact and strictly less than 1.0. Matches MySQL, where unseeded RAND() uses a thread-level state independent of RAND(N).

Seeded RAND(N)

Routes through a PHP UDF implementing MySQL's exact LCG from my_rnd_init() / my_rnd() (sql/item_func.cc, mysys/my_rnd.cc), bit-exact against MySQL 9.6. Requires 64-bit PHP.

Seed handling matches val_int(): NULL becomes 0, floats round to nearest (RAND(3.9) == RAND(4)), numeric strings follow the same path.

Known divergences

  • MySQL distinguishes constant vs non-constant seeds at parse time (constant = init once per statement, non-constant = reinit per row). A SQLite UDF can't see the expression, so we approximate by reinitializing only when the seed value changes. This diverges when a non-constant expression yields a stable value.
  • The UDF keeps one LCG state per connection, so multiple RAND(N) call sites in one query share a stream: SELECT RAND(1), RAND(1) returns (v1, v2) here vs (v, v) in MySQL.

Both are documented in the rand() docblock.

Metadata

RAND() column metadata is now reported as DOUBLE / PARAM_STR, removing a long-standing TODO.

Test plan

  • CI green.
  • Bit-exact reference values against MySQL 9.6 (seeds 0, 1, 3, 5, and multi-row sequences).
  • NULL, float, numeric/non-numeric string, and negative seed handling.
  • RAND() in WHERE, UPDATE, INSERT, ORDER BY (LIMIT + seeded deterministic permutation).
  • Per-statement flush contract inside a transaction.

Replace the stub that returned mt_rand(0, 1) — an integer 0 or 1,
not a float in [0, 1) as MySQL requires.

Unseeded RAND() compiles to a native SQLite expression:

    ((RANDOM() & 0x001FFFFFFFFFFFFF) / 9007199254740992.0)

The 53-bit mask matches the IEEE 754 double mantissa, so division
by 2^53 is exact and strictly less than 1.0. Matches MySQL, where
unseeded RAND() uses a thread-level state independent of RAND(N).

Seeded RAND(N) routes through a PHP UDF implementing MySQL's exact
LCG from my_rnd_init()/my_rnd() (sql/item_func.cc, mysys/my_rnd.cc),
bit-exact against MySQL 9.6. Requires 64-bit PHP.

Seed handling matches val_int(): NULL becomes 0, floats round to
nearest (RAND(3.9) == RAND(4)), numeric strings follow the same path.

The UDF keeps a single LCG state per connection, so multiple RAND(N)
call sites in one query share a stream: `SELECT RAND(1), RAND(1)`
returns (v1, v2) here vs (v, v) in MySQL. Documented divergence.

Column metadata is now reported as DOUBLE / PARAM_STR.
@JanJakes JanJakes requested review from a team and zaerl and removed request for a team April 17, 2026 11:55
@JanJakes JanJakes changed the title Fix RAND() function behavior Fix RAND() function behavior Apr 21, 2026
return mt_rand( 0, 1 );
public function rand( $seed ) {
// Requires 64-bit PHP. Seed * 0x10000001 can exceed PHP_INT_MAX on 32-bit.
$max_value = 0x3FFFFFFF;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really appreciate this defensive polish we have here. Despite the rarity of 32-bit systems, it is always better to handle them.

* The UDF also handles RAND(NULL) as RAND(0), matching MySQL.
*/
if ( 0 === count( $args ) ) {
return '((RANDOM() & 0x001FFFFFFFFFFFFF) / 9007199254740992.0)';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having some of these magic numbers declared at the class level can be useful to clean up the code a little. But feel free to ignore this; the comment above already explains it well.

Unfortunately, PHP has no constant for 2^53 - 1, which instead is Number.MAX_SAFE_INTEGER in Javascript.

Copy link
Copy Markdown
Contributor

@zaerl zaerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In MySQL, unseeded RAND() uses the thread-level random state

I didn't know about this, pretty strange to me.

Thanks for the great work here. I've added some comments, but feel free to ignore everything and ship this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants