Skip to content

williamwutq/bstack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bstack

A persistent, fsync-durable binary stack backed by a single file.

push and pop perform a durable sync before returning, so data survives a process crash or unclean shutdown. On macOS, fcntl(F_FULLFSYNC) is used instead of fdatasync to flush the drive's hardware write cache, which plain fdatasync does not guarantee.

Crates.io Docs.rs License: MIT

A 16-byte file header stores a magic number and a committed-length sentinel. On reopen, any mismatch between the header and the actual file size is repaired automatically — no user intervention required.

On Unix, open acquires an exclusive advisory flock; on Windows, LockFileEx is used instead. Both prevent two processes from concurrently corrupting the same stack file.

Minimal dependencies (libc on Unix, windows-sys on Windows). No unsafe beyond required FFI calls.

Warning: bstack files must only be opened through this crate or a compatible implementation that understands the file format, header protocol, and locking semantics. Reading or writing the file with raw tools (dd, xxd, custom open(2) calls, etc.) while a BStack instance is live, or manually editing the header fields, can silently corrupt the committed-length sentinel or bypass the advisory lock. The authors make no guarantees about the behaviour of the crate — including freedom from data loss or logical corruption — when the file has been accessed outside of this crate's controlled interface.


Quick start

use bstack::BStack;

let stack = BStack::open("log.bin")?;

// push appends bytes and returns the starting logical offset.
let off0 = stack.push(b"hello")?;  // 0
let off1 = stack.push(b"world")?;  // 5

assert_eq!(stack.len()?, 10);

// peek reads from a logical offset to the end.
assert_eq!(stack.peek(off1)?, b"world");

// get reads an arbitrary half-open logical byte range.
assert_eq!(stack.get(3, 8)?, b"lowor");

// pop removes bytes from the tail and returns them.
assert_eq!(stack.pop(5)?, b"world");
assert_eq!(stack.len()?, 5);

API

impl BStack {
    /// Open or create a stack file at `path`.
    /// Acquires an exclusive flock on Unix, or LockFileEx on Windows.
    /// Validates the header and performs crash recovery on existing files.
    pub fn open(path: impl AsRef<Path>) -> io::Result<Self>;

    /// Append `data` and durable-sync.  Returns the starting logical offset.
    /// An empty slice is valid and a no-op on disk.
    pub fn push(&self, data: &[u8]) -> io::Result<u64>;

    /// Remove and return the last `n` bytes, then durable-sync.
    /// `n = 0` is valid.  Errors if `n` exceeds the current payload size.
    pub fn pop(&self, n: u64) -> io::Result<Vec<u8>>;

    /// Remove the last `buf.len()` bytes and write them into `buf`, then durable-sync.
    /// An empty buffer is a valid no-op.  Errors if `buf.len()` exceeds the current payload size.
    /// Prefer this over `pop` when a buffer is already available to avoid an extra allocation.
    pub fn pop_into(&self, buf: &mut [u8]) -> io::Result<()>;

    /// Overwrite `data` bytes in place starting at logical `offset`.
    /// Never changes the file size; errors if the write would exceed the
    /// current payload.  Requires the `set` feature.
    #[cfg(feature = "set")]
    pub fn set(&self, offset: u64, data: &[u8]) -> io::Result<()>;

    /// Copy all bytes from `offset` to the end of the payload.
    /// `offset == len()` returns an empty Vec.
    pub fn peek(&self, offset: u64) -> io::Result<Vec<u8>>;

    /// Fill `buf` with exactly `buf.len()` bytes starting at logical `offset`.
    /// An empty buffer is a valid no-op.
    /// Prefer this over `peek` when a buffer is already available to avoid an extra allocation.
    pub fn peek_into(&self, offset: u64, buf: &mut [u8]) -> io::Result<()>;

    /// Copy bytes in the half-open logical range `[start, end)`.
    /// `start == end` returns an empty Vec.
    pub fn get(&self, start: u64, end: u64) -> io::Result<Vec<u8>>;

    /// Fill `buf` with bytes from the half-open logical range `[start, start + buf.len())`.
    /// An empty buffer is a valid no-op.
    /// Prefer this over `get` when a buffer is already available to avoid an extra allocation.
    pub fn get_into(&self, start: u64, buf: &mut [u8]) -> io::Result<()>;

    /// Current payload size in bytes (excludes the 16-byte header).
    pub fn len(&self) -> io::Result<u64>;

    /// Create a `BStackReader` positioned at the start of the payload.
    pub fn reader(&self) -> BStackReader<'_>;

    /// Create a `BStackReader` positioned at `offset` bytes into the payload.
    pub fn reader_at(&self, offset: u64) -> BStackReader<'_>;
}

// BStack and &BStack both implement std::io::Write (each write = one push + durable_sync).
// BStackReader implements std::io::Read + std::io::Seek + From<&BStack>.

Standard I/O adapters

Writing — impl Write for BStack / impl Write for &BStack

BStack implements [std::io::Write]. Each call to write is forwarded to push, so every write is atomically appended and durably synced before returning. flush is a no-op.

&BStack also implements Write (mirroring impl Write for &File), which lets you pass a shared reference wherever a writer is expected.

use std::io::Write;
use bstack::BStack;

let mut stack = BStack::open("log.bin")?;

// write / write_all forward to push.
stack.write_all(b"hello")?;
stack.write_all(b"world")?;
assert_eq!(stack.len()?, 10);

// io::copy works out of the box.
let mut src = std::io::Cursor::new(b"more data");
std::io::copy(&mut src, &mut stack)?;

Wrapping in BufWriter batches small writes into fewer push calls (and fewer durable_sync calls):

use std::io::{BufWriter, Write};
use bstack::BStack;

let stack = BStack::open("log.bin")?;
let mut bw = BufWriter::new(&stack);
for chunk in chunks {
    bw.write_all(chunk)?;
}
bw.flush()?; // one push + one durable_sync for the whole batch

Note: Each raw write call issues one durable_sync. If you call write or write_all in a tight loop, prefer push directly or use BufWriter to batch.

Reading — BStackReader

[BStackReader] wraps a &BStack with a cursor and implements [std::io::Read] and [std::io::Seek].

use std::io::{Read, Seek, SeekFrom};
use bstack::{BStack, BStackReader};

let stack = BStack::open("log.bin")?;
stack.push(b"hello world")?;

// From the beginning:
let mut reader = stack.reader();

// From an arbitrary offset:
let mut mid = stack.reader_at(6);

// From<&BStack> is also implemented:
let mut r = BStackReader::from(&stack);

let mut buf = [0u8; 5];
reader.read_exact(&mut buf)?;  // b"hello"

reader.seek(SeekFrom::Start(6))?;
reader.read_exact(&mut buf)?;  // b"world"

// read_to_end, BufReader, etc. all work.
let mut out = Vec::new();
stack.reader().read_to_end(&mut out)?;

BStackReader borrows the stack immutably, so multiple readers can coexist and run concurrently with each other and with peek/get calls.


Feature flags

set

BStack::set(offset, data) — in-place overwrite of existing payload bytes without changing the file size or the committed-length header.

[dependencies]
bstack = { version = "0.1", features = ["set"] }

File format

┌────────────────────────┬──────────────┬──────────────┐
│      header (16 B)     │  payload 0   │  payload 1   │  ...
│  magic[8] | clen[8 LE] │              │              │
└────────────────────────┴──────────────┴──────────────┘
^                        ^              ^              ^
file offset 0        offset 16       16+n0          EOF
  • magic — 8 bytes: BSTK + major(1 B) + minor(1 B) + patch(1 B) + reserved(1 B). This version writes BSTK\x00\x01\x02\x00 (0.1.2). open accepts any 0.1.x file (first 6 bytes BSTK\x00\x01) and rejects a different major or minor as incompatible.
  • clen — little-endian u64 recording the last successfully committed payload length. Updated on every push and pop before the durable sync.

All user-visible offsets (returned by push, accepted by peek/get) are logical — 0-based from the start of the payload region (file byte 16).


Durability

Operation Sequence
push lseek(END)write(data)lseek(8)write(clen) → sync
pop, pop_into lseekreadftruncatelseek(8)write(clen) → sync
set (feature) lseek(offset)write(data) → sync
peek, peek_into, get, get_into pread(2) on Unix; ReadFile+OVERLAPPED on Windows; lseekread elsewhere

durable_sync on macOS issues fcntl(F_FULLFSYNC). Unlike fdatasync, this flushes the drive controller's write cache, providing the same "barrier to stable media" guarantee that fsync gives on Linux. Falls back to sync_data if the device does not support F_FULLFSYNC.

durable_sync on Linux / other Unix calls sync_data (fdatasync).

durable_sync on Windows calls sync_data, which maps to FlushFileBuffers. This flushes the kernel write-back cache and waits for the drive to acknowledge, providing equivalent durability to fdatasync.

Push rollback: if the write or sync fails, a best-effort ftruncate and header reset restore the pre-push state.


Crash recovery

The committed-length sentinel in the header ensures automatic recovery on the next open:

Condition Cause Recovery
file_size − 16 > clen partial tail write (crashed before header update) truncate to 16 + clen, durable-sync
file_size − 16 < clen partial truncation (crashed before header update) set clen = file_size − 16, durable-sync

No caller action is required; recovery is transparent.


Multi-process safety

On Unix, open calls flock(LOCK_EX | LOCK_NB) on the file. If another process already holds the lock, open returns immediately with io::ErrorKind::WouldBlock. The lock is released when the BStack is dropped.

On Windows, open calls LockFileEx with LOCKFILE_EXCLUSIVE_LOCK | LOCKFILE_FAIL_IMMEDIATELY covering the entire file range. The same WouldBlock semantics apply (ERROR_LOCK_VIOLATION maps to io::ErrorKind::WouldBlock in Rust). The lock is released when the BStack is dropped.

Both flock (Unix) and LockFileEx (Windows) are advisory and per-process. They protect against concurrent BStack::open calls across well-behaved processes, not against raw file access.


Thread safety

BStack wraps the file in a RwLock<File>.

Operation Lock (Unix / Windows) Lock (other)
push, pop, pop_into write write
set (feature) write write
peek, peek_into, get, get_into read write
len read read

On Unix and Windows, peek, peek_into, get, and get_into use a cursor-safe positional read (pread(2) / read_exact_at on Unix; ReadFile with OVERLAPPED via seek_read on Windows) that does not modify the shared file-position cursor. Multiple concurrent calls to any of these methods can therefore run in parallel. Any in-progress push, pop, or pop_into still blocks all readers via the write lock, so readers always observe a consistent, committed state.

On other platforms a seek is required; peek, peek_into, get, and get_into fall back to the write lock and reads serialise.


Known limitations

  • No record framing. The file stores raw bytes; the caller must track how many bytes each logical record occupies.
  • Push rollback is best-effort. A failure during rollback is silently swallowed; crash recovery on the next open will repair the state.
  • No O_DIRECT. Writes go through the page cache; durability relies on durable_sync, not cache bypass.
  • Single file only. There is no WAL, manifest, or secondary index.
  • Multi-process lock is advisory. flock (Unix) and LockFileEx (Windows) protect well-behaved processes but not raw file access.