Skip to content

jayeshmepani/libpostal-ffi-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PostalKit

Zero-setup, one-command install Python package for libpostal. Designed as a strict 1:1 C-FFI wrapper.

Parsing international street addresses shouldn't require a Ph.D. in C compilation. postalkit provides the ultimate zero-friction environment to run the amazing libpostal C library natively in Python, without abstracting away its raw power.

Like FFI implementations in PHP, this exposes the exact C structs, constants, and functions so that you can port C logic directly to Python.

✨ Why PostalKit?

The standard postal package requires you to manually compile C code, install autoconf, make, pkg-config, and manually download a ~2GB machine learning model.

PostalKit handles everything automatically:

  • Zero C compilation: Downloads pre-compiled libpostal shared binaries for your OS and architecture.
  • Auto-downloads models: Fetches the required libpostal ML models transparently on first use.
  • Strict 1:1 C Mapping: Exposes libpostal_parse_address, libpostal_expand_address, and all ctypes structs exactly as defined in libpostal.h.
  • Cross-platform: Works on Linux (x86_64, arm64), macOS (Intel, Apple Silicon), and Windows.

📦 Installation

pip install postalkit

🚀 Quickstart

Because this is a true 1:1 FFI wrapper, you use the exact function names and C-structs defined in the upstream libpostal C headers. Memory is managed precisely as it is in C.

import ctypes
import postalkit

# 1. Get the C-struct for parser options
options = postalkit.libpostal_get_address_parser_default_options()

# 2. Call the C-function directly (strings must be passed as bytes)
address = b"221B Baker St London"
response_ptr = postalkit.libpostal_parse_address(address, options)

# 3. Access the raw C-arrays
response = response_ptr.contents
for i in range(response.num_components):
    component = response.components[i].decode('utf-8')
    label = response.labels[i].decode('utf-8')
    print(f"{label}: {component}")

# 4. Manually destroy the C pointer to free memory, exactly as in C!
postalkit.libpostal_address_parser_response_destroy(response_ptr)

🧠 True 1:1 FFI Coverage

This package leaves absolutely nothing behind. It natively exposes:

  • All 46 C functions (libpostal_tokenize, libpostal_classify_language, libpostal_is_name_duplicate_fuzzy, etc.)
  • All 10 C Structs (libpostal_normalize_options_t, libpostal_duplicate_options_t, etc.)
  • All 42 C Constants & Bitwise Flags (LIBPOSTAL_ADDRESS_HOUSE_NUMBER, LIBPOSTAL_NORMALIZE_TOKEN_DELETE_HYPHENS, etc.)

You can directly port any libpostal C/C++ tutorial code into Python line-by-line.

🛠️ Advanced Usage

Pre-downloading assets (e.g., for Docker images or CI):

from postalkit.data.manager import ensure_all_assets
ensure_all_assets()

📄 License

MIT License. Developed with 💙 by Jayesh Mepani.