ClickHouse usage analytics: events/gauges tables with daily MV#3
ClickHouse usage analytics: events/gauges tables with daily MV#3lohanidamodar wants to merge 98 commits intomainfrom
Conversation
- Database adapter - ClickHouse adapter
- Removed hardcoded column definitions in Usage class, replacing with dynamic schema derived from SQL adapter. - Introduced new Query class for building ClickHouse queries with fluent interface. - Added support for advanced query operations including find and count methods. - Enhanced error handling and SQL injection prevention mechanisms. - Created comprehensive usage guide for ClickHouse adapter. - Added unit tests for Query class to ensure functionality and robustness. - Maintained backward compatibility with existing methods while improving overall architecture.
…metric logging with deterministic IDs
…ed tags in ClickHouse and Database adapters
…pdate tests for new behavior
- Remove $useFinal property and setUseFinal() (MergeTree doesn't support FINAL) - Remove buildTableReference $useFinal param - Fix resolveTenantFromMetric mixed type handling - Remove unreachable branch in Database::getTotal() - Remove always-true count check in addBatch Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove separate billing table and MV (monthly aggregation) - Daily MV uses same column definitions as source table - Billing queries use daily table (SUM over daily aggregated rows) - Only events are pre-aggregated; gauges query raw table Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Split the single MergeTree table into two separate tables: - Events table with dedicated columns for path, method, status, resource, resourceId - Gauges table with simple metric/value/time/tags schema Event-specific columns are automatically extracted from tags during addBatch. The daily SummingMergeTree MV now aggregates by metric, resource, resourceId. All read methods accept an optional $type parameter to target specific tables, with null querying both tables transparently. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Query the pre-aggregated daily SummingMergeTree table for fast billing/analytics instead of scanning raw events. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- country: LowCardinality(Nullable(String)) for efficient low-cardinality storage - userAgent: Nullable(String) with bloom filter index - Both extracted from tags into dedicated columns like other event fields - Added getCountry() and getUserAgent() getters on Metric Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Daily table only has metric, value, time, resource, resourceId, tenant. No path/status/userAgent/country/tags — those don't aggregate meaningfully. MV groups by metric, resource, resourceId, tenant, day. ORDER BY includes resource and resourceId for efficient billing queries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use Metric::EVENT_COLUMNS to extract all event columns from tags instead of hardcoding the list. Now country and userAgent are properly stored in dedicated columns instead of being left in tags JSON. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Daily table now only has metric, value, time, tenant. One row per metric per project per day — optimal for billing. Resource-level breakdown queries the raw events table directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Single query with GROUP BY metric for summing multiple metrics from the daily table. Returns array<string, int>. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Updated for events/gauges split, event-specific columns, daily MV, query-time aggregation, billing methods, and complete API reference with examples. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…country/userAgent - Remove stale 'type' key from addBatch() @param array shape in Usage.php, Adapter.php, Database.php - Fix mixed-to-string cast in ClickHouse.php event column extraction with type-safe checks - Reduce path size from 1024 to 255 and userAgent size from 512 to 255 in Metric::getEventSchema() to stay within MySQL 768-byte index limit - Update MetricTest assertions: 11 attributes, 9 indexes, 7 EVENT_COLUMNS - Update ClickHouseTest: userAgent/country are now event columns, not tags Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add UsageQuery class extending Query with a custom groupByInterval method that enables time-bucketed aggregated queries. When present in the queries array, the ClickHouse adapter switches from raw row returns to aggregated results grouped by time bucket (SUM for events, argMax for gauges). Supported intervals: 1m, 5m, 15m, 1h, 1d, 1w, 1M. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Query::parse() uses static::isMethod() which allows UsageQuery
to extend the valid method list. Without this override, parsing
'groupByInterval("time","1h")' throws "Invalid query method".
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add array shape type to findAggregatedFromTable $parsed parameter so PHPStan recognizes typed keys (filters, params, orderBy) - Provide default for optional groupByInterval key access - Split compound type check for $interval to satisfy PHPStan string narrowing in exception message interpolation - Remove extra trailing blank line in UsageBase.php (PSR-12) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Database adapter silently cast string tenant IDs to int via (int), which truncates non-numeric strings (e.g. UUIDs) to 0 — effectively disabling tenant isolation. Now throws InvalidArgumentException for non-numeric tenants so the mismatch is caught immediately. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…on, getTotal ambiguity - Buffer key now includes tag hash so events with same metric but different tags (e.g. different paths) stay as separate entries instead of silently discarding the second call's tags - Daily table queries (findDaily, sumDaily, sumDailyBatch) now validate attributes against the daily schema (metric, value, time, tenant) instead of the full event schema. Querying path/method/status on the daily table now throws immediately instead of causing a ClickHouse "No such column" runtime error - Changed (int) cast to (float) for agg_value in getTimeSeries to avoid truncating fractional gauge values or large event sums - getTotal() now throws when a metric exists in both event and gauge tables instead of silently adding incompatible aggregations Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| if (is_array($json) && isset($json['data']) && is_array($json['data'])) { | ||
| foreach ($json['data'] as $row) { | ||
| $metricName = $row['metric'] ?? ''; | ||
|
|
||
| if (!isset($totals[$metricName])) { | ||
| continue; | ||
| } | ||
|
|
||
| $totals[$metricName] += (int) ($row['agg_val'] ?? 0); | ||
| } | ||
| } |
There was a problem hiding this comment.
getTotalBatch() silently sums semantically incompatible aggregations
When $type is null, the method queries both the events table (SUM(value)) and the gauges table (argMax(value, time)) and accumulates both results into $totals[$metricName] +=. If a metric name exists in both tables the result is the sum of a cumulative total and an instantaneous snapshot — a number with no valid meaning.
This is directly inconsistent with getTotal() (line 2137), which explicitly throws when a metric is found in both tables:
if ($eventTotal > 0 && $gaugeTotal > 0) {
throw new Exception("Metric '{$metric}' exists in both event and gauge tables...");
}Apply the same guard here, or at minimum document the known inconsistency.
| /** | ||
| * Enable or disable shared tables mode (multi-tenant with tenant column). | ||
| * | ||
| * @param bool $sharedTables | ||
| * @return self | ||
| */ | ||
| abstract public function setSharedTables(bool $sharedTables): self; |
There was a problem hiding this comment.
This seems specific to Database adapter, should we move it to Databases.php?
| * - path: API endpoint path (events only) | ||
| * - method: HTTP method (events only) | ||
| * - status: HTTP status code (events only) |
There was a problem hiding this comment.
We're going to need a separate HttpLog.php type anyways, do we need these properties?
PHPStan level max flagged getTimeSeries() annotations as int while the ClickHouse adapter emits floats via agg_value cast. Updates the abstract, both adapters, the Usage facade, and zeroFillTimeSeries to float. Also throws on json_encode failure in Usage::collect so the md5() input is guaranteed string instead of string|false. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| // Also check the other type's attributes for cross-table queries | ||
| $otherType = $type === 'event' ? 'gauge' : 'event'; | ||
| foreach ($this->getAttributes($otherType) as $attribute) { | ||
| if ($attribute['$id'] === $attributeName) { | ||
| return true; | ||
| } | ||
| } | ||
|
|
||
| throw new Exception("Invalid attribute name: {$attributeName}"); |
There was a problem hiding this comment.
Cross-type attribute fallback passes invalid columns to gauge queries
validateAttributeName() falls back to checking the opposite type's attributes. When find($queries, null) (or count/sum with $type = null) is called with event-specific filters like Query::equal('path', '/v1/'), parseQueries forwards the same $queries to both tables. For the gauge call, validateAttributeName('path', 'gauge') passes because it finds path in the event schema, generating WHERE \path` = {param:String}against the gauges table — which has no such column — causing a ClickHouseUNKNOWN_IDENTIFIER` server error at runtime.
The cross-type fallback should be removed so that an attribute is only valid if it exists in the target table's own schema:
// Remove lines 1138-1144 (the "Also check the other type's attributes" block)
throw new Exception("Invalid attribute name: {$attributeName}");Any cross-table query with type-specific filters should require the caller to pass an explicit $type.
| public function purge(array $queries = [], ?string $type = null): bool | ||
| { | ||
| $this->setOperationContext('purge()'); | ||
|
|
||
| $typesToPurge = []; | ||
| if ($type === Usage::TYPE_EVENT || $type === null) { | ||
| $typesToPurge[] = Usage::TYPE_EVENT; | ||
| } | ||
| if ($type === Usage::TYPE_GAUGE || $type === null) { | ||
| $typesToPurge[] = Usage::TYPE_GAUGE; | ||
| } | ||
|
|
||
| foreach ($typesToPurge as $purgeType) { | ||
| $tableName = $this->getTableForType($purgeType); | ||
| $escapedTable = $this->escapeIdentifier($this->database) . '.' . $this->escapeIdentifier($tableName); | ||
|
|
||
| $parsed = $this->parseQueries($queries, $purgeType); | ||
| $whereData = $this->buildWhereClause($parsed['filters'], $parsed['params']); | ||
| $whereClause = $whereData['clause']; | ||
| $params = $whereData['params']; | ||
|
|
||
| if (empty($whereClause)) { | ||
| $whereClause = ' WHERE 1=1'; | ||
| } | ||
|
|
||
| $sql = "DELETE FROM {$escapedTable}{$whereClause}"; | ||
| $this->query($sql, $params); | ||
| } | ||
|
|
||
| return true; | ||
| } |
There was a problem hiding this comment.
purge() leaves stale aggregates in the daily MV table
DELETE FROM on the events and gauges tables does not cascade to the _events_daily SummingMergeTree table populated by the materialized view. After purge(), the daily table still contains all previously aggregated rows. Immediate re-insertion of the same metrics will double-count in the billing table, and any billing query against _events_daily will return inflated totals. This also means every test teardown (which calls purge()) leaves residual daily-table state that accumulates across runs.
| public function purge(array $queries = [], ?string $type = null): bool | ||
| { | ||
| $this->db->getAuthorization()->skip(function () use ($queries) { | ||
| $dbQueries = $this->convertQueriesToDatabase($queries); | ||
| $dbQueries[] = DatabaseQuery::limit(100); | ||
|
|
||
| do { | ||
| $documents = $this->db->find( | ||
| collection: $this->collection, | ||
| queries: $dbQueries, | ||
| ); | ||
|
|
||
| foreach ($documents as $document) { | ||
| $this->db->deleteDocument($this->collection, $document->getId()); | ||
| } | ||
| } while (! empty($documents)); | ||
| }); | ||
|
|
||
| return true; | ||
| } |
There was a problem hiding this comment.
purge() ignores $type parameter, deletes all records regardless of type
The $type argument is not forwarded into convertQueriesToDatabase($queries) and no type discriminator is injected into $dbQueries. A call to purge([], Usage::TYPE_EVENT) will delete both events and gauges indiscriminately. Since addBatch() does not persist a type field in each document, the fix requires both write and read/delete paths to be aligned.
Summary
Complete rewrite of the usage analytics library with a two-table architecture optimized for both real-time analytics and billing.
Architecture
Key Changes
API
Write
Read
Billing (Daily MV)
Test Plan