Skip to content

Hive Pattern Mining

The WPfaker Hive collects individual field name to faker method mappings from participating installations. Hive Pattern Mining takes this a step further: the WPfaker API periodically analyzes the collective dataset to discover generalizable patterns that no single installation could derive on its own. When many field names share a common suffix, prefix, or structure and all map to the same faker method, a new pattern is extracted and distributed to every Hive participant. These mined patterns extend your local detection library beyond the 780+ built-in patterns — without a plugin update, without an AI call, and without any manual configuration.

How It Works

Hive Pattern Mining operates through two complementary mechanisms:

Server-side pattern mining runs periodically on the WPfaker API. It analyzes the accumulated Hive dataset, identifies clusters of field names that share a structural similarity and map to the same faker method, and extracts generalizable patterns from those clusters. These mined patterns are then distributed to all Hive participants via the API.

Real-time inference activates when your installation encounters an unknown field name that has no exact match in the Hive. Before falling back to AI detection, the API checks the Hive for structurally similar field names. If related entries point to the same faker method with sufficient confidence, the inference result is returned immediately — no AI call needed.

Both mechanisms integrate into WPfaker's existing detection priority chain, sitting between the built-in pattern matcher and the exact Hive lookup.

Server-Side Pattern Mining

The WPfaker API periodically scans the Hive dataset looking for field name clusters. A cluster forms when multiple distinct field names share a structural element — a common suffix, prefix, or contained substring — and all map to the same faker method.

For example, imagine the Hive contains these individual mappings contributed by different installations:

Field NameFaker Method
primary_phonephone
office_phonephone
home_phonephone
mobile_phonephone
emergency_phonephone

The mining engine recognizes that all five field names end in _phone and map to the phone faker method. It extracts the generalizable pattern *_phone → phone with high confidence. This single pattern now covers every future field name ending in _phone — including ones the Hive has never seen before, like backup_phone or customer_phone.

Mined patterns are distributed to all Hive participants via the API in real time. No plugin update is required. As soon as a pattern is mined, every participating installation can use it on the next generation run.

The mining engine applies confidence thresholds to avoid false positives. A pattern is only extracted when a sufficient number of distinct field names support it and no contradictory mappings exist in the dataset. Patterns with lower confidence are held back until more data confirms them.

Value List Enrichment

Some fields pick a random value from a short list — for example, a field called favorite_volcano might only have three options: Vesuvius, Etna, and Krakatoa. When WPfaker generates 50 posts, those same three names repeat over and over. The data looks fake.

The WPfaker Hive server detects these thin value lists (10 or fewer entries) and automatically expands them to 50–100+ diverse, realistic values. This happens entirely on the WPfaker server using its own AI — completely independent of any AI provider you may have configured in the plugin. No API key, no configuration, and no cost on your end.

How enrichment works

  1. Detection — After each mining run, the Hive server identifies fields with 10 or fewer possible values
  2. Concept extraction — The server determines the underlying concept from the field name (e.g., favorite_volcano → "volcano")
  3. AI generation — The Hive server's AI generates 50–100 diverse values for that concept, matching your language and domain. This does not use your AI provider or API key.
  4. Delivery — The expanded value list replaces the thin original in the standard API response. Your plugin receives 80+ values where it previously had 3.

Locale support

Value lists are generated per language. A volcano list for German contains localized names (Vesuv, Ätna, Yellowstone-Caldera), while the English list uses Vesuvius, Etna, Yellowstone. The Hive supports 12 locales:

German, English (US), English (British), French, Spanish, Italian, Portuguese (Brazilian), Dutch, Polish, Japanese, Korean, and Chinese (Simplified).

If no list exists for your exact locale, the Hive falls back through increasingly general matches until it finds one.

No plugin changes needed

Value list enrichment is fully transparent. Your plugin already supports value lists of any size — it simply receives a larger list through the same API response it always used. No plugin update is required.

TIP

Value list enrichment is fully automatic and runs on the WPfaker Hive server. It does not use your configured AI provider or API key. As long as the Hive is enabled, thin lists are detected and enriched at no cost to you.

Real-Time Inference

When your installation encounters a field name that does not match any built-in pattern and has no exact Hive match, the API performs a real-time inference step before deferring to AI detection.

The inference engine searches the Hive for field names that are structurally similar to the unknown field. It uses fuzzy matching to find entries that share significant substrings, common suffixes or prefixes, or similar word structures. If the similar entries consistently point to the same faker method, the inference result is returned with a confidence score.

For example, if the Hive contains mappings for billing_street, shipping_street, and office_street, and your installation encounters warehouse_street for the first time, the inference engine recognizes the _street suffix pattern from the similar entries and returns street_address as the probable faker method — instantly, without an AI call.

If the inference confidence is too low — because the similar entries map to different methods, or too few similar entries exist — the field falls through to AI detection as it normally would.

Detection Priority

With Hive Pattern Mining enabled, WPfaker's detection priority chain expands to seven levels:

PriorityMethodSpeedCost
1Template configuration — Manual override via TemplatesInstantFree
2Built-in pattern match — 780+ local patternsInstantFree
3Hive mined patterns — Generalizable patterns extracted from community dataAPI callFree
4Hive exact match — Individual field name lookup in the HiveAPI callFree
5Real-time inference — Fuzzy matching against similar Hive entriesAPI callFree
6AI detection — External AI provider (Gemini, Claude, GPT)API callPaid
7Default field type mapping — Generic fallback based on field typeInstantFree

The key insight is that levels 3, 4, and 5 are all free and powered by the community. The more installations participate in the Hive, the more mined patterns emerge, and the fewer fields ever need to reach level 6 (paid AI detection). Over time, the community's collective intelligence covers an ever-growing proportion of real-world field naming conventions.

Enabling Hive Pattern Mining

Hive Pattern Mining requires no separate configuration. It is automatically active when the Hive is enabled:

  1. Navigate to WPfaker > Settings
  2. Scroll to the AI-Powered Field Detection section
  3. Enable the WPfaker Hive toggle

That's it. Mined patterns and real-time inference are received automatically as part of the Hive API responses. There is no additional toggle, no separate API key, and no extra cost.

For detailed setup instructions, see AI Settings.

Privacy

Hive Pattern Mining inherits the same privacy guarantees as the existing Hive:

  • Transmitted data includes: field configs (name, label, plugin type, faker method, faker params, confidence), taxonomy terms, title templates, and number ranges. Each report also includes the post type slug, post type label, and locale.
  • No field values, site URLs, domain names, or user information leave your installation
  • Mined patterns contain no identifiable data — they are abstract structural rules like *_phone → phone
  • All data is fully anonymous

The mined patterns distributed back to your installation are derived from aggregated, anonymized community data. No individual installation's contributions are identifiable in the mined output.

For a complete list of all data WPfaker transmits externally, see the Privacy & Data Transparency page.

WARNING

The Hive is reciprocal. Disabling it stops both your contribution of mappings and your receipt of mined patterns and inference results. Only installations with the Hive enabled can benefit from community-powered detection.

Benefits

Reduced AI costs. Every field name that matches a mined pattern or inference result is one fewer AI API call. For sites with many custom post types and unusual field names, this can eliminate the majority of paid AI calls.

Continuous improvement without plugin updates. New mined patterns appear as the community grows. Your detection library expands in real time via the API — no need to wait for a plugin release or manually update pattern lists.

Community-driven intelligence. The more installations participate, the smarter the system becomes. Domain-specific patterns emerge naturally: real estate field conventions from property listing sites, e-commerce patterns from WooCommerce installations, medical terminology from healthcare sites. No single installation could build this breadth of coverage alone.

Domain-specific patterns. Unlike the built-in pattern library, which covers general-purpose conventions, Hive-mined patterns reflect real-world usage across diverse industries and niches. Patterns like *_listing_price, *_dosage, or *_sku_variant emerge organically from the community dataset.

Troubleshooting

Mined patterns not appearing

Verify that the Hive is enabled in WPfaker > Settings > AI-Powered Field Detection. Mined patterns require an active license and an active Hive connection. If the Hive toggle is on but patterns are not being received, check your site's ability to reach the WPfaker API.

Unexpected field mappings from mined patterns

If a mined pattern produces an incorrect mapping for a specific field on your site, you can override it by configuring that field in a Template. Template configurations always take the highest priority and override all automatic detection methods, including mined patterns.

Fields still falling through to AI despite Hive being enabled

The Hive can only provide matches for field name patterns that the community has contributed. Highly unusual or site-specific field names may not have enough Hive data to generate a mined pattern or inference result. In these cases, AI detection fills the gap as expected, and your AI-detected mapping is contributed back to the Hive for future participants.

Released under the GPL2 License. wpfaker.com