How we know what we know

Research Methodology

Every chart, score, and ranking on Raw Honey Guide traces back to a specific dataset and a specific method. This page documents the catalog, the calculations, and the limits of each one — so anyone can verify, critique, or extend the work.

Last updated · companion to Editorial Policy and Open Data.

The Catalog

All quantitative claims on the site derive from a single hand-curated dataset of 210 raw-honey jars from 168 brands across 16 countries. The seed file lives at backend/src/main/resources/seed-data/honeys.json and the production version is downloadable as JSON or CSV from /open-data under a CC BY 4.0 licence. Per-jar fields:

  • Floral source: enum — Clover, Wildflower, Manuka, Acacia, Buckwheat, Tupelo, Sourwood, Sage, Linden, Chestnut, Heather, Eucalyptus, Lavender, Orange Blossom, Avocado, Blueberry, Other.
  • Honey type: Raw (89%), Creamed (4.3%), Comb, Infused (3.3%), Pasteurised (0.5%), Filtered.
  • Origin country: USA, New Zealand, Australia, Argentina, Mexico, Canada, Brazil, Greece, Turkey, Spain, France, Italy, Hungary, Germany, UK, Other.
  • Brand & region: Where stated by the producer; 168 distinct brands; the largest single brand holds 6 listings (≈ 3% share).
  • Mid-jar price: Average of observed listings over a 30–90 day window; not adjusted for sale pricing or pack-size.
  • Certifications: USDA Organic (6.2%), Non-GMO, Fair Trade, Kosher, UMF, MGO — recorded only when printed on the jar.
  • UMF / MGO ratings: Where the producer prints a numeric score on the label.
Sample bias to be honest about. The catalog deliberately oversamples premium, raw, and artisan jars — the kind a thoughtful shopper would actually consider. It under-samples mass-market filtered grocery honey. Read every "X% of the catalog" sentence as "X% of premium / artisan jars," not as a US retail-shelf census.

Catalog at a glance

Every number in every data story on this site traces back to the table below. Read it before quoting any "X% of raw honey…" stat — the shape of the catalog is the shape of our findings.

By origin

16 countries

USA alone is 44% of the catalog — a reflection of our editorial geography, not the global production share (China and Argentina dominate commodity honey by volume).

  • USA
    93
    44%
  • New Zealand
    22 · 10%
  • Other
    19 · 9%
  • Australia
    10 · 5%
  • Canada
    9 · 4%
  • Italy
    9 · 4%
  • France
    9 · 4%
  • Spain
    8 · 4%
  • Greece
    6 · 3%
  • UK
    5 · 2%
  • Mexico
    4 · 2%
  • Hungary
    4 · 2%
  • Germany
    4 · 2%
  • Brazil
    3 · 1%
  • Turkey
    3 · 1%
  • Argentina
    2 · 1%

By floral source

17 buckets

Deliberately varietal-spread: the largest monofloral (wildflower, 22) is only 10% of the catalog. Manuka (18) is well-represented because readers ask about it most.

  • Other / mixed polyfloral
    35
  • Wildflower
    22
  • Clover
    20
  • Manuka
    18
  • Orange Blossom
    14
  • Acacia
    14
  • Buckwheat
    10
  • Lavender
    10
  • Eucalyptus
    10
  • Linden
    10
  • Sage
    8
  • Sourwood
    8
  • Chestnut
    8
  • Heather
    8
  • Tupelo
    5
  • Blueberry
    5
  • Avocado
    5

By honey type

89% raw

The "raw honey guide" naming is load-bearing — only 13 jars (6.2%) in the catalog are non-raw, and they are there mostly to benchmark what changes when honey is creamed, infused, or filtered.

  • Raw
    187 · 89%
  • Creamed
    9 · 4.3%
  • Infused
    7 · 3.3%
  • Filtered
    3 · 1.4%
  • Comb
    3 · 1.4%
  • Pasteurized
    1 · 0.5%

By certification claim

11.9% certified

Only 25 of 210 jars carry any third-party certification claim printed on the label. A jar can carry more than one; the bars below count each claim independently.

  • USDA Organic
    13 · 6.2%
  • UMF (any tier)
    6 · 2.9%
  • Non-GMO Project
    4 · 1.9%
  • True Source Certified
    4 · 1.9%

Bars scaled 6× for legibility — real certification rate is what the right-hand % column shows. The low rate partly reflects a real regulatory gap: the U.S. has never finalised federal raw-honey organic standards, so many genuine raw beekeepers cannot certify even when they meet equivalent practices. See how to read honey labels for what each cert actually verifies.

Per-story methods

Every data story, embeddable widget, and decision tool we publish is documented below — its sample, its calculation, its source citations, and what it explicitly does not claim.

The State of Raw Honey 2026

n = 210 jars, 168 brands, 16 countries

Original-research data story. Six hand-built CSS bar charts and one SVG MGO scatter, all derived directly from the catalog seed file. Headline statistics:

  • Catalog mean mid-jar price: $23.54.
  • Manuka mean: $55.27 (n=18) — 4× the rest of the catalog.
  • Non-Manuka mean: roughly $19.
  • New Zealand mean: $50.42 vs USA mean $18.93 (driven almost entirely by NZ being 18-of-22 Manuka).
  • Brand long tail: 168 distinct brands; the largest single brand holds 6 listings (≈ 3% share).
  • Certification: 13 of 210 (6.2%) carry USDA Organic.

Method. All charts derive from a single read of `seed-data/honeys.json`. Means are simple arithmetic means with no winsorising. The MGO regression is OLS over the 18 Manuka jars that declare both MGO and price: price ≈ $19.34 + $0.0973 × MGO, r² = 0.97.

Open the story →

Honey Crystallization Timeline

18 unifloral honeys × peer-reviewed F/G ratios

A scrollable SVG timeline plus a G/W-versus-time scatter, with Manikis & Thrasivoulou (2001) "fast zone" shading. We tier 18 unifloral honeys from very-fast to very-slow and map 175 of the 210 catalog jars to those tiers (the remaining 35 are listed as wildflower or "other" and are too botanically heterogeneous to score).

  • Tier shares of mappable catalog: fast 38.3%, medium 26.9%, slow 24.0%, very-slow 10.9%.
  • Single best predictor: fructose-to-glucose ratio. F/G < 1.0 crystallises in weeks; F/G > 1.5 often stays liquid for years.
  • Glucose-to-water ratio is a close second predictor — above 2.1 is the "fast zone".

Method. F/G and G/W mid-points are read from Persano Oddo & Piro (2004), White (1979), White & Doner (1980), Escuredo et al. (2014), and Manikis & Thrasivoulou (2001). Tier thresholds follow the long-standing apicultural convention used in extension service literature. We do not attempt to predict an individual jar's crystallisation date — only the typical window for the variety stored at 14–20 °C.

Open the story →

Honey Harvest Calendar

35 varieties × 12 months × 2 hemispheres

A hand-built SVG Gantt of bloom-to-extraction windows with a hemisphere toggle, a radial "bloom clock" cross-view, and an "in season right now" picker computed from the live month.

  • US peak extraction: late June through early September, anchored by the Upper Midwest clover flow.
  • Briefest tracked window: acacia at roughly 10–14 days in late May.
  • Longest single-variety windows: wildflower, clover, and eucalyptus (each over two months).
  • Hemisphere wrap: NZ Manuka harvest dated "2024" actually means December 2023 to February 2024.

Method. Each variety entry encodes a hemisphere (N / S), one or two month bands (year-end wraps split into two), a peak month, and a region. Sources cited inline on the page: FAO 2019 Agribusiness Handbook, Crane 1999, Somerville 2010, NZ MPI 2022, Persano Oddo & Piro 2004, Al-Ghamdi 2013, plus US extension service flora guides for the regional entries. Windows are typical — micro-climate and year-to-year weather can shift any flow ±2–3 weeks.

Open the story →

A standalone SVG scatter of every Manuka in the catalog declaring both MGO and price, with the OLS regression line overlaid and UMF tier reference points labelled inline. Iframe-friendly, light + dark themes.

  • Regression: price ≈ $19.34 + $0.0973 × MGO.
  • r² = 0.97 — MGO alone explains 97% of price variance in the sample.
  • Per-MGO marginal cost: roughly $0.10 added to the jar for every additional MGO point.
  • Premium-over-line: jars priced above the regression line are paying for brand reputation, packaging, or distribution.

Method. Single-factor OLS, no transformation. Sample is small (n=18) by design — only Manuka jars where the producer prints a numeric MGO score on the label are included. Confidence intervals and prediction bands are deliberately omitted to keep the chart legible at 320 px embed widths.

Open the story →

Raw Honey Price by Variety (Embeddable Widget)

16 floral sources, n = 175 of 210 jars

A horizontal bar chart of average mid-jar price by floral source, sorted high to low. Iframe-friendly, light + dark themes.

  • Manuka tops the chart at $55.27 (n=18).
  • Clover anchors the bottom at $13.97 (n=20).
  • Only floral sources with at least 5 jars in the catalog are charted, so single-jar outliers cannot dominate.

Method. Simple arithmetic means by floral source. Excluded sources are those with n < 5 (the cut keeps the chart honest about sample size) and the OTHER bucket where botanical origin is too heterogeneous to be meaningful.

Open the story →

Sidr Authenticity Wizard (Decision Tool)

6 evidence criteria, 100-point scoring rubric

A six-step decision-tree wizard that scores a jar of Yemeni Sidr against an evidence-weighted authenticity rubric. The structure mirrors the consumer checklist published in /blog/yemeni-honey-guide.

  • Provenance specificity: 20 points (e.g. named wadi, beekeeper).
  • Price sanity: 20 points (calibrated against $250–$500 / kg baseline for genuine Wadi Doan Sidr).
  • Laboratory or pollen evidence: 20 points (≥ 45% Ziziphus pollen is the gold standard).
  • Vendor type: 15 points (specialty importer > generic marketplace).
  • Sensory profile: 15 points (taste, aroma, viscosity match the documented Sidr profile).
  • Physical properties: 10 points (colour, crystallisation behaviour).

Method. The 20 / 20 / 20 / 15 / 15 / 10 weighting reflects the relative diagnostic value of each criterion as documented in food-safety surveys from Saudi and Emirati public-health agencies (which found 70–80% of retail-labelled "Yemeni Sidr" failed authentication on at least one criterion). Critically, an "I don't know" answer drops that step from the denominator rather than scoring zero: a buyer who can only answer 4 of 6 criteria for a jar they have not tasted gets a score over the 4 they actually answered. Without that semantic, partial knowledge would always tip the result into the "likely adulterated" tier regardless of evidence — see /blog/yemeni-honey-guide for the underlying source survey.

Open the story →

Known limitations

  • Curation bias. The 210-jar catalog is a curated sample. It is not a US retail census; mass-market filtered honey is intentionally under-represented.
  • Pricing snapshot. Mid-jar prices reflect a 30–90 day observation window. Sale prices, subscription discounts, and bulk pack-outs are not modelled.
  • Lab-rating trust chain. UMF and MGO ratings are taken as printed on the jar. We trust the certifier for UMF jars and the named lab for MGO-only declarations; we do not run independent confirmation testing.
  • Bloom variability. Harvest windows in the calendar are typical mid-points. Microclimate and year-to-year weather routinely shift flow timing ±2–3 weeks.
  • Crystallisation tier ≠ guarantee. Tiers are the most likely behaviour for the variety stored at 14–20 °C. Temperature, glucose seed content, and trace pollen all affect a specific jar.
  • Sidr score is a screen, not a verdict. The wizard summarises the evidence a buyer can see without lab work. A high score reduces fraud risk but cannot replace pollen analysis or a UMF-style chain of custody.

Versioning

The catalog is a living dataset — varieties, brands, and prices are added or refreshed as we learn of them. Every data story records its own last-updated date in its hero, and the underlying seed file is timestamped through git history. When a refresh changes a headline number by more than 2 percentage points, we bump the story's dateModified and note the delta in-line.

How to cite

For the dataset: Raw Honey Guide honey catalog (2026), https://rawhoneyguide.com/open-data, CC BY 4.0. For a specific data story, please link the article URL and credit Raw Honey Guide. Corrections and reuse questions: hello@rawhoneyguide.com.

Frequently asked questions

Where does the Raw Honey Guide catalog data come from?
The catalog is hand-curated from publicly listed retail honey jars across 168 brands and 16 countries. Each entry records floral source, honey type (raw / creamed / comb / infused / pasteurized / filtered), origin country, region (where stated), brand, mid-jar price (averaged over a 30–90 day window), certification claims (USDA Organic, Non-GMO, UMF, MGO), and where applicable UMF / MGO laboratory ratings as printed on the jar. Source data lives in `backend/src/main/resources/seed-data/honeys.json` and is published in machine-readable form at /open-data under a CC BY 4.0 licence so anyone can reproduce or critique our analyses.
How was the Manuka MGO ↔ price regression computed?
We extracted the 18 Manuka jars in the catalog with both MGO ratings and prices declared, fit an ordinary least-squares linear regression of price on MGO, and recovered price ≈ $19.34 + $0.0973 × MGO with r² = 0.97. The model is intentionally simple — a straight OLS line lets readers verify the result on the back of an envelope, and r² = 0.97 says MGO alone explains 97% of price variation in this slice of the catalog. We have not adjusted for jar weight (most are 250 g), brand premium, or import path, all of which a multi-factor model would account for.
How were the crystallization speed tiers assigned?
Each variety was assigned a tier (very-fast / fast / medium / slow / very-slow) based on its typical fructose-to-glucose (F/G) and glucose-to-water (G/W) ratios as reported in peer-reviewed unifloral surveys (Persano Oddo & Piro 2004; White & Doner 1980; Escuredo et al. 2014). We then mapped each catalog jar to a tier through its declared floral source. Ratios are typical mid-points — individual jars vary based on glucose seed content, moisture, and storage temperature, and the tier is a starting point, not a guarantee.
How was the harvest calendar built?
For each of 35 commercially meaningful unifloral honeys we recorded a hemisphere, a typical bloom-to-extraction window (1–12 month bands, with year-end wraps split into two segments), and a peak month. Windows are derived from FAO Beekeeping handbooks (2019), Crane (1999), Somerville's Honey and Pollen Flora of South-Eastern Australia (2010), the NZ Ministry for Primary Industries (2022), Persano Oddo & Piro (2004), Spanish ASEMIEL data, US extension services (UF IFAS, OSU, Cornell), and Al-Ghamdi (2013) for Arabian Peninsula entries. Microclimate variability means actual flow timing within a region can shift ±2–3 weeks year over year.
How does the Sidr Authenticity Wizard score a jar?
The wizard evaluates six independent evidence categories — Provenance specificity (20 pts), Price sanity (20), Lab / pollen evidence (20), Vendor type (15), Sensory profile (15), Physical properties (10) — totalling 100 points. Weights mirror the relative diagnostic value documented in the Saudi and Emirati food-safety studies that found 70–80% of retail-labelled "Yemeni Sidr" was adulterated. A "don't know" answer drops that step from the denominator (it is not scored as zero), so a buyer who can answer four of six criteria for a jar they have not yet tasted gets a fair score normalised over what they actually know. The result is reported as earned / possible × 100 plus a tier badge.
What are the biggest known limitations of the data?
Three. (1) The catalog is a curated sample of premium and artisan jars; it deliberately under-samples mass-market filtered grocery honey, so figures like "89% raw" describe the catalog, not the broader retail market. (2) Prices are mid-point listings observed over a window and do not adjust for sale pricing, subscription discounts, or large-format pack-outs. (3) MGO and UMF ratings are taken at face value from the jar label — we trust the chain of custody for UMF-certified jars and the listed lab for MGO-only declarations, but we have not run independent confirmation testing.
How can I cite or re-use this methodology?
The honey dataset is published at /open-data under the Creative Commons Attribution 4.0 (CC BY 4.0) licence — free to re-use, including commercially, with credit. For citation: "Raw Honey Guide honey catalog (2026), https://rawhoneyguide.com/open-data, CC BY 4.0." For the data stories themselves, please link the article URL and credit Raw Honey Guide. Questions, corrections, or requests for the underlying CSV: hello@rawhoneyguide.com.
What is the shape of the 210-jar catalog — where do most listings come from?
As of April 2026 the catalog skews toward U.S. and Pacific origins: USA 93 jars (44%), New Zealand 22 (10%), Australia 10, Canada 9, Italy 9, France 9, Spain 8, Greece 6, UK 5, Mexico 4, Hungary 4, Germany 4, Brazil 3, Turkey 3, Argentina 2, and an "Other" bucket of 19 for countries with one or two listings. By floral source the catalog is deliberately diverse: 22 wildflower, 20 clover, 18 Manuka, 14 orange blossom, 14 acacia, 10 each of buckwheat / lavender / eucalyptus / linden, 8 each of sage / sourwood / chestnut / heather, and a 35-jar "other/mixed" bucket for polyfloral entries. 187 of 210 (89%) are raw; only 25 (11.9%) carry any third-party certification claim — 13 USDA Organic, 6 UMF-tier, 4 Non-GMO Project, 4 True Source.