muddler

A cocktail menu scanner that works most of the time.

Point a phone at a menu. It returns the ingredients, a best guess at the measurements, and a flavor profile. Run by one person who keeps finding new ways for it to be wrong.

Cocktail menus are information design at its worst. Handwriting, ink on black, proprietary names, missing measurements, a bartender's sense of humor. muddler is a mobile app that tries anyway. It runs OCR on the image, resolves the text against a database of around nine thousand ingredients, and fills in the gaps using a small set of structural cocktail templates. For clean printed menus it's right most of the time. For chalkboards and cursive, it is humbled regularly.

Input, output.

A photographed menu. A single parsed cocktail. Real menus have more mistakes in them.

The Long Pourest. somewhere — cocktails, 14–18Old Fashioned14bourbon, demerara, ango, orange peelSmoked Rosemary Gimlet16gin, lime, rosemary syrup, mezcal rinseJungle Bird14dark rum, campari, pineapple, limePaper Plane15bourbon, amaro, aperol, lemonCorpse Reviver №215gin, lillet, cointreau, lemon, absinthe rinseno substitutions · split checks fine
Photograph of menu · The Long Pour
muddler / scan / 02
name      Smoked Rosemary Gimlet
family    Sour   # inferred
price     $16

ingredients
  - gin              2 oz     # base
  - lime juice       0.75 oz  # citrus
  - rosemary syrup   0.75 oz  # sweet
  - mezcal rinse     ~0.1 oz  # aromatic

profile   sour 7 · boozy 6 · aromatic 5
glass     coupe
method    shake, fine-strain

# 3 of 4 ingredients matched exactly
# "rosemary syrup" not in db — inferred from name
Structured return · Smoked Rosemary Gimlet

Match confidence

Three of four ingredients matched exactly. "Rosemary syrup" wasn't in the database; the parser pulled it from the cocktail's name.

Why "Sour"?

Gin + citrus + sweetener is the structural signature of the Sour family. The ratios are applied as 2 : 0.75 : 0.75 because the menu didn't list any.

Mezcal rinse

Marked aromatic rather than base because it was written as "rinse". A rinse is ~0.1 oz coating the glass before pouring.

Three phases. Each fails in its own way.

An image becomes text, the text becomes ingredients, the ingredients become a recipe. Any of those steps can quietly be wrong.

01 Read the image

Tesseract runs three passes with different page segmentation modes. The highest-confidence result wins. Dark-background detection inverts menus printed on black.

Known failures. Handwriting, flourish fonts, text embedded as an image in a source PDF. We haven't solved them.

02 Resolve the ingredients

Each parsed token goes through an eight-tier matcher: exact, alias-exact, alias-substring, substring, category, sequence similarity. Brand names and common OCR misreads are mostly covered.

Known failures. New product releases, long proprietary house-spec names, and anything in a language the database doesn't speak yet.

03 Infer the specs

Menus rarely list measurements. muddler classifies each cocktail into one of seven structural families and applies the standard ratios for that family.

Known failures. Every specific measurement will be argued by every bartender. The ratios are defensible, not definitive.

Every cocktail is one of these underneath.

Modern bartending is mostly recombination. muddler classifies drinks into one of seven templates, then applies standard ratios when the menu is silent. Most cocktails fit. The weird ones are flagged as weird.

01

Sour

2 · 0.75 · 0.75   base / citrus / sweet

  • spirit
  • lemon or lime
  • syrup or liqueur
  • egg white (optional)

e.g. Whiskey Sour, Daiquiri, Gimlet

02

Spirit Forward

2 · 1 · dash   base / modifier / bitters

  • base spirit
  • fortified wine or amaro
  • aromatic bitters

e.g. Manhattan, Negroni, Boulevardier

03

Highball

1 · 3   base / long mixer

  • base spirit
  • soda, tonic, ginger beer, or cola
  • citrus wedge

e.g. Gin and Tonic, Moscow Mule

04

Collins

2 · 1 · 1 · 2   base / citrus / syrup / soda

  • base spirit
  • lemon
  • simple syrup
  • soda water

e.g. Tom Collins, John Collins

05

Fizz

2 · 1 · 0.75 · top   base / citrus / syrup / soda

  • base spirit
  • lemon or lime
  • simple syrup
  • soda or sparkling
  • egg white (occasional)

e.g. Gin Fizz, Ramos Gin Fizz

06

Tiki

blend of rums · citrus · orgeat or falernum

  • two or more rums
  • lime or pineapple
  • orgeat, falernum, or curaçao
  • bitters or absinthe

e.g. Mai Tai, Jungle Bird, Zombie

07

Spritz

3 · 2 · 1   sparkling / aperitif / soda

  • prosecco or other sparkling
  • aperitivo (Aperol, Campari, Select)
  • soda water

e.g. Aperol Spritz, Hugo Spritz

Things we found in nine thousand ingredients.

An incomplete tour of the weirder corners.

  • i.

    Seven spellings of falernum

    Three are wrong. Two are regional. Two we can't explain. The ingredient normalizer treats all of them as the same thing.

  • ii.

    Angostura, 1824

    Angostura Bitters has been in continuous production since 1824. The only ingredient in the database older than most American bars.

  • iii.

    Three grenadines

    Three different things are sold under that name. Two contain no pomegranate. One is red dye and corn syrup with a story.

  • iv.

    Twelve botanicals

    One Japanese gin lists twelve botanicals, six of which don't appear in any other entry. Yuzu, sansho, kabosu, hinoki, sakura leaf, and something called yomogi that is technically mugwort.

  • v.

    Category: Other

    Quietly absorbs 144 ingredients that nobody could agree on. Egg whites. Toasted coconut. One entry just says "smoke."

  • vi.

    The word "dash"

    Bartender sources disagree on what a dash is by a factor of three. We picked one and documented the choice. Someone will email us about it.

An incomplete list.

In no particular order, here is what you should expect to not work.

Handwritten menus
OCR confidence drops sharply. Most handwriting gets parsed, some doesn't. Flourished cursive is a war.
Menus rendered as images
A PDF where the menu is an embedded JPEG reads the same as a photograph. Text on the page source doesn't help us.
House specials with no listed ingredients
A cocktail called "The Unicorn" is exactly as useful to us as a cocktail called "The Unicorn." We pass them through untouched.
Measurement inferences
Bartenders will argue about every one of them. That is fine. The ratios are defensible, not definitive.
Non-Latin scripts
Basic Latin and Latin-1 only for now. Japanese, Thai, Cyrillic, Hangul menus all fail. We know.
Anything photographed in the dark
A menu in a mood-lit bar is functionally unreadable. Flash helps; flash on a laminated menu does not.

Availability.

Platforms
iOS and Android.
Offline use
A seed subset ships with the app. Scans still work without a connection.
Accounts
Optional. Scanning and browsing work without one. A free account adds custom ingredients, saved scans, and more as the app grows.
Database
~9,000 ingredients, 600+ base recipes, 7 cocktail families, 16 syrup specs. The library keeps growing.
Privacy
Scans are processed and discarded. Details here.
Made by
Sticctape. One person, at a bar, usually.