Cocktail menus are information design at its worst. Handwriting, ink on black, proprietary names, missing measurements, a bartender's sense of humor. muddler is a mobile app that tries anyway. It runs OCR on the image, resolves the text against a database of around nine thousand ingredients, and fills in the gaps using a small set of structural cocktail templates. For clean printed menus it's right most of the time. For chalkboards and cursive, it is humbled regularly.
Worked example
Input, output.
A photographed menu. A single parsed cocktail. Real menus have more mistakes in them.
name Smoked Rosemary Gimlet
family Sour # inferred
price $16
ingredients
- gin 2 oz # base
- lime juice 0.75 oz # citrus
- rosemary syrup 0.75 oz # sweet
- mezcal rinse ~0.1 oz # aromatic
profile sour 7 · boozy 6 · aromatic 5
glass coupe
method shake, fine-strain
# 3 of 4 ingredients matched exactly
# "rosemary syrup" not in db — inferred from nameMatch confidence
Three of four ingredients matched exactly. "Rosemary syrup" wasn't in the database; the parser pulled it from the cocktail's name.
Why "Sour"?
Gin + citrus + sweetener is the structural signature of the Sour family. The ratios are applied as 2 : 0.75 : 0.75 because the menu didn't list any.
Mezcal rinse
Marked aromatic rather than base because it was written as "rinse". A rinse is ~0.1 oz coating the glass before pouring.
The pipeline
Three phases. Each fails in its own way.
An image becomes text, the text becomes ingredients, the ingredients become a recipe. Any of those steps can quietly be wrong.
01 Read the image
Tesseract runs three passes with different page segmentation modes. The highest-confidence result wins. Dark-background detection inverts menus printed on black.
Known failures. Handwriting, flourish fonts, text embedded as an image in a source PDF. We haven't solved them.
02 Resolve the ingredients
Each parsed token goes through an eight-tier matcher: exact, alias-exact, alias-substring, substring, category, sequence similarity. Brand names and common OCR misreads are mostly covered.
Known failures. New product releases, long proprietary house-spec names, and anything in a language the database doesn't speak yet.
03 Infer the specs
Menus rarely list measurements. muddler classifies each cocktail into one of seven structural families and applies the standard ratios for that family.
Known failures. Every specific measurement will be argued by every bartender. The ratios are defensible, not definitive.
The seven structural families
Every cocktail is one of these underneath.
Modern bartending is mostly recombination. muddler classifies drinks into one of seven templates, then applies standard ratios when the menu is silent. Most cocktails fit. The weird ones are flagged as weird.
Sour
2 · 0.75 · 0.75 base / citrus / sweet
- spirit
- lemon or lime
- syrup or liqueur
- egg white (optional)
e.g. Whiskey Sour, Daiquiri, Gimlet
Spirit Forward
2 · 1 · dash base / modifier / bitters
- base spirit
- fortified wine or amaro
- aromatic bitters
e.g. Manhattan, Negroni, Boulevardier
Highball
1 · 3 base / long mixer
- base spirit
- soda, tonic, ginger beer, or cola
- citrus wedge
e.g. Gin and Tonic, Moscow Mule
Collins
2 · 1 · 1 · 2 base / citrus / syrup / soda
- base spirit
- lemon
- simple syrup
- soda water
e.g. Tom Collins, John Collins
Fizz
2 · 1 · 0.75 · top base / citrus / syrup / soda
- base spirit
- lemon or lime
- simple syrup
- soda or sparkling
- egg white (occasional)
e.g. Gin Fizz, Ramos Gin Fizz
Tiki
blend of rums · citrus · orgeat or falernum
- two or more rums
- lime or pineapple
- orgeat, falernum, or curaçao
- bitters or absinthe
e.g. Mai Tai, Jungle Bird, Zombie
Spritz
3 · 2 · 1 sparkling / aperitif / soda
- prosecco or other sparkling
- aperitivo (Aperol, Campari, Select)
- soda water
e.g. Aperol Spritz, Hugo Spritz
Notes from the database
Things we found in nine thousand ingredients.
An incomplete tour of the weirder corners.
- i.
Seven spellings of falernum
Three are wrong. Two are regional. Two we can't explain. The ingredient normalizer treats all of them as the same thing.
- ii.
Angostura, 1824
Angostura Bitters has been in continuous production since 1824. The only ingredient in the database older than most American bars.
- iii.
Three grenadines
Three different things are sold under that name. Two contain no pomegranate. One is red dye and corn syrup with a story.
- iv.
Twelve botanicals
One Japanese gin lists twelve botanicals, six of which don't appear in any other entry. Yuzu, sansho, kabosu, hinoki, sakura leaf, and something called yomogi that is technically mugwort.
- v.
Category: Other
Quietly absorbs 144 ingredients that nobody could agree on. Egg whites. Toasted coconut. One entry just says "smoke."
- vi.
The word "dash"
Bartender sources disagree on what a dash is by a factor of three. We picked one and documented the choice. Someone will email us about it.
What it gets wrong
An incomplete list.
In no particular order, here is what you should expect to not work.
- Handwritten menus
- OCR confidence drops sharply. Most handwriting gets parsed, some doesn't. Flourished cursive is a war.
- Menus rendered as images
- A PDF where the menu is an embedded JPEG reads the same as a photograph. Text on the page source doesn't help us.
- House specials with no listed ingredients
- A cocktail called "The Unicorn" is exactly as useful to us as a cocktail called "The Unicorn." We pass them through untouched.
- Measurement inferences
- Bartenders will argue about every one of them. That is fine. The ratios are defensible, not definitive.
- Non-Latin scripts
- Basic Latin and Latin-1 only for now. Japanese, Thai, Cyrillic, Hangul menus all fail. We know.
- Anything photographed in the dark
- A menu in a mood-lit bar is functionally unreadable. Flash helps; flash on a laminated menu does not.
Practical notes
Availability.
- Platforms
- iOS and Android.
- Offline use
- A seed subset ships with the app. Scans still work without a connection.
- Accounts
- Optional. Scanning and browsing work without one. A free account adds custom ingredients, saved scans, and more as the app grows.
- Database
- ~9,000 ingredients, 600+ base recipes, 7 cocktail families, 16 syrup specs. The library keeps growing.
- Privacy
- Scans are processed and discarded. Details here.
- Made by
- Sticctape. One person, at a bar, usually.